-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NAS-130427 / 25.04 / Update Linux kernel to next LTS release v6.12 #200
Conversation
This commit adds TrueNAS build customization required for building Debian packages for TrueNAS SCALE kernel. The original commit ported from v6.6 is 0c5b36a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seems "NAS-123903 / NFS41 ACL support for NFS4 protocol" commit squashed previously got split back somehow. I don't have preference, but makes me wonder if it is the same version or if there are any artifacts, may worth checking. @bmeagherix
fs/xattr.c
Outdated
if (size > XATTR_SIZE_MAX) { | ||
if ((size > XATTR_LARGE_SIZE_MAX) || | ||
(IS_LARGE_XATTR(path.dentry->d_inode) == 0)) { | ||
return -E2BIG; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have feeling here are missing path_put(&path)
and kvfree(ctx.kvalue)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While there, I wonder what was the point to check for size
and then for size > XATTR_SIZE_MAX
. Though it is not new and insignificant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it got moved from where it should be because of refactoring in xattr.c by upstream. It should be in do_setxattr
otherwise you're introducing difference between setxattr and fsetxattr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was working on an updated 6.6 for testing purposes I did something like this:
commit 8a1aae17e6f54be3fda82a76a03f07004d7a730c
Author: Andrew Walker <[email protected]>
Date: Thu Oct 31 16:24:00 2024 -0400
Fix do_setxattr
diff --git a/fs/xattr.c b/fs/xattr.c
index ce0d11556bea..30f2a5822d4f 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -669,6 +669,13 @@ int do_setxattr(struct mnt_idmap *idmap, struct dentry *dentry,
return do_set_acl(idmap, dentry, ctx->kname->name,
ctx->kvalue, ctx->size);
+#ifdef CONFIG_TRUENAS
+ if (ctx->size &&
+ (ctx->size > XATTR_SIZE_MAX) &&
+ (IS_LARGE_XATTR(d->d_inode) == 0)) {
+ return -E2BIG;
+ }
+#endif
return vfs_setxattr(idmap, dentry, ctx->kname->name,
ctx->kvalue, ctx->size, ctx->flags);
}
@@ -677,17 +684,6 @@ static int path_setxattr(const char __user *pathname,
const char __user *name, const void __user *value,
size_t size, int flags, unsigned int lookup_flags)
{
-#ifdef CONFIG_TRUENAS
- if (size) {
- if (size > XATTR_SIZE_MAX) {
- if ((size > XATTR_LARGE_SIZE_MAX) ||
- (IS_LARGE_XATTR(d->d_inode) == 0)) {
- return -E2BIG;
- }
- }
- }
-#endif
-
struct xattr_name kname;
struct xattr_ctx ctx = {
.cvalue = value,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated as you suggested, can you please take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again (been a few years), we may also want to set upper bound in setxattr_copy
(unrelated to current questions):
diff --git a/fs/xattr.c b/fs/xattr.c
index c5a8a7c3fb4a..a9b12be7283f 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -646,7 +646,10 @@ int setxattr_copy(const char __user *name, struct xattr_ctx *ctx)
error = 0;
if (ctx->size) {
-#ifndef CONFIG_TRUENAS
+#ifdef CONFIG_TRUENAS
+ if (ctx->size > XATTR_LARGE_SIZE_MAX)
+ return -E2BIG;
+#else
if (ctx->size > XATTR_SIZE_MAX)
return -E2BIG;
#endif
To avoid large memdup_user()
. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do above change then we can simplify check in do_setxattr
to return -E2BIG if does not support large xattrs and xattr size is larger than XATTR_SIZE_MAX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have updated accordingly to check for hard limit of XATTR_LARGE_SIZE_MAX
in setxattr_copy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is not right again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that, it should be fixed now.
I cherry-picked the last pushed artifacts from source branch of #142 to break it down and make it easier to port. It's worth taking a look there @bmeagherix. |
9515f5f
to
ed8fc7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reviewed the commits for which I was the original author. LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can remove Makefile.orig and smb2pdu.h.orig, as they seem to be leftover files from c80fcec.
ed8fc7f
to
8b1a0f2
Compare
Thanks, I have updated. |
776bcde
to
d5174dc
Compare
Support for alternate datastreams over the SMB protocol has been historically enabled in such a way that Samba writes them as filesystem extended attributes in the user namespace. FreeBSD has no practical limit on xattr size, and so clients (often MacOS) may write ones that exceed the 64 KiB limit imposed by the Linux kernel. Since XATTR_SIZE_MAX is uesd in many places in the kernel, and not all filesystems support large xattrs, introduce new constant XATTR_LARGE_SIZE_MAX that is used as an alternate value if the filesystem sb_flags has SB_LARGEXATTR. There will be corresponding commit in ZFS to set this flag when it is defined and xattrs are enabled on the ZFS dataset. This commit also introduces flag SB_NFSV4ACL which will be used to indicate and enable NFSv4-specific behavior in kernel with regard to permissions. These new features / alternate behavior are controlled by the compile-time kernel compilation flag CONFIG_TRUENAS, which defaults to n (off). In principle, TrueNAS-specific changes that deviate from a vanilla Linux kernel can be removed for testing purposes by changing CONFIG_TRUENAS=n in the relevant build scripts. Signed-off-by: Andrew Walker <[email protected]>
There are various places in which evaluation of permissions in the presence of an NFSv4 ACL is more nuanced than what is typical when evaluating traditional POSIX permissions. For example, a user may be permitted to delete a file if he has DELETE permissions on the file or DELETE_CHILD permissions on the parent directory. Traditional POSIX permissions will only check for MAY_WRITE | MAY_EXEC on parent directory. Several new inode permissions masks have been added to facilitate these NFSv4-specific checks corresponding to different NFSv4 permissions that grant abilities to make changes to files. For the purpose of this commit and the goal of providing rough a approximation of NFSv4 access checks, only write (and not read) access checks have been implemented. This is selectively done in a way to grant minimal compliance with permissions as defined in RFC-5661. The new permissions-related behavior is only applied when the inode sb_flag SB_NFS4ACL is present. In this case, the onus of full implementation of requisite features to satisfy the ACL behavior specified in RFC-5661 is delegated to the filesystem's inode permissions interface (i_op->permission). If possible we try to check for the convention POSIX permission first before trying the NFSv4-equivalent. For example, when writing an xattr, we check for WRITE_DATA before WRITE_NAMED_ATTRS because in the case of former with a trivial ACL we can avoid having to evaluate the full ACL, and instead merely look at POSIX mode.
csiostor seems to cause Chelsio T6 firmware to crash. Jira: NAS-110910
Being written anything waits for all device probe to complete before returning. After that `udevadm settle` used by ZFS scripts really can provide system with all disks detected for boot pool import. Ticket: NAS-108200
Enable NTB and NTB tools in the Truenas config. In addition, enable the Intel NTB driver, so that we have at least one NTB driver available.
Added initial support for PLX Non-Transparent Bridge.
Before this change it was impossible to load client modules before NTB hardware is probed. This change removes the limitation. New NTB transports will get their children devices as they come in.
This fixes interrupt storms on hardware using legacy level-triggered interrupts, since doorbell processing could take time after interrupt handler completion, that triggered extra interrupts in a loop.
If a previous successful run is present, then skip re-run for its pull request. Signed-off-by: Umer Saleem <[email protected]>
Previously for TrueNAS, Debian Linux kernel configuration was used and TrueNAS config options were added on top of that. Because of that, TrueNAS kernel config in 'scripts/package/truenas/tn.config' has grown very large and difficult to manage for TrueNAS only options. Debian Linux kernel configuration for version 6.1.55 has been added as 'debian_amd64.config' to keep the options from Debian seperate from TrueNAS options. TrueNAS only config options are stored in 'truenas.config'. Kernel configuration can now be generated as: 1) make ARCH=x86_64 defconfig 2) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/debian_amd64.config 3) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/truenas.config 4) ./scripts/kconfig/merge_config.sh .config \ ./scripts/package/truenas/debug.config Signed-off-by: Umer Saleem <[email protected]>
Without this it was impossible to use multple NTB consumer drivers, since kernel just attached the first one to all NTBs. This replicates driver_override device attribute from PCI, plus adds module parameter to ease default setting. Signed-off-by: Alexander Motin <[email protected]>
… reuse Also now use ACE4SIZE and add a check wrt remaining in convert_nfs41xdr_to_nfs40_acl / convert_to_nfs40_ace
The nfs4_xattr_list_nfs4_acl_xdr, nfs4_xattr_get_nfs4_acl_xdr and nfs4_xattr_set_nfs4_acl_xdr functions are implemented using the existing client-side support for DACL. Currently only OWNER@, GROUP@, EVERYONE@ and numeric uids or gids are supported in the ACEs.
This follows upstream 0192445.
Add an xattr handler with same namespace as ZFS ACL xattr handler to allow userspace utilities to easily preserve and convert contents of SMB Security Descriptor DACL into native ZFS ACL when ingesting data during migration via SMB client (for example using rsync with the explicit option to preserve the xattr in question). This PR also adds a new procfs endpoint: /proc/fs/cifs/zfsacl_configuration_options that can be used to control error handling for cases where we can't convert SID into a Unix ID, and currently also whether we allow setting NT ACL via xattr writes on remote SMB server (disabled by default).
DCB seems to be generating spam on the console for users with Chelsio T4. Disable DCB features for Chelsio T4. Signed-off-by: Umer Saleem <[email protected]>
On several systems we've noticed that when NTB link goes down, the Physical Layer User Test Pattern registers we use as additional scratchpad registers (that is explicitly allowed by the chip specs) become read-only for about 100us. I see no explanation for this in the chip specs, neither why it was not seen before, may be a race. Since we do need these registers, workaround it by repeating writes until we succeed or 1ms timeout expire.
Signed-off-by: Umer Saleem <[email protected]>
Debian has defaulted to compressing the modules in XZ format. This introduces some changes that are not desired. The debug info/symbols for modules are not stripped anymore. As a result, the size of modules increases ten folds. Previosuly, the debug symbols for modules and kernel were packaged separately in a -dbg package. But after this config update, -dbg package only contains symbols for kernel binary itself. Kernel modules have in built debug symbols. The size of ISO image increses 3-4 times because of this. Revert to preveious state for this config. Signed-off-by: Umer Saleem <[email protected]>
This adds support to emulate NVME functionality for the drives connected on Trimode HBA. Userspace tools can talk to the NVME device through passthrough commands, e.g., nvme-cli can identify controller using `nvme id-ctrl /dev/sdX`.
SMB Protocol Background: ------------------------ Filesystems presented over the SMB protocol may support alternate data streams ("named streams") within a file or a directory. This support is designated by the filesystem attribute FILE_NAMED_STREAMS. Named streams are not identical to extended attributes (EAs), which may also be supported by the same SMB server. A named stream is a place within a file in addition to the main stream (normal file data) where data is stored. Named streams have different data than the main stream (and than each other) and may be written and read independently of each other. Named streams for a file are designated by appending a ":" colon character to the file name followed by the name of the alternate data stream. Stream names may be no more than 255 characters in length and are subject to the characteristics and limitations documented in MS-FSCC Section 2.1.5 Pathname and following. A list of named streams for a file can be gathered by submitting an SMB2_QUERY_INFO request for FILE_STREAM_INFORMATION. The expected server response is documented in MS-FSS Section 2.4.43 FileStreamInformation. Streams are typically smallish in size (less than 200 bytes individually), and are rarely used apart from MacOS SMB clients. TrueNAS /ZFS background: ------------------------ Solaris supported a similar feature set through its file-backed xattr capabilities and APIs. This meant that the kernel SMB server in solaris was able to seamlessly provide support for named streams. When ZFS was ported to FreeBSD and Linux the extattr and xattr OS APIs were layered on top of the ZFS file-backed xattrs. As time progressed and ZFS on Linux saw more use, it was determined that the performance and lack of atomicity of operations on file-backed xattrs was insufficient for some application requirements (this was especially the case for Samba shares), this eventually led to the ZFS dataset configuration parameter for SA-backed xattrs on Linux (which is the TrueNAS default). With this configuration, xattrs up to a certain size are written as SA, and larger xattrs are written as files. The practical result of this is that TrueNAS can support extended attributes that are much greater in size than a traditional Linux file server. Unfortunately, due to inability to perform partial reads and writes on extended attributes a 2 MiB upper bound is placed as the maximum size of a single extended attribute / named stream in TrueNAS. Samba background: ----------------- Samba has the ability to present extended attributes as named streams to SMB clients. This is achieved by prepending a special prefix to the extended attribute (to differentiate the streams xattrs from normal xattrs that are presented as EAs over the SMB protocol). Due to historical design decisions, the Samba module in charge of translating xattrs into streams appends an extra NULL byte to the xattr on writes to the local filesystem and strips it off when converting to a stream for SMB clients. Implementation details: ----------------------- This commit adds support for the Linux kernel SMB2/3 client to enumerate streams on a remote SMB server by including them in the output of listxattr with the special Samba prefix. Streams may be written to the remote SMB server via setxattr and read through getxtattr. The Samba-specific behavior for appending / removing an extra byte to the xattr can be disabled by setting /proc/fs/cifs/stream_samba_compat to 0.
PCIe ACS (Access Control Services) is the PCIe 2.0+ feature that allows us to control whether transactions are allowed to be redirected in various subnodes of a PCIe topology. For instance, if two endpoints are below a root port or downsteam switch port, the downstream port may optionally redirect transactions between the devices, bypassing upstream devices. The same can happen internally on multifunction devices. The transaction may never be visible to the upstream devices. One upstream device that we particularly care about is the IOMMU. If a redirection occurs in the topology below the IOMMU, then the IOMMU cannot provide isolation between devices. This is why the PCIe spec encourages topologies to include ACS support. Without it, we have to assume peer-to-peer DMA within a hierarchy can bypass IOMMU isolation. Unfortunately, far too many topologies do not support ACS to make this a steadfast requirement. Even the latest chipsets from Intel are only sporadically supporting ACS. We have trouble getting interconnect vendors to include the PCIe spec required PCIe capability, let alone suggested features. Therefore, we need to add some flexibility. The pcie_acs_override= boot option lets users opt-in specific devices or sets of devices to assume ACS support. The "downstream" option assumes full ACS support on root ports and downstream switch ports. The "multifunction" option assumes the subset of ACS features available on multifunction endpoints and upstream switch ports are supported. The "id:nnnn:nnnn" option enables ACS support on devices matching the provided vendor and device IDs, allowing more strategic ACS overrides. These options may be combined in any order. A maximum of 16 id specific overrides are available. It's suggested to use the most limited set of options necessary to avoid completely disabling ACS across the topology. Note to hardware vendors, we have facilities to permanently quirk specific devices which enforce isolation but not provide an ACS capability. Please contact me to have your devices added and save your customers the hassle of this boot option. Signed-off-by: Alex Williamson <[email protected]> Committed-by: Umer Saleem <[email protected]>
The mpt3sas driver does not create a parent end device for PCIe types where the SAS address is stored, causing the ses driver to not add PCIe device types connected to a tri-mode HBA. To address this, the fallback mechanism reads the SAS address from the VPD 0x83 page. This change is inspired by commit 9927c68.
Upstream commit 6ebfede changed API for SMB2_set_eof() which introduced regression in truncation of SMB alternate data streams.
nfsd supports up to 1024 ACEs. Windows servers support a security descriptor size of up to 64 KiB, which translates to around 1700 aces, but since we don't locally support that size we'll keep both at 1024 max.
Enable CONFIG_ERROR option that treats all warnings as error during kernel build. Signed-off-by: Umer Saleem <[email protected]>
Label err_dma_mask is not being used and generates a warning at build time. With CONFIG_WERROR enabled, this warning is treated as error and breaks the build. This commit removes this label for now. Signed-off-by: Umer Saleem <[email protected]>
d5174dc
to
233cc07
Compare
Not updating JIRA ticket https://ixsystems.atlassian.net/browse/NAS-130427 target versions as no JIRA version corresponds to this PR |
This PR has been merged and conversations have been locked. |
All of our updates have been ported on top of Linux v6.12.0. This includes all the patches from #139 (Update to 6.6) and other PRs that were merged into
truenas/linux-6.6
.Following patches were updated to resolve merge conflicts while porting to 6.12:
Add initial support for large xattrs
Introduce driver_override for NTB devices
(This patch wasn't updated itself but sincestruct bus_type ntb_bus
is now declared asconst
, there was a merge conflict)ahciem: Emulate SES enclosure for AHCI enclosures
Implement native NFSv4 ACLs in NFS server
nvme: skip optional id ctrl csi for versions less than 2.0.0
Introduce NVDIMM NTB mirroring driver
Add DACL support to nfsd (v4.1+)
fs/cifs - add ZFS ACL support to SMB client
25.04 image with 6.12 is present at here. API tests show 6 failures.