Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store common Linux xattrs as native SA with xattr=sa (WIP) #2809

Closed
wants to merge 5 commits into from

Conversation

dweeezil
Copy link
Contributor

There are 2 commits here. The first, dweeezil/zfs@c53847e, fixes deletion of system attributes.

The second is a intended to be the foundation for fixing #2718 but it should otherwise be able to stand on its own.

I've given this patch stack fairly light manual testing as well as running the "pjd" posixacl test suite and some short ztest runs.

I'm submitting this now as a functional WIP to get feedback from other developers regarding the overall concept and its implementation. There are a handful of issues I'll point out about the code:

  • The list of xattrs to store as native SAs is stored statically in zpl_xattr.c which made it difficult to share with zdb. The same list (in a different format) exists within zdb.c as part of the new dump_znode_native_sa_xattr() function.
  • The management of existing ZPL_DXATTR SAs required a helper function to translate from a zpl_attr_t to its name. As far as I could tell, there were no existing "upcalls" from the ZoL ZPL to the native Solaris ZPL. Rather than trying to #include the ZPL definitions in zfs_sa.c, I simply added an appropriate extern declaration. This goes against the current coding style.
  • The ZPL_DXATTR SA cleanup performed by zfs_sa_remove_xattr adds overhead which could be avoided if it was known that no ZPL_DXATTRs exist. To that end, it's only invoked after determining whether a ZPL_DXATTR exists (which should be a low-overhead operation).
  • I'd like to add a feature flag which is activated when a Native SA xattr is set for the first time; the feature flag would not be able to be deactivated. We might also want a feature flag to do the same with the existing ZPL_DXATTR SAs.

I'd also like feedback on the first commit in this stack. I discovered the problem during testing. My concern is that all other OpenZFS implementations likely share the identical code. I think the problem isn't seen often because SAs are rarely deleted. My test system created sort of a "perfect storm" by "stacking" all the variably-sized SAs at the ends of the registration.

@dweeezil dweeezil force-pushed the xattr-security-posixacl-sa branch from 8e4b9ab to 1e69e36 Compare October 20, 2014 12:49
@dweeezil
Copy link
Contributor Author

@maxximino That's a very good question and is something I've been meaning to test. I whipped up a little script to set a bunch of posix acls:

#!/bin/sh

X=1000
echo blah > blah
while true; do
    setfacl -m u:$X:rw blah
    if [ $? -ne 0 ]; then
        echo failed at $X
        exit 0
    fi
    X=$(( $X + 1 ))
done

And I ran it on ext4, xfs and zfs with my patch.

(ext4)
# ./setfacl-max.sh 
setfacl: blah: No space left on device
failed at 1503
(xfs)
# ~/setfacl-max.sh 
setfacl: blah: Argument list too long
failed at 1021
(zfs with patch)
# ~/setfacl-max.sh 
^C
# getfacl blah | wc
   3211    3216   44925

I ended up interrupting it under zfs because the following was being logged:

[  278.899844] SPL: large kmem_alloc(25636, 0xd0) at zpl_set_acl:874 (3354738/3382344)
[  278.900921] SPL: large kmem_alloc(25644, 0xd0) at zpl_set_acl:874 (3355248/3382862)
[  278.902128] SPL: large kmem_alloc(25652, 0xd0) at zpl_set_acl:874 (3355758/3383380)
[  278.903303] SPL: large kmem_alloc(25660, 0xd0) at zpl_set_acl:874 (3356268/3383898)

Interesting, I didn't realize kmem_alloc() was being used like that up in the SPL layer. I suppose that can be fixed, or we can cap the size at some reasonable limit.

I suppose another question would be to ask what happens on a "normal" ZFS system when its native ZPL_DACL_ACES is too large. Clearly this is something that could happen under Solaris or Illumos.

It doesn't look like, at least for posix acls, that the size of an SA we can hold in a spill block is going to be a limiting factor relative to other file systems in common use.

Of course, this test is only working with posix ACLs, when selinux xattrs are added to the mix, the amount of space available for posix acls will be a bit lower.

That all said, the main reason for this scheme of storing these xattrs is to more easily allow them to be created as an atomic part of file/directory/symlink/whatever creation. Creating them atomically as either directory-style or ZPL_DXATTR-style xattrs would be a major pain in the butt, whereas creating them as native SAs will be fairly trivial. It wouldn't be terribly difficult to let a "set" operation allow to overflow into a traditional directory-style xattr in the context of my patch, but during the creation of a new object, they'd be created as native SA xattrs.

I'll work on refining the patch based on this information.

@dweeezil
Copy link
Contributor Author

I replaced the kmem_alloc with vmem_alloc in zpl_set_acl() and zpl_get_acl() to get an idea how many posix acls we can fit:

# ~/setfacl-max.sh 
setfacl: blah: Argument list too long
failed at 9187

I'm going to poke around in the other native file systems to see how they handle storing the output of posix_acl_to_xattr().

@dweeezil
Copy link
Contributor Author

@behlendorf Thanks for the review. I'd like to try to get a second version of this pushed within the next day.

@behlendorf
Copy link
Contributor

I think the general design here is solid it just needs a little refinement.

As you mentioned in your original comment we should add a new feature flag for this. I think we should consider using the same feature flag for both the old ZPL_DXATTR SA functionality and the new native SAs added by this patch. One nice way to go about this would be to use the spa_feature_incr and spa_feature_decr functionality to count the number of znodes which are using the feature. This would allow us to disable the feature if there were no longer any consumers of it. One small wrinkle with this plan would be how to handle existing filesystems with SA xattrs.

It would be nice to use this pull request to build up a patch stack which completely addresses #2718. Patches early in the stack like this one can lay the needed ground work and subsequent ones can leverage it. Then the whole branch could be merged once everyone was happy with it.

It's also probably the opportunity to determine if we're really happy with the current xattr=SA behavior. I think there's probably a legitimate case to be made that the existing behavior is confusing and should perhaps be simplified.

@dweeezil
Copy link
Contributor Author

@behlendorf My plan was definitely to layer the fix for #2718 atop this foundation. It would make perfect sense for it all to be done as part of this pull request.

@dweeezil dweeezil force-pushed the xattr-security-posixacl-sa branch from 1e69e36 to 533cf28 Compare October 27, 2014 18:43
The sa_modify_attrs() function can add, remove or replace an SA.
The main loop in the function uses the index "i" to iterate over the
existing SAs and uses the index "j" for writing them into a new buffer
bia SA_ADD_BULK_ATTR().  The write index, "j" is incremented on remove
(SA_REMOVE) operations which leads to a corruption in the new SA buffer.
This patch remove the increment for SA_REMOVE operations.
This patch adds a new class of system atributes which are referred to as
"Native SA xattrs".  The facilities described here are only enabled when
"xattr=sa" is set.

If xattr=sa is set, the following "security." and "system." xattrs are
stored as native SA xattrs rather than as elements of the ZPL_DXATTR SA:

          xattr                      System atrribute
    --------------------------------------------------------
    security.selinux            ZPL_SECURITY_SELINUX
    security.capability         ZPL_SECURITY_CAPABILITY
    system.posix_acl_access     ZPL_SYSTEM_POSIX_ACL_ACCESS
    system.posix_acl_default    ZPL_SYSTEM_POSIX_ACL_DEFAULT

Storing these xattrs as native system attributes allows for the ZPL to
more easily and naturally operate on them as an atomic part of other
operations and will be used as the foundation for fixing issue openzfs#2718.

Zdb will display these under the new "Native SA xattrs" section.

Lookups of these xattrs will use the following priority:

    1. Native SA xattr (as shown in the able above)
    2. Linux ZPL_DXATTR nvlist
    3. Traditional ZFS directory-style xattr

Modifications of these xattrs will erase an existing ZPL_DXATTR instance
of an identically-named xattrs but will not change an existing instance
of an identically-named directory-style xattr.

If a modification of the ZPL_DXATTR SA causes it to become empty, the
ZPL_DXATTR SA, itself is deleted to maximize available space in the
bonus buffer.

The effect is that existing ZPL_DXATTR SAs are automatically upgraded
as they are changed.
Add the zpl_init_attrs() function to centralize the creation of xattrs
for newly-created files.
kernelOfTruth added a commit to kernelOfTruth/zfs that referenced this pull request Dec 13, 2014
[dweeezil]

There are 2 commits here. The first, dweeezil/zfs@c53847e, fixes deletion of system attributes.

The second is a intended to be the foundation for fixing openzfs#2718 but it should otherwise be able to stand on its own.

I've given this patch stack fairly light manual testing as well as running the "pjd" posixacl test suite and some short ztest runs.

I'm submitting this now as a functional WIP to get feedback from other developers regarding the overall concept and its implementation. There are a handful of issues I'll point out about the code:

The list of xattrs to store as native SAs is stored statically in zpl_xattr.c which made it difficult to share with zdb. The same list (in a different format) exists within zdb.c as part of the new
dump_znode_native_sa_xattr() function.
The management of existing ZPL_DXATTR SAs required a helper function to translate from a zpl_attr_t to its name. As far as I could tell, there were no existing "upcalls" from the ZoL ZPL to the native
Solaris ZPL. Rather than trying to #include the ZPL definitions in zfs_sa.c, I simply added an appropriate extern declaration. This goes against the current coding style.
The ZPL_DXATTR SA cleanup performed by zfs_sa_remove_xattr adds overhead which could be avoided if it was known that no ZPL_DXATTRs exist. To that end, it's only invoked after determining whether a
ZPL_DXATTR exists (which should be a low-overhead operation).
I'd like to add a feature flag which is activated when a Native SA xattr is set for the first time; the feature flag would not be able to be deactivated. We might also want a feature flag to do the same
with the existing ZPL_DXATTR SAs.
I'd also like feedback on the first commit in this stack. I discovered the problem during testing. My concern is that all other OpenZFS implementations likely share the identical code. I think the
problem isn't seen often because SAs are rarely deleted. My test system created sort of a "perfect storm" by "stacking" all the variably-sized SAs at the ends of the registration.
@behlendorf behlendorf added this to the 0.6.5 milestone Feb 6, 2015
@dweeezil dweeezil force-pushed the xattr-security-posixacl-sa branch from 533cf28 to 8d77a77 Compare March 23, 2015 02:34
@behlendorf behlendorf changed the title Store common Linux xattrs as native SA with xattr=sa Store common Linux xattrs as native SA with xattr=sa (WIP) Jun 18, 2015
@behlendorf behlendorf removed this from the 0.6.5 milestone Jul 16, 2015
@behlendorf
Copy link
Contributor

@dweeezil I'm going to close out this PR for now but I do think this is an issue we'll need to eventually address. By all means open a PR when you have time to work on this again. There's some overlap with existing already merged patches.

21f604d Prevent duplicated xattr between SA and dir
43b4935 Prevent SA length overflow

@behlendorf behlendorf closed this Jan 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants