-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS Crypto support #494
Comments
I'd like to hold off on this for the moment, we have enough other work on our plate and this is a huge change! If Illumos puts together an implementation we'll happily look at integrating it. We would could even use the source from ZFS Pool Version 30 if Oracle decides to release the source 6-12 months from now (unlikely but possible). |
If you don't mind LUKS, I might have some time to look at this in a week or two. |
I'm OK with making it easier to layer zfs on top of LUKS, that would be nice. It's just not what most people think of when they say zfs encryption support. |
I was rather thinking of 'cloning'/'copying' the way LUKS works. Or rather, use the LUKS API inside ZFS. LUKS is used by 'cryptsetup' (configures encrypted block devices) and 'dmsetup' (The Linux Kernel Device Mapper userspace library). So it seems LUKS is an API for device encryption. Using ZFS 'on top of' something like that would probably be easier, but not, as you say, not the intention... |
It would be interesting to investigate if what your suggesting is possible. It would result in a second version of zfs encryption which isn't compatible with the pool v30 version but that might not be a big deal. We should be integrating the new feature flag support early next year so it could end up as a Linux-only feature. |
I don't think we'll ever going to be compatible with v30... Not any of us, not unless Oracle all of a sudden 'sees the light', and I'm not holding my breath on that! :) Best would be if we could come up with a solution, that would be portable to other OS'es. Don't know how much Linux the 'Linux Unified Key Setup' is, but it's worth a look. I'll start that once I have a workable sharesmb patch. |
How about at least reverse engineering v30 format? |
Be my guest! Reverse engineering something, especially a crypt algorithm isn't any where near as simple as it sounds! |
We know it is using SHA-256, and AES-128 with Incremental mode probably, so actually there is nothing complicated, only some on-disk meta-data needs to be reverse enginered, like which bit is what, and where is salt, and where is stored information that it is AES-128 and not 192 or 256. It should be easy. Unfortunately I do not have access to Solaris right now to test it. |
That DO sound easy :). Unfortunatly, we probably have to... I've spent the day looking into LUKS, but it does not seem to fit the purpose :(. It is intended for being placed between the device and the FS. Which means it needs one device (either one physical disk or multiple disks presented as one through raid/md) where it can store data linearly... Kind of. But since ZFS is both a FS and a ... 'device mapper' (?) which have multiple devices, I doubt it will be possible to have LUKS split the data and it's key storage partitions split over multiple physical disk. I haven't looked at the code yet, just the specs but that's what it looks like so far. |
Hi, |
Of course, you shouldn't look at the leaked source if you work on ZFS lest Oracle accuse you of copyright infringement. |
Yes, You are right. |
LUKS is not an option. ZFS performs encryption on per-dataset/volume/file basis, LUKS works on device level. We already have crypto primitives available in kernel, we already have on-disk format designed, we just need to reverse enginer it (it should be slightly easier than designing it - which in case of crypto-stuff is hard to do properly/securely). Probably ZIL will be the hardest part. Of course looking at leaked source-code is not an option at all. Even for second I wasn't thinking about it. |
An interim solution is ecryptfs, which can be installed on top of ZFS. Most RPM and DEB systems have built-in management for ecryptfs, which makes it easy to configure. For maximum performance, dedup and compression should be disabled on any ZFS dataset that hosts a crypto layer. |
This ( http://src.opensolaris.org/source/xref/zfs-crypto ) looks very nice, CDDL and lots of zfs crypto stuff. Maybe we should try to cooperate with Illumous for a common port to linux. In any case processors with AES-NI should be supported to gain optimal performance. |
I would like to point out that the code linked above has ZPL_VERSION = 3 and SPA_VERSION=15. That's quite old!! |
We should certainly work with the other ZFS implementations when any crypto work is being considered. Also it's my understanding that the link your referencing is to some of the early crypto work and it has been significantly reworked before being include in v30. That said, it's still probably a reasonable place to get familiar with the basic design decisions. |
Here is Sun's design document for ZFS encryption support: http://hub.opensolaris.org/bin/download/Project+zfs-crypto/files/zfs-crypto-design.pdf We can check out the early code by doing |
I would love to see that as well. crypto is an amazing feature. |
The last post was 5 month ago. Did you guys decide on anything? What is the current state? |
@FloFra |
In the ZFS on Linux area https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-discuss/crypto https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-devel/crypto leads to pool version 33, zfs-crypto (2011-12-22). In the illumos area Whilst https://www.illumos.org/projects/zfs-crypto is not recently updated, there's http://wiki.illumos.org/display/illumos/Project+Ideas (2012-07-18) Device drivers
File systems
I'll align myself with the latter. Elsewhere In irc://irc.freenode.net/#zfs on 2013-11-09, someone attention to code on GitHub. We acknowledged the need for someone to audit that code, so I didn't follow the link. |
@grahamperrin mentioned some encryption code on github. It was determined on the mailing list that it includes code from the Solaris 11 leak and is therefore encumbered. We will not be using it. |
I believe the encryption code referred to is located at https://github.com/zfsrogue/zfs-crypto. I've been able to merge and build both the SPL changes (https://github.com/zfsrogue/spl-crypto) and the ZFS branch. |
@sempervictus I would be very, very careful using that code (IF you can get it to merge). There's a very, very (yes, yes! :) high risk that that code is the source of my loss of my pool (16TB almost full).... |
Also, Rougue isn't really maintaining the code any more (ZoL have gone through a lot of changes since he created the repository). |
Actually he is, updating the osx one when I ask etc. He's most likely waiting on 0.6.3 to tag and release. You can ask him to do a merge anytime if there is something you want sooner. |
@lundman All I've seen is that he have 'come in' once a month (or 'every now and then') accepting patches, sometimes without any review. There was a couple of pulls I wanted to discuss before merge (they required other ZoL pulls I did to be accepted first, which they weren't/haven't yet - and might not ever be). |
@tcaputi The dir-structure is all fine, and as you say, is like IllumOS. I meant more that any file that includes modctl.h (Linux file yes?) and
I took the assembler out for now, as I have other things to do first, before it can even run, so its not worth worrying about just yet. Also, my comments are not meant to be negative, just reporting in as I port it over to OSX. I am pleased something is happening in this area :) |
The |
Huh that is interesting. Then it would be mixed Linux or Illumos code, so #ifdef's it is :) |
There, I have brought all that mod* stuff back in, and initialise as expected. I have ported everything I had missed. Currently it dies during init, here:
I have no call to initialise |
Hmm actually looks like IllumOS do not initialise it, tsk tsk eh :) |
I tried to catch most of those, but there were a lot of places where they rely on zero'd allocation to initialize locks and such. They also didn't have |
In case anyone is watching the PR, I pushed a couple of small fixes for encrypted zvols (which are also encrypted now) |
@behlendorf or anyone who knows: While I'm asking questions, does anyone know why
Is |
I can report partial success. I cleaned up the missing I can create a pool, and encrypted dataset, and copy a file to it, which does not show in the pool image file. First sign of trouble is trying to import the said pool again, but will look into that too. If any OSXers wish to play with it, it is under https://github.com/openzfsonosx/zfs/compare/wip_crypt3 |
So I've hit a bit of a roadblock and am not sure what the best solution is. If anybody knows a solution off hand, I would appreciate the input. The problem is regarding partially encrypted types, particularly dnodes. Encrypted datasets currently encrypt dnodes by encrypting bonus buffers, but not the dnode itself. Only a few bonus types need to be encrypted, and for right now I'm just working with System Attribute bonus buffers. Other bonus buffers are left in the clear. This setup should hopefully allow us to scrub datasets and check for errors without the keys being loaded. Before keys can be loaded, several functions ( The problem arises when the dnode block is used again later, when the keys are loaded. At this point zfs sees that the dnode block is cached in the ARC and doesnt bother to reread the data so it comes back with the bonus buffer still encrypted. This obviously breaks any code that relies on that. The solutions I can think of are:
I apologize for the wall of text, but I would appreciate it if anyone had any input on how I should go about doing this. |
@tcaputi: would it be possible to mark the bonus buffers cached in ARC as dirtied elsewhere and thus force a re-read from disk through the decryption routines? Essentially tell the ARC its somehow out of sync with the on-disk data instead of asking it to track every object as encrypted or not (which, being a boolean is small overhead, but as you point out, that's precious space). |
@sempervictus: Since I posted the question I found arc_buf_header_t, pointed to by arc_buf_t, which appears to be 1-1 mapped. This struct does have a field for flags, which solves the space problem. I'm still not sure what changes would need to be made to the ARC for this to work. From the top of my head, I know that the ARC has a hashmap of buffers, and I imagine trying to maintain 2 copies of the same buffer in that hashmap could cause problems, since the second one would attempt take the first's place. Short of any other suggestions I plan on spending a lot of time tomorrow reading through arc.c. |
@tcaputi So what you could do is don't do anything to dnode_phys, make it always have encrypted bonus. And make the decryption happens when you read the bonus into the separate bonus buffer. But of course, for other stuff like user data and stuff, you'll likely need to mark it as encrypted/decrypted in ARC. The Illumos camp are working on compressed ARC, you might want to look at it. Maybe you can find some useful idea. https://drive.google.com/file/d/0B5hUzsxe4cdmbEh2eEZDbjY3LXM/view?usp=sharing |
@tuxoko: would it be feasible to create something along the lines of a negative DVA in the arc_buf_hdr_t structure for encrypted data such that repeated access to the encrypted state can still be cached, and the real DVA for decrypted data? |
@tuxoko: |
I think I'm approaching a solution a solution, although there is now an additional (but hopefully easier) problem. It seems like the best place to decrypt the bonus buffers would be in The issue is that decrypting a single bonus buffer requires decrypting all the bonus buffers in the block. I believe the full DMU_OT_DNODE block should be available as long as a bonus buffer within it is available, but this is still at the very least inefficient. Perhaps this wouldn't be quite as bad of an option if it also cached the other decrypted bonus buffers instead of throwing them away. That said, I had mentioned earlier that we could fit the IV and MAC for a dnode into I'm not sure which method is better here, but I'm (again) open to any input. In the meantime I will continue looking myself. |
I had to work on another project for a few weeks, but now I am back to working on this full time. After looking at the status of large dnode support in #3542 I am now less averse to using some of the padding in For now I will start working on adding encryption parameters to |
Is it possible to get this into 0.7.0? |
This change incorporates three major pieces: The first change is a keystore that manages wrapping and encryption keys for encrypted datasets. These commands mostly involve manipulating the new DSL Crypto Key ZAP Objects that live in the MOS. Each encrypted dataset has its own DSL Crypto Key that is protected with a user's key. This level of indirection allows users to change their keys without re-encrypting their entire datasets. The change implements the new subcommands "zfs load-key", "zfs unload-key" and "zfs change-key" which allow the user to manage their encryption keys and settings. In addition, several new flags and properties have been added to allow dataset creation and to make mounting and unmounting more convenient. The second piece of this patch provides the ability to encrypt, decyrpt, and authenticate protected datasets. Each object set maintains a Merkel tree of Message Authentication Codes that protect the lower layers, similarly to how checksums are maintained. This part impacts the zio layer, which handles the actual encryption and generation of MACs, as well as the ARC and DMU, which need to be able to handle encrypted buffers and protected data. The last addition is the ability to do raw, encrypted sends and receives. The idea here is to send raw encrypted and compressed data and receive it exactly as is on a backup system. This means that the dataset on the receiving system is protected using the same user key that is in use on the sending side. By doing so, datasets can be efficiently backed up to an untrusted system without fear of data being compromised. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes openzfs#494 Closes openzfs#5769
This change incorporates three major pieces: The first change is a keystore that manages wrapping and encryption keys for encrypted datasets. These commands mostly involve manipulating the new DSL Crypto Key ZAP Objects that live in the MOS. Each encrypted dataset has its own DSL Crypto Key that is protected with a user's key. This level of indirection allows users to change their keys without re-encrypting their entire datasets. The change implements the new subcommands "zfs load-key", "zfs unload-key" and "zfs change-key" which allow the user to manage their encryption keys and settings. In addition, several new flags and properties have been added to allow dataset creation and to make mounting and unmounting more convenient. The second piece of this patch provides the ability to encrypt, decyrpt, and authenticate protected datasets. Each object set maintains a Merkel tree of Message Authentication Codes that protect the lower layers, similarly to how checksums are maintained. This part impacts the zio layer, which handles the actual encryption and generation of MACs, as well as the ARC and DMU, which need to be able to handle encrypted buffers and protected data. The last addition is the ability to do raw, encrypted sends and receives. The idea here is to send raw encrypted and compressed data and receive it exactly as is on a backup system. This means that the dataset on the receiving system is protected using the same user key that is in use on the sending side. By doing so, datasets can be efficiently backed up to an untrusted system without fear of data being compromised. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes openzfs#494 Closes openzfs#5769
Fix some nits that are only detected by `cargo +nightly clippy` but which seem to be good recommendations.
As of ZFS Pool Version 30, there is support for encryption. This part is unfortunatly closed source, so an opensource implementation would be required. That means it would probably not be compatible with the Solaris version 'but who cares' :).
Illumos is apparently working on this at https://www.illumos.org/projects/zfs-crypto. Source repository can be found at https://bitbucket.org/buffyg/illumos-zfs-crypto. Unfortunatly there is no changes since the fork from illumos-gate. Should ZoL start thinking about this or should we just take the back seat?
Don't know how big of a problem this would be, but 'copying' the way that LUKS (Linux Unified Key Setup) do it seems to be a good place to start.
The text was updated successfully, but these errors were encountered: