-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS Encryption #4329
ZFS Encryption #4329
Conversation
I see my builds all failed because I left in a few bad asserts and I didn't have debugging enabled so they didn't cause an issue on my end. Oops. I will fix this tonight. |
I also see that the style issues are causing the builds to fail. I will fix that tonight as well. |
Does none of this belong in spl? |
All of the code in module/icp (with the exception of the algs directory which has specific cipher implementations) acts as a single frameowrk and should be kept together. Personally, I view it as a standalone module that relies on the spl which is why I put it where it is. To me it seemed like it was along the same lines as the nvpair module. I can move it if the community feels there is a better place for it. |
@ilovezfs Putting any of this into the SPL would require implementing new encryption routines because the Linux kernel GPL symbol exports its encryption routines and the SPL is not allowed to touch that. |
@ryao Then perhaps SPL needs to be renamed LPL "License Porting Layer" since a piece of Solaris porting cannot go in the SPL. Ahem. |
@ryao Thank you for mentioning that. I had forgotten about the licensing aspect. |
does anyone know how to re-trigger the automated build tests? The issues from above should have been fixed by my reformatting commit, but I'd like to be sure. |
@tcaputi if you force update your branch on Github it will rerun all the builds and tests. Usually, I just rebase my work on master and then force update it. As for putting this work in the SPL I'd prefer to keep in the ZFS source tree since it's all CDDL and originated from illumos. The SPL itself will in due course get moved in to a directory of the ZFS source tree for convenience so let's try and avoid adding more to it if we don't have too. |
6e4b1a8
to
7557c83
Compare
@behlendorf It doesn't look like the tests ran. I did a merge against upstream/master and then |
@tcaputi: You need to rebase instead of merge, so assuming your index and working tree are clean:
|
I looked at the logs here. The builds are failing now because zfs requires the (few) changes I made to the SPL. I will make a PR for that too, but I suppose this won't work on the automated builds until then (which will probably take a little while) |
I made a (small) corresponding PR against the spl: openzfs/spl#533 |
@tcaputi if you add the line |
@behlendorf I will do that and fix the SPL PR tonight. Thanks for the advice and patience. |
31f3656
to
9ec5427
Compare
I see that there are a few problems on non x86_64 architectures. I will fix these soon, but I don't have any local machines to test against. I hope its ok if I end up hitting the build system a few more times for testing. |
@tcaputi by all means keep submitting things until the buildbot is happy. That's what it's there for. |
* CDDL HEADER START | ||
* | ||
* The contents of this file are subject to the terms of the | ||
* Common Development and Distribution License (the "License"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The header in this file and any other new files that are not derived from existing files under the unbound CDDL (i.e. no version peg) should state Common Development and Distribution License, Version 1.0 only
. That way Oracle cannot exercise sections 4.1 and 4.2 of the CDDL to make them available under an open source incompatible license. There are already a few files in the repository like this, such as ./lib/libspl/xdr.c
.
I ran this by @ahrens before posting it. He agrees that we should be going with Version 1.0 only
in new files.
@ryao Ok. Thanks. I just copies the license from an existing file. I'lll make the change. |
005638a
to
55ef824
Compare
…IL-specific IV derivation
@@ -727,26 +802,41 @@ dsl_dataset_disown(dsl_dataset_t *ds, void *tag) | |||
ASSERT(ds->ds_dbuf != NULL); | |||
|
|||
mutex_enter(&ds->ds_lock); | |||
if (ds->ds_dir && ds->ds_dir->dd_crypto_obj) { | |||
(void) spa_keystore_remove_mapping(ds->ds_dir->dd_pool->dp_spa, | |||
ds, ds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that spa_keystore_add_mapping
and spa_keystore_remove_mapping
need to be paired. Since dsl_dataset_own
can conditionally add
, doesn't dsl_dataset_disown
also need to conditionally remove
? Otherwise we may remove
when we didn't add
.
Since there's only one owner, we should be able to keep track in the ds
whether the owner did an add
, in which case we should do a remove
. That way we don't need to change the args to _disown
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the dsl_dataset_hold_crypt()
change I mentioned above, the dsl_dataset_rele()
and dsl_dataset_disown()
will also accept flags to properly manage releasing the keys.
int | ||
dsl_dataset_own_obj(dsl_pool_t *dp, uint64_t dsobj, | ||
dsl_dataset_hold_crypt(dsl_pool_t *dp, const char *name, | ||
void *tag, dsl_dataset_t **dsp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment to help the reader understand when they would use hold
vs hold_crypt
. E.g:
Hold the dataset and ensure that the decryption key is available (otherwise, the call will fail and return ENOKEY). Callers must use dsl_dataset_rele_crypt() to release this hold. If callers need access to the decrypted contents, they must use this routine, otherwise they can use dsl_dataset_hold().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am getting this rewritten now to accommodate this. The new version will have flags passed to dsl_dataset_rele_flags()
and dsl_dataset_hold_flags()
and friends. The non-crypt version will be a thin wrapper around this.
if (os->os_encrypted && | ||
(spa_keystore_lookup_key(os->os_spa, | ||
os->os_dsl_dataset->ds_object, NULL, NULL) == 0)) | ||
key_needed = B_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know that this ownership needed the key? See similar comment in dsl_dataset_disown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already have this fixed locally (and the other one as well). I will push it with the rest of these fixes.
@@ -267,7 +267,7 @@ calculate_range(const char *dataset, err_type_t type, int level, char *range, | |||
* size. | |||
*/ | |||
if ((err = dmu_objset_own(dataset, DMU_OST_ANY, | |||
B_TRUE, FTAG, &os)) != 0) { | |||
B_TRUE, B_TRUE, FTAG, &os)) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that we actually need any decrypted data here. If we do require it to be decrypted, how does zinject get the keys?
@@ -179,7 +179,7 @@ object_from_path(const char *dataset, const char *path, struct stat64 *statbuf, | |||
*/ | |||
sync(); | |||
|
|||
err = dmu_objset_own(dataset, DMU_OST_ZFS, B_TRUE, FTAG, &os); | |||
err = dmu_objset_own(dataset, DMU_OST_ZFS, B_TRUE, B_TRUE, FTAG, &os); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that we actually need any decrypted data here. If we do require it to be decrypted, how does zinject get the keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to err on the side of caution here, and so anything I wasn't sure of required the keys be loaded. I will change this to B_FALSE
. (same with the other instance in zinject).
.ad | ||
.sp .6 | ||
.RS 4n | ||
Indicates that the zpool command will request encryption keys for all encrypted datasets it attempts to mount as it is bringing the pool online. This is equivalent to running \fBzfs mount\fR on each encrypted dataset immediately after the pool is imported. If any datasets have a \fBprompt\fR keysource this command will block waiting for the key to be entered. Otherwise, encrypted datasets will be left unavailable until the keys are loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equivalent to running \fBzfs mount\fR on each encrypted dataset
So, it mounts them? That seems counterintuitive. Seems like this should be equivalent to zfs key -l
on all encrypted filesystems.
waiting for the key
I think this should be keys
(plural).
Otherwise,
I think you mean, If \fB-l\fR is not specified,
, not "If no datasets have a prompt keysource"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct on both points, will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this some more, on the surface it seems like -l
does approximately nothing, because we already mount filesystems when the pool is imported, and presumably that will load keys for the filesystems we mount. I think the exception is filesystems with mountpoint=legacy
or canmount=off | noauto
. If that's right, I think it would be worth mentioning here - that -l
is used to ensure that all keys are loaded, even for filesystems that are not mounted because they have mountpoint=legacy
or canmount=off | noauto
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zfs key -l
and zpool import -l
also will result in encrypted zvols appearing in /dev/
, by the way. To me, these commands allow the user to access the dataset normally doing whatever without worrying about encryption anymore.
That said, I can verify the behavior of this command and make the comment more specific accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more context about the reason for zpool import -l
after looking back at the code:
Thinking about this some more, on the surface it seems like -l does approximately nothing, because we already mount filesystems when the pool is imported, and presumably that will load keys for the filesystems we mount.
Right now, zpool import
(without -l
) will actually leave encrypted datasets unmounted. The reason for this was compatibility. I was afraid that someone might add an encrypted dataset to a pool that is mounted via an automated script. When the pool is re-imported, the script would hit the prompt for the encrypted dataset and hang indefinitely. As a result, I wanted people to have to opt into the automatic key loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, zpool import (without -l) will actually leave encrypted datasets unmounted.
OK, let's document that in the manpage.
hierarchy, file size, file holes, and dedup tables. Key rotation is managed | ||
internally by the ZFS kernel module and changing the user's key does not | ||
require re-encrypting the entire dataset. Datasets can be scrubbed, resilvered, | ||
moved, and deleted without the encryption keys being loaded (see the zfs key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
renamed
?
zfs key
\fBzfs key\fR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
provides additional protection against maliciously altered data. Deduplication | ||
is still possible with encryption enabled but for security, datasets will only | ||
dedup against themselves, their snapshots, and their clones. Encrypted data | ||
cannot be embedded via the \fIembedded_data\fR feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph belongs in the zfs.8 manpage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
encryption datasets may be vulnerable to a CRIME-like attack if applications | ||
accessing the data allow for it. Deduplication with encryption will leak | ||
information about which blocks are equivalent in a dataset and will incur an | ||
extra CPU cost per block written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph belongs in the zfs.8 manpage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
internally by the ZFS kernel module and changing the user's key does not | ||
require re-encrypting the entire dataset. Datasets can be scrubbed, resilvered, | ||
moved, and deleted without the encryption keys being loaded (see the zfs key | ||
subcommand for more info). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the information in here should also be in the zfs.8 manpage, perhaps in a new subsection of the DESCRIPTION
(peer to Clones
, Mount Points
, etc). The zpool-features manpage should primarily document why you would want to enable this feature. See for example the documentation of the bookmarks
feature. In this case:
Enabling this property allows setting the \fBencryption\fR property to values other than \fBoff\fR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix. Would you want me to move some of this information or make a copy of it in zfs.8 (perhaps with different wording)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think move it, and leave just the minimal description of what this property does (allows setting encryption), and when it becomes active and enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will fix.
@ahrens Thanks for helping with the review. I should be able to address all of the comments and make another push early next week. |
|
||
.LP | ||
.nf | ||
\fB\fBzfs key -l\fR \fIfilesystem\fR | \fIvolume\fR\fR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easier to read (e.g. in a script or an email) if we also have long opts for the "verbs" here:
zfs key --load
zfs key --unload
zfs key --change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZoL actually doesn't have any longopts handling at all. I can add it though, if it will help.
This (the wording of this command) is actually the biggest complaint I've received about the patch so far. I've heard from several people that they don't like that key
isn't a verb like create
or destroy
. That said, I have been unable to come up with any alternatives that dont seem tedious (like zfs loadkey
, zfs unloadkey,
zfs changekey`. If you have any thoughts here now would probably be a good time while I'm cleaning up all of this other stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thoughts... nothing better jumps to mind but I'll keep pondering...
FWIW, I think that the tedious zfs loadkey
etc are in keeping with the existing interface, which does have some collections of verbs that all operate on the same concepts (e.g. zfs allow
, zfs unallow
; and zfs hold
, zfs release
, zfs holds
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change this if you'd like (its a bit tedius to make every one of these a separate funnction in both kernel and userspace. My only other qualm here is that people have asked for a few other key-related subcommands like zfs verifykey
which would tell you if a key is correct (even if its already loaded).
Another one I was going to add while doing all these fixups is zfs linkkey
which would tell a dataset to inherit the wrapping keys of a parent. Currently, you can break wrapping key inheritance with zfs key -c
but there's no way to relink them later so this would add that functionality. So at that point there are now 5 key commands which seems like a lot of bloat in both zfs_cmd.c
and zfs_ioctl.c
.
I'm fine doing it either way. Let me know what you think is best. Maybe @behlendorf could weigh in here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do create lots of subcommands, I don't think that makes it necessary to create lots of ioctls; you can keep using one ioctl if that's cleaner/simpler.
zfs linkkey
Too bad key change doesn't really fit in with the property scheme, otherwise zfs inherit
would be the obvious choice. I'd suggest that if we want this functionality, it may make sense to borrow the "inherit" terminology here, e.g. zfs inheritkey
or zfs key --inherit
.
.ne 2 | ||
.mk | ||
.na | ||
\fB\fBkeysource\fR=<\fBraw\fR | \fBhex\fR | \fBpassphrase\fR>,<\fBprompt\fR | \fBfile://\fR\fI<absolute file path\fR>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get wanting to use a URI in case we later add extensions, like to contact some kind of keyserver. But it seems like it would be handy if we could specify an absolute path as a shortcut for file://<path>
. This seems non-ambiguous since AFAIK a URI can't start with a /
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another leftover from trying to stick to the Solaris UI. I actually kind of like the file URI, but that's just me. Maybe @behlendorf could be a tie-breaker here?
.ne 2 | ||
.mk | ||
.na | ||
\fB\fBkeysource\fR=<\fBraw\fR | \fBhex\fR | \fBpassphrase\fR>,<\fBprompt\fR | \fBfile://\fR\fI<absolute file path\fR>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do keysource=raw,prompt
, is it really possible to type/paste the raw data (including nul characters) into the terminal? If not, I assume this is still a valid setting because you can pipe raw bytes into zfs key -l
? That might be worth mentioning in the "STDIN" section of zfs key -l
docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually is possible to type paste raw characters in (at least in my terminal maybe not all). But yes, this is still valid for the use-case you described. I can add a sentence or 2 to the man page.
Back to basics for a moment, and apologies if I read these points out of context:
– and from related https://github.com/tcaputi/zfs/blob/2f7c9cb5e27c4feef62971d0c2f802735d60a4e8/man/man8/zfs.8#L1021:
Children and creation of children aside, for a moment. What, if anything, exists – or should exist – to prevent simple mount of a non-encrypted filesystem at a point within an encrypted filesystem? (Probably treat that as a question about documentation. Not an expression of paranoia.) |
.ad | ||
.sp .6 | ||
.RS 4n | ||
Unloads a key from ZFS, removing the ability to access the dataset and all of its children that inherit the \fBencryption\fR property. This requires that the dataset is not currently open or mounted. When a key is unloaded the \fBkeystatus\fR property will be set to \fBunavailable\fR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about \fBkeystatus\fR property is \fBunavailable\fR
. It seems strange to say that it's "set" because it isn't a settable property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
.ad | ||
.sp .6 | ||
.RS 4n | ||
Allows a user to change the encryption key used to access a dataset. This command requires that the existing key for the dataset is already loaded into ZFS. This command may also be used to change the \fBpbkdf2iters\fR and / or \fBkeysource\fR properties as needed. If the dataset was previously inheriting the \fBencryption\fR property when this command is run it will now be locally set, indicating that this dataset must have its key loaded separately from the parent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind requiring the key already be loaded for zfs key -c
, compared with zfs mount
which will load the key for you? I think you mentioned in another comment that there was a concern about what state you're left in (loaded or not) when zfs key -c
completes. Maybe it should be the same as if you do zfs mount; zfs unmount
, which I think would leave the key loaded.
The philosophy around key loaded-ness seems to be that in general we attempt to load when the keys are needed, and unload only when explicitly requested (zfs key -u
). In the current implementation, I'm imaging user interaction:
$ zfs key -c ...
Sorry, you can't change keys because they aren't loaded. Run "zfs key -l ..." to load them.
# OK fine I'll copy/paste what you said to do, but if you know what I have to do, why didn't you do it for me?
$ zfs key -l ...
$ zfs key -c ...
Seems like we should have a good reason for annoying them in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a good middle ground is that the keys will remain in whatever state they were in before the zfs key -c
command was run? Is that reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this comment I made above.
Nothing prevents a mount like that from being done right now, but as far as I am aware that will not automount when the parent is mounted where as child datasets are. I can look into adding a sentence or 2 to the docs about this |
if this PR good for testing by unpatient end-users? is the on-disk format finished? |
@mailinglists35 and everyone else:
I would say patch 89b4e7a is good enough to test. This week I will be addressing @ahrens comments and I will be using buildbot here for testing, so I wouldn't use any of these commits until theyre back to being stable (I will squash them all at that time). The on-disk format has not been verified yet. We are working to get there. |
I'm still getting patches on a silver-platter once the dust settles, right? :) |
@lundman Yes. Let me get all of these changes stable first (hopefully sometime this week) and then I'll squash them and I'll send you a nice and tidy patch |
…needs testing. man pages / tests not updated
…w api. fixed a few bugs as they appeared. man page still not updated.
|
||
/* | ||
* With the advent of encrypted data in the ARC it is now possible for | ||
* legitimate errors to arise while transforming data into its desired format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the style/tone of comments should be to describe the code as it is. Consider someone reading this in 5 years time - the "advent" making it "now possible" will seem misplaced. Instead, consider something like:
Because the ARC can store encrypted data, errors (not due to bugs) may arise while transforming data into its desired format - specifically, when decrypting, the key may not be present, or the HMAC may not be correct, which signifies deliberate tampering with the on-disk state (assuming that the checksum was correct). The "error" parameter will be nonzero in this case, even if there is no associated zio.
Not sure if the technical details above are correct, but hopefully the style makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. I will also fix the comment.
@@ -112,20 +121,21 @@ typedef enum arc_flags | |||
ARC_FLAG_L2_WRITING = 1 << 11, /* write in progress */ | |||
ARC_FLAG_L2_EVICTED = 1 << 12, /* evicted during I/O */ | |||
ARC_FLAG_L2_WRITE_HEAD = 1 << 13, /* head of write list */ | |||
ARC_FLAG_ENCRYPT = 1 << 14, /* encrypted on disk */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might add something like may or may not be encrypted in memory
, to emphasize this fact (lest anyone wonder if encrypted on disk
is what you precisely meant).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
* disk as they appear in the main pool. In order for this to work we | ||
* need to pass around the encryption parameters so they can be used | ||
* to write data to the L2ARC. This struct is only defined in the | ||
* arc_buf_hdr_t if the L1 header is defined and the has the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: remove extraneous the
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
@@ -144,7 +154,8 @@ typedef enum arc_flags | |||
|
|||
typedef enum arc_buf_flags { | |||
ARC_BUF_FLAG_SHARED = 1 << 0, | |||
ARC_BUF_FLAG_COMPRESSED = 1 << 1 | |||
ARC_BUF_FLAG_COMPRESSED = 1 << 1, | |||
ARC_BUF_FLAG_ENCRYPTED = 1 << 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here contrasting this with ARC_FLAG_ENCRYPTED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
@@ -1331,13 +1391,30 @@ arc_buf_lsize(arc_buf_t *buf) | |||
return (HDR_GET_LSIZE(buf->b_hdr)); | |||
} | |||
|
|||
boolean_t | |||
arc_is_encrypted(arc_buf_t *buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment explaining what this means. In particular, this tells if the buf is stored encrypted in the ARC, and therefore ... [can't be read unless X flag passed to arc_read()? unless key loaded via foobar()
?]. It returns false if the data is encrypted on disk but decrypted in memory.
(The same should type of comment should probably have been added above arc_get_compression.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix. should i add the one for compression as well while I'm here? Or should that be left to a different PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with adding the compression comment while you're here.
@@ -208,7 +209,7 @@ typedef struct zio_cksum_salt { | |||
* G gang block indicator | |||
* B byteorder (endianness) | |||
* D dedup | |||
* X encryption (on version 30, which is not supported) | |||
* X encryption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should say set to zero; see diagram below for encrypted blocks
, and then there should be a separate diagram that shows the layout for encrypted blocks, similar to how we explain embedded BP's (IIRC, checksum[2-3], fill, and dva[2] have different meanings when the X bit is set). The legend of the new diagram might omit fields that are already described here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I think the in-memory representations of the IV and salt (and MAC?) are treated as byte arrays, be sure to explain how those are encoded into the 64-bit words (e.g. low 8 bits of the word are the low byte of the array, etc). As you probably know, we can't simply treat the byte array as a uint64_t*, because of endianness concerns.
} | ||
|
||
#define BP_GET_IV2(bp) BF64_GET((bp)->blk_fill, 32, 32) | ||
#define BP_SET_IV2(bp, iv2) BF64_SET((bp)->blk_fill, 32, 32, iv2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should assert that the BP is encrypted (you can use the comma operator to do the assertion in the GET
, see BPE_GET_ETYPE()
for an example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
} | ||
|
||
#define BP_GET_IV2(bp) BF64_GET((bp)->blk_fill, 32, 32) | ||
#define BP_SET_IV2(bp, iv2) BF64_SET((bp)->blk_fill, 32, 32, iv2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful (for readability) to add accessor macros for the other encryption-specific fields (even if they are entire words). Those macros could also assert that they are only used on encrypted BP's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have full functions for these (see zio_crypt_encode_params_bp()
in zio_crypt.c
for instance). Let me know if you think they should be changed to macros and brought to spa.h
@@ -269,6 +270,7 @@ struct spa { | |||
spa_avz_action_t spa_avz_action; /* destroy/rebuild AVZ? */ | |||
uint64_t spa_errata; /* errata issues detected */ | |||
spa_stats_t spa_stats; /* assorted spa statistics */ | |||
spa_keystore_t spa_keystore; /* loaded crypto keys */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align the comment with the adjacent ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
@@ -86,7 +87,7 @@ typedef struct zil_header { | |||
* number passed in the blk_cksum field of the blkptr_t | |||
*/ | |||
typedef struct zil_chain { | |||
uint64_t zc_pad; | |||
uint64_t zc_mac; /* mac for encryption */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align the comment with the adjacent ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
.ad | ||
.sp .6 | ||
.RS 4n | ||
Indicates that the zpool command will request encryption keys for all encrypted datasets it attempts to mount as it is bringing the pool online. This is equivalent to running \fBzfs mount\fR on each encrypted dataset immediately after the pool is imported. If any datasets have a \fBprompt\fR keysource this command will block waiting for the key to be entered. Otherwise, encrypted datasets will be left unavailable until the keys are loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, zpool import (without -l) will actually leave encrypted datasets unmounted.
OK, let's document that in the manpage.
@@ -1331,13 +1391,30 @@ arc_buf_lsize(arc_buf_t *buf) | |||
return (HDR_GET_LSIZE(buf->b_hdr)); | |||
} | |||
|
|||
boolean_t | |||
arc_is_encrypted(arc_buf_t *buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with adding the compression comment while you're here.
/* | ||
* After encrypting many blocks with the same salt we may start to run | ||
* up against the theoretical limits of how much data can securely be | ||
* encrypted a single key using the supported encryption modes. To |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: encrypted *with* a single key
* counteract this we generate a new salt after writing | ||
* ZIO_CRYPT_MAX_SALT_USAGE blocks of data, tracked by zk_salt_count. | ||
* The current value was chosen because it is approximately the number | ||
* of blocks that would have to be written in order to acheive a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling: achieve
* 1 / 1 trillion chance of having an IV collision. Developers looking to | ||
* change this number should make sure they take into account the | ||
* birthday problem in regards to IV generation and the limits of what the | ||
* underlying mode can actually handle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should lay out the math behind this number. That would make it clear that the birthday problem is accounted for. Also we should simply say that we account for the birthday problem rather than condescend to future readers. E.g. This protects against a birthday attack. With n = 400 million blocks encrypted with the same key and salt, and d = the number of IV's = 2^96, the probability of two blocks using the same IV is: 1 - 1 * (1 - 1 / d ) * ( 1 - 2 / d ) * … * (1 - (n - 1) / d) which is approximated by ... (according to [citation]), which is 10^-12.
|
||
/* utility macros */ | ||
#define BITS_TO_BYTES(x) (((x) + 7) >> 3) | ||
#define BYTES_TO_BITS(x) (x << 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the compiler can do the optimization for / 8
. How about x * NBBY
and (x + NBBY - 1) / NBBY
.
#define DSL_CRYPTO_KEY_IV "DSL_CRYPTO_IV" | ||
#define DSL_CRYPTO_KEY_MAC "DSL_CRYPTO_MAC" | ||
#define DSL_CRYPTO_KEY_MASTER_BUF "DSL_CRYPTO_MASTER" | ||
#define DSL_CRYPTO_KEY_HMAC_KEY_BUF "DSL_CRYPTO_HMAC_KEY" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for the macro names and values to not be consistent, at least DSL_CRYPTO_KEY_XXX -> "DSL_CRYPTO_XXX"
(this applies to the last 2 which differ in the _BUF
suffix)
extern zio_crypt_info_t zio_crypt_table[ZIO_CRYPT_FUNCTIONS]; | ||
|
||
/* ZAP entry keys for DSL Encryption Keys stored on disk */ | ||
#define DSL_CRYPTO_KEY_CRYPT "DSL_CRYPTO_CRYPT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"crypto crypt" is not the most descriptive name. Maybe CRYPTO_FUNC?
@@ -208,7 +209,7 @@ typedef struct zio_cksum_salt { | |||
* G gang block indicator | |||
* B byteorder (endianness) | |||
* D dedup | |||
* X encryption (on version 30, which is not supported) | |||
* X encryption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I think the in-memory representations of the IV and salt (and MAC?) are treated as byte arrays, be sure to explain how those are encoded into the 64-bit words (e.g. low 8 bits of the word are the low byte of the array, etc). As you probably know, we can't simply treat the byte array as a uint64_t*, because of endianness concerns.
* encrypted, this stage determines how the encryption metadata is stored in | ||
* the bp. Decryption is performed during ZIO_STAGE_READ_BP_INIT as a transform | ||
* callback. Encryption is also mutually exclusive with nopwrite, because | ||
* encrypted blocks with the same plaintext will not have matching ciphertexts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about because blocks with the same plaintext will be encrypted with different salts and therefore different IV's (if dedup is off), and therefore have different ciphertexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
zfeature_register(SPA_FEATURE_ENCRYPTION, | ||
"com.datto:encryption", "encryption", | ||
"Support for dataset level encryption", | ||
0, encryption_deps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encryption feature is is deactivated when it's no longer used, but it isn't PER_DATASET
? I'll look for where this is implemented... it may be simpler to use the PER_DATASET flag here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the comments around this flag it looks like this might be the case. I'm not familiar with this flag but it looks like all I need to do is set ds->ds_feature_activation_needed[feature] = B_TRUE
when the dataset starts using the feature. Is this correct? Currently, the implementation increments the feature count when it creates a key object in the spa and decrements it when that object is destroyed.
ZIO_CRYPT_DEFAULT, PROP_ONETIME, ZFS_TYPE_DATASET, | ||
"on | off | aes-128-ccm | aes-192-ccm | aes-256-ccm | " | ||
"aes-128-gcm | aes-192-gcm | aes-256-gcm", "ENCRYPTION", | ||
crypto_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could argue that this property is inherited, but from the prop management infrastructure, it's ONETIME, so it should probably be registered in the "set once index properties" section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
boolean_t | ||
zfs_prop_valid_keylocation(const char *str) | ||
{ | ||
if (strlen(str) == 6 && strncmp("prompt", str, 6) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent to strcmp("prompt", str) == 0
, but more brittle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
{ | ||
if (strlen(str) == 6 && strncmp("prompt", str, 6) == 0) | ||
return (B_TRUE); | ||
else if (strlen(str) > 8 && strncmp("file:///", str, 8) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than having the literal 8
's, how about something like:
#define FILE_URI_PREFIX "file:///"
#define FILE_URI_PREFIX_LEN strlen(FILE_URI_PREFIX)
Also I'm not sure how valuable the "len>8" check is, since file:///
is just as wrong as file:///etc
, but we aren't checking for that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is something I was trying to think of ways to improve or possibly remove. Realistically, zfs_prop_valid_keylocation()
is just a sanity check. The create
, clone
, and change-key
code all will actually need to load the key from the given keylocation
before they can even attempt to talk to the kernel. This is all done in userspace, however, so the kernel can't really be sure if the keylocation
it receives is valid. As a result, I added this function as a quick smoke test for the kernel. Can you think of a better way to check this?
@@ -1527,13 +1536,14 @@ dmu_recv_begin_sync(void *arg, dmu_tx_t *tx) | |||
const char *tofs = drba->drba_cookie->drc_tofs; | |||
dsl_dataset_t *ds, *newds; | |||
uint64_t dsobj; | |||
int flags = DS_HOLD_FLAG_DECRYPT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe name this dsflags
for consistency with dmu_recv_begin_check()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only called it dsflags
there because flags
was already being used. Would you like me to change this everywhere?
@@ -2454,14 +2466,16 @@ receive_free(struct receive_writer_arg *rwa, struct drr_free *drrf) | |||
static void | |||
dmu_recv_cleanup_ds(dmu_recv_cookie_t *drc) | |||
{ | |||
int flags = DS_HOLD_FLAG_DECRYPT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dsflags
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above
*/ | ||
dsl_dataset_disown(drc->drc_ds, dmu_recv_tag); | ||
(void) spa_keystore_remove_mapping(dmu_tx_pool(tx)->dp_spa, | ||
drc->drc_ds, drc->drc_ds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is so that zfs receive
unloads the key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that it releases the key mapping hold it created in dmu_recv_begin()
. Loading and unloading is for wrapping keys, technically.
@@ -196,8 +197,11 @@ traverse_prefetch_metadata(traverse_data_t *td, | |||
if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE) | |||
return; | |||
|
|||
if ((td->td_flags & TRAVERSE_NO_DECRYPT) && BP_IS_ENCRYPTED(bp)) | |||
zio_flags |= ZIO_FLAG_RAW; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth a comment somewhere that this is all about dnode blocks (since traversal doesn't read any other encrypted blocks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
int32_t i; | ||
int32_t epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; | ||
dnode_phys_t *child_dnp; | ||
|
||
/* | ||
* dnode blocks might have their bonus buffers encrypted, so | ||
* we must be careful to honor TRAVERSE_NO_DECRYPT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we assert that other cases here are not encrypted (and therefore it's OK to ignore NO_DECRYPT)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. will fix.
Thank you very much to everyone who has looked at and commented on this PR. As some of you may have noticed, this page is getting a bit too big for github to handle effectively so I will be moving it to a new PR in a few minutes. I will link it here when it has been made. |
The new PR has been opened at #5769. Please leave any future comments and reviews there |
Native encryption in zfsonlinux (See issue #494)
The change incorporates 2 major pieces:
The first feature is a keystore that manages wrapping and encryption keys for encrypted datasets. The commands are similar to that of Solaris but with a few key enhancements to make it more predictable, more consistent, and require less manual maintenance. It is fully integrated with the existing
zfs create
functions andzfs clone
functions. It also exposes a new set of commands viazfs key
for managing the keystore. For more info on the issues with the Solaris implementation see my comments here and here. The keystore operates on a few rules.The second feature is the actual data and metadata encryption. All user data in an encrypted dataset is stored encrypted on-disk. User-provided metadata is also encrypted, but metadata structures have been left plain so that scrubbing and resilvering still works without the keys loaded. The design was originallly inspired by this article but has been changed fairly significantly since.
Implementation details that should be looked at
key_mapping_t
duringdsl_dataset_tryown()
. I added a flag to this function for code that wishes to own the dataset, but that does not require encrypted data, such as the scrub functions. I did my best to confirm that all owners set this flag correctly, but someone should confirm them, just to be sure.zfs send
andzfs recv
do not currently do anything special with regards to encryption. The format of the send file has not changed and zfs send requires the keys to be loaded in order to work. At some point there should probably be a way to do raw sends.lzc_create() and lzc_clone()
to support hidden arguments. I understand that the purpose of libzfs_core is to have a stable api interacting with the ZFS ioctls. However, these functions need to accept wrapping keys separately from the rest of their parameters because they need to use the (new) hidden_args framework to support hiding arguments from the logs. Without this, the wrapping keys would get printed to the zpool history.EDIT 5/4/16: Updated to reflect the current state of the PR
EDIT 1/3/17: Updated to reflect the current state of the PR