Seekable Crypto is a Java library that provides the ability to seek within
SeekableInput
s while decrypting the underlying contents along with some
utilities for storing and generating the keys used to encrypt/decrypt the data
streams. An implementation of the Hadoop FileSystem is also included that uses
the Seekable Crypto library to provide efficient and transparent client-side
encryption for Hadoop filesystems.
Currently AES/CTR/NoPadding
and AES/CBC/PKCS5Padding
are supported.
Disclaimer Neither supported AES mode is authenticated. Authentication should be performed by consumers of this library via an external cryptographic mechanism such as Encrypt-then-MAC. Failure to properly authenticate ciphertext breaks security in some scenarios where an attacker can manipulate ciphertext inputs.
Source for examples can be found here
byte[] bytes = "0123456789".getBytes(StandardCharsets.UTF_8);
// Store this key material for future decryption
KeyMaterial keyMaterial = SeekableCipherFactory.generateKeyMaterial(AesCtrCipher.ALGORITHM);
SeekableCipher cipher = SeekableCipherFactory.getCipher(AesCtrCipher.ALGORITHM, keyMaterial);
ByteArrayOutputStream os = new ByteArrayOutputStream(bytes.length);
Cipher encrypt = cipher.initCipher(Cipher.ENCRYPT_MODE);
// Encrypt some bytes
CipherOutputStream encryptedStream = new CipherOutputStream(os, encrypt);
encryptedStream.write(bytes);
encryptedStream.close();
byte[] encryptedBytes = os.toByteArray();
// Bytes written to stream are encrypted
assertThat(encryptedBytes, is(not(bytes)));
SeekableInput is = new InMemorySeekableDataInput(encryptedBytes);
DecryptingSeekableInput decryptedStream = new DecryptingSeekableInput(is, cipher);
// Seek to the last byte in the decrypted stream and verify its decrypted value
byte[] readBytes = new byte[bytes.length];
decryptedStream.seek(bytes.length - 1);
decryptedStream.read(readBytes, 0, 1);
assertThat(readBytes[0], is(bytes[bytes.length - 1]));
// Seek to the beginning of the decrypted stream and verify it's equal to the raw bytes
decryptedStream.seek(0);
decryptedStream.read(readBytes, 0, bytes.length);
assertThat(readBytes, is(bytes));
Hadoop Crypto is a library for per-file client-side encryption in Hadoop FileSystems such as HDFS or S3. It provides wrappers for the Hadoop FileSystem API that transparently encrypt and decrypt the underlying streams. The encryption algorithm uses Key Encapsulation: each file is encrypted with a unique symmetric key, which is itself secured with a public/private key pair and stored alongside the file.
The EncryptedFileSystem
wraps any FileSystem
implementation and encrypts the
streams returned by open and close. These streams are encrypted/decrypted by a
unique per-file symmetric key which is then passed to the KeyStorageStrategy
which stores the key for future access. The provided storage strategy
implementation encrypts the symmetric key using a public/private key pair and
then stores the encrypted key on the FileSystem
with the encrypted file.
The hadoop-crypto-all.jar can be added to the classpath of any client and used
to wrap any concrete backing FileSystem. The scheme of the EncryptedFileSystem
is e[FS-scheme]
where [FS-scheme]
is any FileSystem that can be instantiated
statically using FileSystem#get
(eg: efile). The FileSystem implementation,
public key, and private key must be configured in the core-site.xml as well.
Add hadoop-crypto-all.jar to the classpath of the cli (ex: share/hadoop/common).
openssl genrsa -out rsa.key 2048
# Public Key
openssl rsa -in rsa.key -outform PEM -pubout 2>/dev/null | grep -v PUBLIC | tr -d '\r\n'
# Private Key
openssl pkcs8 -topk8 -inform pem -in rsa.key -outform pem -nocrypt | grep -v PRIVATE | tr -d '\r\n'
<configuration>
<property>
<name>fs.efile.impl</name> <!-- others: fs.es3a.impl or fs.ehdfs.impl -->
<value>com.palantir.crypto2.hadoop.StandaloneEncryptedFileSystem</value>
</property>
<property>
<name>fs.efs.key.public</name>
<value>MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqXkSOcB2UpLrlG3scAHDavPnSucxOwRWG12woY5JerYlqyIm7xcNuyLQ/rLPxdlCGgOZOoPzKVXc/3pAeOdPM1LcXLNW8d7Uht3vo7a6SR/mXMiCTMn+9wOx40Bq0ofvx9K4RSpW2lKrlJNUJG+RP5lO7OhB5pveEBMn/8OR2yMLgS58rHQ0nrXXUHqbWiMI8k+eYK7aimexkQDhIXtbqmQ5tAXKyoSMDAyeuDNY8WsYaW15OCwGSIRClNAiwPEGLQCYJQi41IxwQxwN42jQm7fwoVSrN4lAfi5B8EHxFglAZcE8nUTdTnXCbUk9SPz8XXmK4hmK9X4L+2Av4ucNLwIDAQAB</value>
</property>
<property>
<name>fs.efs.key.private</name>
<value>MIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCpeRI5wHZSkuuUbexwAcNq8+dK5zE7BFYbXbChjkl6tiWrIibvFw27ItD+ss/F2UIaA5k6g/MpVdz/ekB4508zUtxcs1bx3tSG3e+jtrpJH+ZcyIJMyf73A7HjQGrSh+/H0rhFKlbaUquUk1Qkb5E/mU7s6EHmm94QEyf/w5HbIwuBLnysdDSetddQeptaIwjyT55grtqKZ7GRAOEhe1uqZDm0BcrKhIwMDJ64M1jxaxhpbXk4LAZIhEKU0CLA8QYtAJglCLjUjHBDHA3jaNCbt/ChVKs3iUB+LkHwQfEWCUBlwTydRN1OdcJtST1I/PxdeYriGYr1fgv7YC/i5w0vAgMBAAECggEASvSLhROEwbzNiRadLmT5Q4Kg19YtRgcC9pOXnbzK7wVE3835HmI55nzdpuj7UGxo+gyBZwoZMD0Tw8MUZOUZeH+7ixye5ddCdGwQo34cIl+DiaH9T20/4Yy2zuYc2QTanqyqZ5z0URejX9FRs9PMkC6EY+/NxetGaiGu3UZoalz7F/5wS8bCaKPkm3AjLvqXHL5KiSbPDPBQj4m+iFWLoWZL9FB1zyif+yBatU4cBCLHaTTgXroItEKcxTwFfyi2l059ItoP5E10djKHpMuPiPrTMS0FHAom3GZAYEFnjRgInR0sIotEwuSDObqcio1PdXRsi5Ul8MxfpXxLSuL+UQKBgQDcvmehBARNDksQJGzIyegKg10eLYdfXFCR+QDZeqJod/pCQ6gtW0aFYAoL0uXiMwQzSb6m7offmXH0JLLqOnjgcZlejHUDSTTWtNOYlGaO7OVgFcnG6/UnCE54eJcaw68auvPB9XW3gm5cfWSNpUI+6aJDBb6BKx8uNMoRreq9wwKBgQDEilhsCgUOIRkJfM5MYUzMT0gR8qt671q+lgTjBDwYvdoQ7BijG6Lbqbp9Xd4nODiw1t7e1Rexw+cuIeRs8NITU4f4Nfe25rRhZ+0n7g9OoCiRUoEsmd7cqDk6pubpw9hW1TKKLzTqExisGFy+bnUA8FFs2TbU9Xeb9kdm1GXgJQKBgAsN9f6YRubc+mFakaAUjGxKW9VxDkB2TQqiX6qEe7GjoILFBJ0Q3x06zAX/j8eeKm2vGb8eXuuRsaU6WUNlnjwPNFEJ06pQdjbyY05W0DQEJRCExtARbPuBbPyXfWm3twMtrZtfAYApJgG3vdtiFUk1Rgz5MqshT7RurFfqT8ElAoGAE2BEOVp/hxYSPtI0EGmjRZ0nUMWozDTesF1f2/Wl6xaEchikkSf/VUKVZRik9x7ez+hPDo7ZiCf1GaIzv926CDe69uhzJG/4JoY1ZjNdBPZbKYCFxZzh0MUw5yxfJXquUFkyY1cmE1GQpB6+vfNry4zlqiJ7+mC8yv5rqaKU7JUCgYBXPYpuQppR1EFj66LSrZ8ebXmt5TtwR839UkgEhLOBkO0cFP2BXVAMx9p0/MYLNIPk7vVpVtRCKYr6tBVdUWCin0obC5O+JzuhilQ0aH3xl5mbiasOvCNPjniaTViRt6zNlaq6RMS4x1LqYUyqc4LUrBbGMWJsdjYqVAi1Rq1FTw==</value>
</property>
</configuration>
./bin/hadoop dfs -put file.txt efile:/tmp/file.txt
./bin/hadoop dfs -ls efile:/tmp
./bin/hadoop dfs -cat efile:/tmp/file.txt
Source for examples can be found here
// Get a local FileSystem
FileSystem fs = FileSystem.get(new URI("file:///"), new Configuration());
// Initialize EFS with random public/private key pair
KeyPair pair = TestKeyPairs.generateKeyPair();
KeyStorageStrategy keyStore = new FileKeyStorageStrategy(fs, pair);
EncryptedFileSystem efs = new EncryptedFileSystem(fs, keyStore);
// Init data and local path to write to
byte[] data = "test".getBytes(StandardCharsets.UTF_8);
byte[] readData = new byte[data.length];
Path path = new Path(folder.newFile().getAbsolutePath());
// Write data out to the encrypted stream
OutputStream eos = efs.create(path);
eos.write(data);
eos.close();
// Reading through the decrypted stream produces the original bytes
InputStream ein = efs.open(path);
IOUtils.readFully(ein, readData);
assertThat(data, is(readData));
// Reading through the raw stream produces the encrypted bytes
InputStream in = fs.open(path);
IOUtils.readFully(in, readData);
assertThat(data, is(not(readData)));
// Wrapped symmetric key is stored next to the encrypted file
assertTrue(fs.exists(new Path(path + FileKeyStorageStrategy.EXTENSION)));
Key | Value | Default |
---|---|---|
fs.efs.cipher |
The cipher used to wrap the underlying streams. | AES/CTR/NoPadding |
fs.e[FS-scheme].impl |
Must be set to com.palantir.crypto2.hadoop.StandaloneEncryptedFileSystem |
|
fs.efs.key.public |
Base64 encoded X509 public key | |
fs.efs.key.private |
Base64 encoded PKCS8 private key | |
fs.efs.key.algorithm |
Public/private key pair algorithm | RSA |
This repository is made available under the Apache 2.0 License.