Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML-KEM] incremental API #757

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

franziskuskiefer
Copy link
Member

An incremental API for ML-KEM that allows encapsulating to a partial public key.

Only the multiplexing API is exposed for incremental.

Copy link

@gram-signal gram-signal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems easily understandable and good to go. A few top-level comments:

  1. why Box<dyn ...> rather than just returning a struct of the right size and with the right stuff inside it as arrays (for State and Keys)
  2. our use case requires storing/retrieving things we use, particularly State and Keys, so we'll need some way to serialize and deserialize them. If they're just byte arrays or structs containing sets of byte arrays (that are public), we can do the {de,}seriallization ourselves.
  3. As mentioned elsewhere, it would be nice to get a (pubkey, privkey) rather than an opaque Keys, so we could throw away the public key and just keep the privkey around for decapsulation, to potentially save some memory space.
  4. There's a bunch of structs in the public API that aren't themselves public. I believe I've added comments for all of them.

Thanks again for this, it's super nice to see!

use super::*;
extern crate alloc;
use alloc::boxed::Box;
use ind_cca::incremental::{self, types::Keys};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keys must be made public with pub use or the value from generate_key_pair can't be stored or used by the caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can make it public, but it's not needed (see integration tests).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's not needed when you do:

  let foo = x();
  foo.y();

However, I need to put this in a struct:

struct Foo {
  keys: incremental::Keys,
}

fn a() -> Self {
  Self{ keys: x() }
}
fn b(&mut self) {
  self.keys.y();
}

For this purpose, it appears that public stuff is actually necessary?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm taking this (partially) back. We shouldn't need to have the structs themselves be public if we can serialize/deserialize them. So, if we just get {de,}serialization, we can deserialize when needed and store the serialized stuff within our structs at other times. This will probably be both more space and more computation efficient for our use case.

use ind_cca::incremental::{self, types::Keys};

/// Get the size of the second public key in bytes.
pub const fn pk2_len() -> usize {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add a method for pk1_len(), or a constant.

/// The second part of the public key is passed in as byte slice.
/// [`Error::InvalidInputLength`] is returned if `public_key_part` is too
/// short.
pub fn encapsulate2(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encapsulate2 should also return an error (I think?) if the hash from pk1 does not match pk2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate never returns an error. We should stay coherent with the standard APIs on that here. Instead you can call the validation function for the public key.
We have public key validation functions in libcrux that recompute the hash for you and check it validate_public_key. But you need to re-assemble the key for that first. We may not currently expose the exact version that you want. But I will check to see that we expose validate_public_key for unpacked keys.


/// Decapsulate incremental ciphertexts.
pub fn decapsulate(
private_key: &dyn Keys,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decapsulate should, I believe, be able to work with just a private key, not a full Keys. This might allow some callers to discard their public key early and save some memory usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a question of naming. We (re)use unpacked keys here for the incremental API. And if we don't serialize the keys, we need parts of the public key for the private key. That's why it's easier to keep the entire key pair around. Other implementations actually call our key pair private key.
If you indeed don't need the public key ever again and the saved space actually makes a difference, we could create a new private key for the incremental API here and throw away the rest of the public key.

/// **NOTE:** This is a non-standard API. Use with caution!
pub mod incremental {
pub use self::incremental::types::PublicKey1;
use self::incremental::types::{Ciphertext1, Ciphertext2, Error, State};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciphertext1 and Ciphertext2 need to be pub use in order for a caller to call decapsulate.

State needs to be pub use for capturing the output of encapsulate1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can make it public, but it's not needed (see integration tests).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing: because Ciphertext1/2 is parameterized, C1_SIZE_768 also appears to be necessary. And it's somewhat annoying to use... maybe a type alias? Would something like this work?:

pub type Ciphertext1 = types::Ciphertext1<C1_SIZE_768>;


/// Incremental trait for unpacked key pairs.
//<const K: usize, Vector: Operations>
pub trait IncrementalKeyPair {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our use case, we also need to be able to write Keys out to persistent storage and read it back in. Possibly we could do this by exposing a sk_bytes(&self, bytes: &mut [u8]) and a from_sk_bytes(&[u8]) -> Self?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can definitely expose that too. It's more expensive, but we can add the functions here.

/// The output `bytes` have to be at least 64 bytes long.
///
/// **PANICS:** if the output `bytes` are too short.
fn pk1_bytes(&self, bytes: &mut [u8]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return a PublicKey1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we never need the key struct itself (you only want to send it out), it felt unnecessary to have that. But if you want to use the struct for something, we can do that too.
But it would make it asymmetric because we can't do that for PK2, which is platform dependent.


/// Trait container for multiplexing over platform dependent [`EncapsState`].
pub trait State {
fn as_any(&self) -> &dyn Any;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait will also need to be stored/retrieved from persistent storage, so also needs some mechanism of being mapped to/from bytes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was one of my main questions. I'll add it then. It's computational overhead, that's why this is nicer. But I'll add something and try to keep it light and safe.

/// The partial ciphertext c1 - first part.
#[repr(transparent)]
pub struct Ciphertext1<const LEN: usize> {
pub value: [u8; LEN],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice if PublicKey2 were like this, rather than just being a &[u8], so it had a known, type-checked length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not possible. PK2 is platform dependent. We could always write it out in a fixed length byte array, but that's more expensive.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding this... how can PK2 be platform dependent? I'm going to pass its bytes between different platforms. Or is it that the serialization is constant-sized, but the struct isn't? In that case, can we make a publicly facing PublicKey2 struct that's the right size and use it for returning bytes to the user?

Copy link
Member Author

@franziskuskiefer franziskuskiefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why Box<dyn ...> rather than just returning a struct of the right size and with the right stuff inside it as arrays (for State and Keys)

Serialising the key and state costs extra cycles, or is very unsafe. That's why we opted for a safe, efficient version as the default API. But when we add serialisation for everything anyway, we can also add an additional API that serialises the values before it returns them.

  1. our use case requires storing/retrieving things we use, particularly State and Keys, so we'll need some way to serialize and deserialize them. If they're just byte arrays or structs containing sets of byte arrays (that are public), we can do the {de,}seriallization ourselves.

Everything, except PK1 and the ciphertexts are platform dependent unfortunately. That's why we need the extra hoops. But we can add serialisation for all of them. The cheapest versions are just pretty unsafe. I'll think about a good solution and propose something.

  1. As mentioned elsewhere, it would be nice to get a (pubkey, privkey) rather than an opaque Keys, so we could throw away the public key and just keep the privkey around for decapsulation, to potentially save some memory space.

As commented inline, you need parts of the public key. So you can't just throw it away. We can add a separate struct that drops unneeded parts of the key. I'm not sure if it's worth it though. Let's look at this when we're happy with the rest.

  1. There's a bunch of structs in the public API that aren't themselves public. I believe I've added comments for all of them.

I commented on them. They are not needed, but we can of course expose them.


/// Incremental trait for unpacked key pairs.
//<const K: usize, Vector: Operations>
pub trait IncrementalKeyPair {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can definitely expose that too. It's more expensive, but we can add the functions here.

/// The output `bytes` have to be at least 64 bytes long.
///
/// **PANICS:** if the output `bytes` are too short.
fn pk1_bytes(&self, bytes: &mut [u8]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we never need the key struct itself (you only want to send it out), it felt unnecessary to have that. But if you want to use the struct for something, we can do that too.
But it would make it asymmetric because we can't do that for PK2, which is platform dependent.

/// The partial ciphertext c1 - first part.
#[repr(transparent)]
pub struct Ciphertext1<const LEN: usize> {
pub value: [u8; LEN],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not possible. PK2 is platform dependent. We could always write it out in a fixed length byte array, but that's more expensive.


/// Trait container for multiplexing over platform dependent [`EncapsState`].
pub trait State {
fn as_any(&self) -> &dyn Any;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was one of my main questions. I'll add it then. It's computational overhead, that's why this is nicer. But I'll add something and try to keep it light and safe.


/// Decapsulate incremental ciphertexts.
pub fn decapsulate(
private_key: &dyn Keys,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a question of naming. We (re)use unpacked keys here for the incremental API. And if we don't serialize the keys, we need parts of the public key for the private key. That's why it's easier to keep the entire key pair around. Other implementations actually call our key pair private key.
If you indeed don't need the public key ever again and the saved space actually makes a difference, we could create a new private key for the incremental API here and throw away the rest of the public key.

/// The second part of the public key is passed in as byte slice.
/// [`Error::InvalidInputLength`] is returned if `public_key_part` is too
/// short.
pub fn encapsulate2(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate never returns an error. We should stay coherent with the standard APIs on that here. Instead you can call the validation function for the public key.
We have public key validation functions in libcrux that recompute the hash for you and check it validate_public_key. But you need to re-assemble the key for that first. We may not currently expose the exact version that you want. But I will check to see that we expose validate_public_key for unpacked keys.

use super::*;
extern crate alloc;
use alloc::boxed::Box;
use ind_cca::incremental::{self, types::Keys};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can make it public, but it's not needed (see integration tests).

/// **NOTE:** This is a non-standard API. Use with caution!
pub mod incremental {
pub use self::incremental::types::PublicKey1;
use self::incremental::types::{Ciphertext1, Ciphertext2, Error, State};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can make it public, but it's not needed (see integration tests).


/// Get the state as bytes
pub fn to_bytes(self, out: &mut [u8]) {
debug_assert!(out.len() >= Self::num_bytes());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's any chance that this serialization will change in the future, it might be prudent to add a "version byte" to the head of this serialization. These serializations may be stored persistently for long periods of time on user devices, and may be processed by different versions of the libcrux library. Should any changes occur here (including length changes, or changes in the layout of bytes or additional information), new library versions should be able to process old library serialization, and old library versions should at least be able to fail gracefully to process new serializations (return a Result?). Some initial byte saying "this is serialization version 1" could aid with that.


/// Build a state from bytes
pub fn from_bytes(bytes: &[u8]) -> Self {
debug_assert!(bytes.len() >= Self::num_bytes());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular assert is probably best as a Result, if there's any possibility of future serialization change.

@@ -677,6 +677,34 @@ pub mod incremental {
RANK_512 * 16 * 32
}

/// The size of the key pair in bytes.
pub const fn key_pair_len() -> usize {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 99% of the way there on using this, but there's still one thing I think should be made public here: the expected lengths of CT1 and CT2. I need to set up a receiver with the expected number of bytes I expect to receive:

  let receiver = ExpectToReceive(CT1_SIZE);
  while !receiver.done() {
    receiver.read_byte(b);
  }

Right now, I'm not seeing a good way to do this with the constnats and functions that are currently publicly exposed. Of course, I could recompute the constants myself, but that seems a bit brittle.

@@ -150,17 +152,30 @@ macro_rules! impl_consistency_incremental {
let mut pk2_bytes = [0u8; pk2_len()];
key_pair.pk2_bytes(&mut pk2_bytes);

// Check that the keys are the same
assert_eq!(pk1_bytes, &key_pair_bytes[0..64]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide functions to extract pk1/pk2 from generate_key_pair_bytes output? Something like:

fn pk1_bytes(keys: &[u8]) -> &[u8] {
    &keys[..64]
}

fn pk2_bytes(keys: &[u8]) -> &[u8] {
    &keys[64..][..incremental::pk2_len()]
}

// ... and then to pk2.
// pk2 is passed in as bytes because the deserializaiton is runtime
// platform dependent.
let ct2 = encapsulate2(state.as_ref(), &pk2_bytes).unwrap();

// encaps2 with serialized state
let ct22 = encapsulate2_serialized(&serialized_state, &pk2_bytes).unwrap();
assert_eq!(ct2.value, ct22.value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does one extract the shared secret from serialized state?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants