Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: re-organize ec-gpu traits #30

Merged
merged 1 commit into from
Aug 9, 2022
Merged

feat: re-organize ec-gpu traits #30

merged 1 commit into from
Aug 9, 2022

Conversation

vmx
Copy link
Contributor

@vmx vmx commented Aug 3, 2022

The ec-gpu traits are re-organized.

BREAKING CHANGE: The GpuEngine trait is removed, as it was specific
to pairing friendlt elliptic curves.

The GpuField trait gains a new method called sub_field_name, which
only matters for extension fields. It has a default implementation, so
existing implementations don't need to change.

A new trait called GpuName is introduced, which every GpuField needs
to implement. It contains the name of the object it's implemented for,
which is then used in the generated source code as identifier.

@vmx
Copy link
Contributor Author

vmx commented Aug 3, 2022

@dignifiedquire I'd like to get a review from you as you did the original work on this crate.

@DrPeterVanNostrand I'd like especially a review of the comments.

For both of you: the implementation of the traits for blstrs can be found here: filecoin-project/blstrs#43, the pasta_curves one is at zcash/pasta_curves#45.

Comment on lines 5 to 21
/// This name is used in the source code that is generated for the GPU. It should be globally
/// unique so that you don't collide with other libraries. It must *not* contain spaces. For
/// fields it is best to use a combination of the curve you are using and the field name. For
/// example `bls12_381_fp`, `bls12_381_scalar`, or `pallas_fp`.
fn name() -> String;
Copy link

@str4d str4d Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global uniqueness requirement is going to be very hard to enforce; in fact, this documentation recommends a strategy for naming fields that is guaranteed to cause collisions across different implementations of the same field.

However, we know that within a particular binary application, each type will have a unique TypeId. Would it be possible to generate the GPU source code from something based on that identifier, as derived from Self? The TypeId is allowed to change between compilation runs (see this issue for some context), so it wouldn't work if the expectation is that the generated GPU source code can be compiled once and reused indefinitely. But if the source is being generated and compiled live (and thus is already subject to the rustc -Cmetadata flag) then this should both work, and mean downstream crates don't need to do anything (as the changes to GpuField could then be entirely non-breaking, either with this being a default method on GpuField or by inverting the trait bounds so GpuName: GpuField and providing a blanket implementation of GpuName).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about using TypeId, but then thought it might be too brittle. Practically I wouldn't expect that there are bazillions of implementations. I even thought one might perhaps even use a non-unique name intentionally. E.g. you might even want to create the same source for differently implemented fields (thought I'm not really sure if that's a use case). I liked the idea of keeping things pretty straight forward and easy to debug. Hence I think explicit names are better.

Input from other folks would be appreciated.

Even if we would use TypeId it would still be a breaking change for downstream crates, as there are cases where GpuName is implemented for things that aren't fields. So the trait bound cannot be flipped.

Copy link

@str4d str4d Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically I wouldn't expect that there are bazillions of implementations.

That may be true, but it only takes two to collide in order for the documented requirement to fail. And there are no clearly documented consequences for a collision.

I even thought one might perhaps even use a non-unique name intentionally. E.g. you might even want to create the same source for differently implemented fields (thought I'm not really sure if that's a use case).

This is where it would be very useful to have more documentation about how this function will (or should) be used. If the contents of this method are not being tightly constrained, then it falls to the consumer of the method to have a well-defined usage contract, so that people implementing this trait know what they are implementing it against.

I liked the idea of keeping things pretty straight forward and easy to debug. Hence I think explicit names are better.

Definitely reasonable, and also why I'd prefer human-meaningful names over compiler-generated identifiers. If debuggability were the only concern, then std::any::type_name would be perfect for this; sadly it doesn't guarantee to include the module path, so it doesn't claim uniqueness.

Maybe the recommended implementation for this method should be:

    fn name() -> String {
        format!("{}::{}", module_path!(), std::any::type_name::<Self>())
    }

(or something more involved that strips the module path if present from the output of type_name). This could even be provided as a macro (used like e.g. ec_gpu::name!(MyType); or more cleanly as a proc macro #[ec_gpu::GpuName] that would always be run in the same module) so that most people use the non-colliding default (while people with more convoluted use cases can implement the trait directly).

Even if we would use TypeId it would still be a breaking change for downstream crates, as there are cases where GpuName is implemented for things that aren't fields. So the trait bound cannot be flipped.

Good to know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @str4d for all the input. I played a bit around with those ideas and pushed a new commit. Now the name is auto-generated and it actually isn't too bad. In case of BLS381-12, it's names like ec_gpu_blstrs__fp__Fp, ec_gpu_blstrs__scalar__Scalar or ec_gpu_blstrs__g1__G1Affine, which are still reasonable. I've also updated the documentation that you shouldn't implement the name yourself.

I thought it would be a good idea to auto-implement it for the GpuField. I don't know if it's still a breaking change (I guess it still is), but at least it will need less code changes. I've also updated the blstrs PR so that folks can see what changes would be needed filecoin-project/blstrs#43. I think the pasta_curves trait wouldn't need any code changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to add that G1 and G2 (https://github.com/filecoin-project/blstrs/pull/43/files#diff-a81fd3c080fd012c0c9155e68a28c5ef92bafcd33eced7847bbb40c5a7dcc116R841) are an example where I need a GpuName which aren't fields. I need it in order to name the Multiexp/MSM kernel correctly.

Comment on lines 15 to 19
fn name() -> String {
//sanitize_name(&format!("{}_{}", module_path!(), std::any::type_name::<Self>()))
format!("{}_{}", module_path!(), std::any::type_name::<Self>())
.replace(|c: char| !c.is_ascii_alphanumeric(), "_")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default implementation doesn't quite do the right thing. module_path!() inserts the path for the module in which the macro is evaluated, which in this case is ec_gpu. That's handy for domain-separating multiple versions of ec-gpu, but this could also just be done at the point where GpuName::name() gets called.

What I'd suggested it for was domain-separating the downstream types themselves, and that only works if the module_path!() macro is invoked inside the downstream modules. This is why in my previous comment I suggested providing a wrapping macro that downstreams could use to implement GpuName, which would invoke it in the correct context.

The reason the examples you showed before look right is because std::any::type_name happens to be printing the module path, but it is not guaranteed to do so (it may also just show the type name). I think the best way to combine them would be to check if the output of type_name() is prefixed by module_path!() and prepend the latter if not. In reusable macro form:

#[macro_export]
macro_rules! name {
    ($named:ident) => {
        impl ::ec_gpu::GpuName for $named {
            fn name() -> String {
                let mod_path = module_path!();
                let type_name = std::any::type_name::<Self>();
                if type_name.starts_with(mod_path) {
                    type_name.replace(|c: char| !c.is_ascii_alphanumeric(), "_")
                } else {
                    format!("{}__{}", mod_path, type_name)
                        .replace(|c: char| !c.is_ascii_alphanumeric(), "_")
                }
            }
        }
    };
}

and then the default impl would be removed:

Suggested change
fn name() -> String {
//sanitize_name(&format!("{}_{}", module_path!(), std::any::type_name::<Self>()))
format!("{}_{}", module_path!(), std::any::type_name::<Self>())
.replace(|c: char| !c.is_ascii_alphanumeric(), "_")
}
fn name() -> String;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro would then be used like so:

#[cfg(feature = "gpu")]
ec_gpu::name!(Fp2);

#[cfg(feature = "gpu")]
impl ec_gpu::GpuField for Fp2 {
    ...
}

Comment on lines 41 to 43

// Automatically implement `GpuName` for everything that implements `GpuField`.
impl<T> GpuName for T where T: GpuField {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also be removed, as there'd not be a default impl:

Suggested change
// Automatically implement `GpuName` for everything that implements `GpuField`.
impl<T> GpuName for T where T: GpuField {}

@vmx
Copy link
Contributor Author

vmx commented Aug 5, 2022

I've pushed a new version, which I think is the best of the suggestions from @str4d and my idea. I found the macro a bit too magical. I prefer the code being more explicit and easy to grep. I updated the documentation, so that people know that they should use the ec_gpu::name() macro. @str4d what do you think?

Copy link

@str4d str4d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. I confirmed that zcash/pasta_curves#45 produces the names pasta_curves__fields__fp__Fp and pasta_curves__fields__fq__Fq.

ec-gpu/src/lib.rs Outdated Show resolved Hide resolved
The ec-gpu traits are re-organized.

BREAKING CHANGE: The `GpuEngine` trait is removed, as it was specific
to pairing friendly elliptic curves.

The `GpuField` trait gains a new method called `sub_field_name`, which
only matters for extension fields. It has a default implementation, so
existing implementations don't need to change.

A new trait called `GpuName` is introduced, which every `GpuField` needs
to implement. It contains the name of the object it's implemented for,
which is then used in the generated source code as identifier. As the
name shoule be unique, a macro called `name` is introduced. This way
implementations can have unique names without much hassle. Example:

    struct Fp;

    impl ec_gpu::GpuName for Fp {
        fn name() -> String {
            ec_gpu::name!()
        }
    }
@vmx
Copy link
Contributor Author

vmx commented Aug 5, 2022

I've squashed it into a single commit with an updated commit message.

@dignifiedquire it would be great if you could also review this code, as you are the maintainer of blstrs.

@DrPeterVanNostrand if you could have a look if I got the comments right, it would be great.

@str4d thanks once again for the thorough review, I'm really happy with the outcome.

Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants