Include Message Descriptor information #336

fdeantoni · 2020-05-25T07:17:33Z

A new trait MessageDescriptor which includes the following info about a message:

package_name which is taken from the package field in the proto file;
message_name which is the message name itself;
type_url which is "type.googleapis.com/" + package_name + "." + message_name;

The prefix "type.googleapis.com" is the default prefix for all protobuf types (see here).

Adding the MessageDescriptor trait will help with the packing and unpacking of messages in the Any type, but may be helpful in other situations as well, and can easily be extended to include more proto information in future (e.g. source file etc).

This pull request can address #277 completely as it will allow an external crate to provide all the functionalities without polluting the prost crate with additional dependencies. For an external crate that provides all the functionality described in #277, see prost-wkt.

A new trait `MessageProto` has been added which includes the following info about a message: - `package_name` which is taken from the `package` field in the proto file; - `message_name` which is the message name itself; - `type_url` which is "type.googleapis.com/" + package_name + "." + message_name; The prefix "type.googleapis.com" is the default prefix for all protobuf types (see [here](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/Any)). Adding the MessageProto trait will help with the packing and unpacking of messages in the `Any` type, but may be helpful in other situations as well, and can easily be extended to include more proto information in future (e.g. source file etc).

aljoscha · 2020-07-03T13:08:24Z

Is there any update on this PR? I recently stumbled across this because I need to work with Any. I'd really like to use Prost because I think it feels much more idiomatic but right now I'm forced to use protobuf.

teburd · 2020-08-20T16:39:54Z

I'd really like to see this happen, along with perhaps some attribute describing the .proto real field name rather than the more friendly rust one, such as #[prost(package=..., message=..., field=...)] to effectively describe everything needed for prost-derive to possibly generate the reflection API

fdeantoni · 2020-09-16T05:45:48Z

Hi @danburkert , as you suggested in your comment on issue 255, I broke up the requirement into smaller parts. This PR now only implements a single trait with three fields (message_name, package_name, and type_url) for each Message struct, and generates those structs from protobuf messages with a new package macro attribute (to provide the package_name field). It does not pull in any other dependencies, and should be fully backwards compatible with previous releases. Do let me know if this is ok or if you think more work is needed. I really like this library and hope this PR can be included!

# Conflicts: # prost-derive/src/lib.rs # src/message.rs

fdeantoni · 2021-02-09T05:07:16Z

Hi @danburkert, I was wondering if you have plans of adding some kind of message descriptor to Prost in future? Although this PR implements it, I hope any implementation that allows me to retrieve a message type_url can make into the project. Really love Prost and it is the only thing I'm missing from this great library!

jcaden

These changes look good to me. I was wondering if they could be merged as they allow for serializing/Deserializing into/from Any type that is the only missing functionality for us in prost.

haozibi · 2021-02-25T12:39:46Z

WANT MERGE!!!

jcaden · 2021-02-25T14:44:15Z

Whie this is merged, I've managed to do a workaround without forking prost nor prost build. Basically I ask prost-build to left the files in a known directory and then get the path from the file name, using that information a rewrite the generated code adding package information. I also add an attribute to the generated types in order to include some methods. It's not the best solution ever, but at least I can carry on without forking.

The buidl is changed by this function:

/// Generates source from protocol buffer definitions and adds it information about the package
/// That can be used by the proto_package_name macro to generate metadata information
pub fn generate_interface_source<P>(
    out_file: P,
    protos: &[P],
    includes: &[P],
) -> Result<(), Box<dyn std::error::Error>>
where
    P: AsRef<Path>,
{
    let mut out_dir: PathBuf = env::var_os("OUT_DIR")
        .ok_or_else(|| Error::new(ErrorKind::Other, "OUT_DIR environment variable is not set"))
        .map(Into::into)
        .unwrap();

    out_dir.push("generated");

    // Ignoring create_dir error as it can already exist
    let _ = fs::create_dir(&out_dir);

    prost_build::Config::new()
        .out_dir(&out_dir)
        .type_attribute(".", "#[proto_package_name(PACKAGE_NAME)]")
        .compile_protos(protos, includes)?;

    fs::remove_file(&out_file)?;
    let mut generated_output = OpenOptions::new()
        .read(true)
        .write(true)
        .create(true)
        .open(&out_file)?;

    for entry in fs::read_dir(&out_dir)? {
        let entry = entry?;
        let path = entry.path();
        if !path.is_dir() {
            let mut content = String::new();
            File::open(&path).unwrap().read_to_string(&mut content)?;
            let file_name = path.file_stem().unwrap().to_str().unwrap();

            let tokens = file_name.split(".");
            let mut tokens_len = 0;
            for module in tokens {
                writeln!(generated_output, "pub mod {} {{", module)?;
                tokens_len = tokens_len + 1;
            }

            writeln!(
                generated_output,
                "\nuse proto_package_name::proto_package_name;\n"
            )?;

            writeln!(
                generated_output,
                "static PACKAGE_NAME : &str = \"{}\";\n",
                file_name
            )?;

            generated_output.write_all(content.as_bytes())?;

            for _ in 0..tokens_len {
                writeln!(generated_output, "}}")?;
            }
        }
    }

    generated_output.sync_all()?;

    fs::remove_dir_all(&out_dir)?;

    Ok(())
}

The code of the macro looks like this:

#[derive(Debug)]
struct ProtocPackageNameParams {
    module_name_constant: Ident,
}

impl Parse for ProtocPackageNameParams {
    fn parse(input: ParseStream) -> Result<Self> {
        Ok(ProtocPackageNameParams {
            module_name_constant: input.parse()?,
        })
    }
}

#[proc_macro_attribute]
pub fn proto_package_name(attr: TokenStream, item: TokenStream) -> TokenStream {
    let item = parse_macro_input!(item as DeriveInput);
    let struct_name = &item.ident;
    let struct_name_string = quote!(#struct_name).to_string();
    let params = parse_macro_input!(attr as ProtocPackageNameParams);
    let module_name_constant = &params.module_name_constant;

    let item = TokenStream::from(quote!(
        #item impl #struct_name {
            pub fn package_name() -> &'static str{
                &#module_name_constant
            }

            pub fn message_name() -> &'static str {
                #struct_name_string
            }

            pub fn type_url() -> String {
                format!(
                    "type.googleapis.com/{}.{}",
                    Self::package_name(),
                    Self::message_name()
                )
            }
        }
    ));

    item
}

If there is interest I can create a project to implement this workaround.

Khalilw1 · 2021-07-19T07:56:26Z

Hello guys any ETA on this. I really want to use Any without moving to another library (I really like Prost generation)

mkatychev · 2021-07-21T22:47:56Z

@jcaden that looks really interesting, do you mind providing a minimal example of the two snippets in action?

jcaden · 2021-07-22T08:30:58Z

@jcaden that looks really interesting, do you mind providing a minimal example of the two snippets in action?

Actually I found some cases where this is not working as expected when having multiple packages there were some name collisions, not sure if that's the best solution.

teburd · 2021-07-23T14:31:08Z

So my start of this solutions is to effectively include and create a static with the message descriptors

https://github.com/openenergysolutions/openfmb-rs/blob/develop/openfmb-descriptors/src/lib.rs#L9

I then codegen from that crate and have all the metadata available at build and runtime that way.

fdeantoni · 2021-07-27T02:06:30Z

That could work too but I think it is still cleaner to have the descriptor as part of the message via a trait. To me adding a MessageDescriptor trait seems like a relatively safe feature to add, or am I overlooking something important? Really hope the maintainers of Prost can help out and give their feedback...

teburd · 2021-08-02T14:00:06Z

@fdeantoni I'd say implementing the metadata/descriptor bits as functions that pull out data from the static variable is the sanest route, that way you don't end up with circular references which Rust won't like.

The static is just step one, step two might be adding a trait and trait impls being created for each generated message type to get that descriptor data from the static.

fdeantoni · 2021-08-03T00:37:19Z

Thanks for the feedback @BFrog! I'm definitely not against using a static to get the package information, and using a trait impl for each message to get that data. All I want really is just a way to reconstruct the package url from the message :) . However, I figured just placing this data in the trait would be fastest (and safest). In essence the only thing this PR really does is add the following trait:

pub trait MessageDescriptor: Message + 'static {
    fn message_name(&self) -> &'static str;
    fn package_name(&self) -> &'static str;
    fn type_url(&self) -> &'static str;
}

The derive macro then takes care of implementing this trait for each message, for example:

    impl ::prost::MessageDescriptor for Foo {
        fn message_name(&self) -> &'static str {
            "Foo"
        }
        fn package_name(&self) -> &'static str {
            "com.example.package"
        }
        fn type_url(&self) -> &'static str {
            "type.googleapis.com/com.example.package.Foo"
        }
    }

You mention circular references as a potential risk to this implementation though. Can you tell me more how/where this might happen? If there is a weakness in the design then definitely it should be addressed!

teburd · 2021-08-03T21:19:36Z

@fdeantoni As soon as you start looking at the fields of a message you start running into issues

For example the protobuf api itself provides this, https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.message#Message.GetDescriptor.details

A Descriptor provides a way to introspect the fields of a message.

A FieldDescriptor has a method containing_type" call provides the Descriptor for the message type holding that field.

So you have a circular reference in Descriptor <-> FieldDescriptor in the GetDescriptor protobuf introspection API

You don't have to replicate the Descriptor API entirely, but it is really useful, especially when crafting custom code generators that maybe generate new types/functions/etc on top of the existing Message implementations.

fdeantoni · 2021-08-04T01:47:42Z

Ok, but I think that is a much bigger change than what I'm proposing for this PR, which is really a very simple and dumb message descriptor trait containing only the package name and message name (to construct a url). The next step indeed could be adding a static reference to get the field descriptors, but that I think should be a separate PR.

Nonetheless, if the maintainers are willing to go with a much bigger change that implements the full Descriptor API comparable to the cpp version then that's even better, but I was hoping by keeping this PR small and simple (with potential to add features incrementally) it had more chance of being included. Hopefully the maintainers can weigh in here and suggest the best course of action.

neverchanje · 2021-08-18T05:23:31Z

src/message.rs

+pub trait MessageDescriptor: Message + 'static {
+    fn message_name(&self) -> &'static str;
+    fn package_name(&self) -> &'static str;
+    fn type_url(&self) -> &'static str;


We may not need type_url to be included in the trait. Let's take a look on how C++ implements Any::PackFrom:

bool AnyMetadata::PackFrom(Arena* arena, const Message& message, StringPiece type_url_prefix) { type_url_->Set( &::PROTOBUF_NAMESPACE_ID::internal::GetEmptyString(), GetTypeUrl(message.GetDescriptor()->full_name(), type_url_prefix), arena); return message.SerializeToString( value_->Mutable(ArenaStringPtr::EmptyDefault{}, arena)); }

It uses an API called Descriptor::full_name which returns <package>.<message>, and finally generates the type_url via GetTypeUrl, which is hard-coded in the library and can not be extended like a trait.

The type_url was just added for convenience really. If this stops the PR from being merged then I'm all for removing it! A third-party library that implements pack and unpack like prost-wkt can then implement it instead (with the prefix configurable and whatnot).

I really appreciate your work on prost-wkt and I'm actually not one of the prost maintainers, so I couldn't and didn't intend to stop anybody. I left this comment only because I feel a minimum public trait is better and will not lead to incompatibility issues in the future.

Anyway, I'm really looking forward to this PR being merged, then we will not have to use a forked version of prost in prost-wkt any longer. I don't understand why such tiny PR has been blocked so long since last year. 😢

oblique · 2021-10-01T21:34:00Z

prost-derive/src/lib.rs

+            fn type_url(&self) -> &'static str {
+                #type_url
+            }
+        }


I believe the above methods should be associated functions instead. Someone will be able to use Foo::type_url() instead of constructing Foo and then calling the method.

For example this is how pack/unpack to Any will be implemented with associated functions:

fn pack_to_any<M>(msg: M) -> prost_types::Any where M: prost::Message + prost::MessageDescriptor, { prost_types::Any { type_url: M::type_url().to_owned(), value: msg.encode_to_vec(), } } fn unpack_from_any<M>(msg: prost_types::Any) -> Option<M> where M: prost::Message + prost::MessageDescriptor + Default, { if msg.type_url == M::type_url() { Some(M::decode(&msg.value[..]).ok()?) } else { None } }

oblique · 2021-10-01T23:53:31Z

I created a workaround example for packing/unpacking to Any that uses FileDescriptorSet in build.rs: https://github.com/oblique/prost-any-workaround

fdeantoni · 2021-10-25T01:17:01Z

Thanks for the example @oblique. It looks like the prost maintainers will only allow this route to retrieve the package name from the proto files.

For anybody interested, I reworked prost-wkt to use FileDescriptorSets to retrieve the package_name instead and it works ok for my projects. As this is sufficient for my requirements, I will be closing this PR.

Thanks everyone!

fdeantoni mentioned this pull request May 25, 2020

Additional convenience methods to convert from/to Rust standard types. #278

Closed

Various formatting fixes.

4d56eac

fdeantoni mentioned this pull request May 25, 2020

WKT Any support with JSON serialization/deserialization #277

Closed

Renamed MessageProto to MessageDescriptor.

d1f42ab

fdeantoni changed the title ~~Include Message Proto information~~ Include Message Descriptor information May 27, 2020

aljoscha mentioned this pull request Jul 6, 2020

Switch from protobuf crate to prost aljoscha/statefun-rust#4

Open

Merge branch 'master' into meta

9a8445d

fdeantoni added 2 commits September 16, 2020 11:52

Merge branch 'master' of https://github.com/danburkert/prost into meta

a27f565

Merge branch 'meta' of https://github.com/fdeantoni/prost into meta

510cd4d

fdeantoni added 8 commits December 1, 2020 15:59

Merge branch 'master' into meta

fc2bccc

# Conflicts: # prost-derive/src/lib.rs # src/message.rs

Updates due to merge with upstream.

06d6371

Fixed formatting errors.

7bcbed8

Updated well-known-types test.

f8cb1b2

Add static lifetime for generic_derive test.

d42f84d

Add generics to MessageDescriptor.

87c3f3a

Merge branch 'master' into meta

086dcb1

Merge branch 'master' into meta

5d4bd66

jcaden approved these changes Feb 23, 2021

View reviewed changes

Voronar mentioned this pull request Apr 23, 2021

prost-build: keep protobuf's enum variant name as is #207

Open

neverchanje reviewed Aug 18, 2021

View reviewed changes

oblique reviewed Oct 1, 2021

View reviewed changes

fdeantoni added 2 commits October 23, 2021 09:55

Merge branch 'master' into meta.

4989326

Changed visibility of prost_attrs to pub(crate).

319e38f

fdeantoni closed this Oct 25, 2021

rdelfin mentioned this pull request Nov 10, 2021

Add message_path() to Message trait #535

Closed

carols10cents mentioned this pull request Feb 25, 2022

Create a datafusion-proto crate for datafusion protobuf serialization apache/datafusion#1887

Merged

ghost mentioned this pull request Apr 11, 2022

feature request: Encoding and decoding prost_types::Any seamlessly #618

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include Message Descriptor information #336

Include Message Descriptor information #336

fdeantoni commented May 25, 2020 •

edited

Loading

aljoscha commented Jul 3, 2020

teburd commented Aug 20, 2020 •

edited

Loading

fdeantoni commented Sep 16, 2020

fdeantoni commented Feb 9, 2021

jcaden left a comment

haozibi commented Feb 25, 2021

jcaden commented Feb 25, 2021 •

edited

Loading

Khalilw1 commented Jul 19, 2021

mkatychev commented Jul 21, 2021

jcaden commented Jul 22, 2021

teburd commented Jul 23, 2021

fdeantoni commented Jul 27, 2021

teburd commented Aug 2, 2021

fdeantoni commented Aug 3, 2021

teburd commented Aug 3, 2021 •

edited

Loading

fdeantoni commented Aug 4, 2021

neverchanje Aug 18, 2021 •

edited

Loading

fdeantoni Aug 18, 2021

neverchanje Aug 18, 2021 •

edited

Loading

oblique Oct 1, 2021

oblique commented Oct 1, 2021

fdeantoni commented Oct 25, 2021

Include Message Descriptor information #336

Include Message Descriptor information #336

Conversation

fdeantoni commented May 25, 2020 • edited Loading

aljoscha commented Jul 3, 2020

teburd commented Aug 20, 2020 • edited Loading

fdeantoni commented Sep 16, 2020

fdeantoni commented Feb 9, 2021

jcaden left a comment

Choose a reason for hiding this comment

haozibi commented Feb 25, 2021

jcaden commented Feb 25, 2021 • edited Loading

Khalilw1 commented Jul 19, 2021

mkatychev commented Jul 21, 2021

jcaden commented Jul 22, 2021

teburd commented Jul 23, 2021

fdeantoni commented Jul 27, 2021

teburd commented Aug 2, 2021

fdeantoni commented Aug 3, 2021

teburd commented Aug 3, 2021 • edited Loading

fdeantoni commented Aug 4, 2021

neverchanje Aug 18, 2021 • edited Loading

Choose a reason for hiding this comment

fdeantoni Aug 18, 2021

Choose a reason for hiding this comment

neverchanje Aug 18, 2021 • edited Loading

Choose a reason for hiding this comment

oblique Oct 1, 2021

Choose a reason for hiding this comment

oblique commented Oct 1, 2021

fdeantoni commented Oct 25, 2021

fdeantoni commented May 25, 2020 •

edited

Loading

teburd commented Aug 20, 2020 •

edited

Loading

jcaden commented Feb 25, 2021 •

edited

Loading

teburd commented Aug 3, 2021 •

edited

Loading

neverchanje Aug 18, 2021 •

edited

Loading

neverchanje Aug 18, 2021 •

edited

Loading