-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include Message Descriptor information #336
Conversation
A new trait `MessageProto` has been added which includes the following info about a message: - `package_name` which is taken from the `package` field in the proto file; - `message_name` which is the message name itself; - `type_url` which is "type.googleapis.com/" + package_name + "." + message_name; The prefix "type.googleapis.com" is the default prefix for all protobuf types (see [here](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/Any)). Adding the MessageProto trait will help with the packing and unpacking of messages in the `Any` type, but may be helpful in other situations as well, and can easily be extended to include more proto information in future (e.g. source file etc).
Is there any update on this PR? I recently stumbled across this because I need to work with |
I'd really like to see this happen, along with perhaps some attribute describing the .proto real field name rather than the more friendly rust one, such as #[prost(package=..., message=..., field=...)] to effectively describe everything needed for prost-derive to possibly generate the reflection API |
Hi @danburkert , as you suggested in your comment on issue 255, I broke up the requirement into smaller parts. This PR now only implements a single trait with three fields ( |
# Conflicts: # prost-derive/src/lib.rs # src/message.rs
Hi @danburkert, I was wondering if you have plans of adding some kind of message descriptor to Prost in future? Although this PR implements it, I hope any implementation that allows me to retrieve a message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good to me. I was wondering if they could be merged as they allow for serializing/Deserializing into/from Any type that is the only missing functionality for us in prost.
WANT MERGE!!! |
Whie this is merged, I've managed to do a workaround without forking prost nor prost build. Basically I ask prost-build to left the files in a known directory and then get the path from the file name, using that information a rewrite the generated code adding package information. I also add an attribute to the generated types in order to include some methods. It's not the best solution ever, but at least I can carry on without forking. The buidl is changed by this function: /// Generates source from protocol buffer definitions and adds it information about the package
/// That can be used by the proto_package_name macro to generate metadata information
pub fn generate_interface_source<P>(
out_file: P,
protos: &[P],
includes: &[P],
) -> Result<(), Box<dyn std::error::Error>>
where
P: AsRef<Path>,
{
let mut out_dir: PathBuf = env::var_os("OUT_DIR")
.ok_or_else(|| Error::new(ErrorKind::Other, "OUT_DIR environment variable is not set"))
.map(Into::into)
.unwrap();
out_dir.push("generated");
// Ignoring create_dir error as it can already exist
let _ = fs::create_dir(&out_dir);
prost_build::Config::new()
.out_dir(&out_dir)
.type_attribute(".", "#[proto_package_name(PACKAGE_NAME)]")
.compile_protos(protos, includes)?;
fs::remove_file(&out_file)?;
let mut generated_output = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(&out_file)?;
for entry in fs::read_dir(&out_dir)? {
let entry = entry?;
let path = entry.path();
if !path.is_dir() {
let mut content = String::new();
File::open(&path).unwrap().read_to_string(&mut content)?;
let file_name = path.file_stem().unwrap().to_str().unwrap();
let tokens = file_name.split(".");
let mut tokens_len = 0;
for module in tokens {
writeln!(generated_output, "pub mod {} {{", module)?;
tokens_len = tokens_len + 1;
}
writeln!(
generated_output,
"\nuse proto_package_name::proto_package_name;\n"
)?;
writeln!(
generated_output,
"static PACKAGE_NAME : &str = \"{}\";\n",
file_name
)?;
generated_output.write_all(content.as_bytes())?;
for _ in 0..tokens_len {
writeln!(generated_output, "}}")?;
}
}
}
generated_output.sync_all()?;
fs::remove_dir_all(&out_dir)?;
Ok(())
} The code of the macro looks like this: #[derive(Debug)]
struct ProtocPackageNameParams {
module_name_constant: Ident,
}
impl Parse for ProtocPackageNameParams {
fn parse(input: ParseStream) -> Result<Self> {
Ok(ProtocPackageNameParams {
module_name_constant: input.parse()?,
})
}
}
#[proc_macro_attribute]
pub fn proto_package_name(attr: TokenStream, item: TokenStream) -> TokenStream {
let item = parse_macro_input!(item as DeriveInput);
let struct_name = &item.ident;
let struct_name_string = quote!(#struct_name).to_string();
let params = parse_macro_input!(attr as ProtocPackageNameParams);
let module_name_constant = ¶ms.module_name_constant;
let item = TokenStream::from(quote!(
#item impl #struct_name {
pub fn package_name() -> &'static str{
&#module_name_constant
}
pub fn message_name() -> &'static str {
#struct_name_string
}
pub fn type_url() -> String {
format!(
"type.googleapis.com/{}.{}",
Self::package_name(),
Self::message_name()
)
}
}
));
item
} If there is interest I can create a project to implement this workaround. |
Hello guys any ETA on this. I really want to use Any without moving to another library (I really like Prost generation) |
@jcaden that looks really interesting, do you mind providing a minimal example of the two snippets in action? |
Actually I found some cases where this is not working as expected when having multiple packages there were some name collisions, not sure if that's the best solution. |
So my start of this solutions is to effectively include and create a static with the message descriptors I then codegen from that crate and have all the metadata available at build and runtime that way. |
That could work too but I think it is still cleaner to have the descriptor as part of the message via a trait. To me adding a |
@fdeantoni I'd say implementing the metadata/descriptor bits as functions that pull out data from the static variable is the sanest route, that way you don't end up with circular references which Rust won't like. The static is just step one, step two might be adding a trait and trait impls being created for each generated message type to get that descriptor data from the static. |
Thanks for the feedback @BFrog! I'm definitely not against using a static to get the package information, and using a trait impl for each message to get that data. All I want really is just a way to reconstruct the package url from the message :) . However, I figured just placing this data in the trait would be fastest (and safest). In essence the only thing this PR really does is add the following trait: pub trait MessageDescriptor: Message + 'static {
fn message_name(&self) -> &'static str;
fn package_name(&self) -> &'static str;
fn type_url(&self) -> &'static str;
} The derive macro then takes care of implementing this trait for each message, for example: impl ::prost::MessageDescriptor for Foo {
fn message_name(&self) -> &'static str {
"Foo"
}
fn package_name(&self) -> &'static str {
"com.example.package"
}
fn type_url(&self) -> &'static str {
"type.googleapis.com/com.example.package.Foo"
}
} You mention circular references as a potential risk to this implementation though. Can you tell me more how/where this might happen? If there is a weakness in the design then definitely it should be addressed! |
@fdeantoni As soon as you start looking at the fields of a message you start running into issues For example the protobuf api itself provides this, https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.message#Message.GetDescriptor.details A Descriptor provides a way to introspect the fields of a message. A FieldDescriptor has a method containing_type" call provides the Descriptor for the message type holding that field. So you have a circular reference in Descriptor <-> FieldDescriptor in the GetDescriptor protobuf introspection API You don't have to replicate the Descriptor API entirely, but it is really useful, especially when crafting custom code generators that maybe generate new types/functions/etc on top of the existing Message implementations. |
Ok, but I think that is a much bigger change than what I'm proposing for this PR, which is really a very simple and dumb message descriptor trait containing only the package name and message name (to construct a url). The next step indeed could be adding a static reference to get the field descriptors, but that I think should be a separate PR. Nonetheless, if the maintainers are willing to go with a much bigger change that implements the full Descriptor API comparable to the cpp version then that's even better, but I was hoping by keeping this PR small and simple (with potential to add features incrementally) it had more chance of being included. Hopefully the maintainers can weigh in here and suggest the best course of action. |
pub trait MessageDescriptor: Message + 'static { | ||
fn message_name(&self) -> &'static str; | ||
fn package_name(&self) -> &'static str; | ||
fn type_url(&self) -> &'static str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may not need type_url
to be included in the trait. Let's take a look on how C++ implements Any::PackFrom
:
bool AnyMetadata::PackFrom(Arena* arena,
const Message& message,
StringPiece type_url_prefix) {
type_url_->Set(
&::PROTOBUF_NAMESPACE_ID::internal::GetEmptyString(),
GetTypeUrl(message.GetDescriptor()->full_name(), type_url_prefix),
arena);
return message.SerializeToString(
value_->Mutable(ArenaStringPtr::EmptyDefault{}, arena));
}
It uses an API called Descriptor::full_name
which returns <package>.<message>
, and finally generates the type_url via GetTypeUrl
, which is hard-coded in the library and can not be extended like a trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type_url
was just added for convenience really. If this stops the PR from being merged then I'm all for removing it! A third-party library that implements pack and unpack like prost-wkt can then implement it instead (with the prefix configurable and whatnot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate your work on prost-wkt and I'm actually not one of the prost maintainers, so I couldn't and didn't intend to stop anybody. I left this comment only because I feel a minimum public trait is better and will not lead to incompatibility issues in the future.
Anyway, I'm really looking forward to this PR being merged, then we will not have to use a forked version of prost in prost-wkt any longer. I don't understand why such tiny PR has been blocked so long since last year. 😢
fn type_url(&self) -> &'static str { | ||
#type_url | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the above methods should be associated functions instead. Someone will be able to use Foo::type_url()
instead of constructing Foo
and then calling the method.
For example this is how pack/unpack to Any
will be implemented with associated functions:
fn pack_to_any<M>(msg: M) -> prost_types::Any
where
M: prost::Message + prost::MessageDescriptor,
{
prost_types::Any {
type_url: M::type_url().to_owned(),
value: msg.encode_to_vec(),
}
}
fn unpack_from_any<M>(msg: prost_types::Any) -> Option<M>
where
M: prost::Message + prost::MessageDescriptor + Default,
{
if msg.type_url == M::type_url() {
Some(M::decode(&msg.value[..]).ok()?)
} else {
None
}
}
I created a workaround example for packing/unpacking to Any that uses FileDescriptorSet in build.rs: https://github.com/oblique/prost-any-workaround |
Thanks for the example @oblique. It looks like the For anybody interested, I reworked prost-wkt to use Thanks everyone! |
A new trait
MessageDescriptor
which includes the following info about a message:package_name
which is taken from thepackage
field in the proto file;message_name
which is the message name itself;type_url
which is "type.googleapis.com/" + package_name + "." + message_name;The prefix "type.googleapis.com" is the default prefix for all protobuf types (see here).
Adding the MessageDescriptor trait will help with the packing and unpacking of messages in the
Any
type, but may be helpful in other situations as well, and can easily be extended to include more proto information in future (e.g. source file etc).This pull request can address #277 completely as it will allow an external crate to provide all the functionalities without polluting the
prost
crate with additional dependencies. For an external crate that provides all the functionality described in #277, see prost-wkt.