Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prefix-symbols feature #1315

Merged
merged 5 commits into from
Nov 27, 2024
Merged

Add prefix-symbols feature #1315

merged 5 commits into from
Nov 27, 2024

Conversation

xdoardo
Copy link
Contributor

@xdoardo xdoardo commented Nov 25, 2024

Hello again!

This small patch adds a prefix-symbols feature that adds a wasmi_ prefix to all the public symbols that did not have it before, keeping the symbols not mangled. This way, users (that need to use wasmi's c_api implementation with other backends) need not worry about Option<Box<_>>es but can simply use bindgen to take care of it.

I think that this method makes more sense than the other one we ended up merging.

Again, thank you in advance for taking the time to review the PR.

Copy link

codecov bot commented Nov 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.58%. Comparing base (8cc0258) to head (9cf2e4a).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1315   +/-   ##
=======================================
  Coverage   81.58%   81.58%           
=======================================
  Files         306      306           
  Lines       25271    25271           
=======================================
  Hits        20617    20617           
  Misses       4654     4654           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Robbepop
Copy link
Member

Robbepop commented Nov 25, 2024

@xdoardo Thanks for the PR!

This small patch adds a prefix-symbols feature that adds a wasmi_ prefix to all the public symbols that did not have it before, keeping the symbols not mangled.

It is intentional that all Wasm C-API symbols do not start with wasmi_ but with wasm since that's what Wasm C-API headers expose.

Unforunately CI is not yet happy and it seems that both new features for the C-API are conflicting each other which itself conflicts with Rust API design guidelines that crate features must be additive. This change has another downside that it is not reflected in the Wasmi C headers.

How could we best resolve this? 🤔

Does this mean that symbols like wasm_store_delete will now be called wasmi_wasm_store_delete?

@xdoardo
Copy link
Contributor Author

xdoardo commented Nov 25, 2024

Hello again @Robbepop , thanks for the response. Let me give a bit of context so to explain better why and how I came to this PR.

In this PR I add support for multiple backends (or runtimes) to Wasmer: among others, we have support for V8, WAMR and wasmi. All of them are embedded using their implementation of the wasm_c_api standard.

Of course, using the respective pristine builds, the three libraries mentioned above all export the wasm_... symbols, resulting in duplicate symbol errors when trying to link them all together in a single wasmer build. For v8 and wamr I was able to use objcopy to rename these wasm_... symbols giving them a prefix. The reason for this PR is, simply, that the same mechanism - which, I'll admit, is also pretty hacky - I think does not play well when the library being embedded in the binary is a Rust library (i.e. does not play too well with cargo: it's not as easy as downloading the library from an URL and running objcopy on it).

The other nice side of this endeavour is that with bindgen I can add a #[link_name = "wasmi_wasm_..."] attribute to the symbols imported from the header. In practice, the result is that the code generated from bindgen is:

#[link_name = "\u{1}_wasmi_wasm_byte_vec_new_empty"]
pub fn wasm_byte_vec_new_empty(out: *mut wasm_byte_vec_t);

and from the library side I can simply import wasm_byte_vec_new_empty; in turn, this allows the library code calling the backends pretty much identical for all wasm_c_api implementers.

To conclude: #1312 had pretty much the same meaning, at least for my use case. However, without the shims created by bindgen, the result of using a library that was specifically thought to be used by C-like code (and not Rust without bindgen, as in my case) is...not good, at least in my experience: long story short, the result is a vast amount of null checks for raw pointers to be able to cast them back to "normal" rust references (again, at least for my use case).

The next question, then, might be: why not use bindgen to generate the shims and then use #[link_name = ...] to link to mangled wasmi_c_impl symbols? The reason is that to do that I need to have a precise mapping I can compute at build time, that is, I'd need a way to get the mangled name of a Rust function in a build.rs file, which is probably not unfeasible but not ideal.

How could we best resolve this? 🤔

If I had to choose, I think that if anyone needs to use the wasmi_c_impl bindings with the caveat of not wanting the "plain" wasm_c_api symbol names (i.e. wasm_store_delete), it'd be better to have the "static" wasmi_wasm_store_delete rather than __ZN96_$LT$wasm_store_delete$GT$17hdfa21c4755fac718E (not a real symbol, only represents Rust's mangling scheme).

Does this mean that symbols like wasm_store_delete will now be called wasmi_wasm_store_delete?

Yes, only with the prefix-symbols feature enabled, of course.

--

This comment has quite a bit of technical details which I tried to report briefly but in a meaningful manner. Of course, if something is not clear or, even better, if I missed some obvious and easier way to do what I'm trying to do without jumping through all these hoops, I'm happy to provide more context.

@Robbepop
Copy link
Member

Robbepop commented Nov 26, 2024

Okay, so this means before we merge this PR we have to revert the other one that enables Rust name mangling conditionally. Both features conflict with each other and if this feature is superior then we only want this solution.

In order to reflect this feature in the C-API headers we probably should also add this feature there so that users can add a prefix to the C headers conditionally. I looked up how this could be done and it is kinda a lot of work but this is roughly it:

#ifndef WASMI_NAME_PREFIX
#define WASMI_NAME_PREFIX // by default: no name prefix
#endif

#define CONCAT(a, b) a##b
#define EXPORT(name) CONCAT(WASMI_NAME_PREFIX, name)

void EXPORT(wasm_store_delete)(void);
// etc..

C user code:

#define WASMI_NAME_PREFIX wasmi_
#include "wasm.h"

void foo(void) {
    wasmi_wasm_store_delete();
}

Though, I am very uncertain if I really want to change the C-API headers which are a given by the Wasm spec standard. This probably is a really bad idea.

@xdoardo
Copy link
Contributor Author

xdoardo commented Nov 26, 2024

Both features conflict with each other and if this feature is superior then we only want this solution.

I agree. I didn't remove the previous feature in this PR because I wanted to come to a consensus before doing it: my previous PR was, in hindsight, not a good solution to the problem at hand. Sorry for that.

Though, I am very uncertain if I really want to change the C-API headers which are a given by the Wasm spec standard. This probably is a really bad idea.

I agree. My thought here is that the asymmetry between the symbols exported in the library and the symbols declared in the C-API header can make a lot of sense or no sense at all depending on the use case. In my case it does, as I can simply copy the code for another C-API implementer, change include!(".../wamr_bindings.rs"); to include!(".../wasmi_bindings.rs"); and have a working wasmi-backed backend, given that I add the link_name attributes while generating shims with bindgen. On the other hand, I also understand that other users might expect the symbols exported in the C-API header to be the same symbols exported from the library, without any asymmetry.

The choice, here, is between changing a standard API or expect users to know what they're doing when enabling this feature (the third option being removing both symbols-related features). We should also take into account that the current mangle-symbols feature already induces an asymmetry between the symbols in the header and those in the library, with the difference that the symbol exposed in the library are pretty much unknown to users (again, something like __ZN96_$LT$wasm_store_delete$GT$17hdfa21c4755fac718E). I think that the second option, that is giving users a bit of responsibility with this feature, is the best as it allows new use-cases and does not change the standard API.

@Robbepop
Copy link
Member

We should also take into account that the current mangle-symbols feature already induces an asymmetry between the symbols in the header and those in the library [..]

Good point. I think it would be fine if we revert your other PR and add docs that describe how to use this new prefix feature without changing our C headers. This way users will understand why it exists and how to make use of the same solution that you used to make use of it.

@xdoardo
Copy link
Contributor Author

xdoardo commented Nov 26, 2024

Uhm, I'm not sure why Build in CI is failing. I don't see any difference between that step and the build (all features) steps in the various Test (*-latest) jobs. Any clue?

@Robbepop
Copy link
Member

Uhm, I'm not sure why Build in CI is failing. I don't see any difference between that step and the build (all features) steps in the various Test (*-latest) jobs. Any clue?

You can view the logs here:

The build failure is weird, I don't know why it fails.

Would you be so kind and separate out the revert of the other PR into another PR? This would make this PR simpler to review.

@xdoardo xdoardo force-pushed the main branch 4 times, most recently from 59f37be to 4728ac8 Compare November 27, 2024 10:18
crates/c_api/macro/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@Robbepop Robbepop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the clippy CI job here: https://github.com/wasmi-labs/wasmi/actions/runs/12049971351/job/33597861395?pr=1315#step:5:18

Then we should be fine with merging this.
Thanks a lot!

Comment on lines 162 to 196
#[proc_macro_attribute]
pub fn prefix_symbol(
_attributes: proc_macro::TokenStream,
input: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
let mut stream = TokenStream::from(input.clone()).into_iter();

let mut fn_name = None;

while let Some(token) = stream.next() {
match token {
TokenTree::Ident(ref ident) if ident.to_string() == "fn" => {
if let Some(TokenTree::Ident(ident_name)) = stream.next() {
fn_name = Some(ident_name.to_string());
break;
}
}
_ => continue,
}
}

if fn_name.is_none() {
panic!("expected a valid Rust function definition, but it does not appear in: {input:?}");
}

let prefixed_fn_name = format!("wasmi_{}", fn_name.unwrap());

let mut attr: proc_macro::TokenStream = quote! {
#[cfg_attr(feature = "prefix-symbols", export_name = #prefixed_fn_name)]
}
.into();
attr.extend(input);

attr
}
Copy link
Member

@Robbepop Robbepop Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think this is fine for merging even though it does not really check for function signatures. What I had in my mind is to perform shallow parsing similar to syn but overly specific to our use case as to not have to use syn as dependency. Shouldn't be too much work but I am willing to implement this as a follow-up PR so we can get this PR merged as is.

@Robbepop
Copy link
Member

@xdoardo Could you please remove this massive o file?

crates/c_api/macro/lib.rs Outdated Show resolved Hide resolved
@xdoardo
Copy link
Contributor Author

xdoardo commented Nov 27, 2024

Could you please remove this massive o file?

Yep, sorry, my bad!

crates/c_api/src/extern.rs Outdated Show resolved Hide resolved
crates/c_api/src/extern.rs Outdated Show resolved Hide resolved
crates/c_api/src/frame.rs Outdated Show resolved Hide resolved
crates/c_api/src/func.rs Outdated Show resolved Hide resolved
crates/c_api/src/types/extern.rs Outdated Show resolved Hide resolved
crates/c_api/src/types/import.rs Outdated Show resolved Hide resolved
crates/c_api/src/types/memory.rs Outdated Show resolved Hide resolved
crates/c_api/src/types/table.rs Outdated Show resolved Hide resolved
Copy link
Member

@Robbepop Robbepop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the work you've put into this!
LGTM

@Robbepop Robbepop merged commit 4abde9c into wasmi-labs:main Nov 27, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants