-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proc macro tweaks #97004
Proc macro tweaks #97004
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
Best reviewed one commit at a time. |
Apart from the two comments I left everything looks fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as @bjorn3, primary concern is Closure<'a
needing its lifetime, and there's a typo in a comment.
pub struct Buffer<T: Copy> { | ||
data: *mut T, | ||
pub struct Buffer { | ||
data: *mut u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I think the reason I made this generic was to avoid accidentally doing the wrong thing because the types happened to match up (especially since there's unsafe
code here) but I'll review this carefully and hopefully this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would any type other than u8
make sense here? Doesn't seem like it, especially given that Buffer<u8>
was hardcoded in a bunch of places anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I meant, but rather that writing parametric code can sometimes help avoid making mistakes (which can be catastrophic in unsafe
code).
It doesn't make sense at all from the perspective of whether e.g. a C++ container might be templated or not, but in does in Rust because of the type-checking in the generic form.
Anyway it doesn't matter much, I was just explaining that Foo<T>
can still make sense even if it's only ever used as a single Foo<Concrete>
and the generic nature "isn't taken advantage of".
Hopefully we can remove this Buffer
abstraction and move more to e.g. a model closer to read/write syscalls (or io::{Read,Write}
traits I suppose, at a higher level).
// to avoid borrow conflicts from borrows started by `&mut` arguments. | ||
// to match the ordering in `reverse_decode`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a big comment I just had to remove, heh, until I figured out what was meant by this change.
So yeah it is about the borrow-checker but only directly so for reverse_decode
and then reverse_encode
has to play along.
I would still maybe add "(which is forced by borrowck)" or something at the end of the comment just to not make it look like the order is arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to match the ordering in reverse_decode
" was precisely my attempt to make it clear that the order is not arbitrary. I'm happy to hear alternative wording suggestions.
6166758
to
4e7fe1a
Compare
I have addressed the comments, and added a small new commit "Rename |
I added another commit: "Inline and remove |
fe001db
to
e718c47
Compare
I have added a few more commits, including "Move HandleStore into server.rs" which fixes a couple of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, modulo some comments, and excluding the "inline Bridge::enter
" commit - having a really hard time reviewing it with the s/b/buf
change thrown in, can that be its commit?
/// Declare an associated item of one of the traits below, optionally | ||
/// adjusting it (i.e., adding bounds to types and default bodies to methods). | ||
macro_rules! associated_item { | ||
(type FreeFunctions) => (type FreeFunctions: 'static;); | ||
(type TokenStream) => (type TokenStream: 'static + Clone;); | ||
(type TokenStreamBuilder) => (type TokenStreamBuilder: 'static;); | ||
(type TokenStreamIter) => (type TokenStreamIter: 'static + Clone;); | ||
(type Group) => (type Group: 'static + Clone;); | ||
(type Punct) => (type Punct: 'static + Copy + Eq + Hash;); | ||
(type Ident) => (type Ident: 'static + Copy + Eq + Hash;); | ||
(type Literal) => (type Literal: 'static + Clone;); | ||
(type SourceFile) => (type SourceFile: 'static + Clone;); | ||
(type MultiSpan) => (type MultiSpan: 'static;); | ||
(type Diagnostic) => (type Diagnostic: 'static;); | ||
(type Span) => (type Span: 'static + Copy + Eq + Hash;); | ||
pub trait Types { | ||
type FreeFunctions: 'static; | ||
type TokenStream: 'static + Clone; | ||
type TokenStreamBuilder: 'static; | ||
type TokenStreamIter: 'static + Clone; | ||
type Group: 'static + Clone; | ||
type Punct: 'static + Copy + Eq + Hash; | ||
type Ident: 'static + Copy + Eq + Hash; | ||
type Literal: 'static + Clone; | ||
type SourceFile: 'static + Clone; | ||
type MultiSpan: 'static; | ||
type Diagnostic: 'static; | ||
type Span: 'static + Copy + Eq + Hash; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would really be nice if this info was in with_api!
instead, it just sucks that it would have to look like : (...);
instead of : ...;
to be easily digestible by proc macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about adding these bounds to with_api_types!
so that this trait could be a with_api_types
invocations and decided against it.
I understand the appeal of DRY and the platonic goal of encoding all of the API information in a single macro so that API changes (e.g. adding new types and/or methods) only requires modifying a single code location. But I don't think that's so important. If adding a new type and/or method requires modifying two or three places (which the compiler will point out for you) it's not that bad.
What is bad, and I have had to struggle with for a couple of days now, is how hard it is to read really macro-heavy code. I have found this bridge code hard to read. I find it hard to do macro substitution in my head. I've written out parts of the expansions of several macro invocations in a separate text file just to see what they look like. I'm contemplating putting such example expansions into comments to help future readers of the code (including me). And this was a case where macro invocation would have saved very little in terms of code length.
library/proc_macro/src/bridge/mod.rs
Outdated
// Similar to `with_api`, but only lists the types, and they are divided into | ||
// the two storage categories. | ||
macro_rules! with_api_types { | ||
($m:ident) => { | ||
$m! { | ||
'owned: | ||
FreeFunctions, | ||
TokenStream, | ||
TokenStreamBuilder, | ||
TokenStreamIter, | ||
Group, | ||
Literal, | ||
SourceFile, | ||
MultiSpan, | ||
Diagnostic, | ||
|
||
'interned: | ||
Punct, | ||
Ident, | ||
Span, | ||
} | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't great and it would be useful to have this information either in with_api!
entirely or not at all there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a clear improvement over the existing code. E.g. this macro is right next to with_api!
, previously this information was in the define_handles!
macro in a different file.
pub struct HandleCounters { | ||
$($oty: AtomicUsize,)* | ||
$($ity: AtomicUsize,)* | ||
pub(super) struct HandleCounters { | ||
$(pub(super) $oty: AtomicUsize,)* | ||
$(pub(super) $ity: AtomicUsize,)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the fields being public is acceptable. The FIXME
you went for is likely tied to the fact that it's impossible to have HandleStore::new
defined anywhere other than in this module.
You could maybe move everything except HandleStore::new
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it not acceptable? The new division seems better: all the HandleStore
stuff is in server.rs
, where it belongs. HandleStore
does need access to the fields of HandleCounters
, but that makes sense given that HandleStore
is layered on top of HandleCounters
.
(Why is HandleCounters
on the client side, BTW? Is it so that each proc macro gets its own separate set of counters? Would there be a problem if it was on the server side and all proc macros shared one set of counters?)
Some good news: your comment also made me realize that there was room for improvement in that last commit. I have removed the pub(super)
markers on HandleStore
, HandleStore
's fields, HandleStore::new
, MarkedTypes
. I have also removed all the server::
qualifiers from the code moved into server.rs
. Thanks for that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is
HandleCounters
on the client side, BTW?
I just tried moving it into server.rs
and everything worked fine. All tests pass, and it removed the need for the pub(super)
markings.
Maybe there's a reason for having it in client.rs
, but there's no apparent explanation in either the code or the test suite. So I've pushed the commit to this PR in case it's usable.
e718c47
to
cf47a51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split these commits off into a separate PR?
- Inline and remove
Bridge::enter
. - Move
HandleStore
intoserver.rs
. - Move counters into
server.rs
.
(r=me on the 12 other commits at the time of this writing, ideally this will land even if I forget about it for another week)
Everything else is small and largely uncontroversial tweaks, and I think this PR has grown past its original scope, and it's hard for me to reason directly about it (e.g. there's FIXMEs I left in the code that seem to lack the precise reason I couldn't do that specific change originally).
Replying to this (#97004 (comment)) here so that it doesn't get lost in review comments. Just to be clear, the code compiling and the tests passing doesn't mean much about failure modes - we generally (at most) have tests for how the proc macro API is meant to be used, and maybe some high-level errors like trying to create invalid tokens, not low-level abuse (which can be tricky to test, or at the very least require a lot of scaffolding and the test itself may become brittle), or even internal properties of the code that don't affect its behavior (not really testable). (For example, you can't break compilation or tests by making implementation details of The reason for client-side (Apologies for this sort of scenario not being described in code comments, it took a long time to get that code landed so it's possible near the end I didn't even remember myself in great detail every decision) There is, however, a simplification that can be made: the server can account for never reusing handles if it's the only server that is talking to that "in-memory" instance of the proc macro dylib. This would always be true for process/wasm isolation (where the Even for the in-process So maybe all we need is the client to remember the This has the nice added benefit of removing one of the blockers for any kind of proc macro isolation that doesn't share memory between the client and the server, but I don't think it should be done in this PR (feel free to reference this comment or quote parts of it wherever such an endeavor ends up). cc @mystor |
`u8` is the only type that makes sense for `T`, as demonstrated by the fact that several impls and functions are hardwired to `Buffer<u8>`.
This gives the more obvious derive/attr/bang distinction, and reduces code size slightly.
Similar to the existing `AttrProcMacro` trait.
So it matches the existing `AttrProcMacro` and `BangProcMacro` types.
`reverse_encode` isn't necessary to please the borrow checker, it's to match the ordering done by `reverse_decode`.
There is some non-obvious information required to understand them.
#97445 is the follow-up. |
That will break running multiple rustc sessions in the same process. The proc macro will keep being loaded, but rustc will forget all state (or at least it should). |
☀️ Test successful - checks-actions |
Finished benchmarking commit (f558990): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Footnotes |
Surprising perf regressions. I'll take a look on Monday. |
@bjorn3 Just as the proc macro will remain loaded, so will |
proc_macro: don't pass a client-side function pointer through the server. Before this PR, `proc_macro::bridge::Client<F>` contained both: * the C ABI entry-point `run`, that the server can call to start the client * some "payload" `f: F` passed to that entry-point * in practice, this was always a (client-side Rust ABI) `fn` pointer to the actual function the proc macro author wrote, i.e. `#[proc_macro] fn foo(input: TokenStream) -> TokenStream` In other words, the client was passing one of its (Rust) `fn` pointers to the server, which was passing it back to the client, for the client to call (see later below for why that was ever needed). I was inspired by `@nnethercote's` attempt to remove the `get_handle_counters` field from `Client` (see rust-lang#97004 (comment)), which combined with removing the `f` ("payload") field, could theoretically allow for a `#[repr(transparent)]` `Client` that mostly just newtypes the C ABI entry-point `fn` pointer <sub>(and in the context of e.g. wasm isolation, that's *all* you want, since you can reason about it from outside the wasm VM, as just a 32-bit "function table index", that you can pass to the wasm VM to call that function)</sub>. <hr/> So this PR removes that "payload". But it's not a simple refactor: the reason the field existed in the first place is because monomorphizing over a function type doesn't let you call the function without having a value of that type, because function types don't implement anything like `Default`, i.e.: ```rust extern "C" fn ffi_wrapper<A, R, F: Fn(A) -> R>(arg: A) -> R { let f: F = ???; // no way to get a value of `F` f(arg) } ``` That could be solved with something like this, if it was allowed: ```rust extern "C" fn ffi_wrapper< A, R, F: Fn(A) -> R, const f: F // not allowed because the type is a generic param >(arg: A) -> R { f(arg) } ``` Instead, this PR contains a workaround in `proc_macro::bridge::selfless_reify` (see its module-level comment for more details) that can provide something similar to the `ffi_wrapper` example above, but limited to `F` being `Copy` and ZST (and requiring an `F` value to prove the caller actually can create values of `F` and it's not uninhabited or some other unsound situation). <hr/> Hopefully this time we don't have a performance regression, and this has a chance to land. cc `@mystor` `@bjorn3`
Local measurements indicate that Make Buffer non-generic. is responsible for the perf regression, that's annoying. |
It was made non-generic in rust-lang#97004, but that (surprisingly) caused a mild performance regression.
#97539 is the follow-up for the perf regression. |
visiting for weekly performance triage. I won't mark this as triaged quite yet, since PR #97539 has not yet landed, but it certainly sounds like it is under control. Thanks @nnethercote ! |
This fixes a performance regression caused by making `Buffer` non-generic in rust-lang#97004.
…ods, r=eddyb Inline `bridge::Buffer` methods. This fixes a performance regression caused by making `Buffer` non-generic in rust-lang#97004. r? `@eddyb`
proc_macro: don't pass a client-side function pointer through the server. Before this PR, `proc_macro::bridge::Client<F>` contained both: * the C ABI entry-point `run`, that the server can call to start the client * some "payload" `f: F` passed to that entry-point * in practice, this was always a (client-side Rust ABI) `fn` pointer to the actual function the proc macro author wrote, i.e. `#[proc_macro] fn foo(input: TokenStream) -> TokenStream` In other words, the client was passing one of its (Rust) `fn` pointers to the server, which was passing it back to the client, for the client to call (see later below for why that was ever needed). I was inspired by `@nnethercote's` attempt to remove the `get_handle_counters` field from `Client` (see rust-lang/rust#97004 (comment)), which combined with removing the `f` ("payload") field, could theoretically allow for a `#[repr(transparent)]` `Client` that mostly just newtypes the C ABI entry-point `fn` pointer <sub>(and in the context of e.g. wasm isolation, that's *all* you want, since you can reason about it from outside the wasm VM, as just a 32-bit "function table index", that you can pass to the wasm VM to call that function)</sub>. <hr/> So this PR removes that "payload". But it's not a simple refactor: the reason the field existed in the first place is because monomorphizing over a function type doesn't let you call the function without having a value of that type, because function types don't implement anything like `Default`, i.e.: ```rust extern "C" fn ffi_wrapper<A, R, F: Fn(A) -> R>(arg: A) -> R { let f: F = ???; // no way to get a value of `F` f(arg) } ``` That could be solved with something like this, if it was allowed: ```rust extern "C" fn ffi_wrapper< A, R, F: Fn(A) -> R, const f: F // not allowed because the type is a generic param >(arg: A) -> R { f(arg) } ``` Instead, this PR contains a workaround in `proc_macro::bridge::selfless_reify` (see its module-level comment for more details) that can provide something similar to the `ffi_wrapper` example above, but limited to `F` being `Copy` and ZST (and requiring an `F` value to prove the caller actually can create values of `F` and it's not uninhabited or some other unsound situation). <hr/> Hopefully this time we don't have a performance regression, and this has a chance to land. cc `@mystor` `@bjorn3`
Various improvements I spotted while looking through the proc macro code.
r? @eddyb