You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's been some activity recently around the benchmark suite published by lqd. Those benchmarks are a trove of useful data on how rust crates are built and where the time is spent.
Looking at the benchmarks, I noticed that in multi-core builds, quitea lotofcrates had a bottleneck around serde and serde_derive.
Because serde reexports serde_derive's macro, the compilation order is always syn -> serde_derive -> serde, sequentially. In the benchmarks, this usually accounts for around 9s that cargo fails to parallelize; rustc can do some parallelizing of its own during the codegen phase, but it's still a lot of CPU usage left on the table.
Instead I suggest the crate structure should be:
serde
serde_core
serde_derive
With the serde crate being a thin wrapper re-exporting types from the other two. That way the compilation pipeline would be syn -> serde_derive on one side, and serde_core on the other.
Another advantage is that crates could import serde_core instead of serde if they don't care about the macros; that way, they wouldn't be blocked on the compilation of syn + serde even if some other crate in the tree imports serde with the derive feature.
Looking at the benchmarks, it looks like doing so could shave 2 or 3 seconds off total build time for multi-core builds. The benchmarks may not be representative of real projects (they test lib crates, not bins), though. Overall, the smaller the project, the more likely serde is to be in its critical path, so the more likely it is to see improvement from a refactoring.
Possible issues
Backwards compatibility (though it should be fine?)
Documentation: items can either appear as serde::Serialize or serde_core::Serialize. Not sure how much of a problem it actually is.
The text was updated successfully, but these errors were encountered:
Thanks for the suggestion! I can see how this would be beneficial, but I think instead of this, I would prefer to pursue Wasm-based precompiled proc macros. serde_derive should take effectively 0 seconds to compile.
There's been some activity recently around the benchmark suite published by lqd. Those benchmarks are a trove of useful data on how rust crates are built and where the time is spent.
Looking at the benchmarks, I noticed that in multi-core builds, quite a lot of crates had a bottleneck around serde and serde_derive.
Because serde reexports serde_derive's macro, the compilation order is always
syn -> serde_derive -> serde
, sequentially. In the benchmarks, this usually accounts for around 9s that cargo fails to parallelize; rustc can do some parallelizing of its own during the codegen phase, but it's still a lot of CPU usage left on the table.Instead I suggest the crate structure should be:
With the serde crate being a thin wrapper re-exporting types from the other two. That way the compilation pipeline would be
syn -> serde_derive
on one side, andserde_core
on the other.Another advantage is that crates could import serde_core instead of serde if they don't care about the macros; that way, they wouldn't be blocked on the compilation of syn + serde even if some other crate in the tree imports serde with the
derive
feature.Looking at the benchmarks, it looks like doing so could shave 2 or 3 seconds off total build time for multi-core builds. The benchmarks may not be representative of real projects (they test lib crates, not bins), though. Overall, the smaller the project, the more likely serde is to be in its critical path, so the more likely it is to see improvement from a refactoring.
Possible issues
serde::Serialize
orserde_core::Serialize
. Not sure how much of a problem it actually is.The text was updated successfully, but these errors were encountered: