-
Notifications
You must be signed in to change notification settings - Fork 676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove PATH_MAX restriction from with_nix_path and further improve performance #1656
Conversation
- Update nix with my PRs: nix-rust/nix#1656, nix-rust/nix#1655 - Use raw pointer manipulation to extract name cache values - Use a faster rand implementation with better statistical guarantees - Tune the task queue capacity and consumption rate and get rid of copies in drain (by using a VecDeque) Signed-off-by: Alex Saveau <[email protected]>
@rtzoeller Alrighty, this one should be ready for review. |
src/lib.rs
Outdated
if self.len() >= PATH_MAX as usize { | ||
return Err(Errno::ENAMETOOLONG); | ||
// The real PATH_MAX is 4096, but it's statistically unlikely to have a path longer than | ||
// ~300 bytes. See the appendix to get stats for your own machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My numbers starting from ~
:
min=17
max=368
mean=125
p50=118
p90=185
p99=244
p999=271
stddev=42
stdvar=1733
src/lib.rs
Outdated
// | ||
// By being smaller than a memory page, we also avoid the compiler inserting a probe frame: | ||
// https://docs.rs/compiler_builtins/latest/compiler_builtins/probestack/index.html | ||
const MAX_STACK_ALLOCATION: usize = 512; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a particular reason you chose 512
? Choosing the "best" value for MAX_STACK_ALLOCATION
has been bothering me.
Considerations include things like:
- Avoiding
__rust_probestack
for the "common" case.- Only supported on x86 and x86_64 today, so let's assume a page is at least 4096 bytes.
- Our stack usage is dependent on the size of
T
, but excludingbuf
is typically fairly low.
- Avoiding heap allocations where possible.
- Only fall back to
with_nix_path_allocating
where it is necessary or otherwise justifiable.
- Only fall back to
- Avoiding using significantly more stack space than previously.
PATH_MAX
is 1024 on the BSDs, so indiscriminately picking a much larger value isn't appropriate.
Choosing 4032
(as I believe you had done previously?) seems to mostly alleviate the first two points, but is significantly higher than the current stack utilization on the BSDs. On the other hand, choosing 512
addresses the first and third points, but still leads to heap allocations on all platforms for valid path lengths.
IMO 1024
seems like a reasonable compromise. It should fully avoid __rust_probestack
for sane values of T
. It ensures that we don't get a heap allocation on the BSDs while not increasing the stack usage, and also ensures we won't get a heap allocation on other platforms for reasonable path lengths. We could optionally have different definitions per-platform, but that feels like we're getting into diminishing returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, 512 was pretty arbitrary. I bumped it to 1024, but I'm even happy to go to 2048 if you think that's worth it. Might be interesting to run that histogram script on your machine to see what your longest path is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asomers I'm curious for your thoughts on the maximum stack allocation size here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There really isn't any magic number for what's acceptable for the stack. I don't have any better idea that Supercilex does.
@SUPERCILEX can you add a changelog note indicating that the |
Signed-off-by: Alex Saveau <[email protected]>
Sweet, done! |
bors r+ |
This PR removes the
PATH_MAX
limitation since that's wrong anyway: https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.htmlA nice side effect is that this lets us further optimize
with_nix_path
by having the compiler not insert probe frames, saving us another fat pile of instructions:New numbers:
Appendix