-
-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Make JsSymbol
thread-safe
#2539
Conversation
Test262 conformance changes
|
Codecov Report
@@ Coverage Diff @@
## main #2539 +/- ##
==========================================
+ Coverage 49.92% 50.96% +1.03%
==========================================
Files 377 374 -3
Lines 37557 36805 -752
==========================================
+ Hits 18751 18756 +5
+ Misses 18806 18049 -757
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Benchmark for 1c12fe4Click to view benchmark
|
Benchmark for 37c9a3bClick to view benchmark
|
Some(Self { | ||
// SAFETY: Pointers returned by `Arc::into_raw` must be non-null. | ||
repr: unsafe { Tagged::from_ptr(Arc::into_raw(arc).cast_mut()) }, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks interesting but not sure i understand it.
Could you explain why the JsSymbol holds a Tagged pointer over the Arc? Doesn't the Arc return a pointer anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing about this is that on the previous implementation, a JsSymbol
was 8 bytes in size, making it the same size as an usize
. However, on a previous implementation of the static well-known symbols, I tried to encode statics like so:
enum JsSymbol {
WellKnown(usize),
Runtime(Arc<Inner>)
}
This doubles the size of JsSymbol
to 16 bytes! So, to preserve the original size of 8 bytes, I had to encode the tag itself into the pointer. To do that, I had mainly two options:
- Store an
Arc<Inner>
andtransmute
the tags intoArc<Inner>
to store them. - Store the
NonNull<Inner>
returned byArc::into_raw
, and trivially convertusize
intoNonNull<Inner>
usingptr::invalid
.
The first option seems tempting, but is EXTREMELY hard to maintain; we need to check on every rustc update that the inner representation of Arc
hasn't changed. That's not ideal. And so, I had to go with the second option, which is a bit harder to reason about but just uses APIs provided by Arc
itself, so we don't have to worry about breaking the implementation between rustc version bumps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a JsSymbol was 8 bits in size, making it the same size as an usize
Isn't a usize
32bits on a 32bit machine, 64bits on a 64bit machine? i.e the pointer size? I've not known of a usize
be only 1 byte. https://doc.rust-lang.org/std/primitive.usize.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry! I mistyped. Edited my comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I assume pointers of 64-bit in size in the comment, but also applies to 32-bit pointers because the tag itself is usize
.
Benchmark for 62051deClick to view benchmark
|
Benchmark for 6a33665Click to view benchmark
|
JsSymbol
and JsString
thread-safeJsSymbol
thread-safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, we will take a performance hit switching from Rc to Arc, thats unavoidable, but probbaly explains the +5% on the Symbols (Execution)
benchmark.
As we discussed we probably don't want to take that hit on JsStrings (using atomics) so maybe we can move the penalty into the globalSymbolRegistry.
Benchmark for c3b6ecfClick to view benchmark
|
When I run miri (
@jedel1043 do you get the same result? |
Hm yeah, maybe integrating |
@raskad I did change some code while integrating |
Benchmark for 95cb217Click to view benchmark
|
Benchmark for edb2e74Click to view benchmark
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good :)
d5c1d85
to
a993253
Compare
Benchmark for 6171abcClick to view benchmark
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect to me! :)
bors r+ |
The section about `Symbol` on the [specification](https://tc39.es/ecma262/#sec-ecmascript-language-types-symbol-type) says: > The Symbol type is the set of all non-String values that may be used as the key of an Object property ([6.1.7](https://tc39.es/ecma262/#sec-object-type)). Each possible Symbol value is unique and immutable. Our previous implementation of `JsSymbol` used `Rc` and a thread local `Cell<usize>`. However, this meant that two different symbols in two different threads could share the same hash, making symbols not unique. Also, the [GlobalSymbolRegistry](https://tc39.es/ecma262/#table-globalsymbolregistry-record-fields) is meant to be shared by all realms, including realms that are not in the same thread as the main one; this forces us to replace our current thread local global symbol registry with a thread-safe one that uses `DashMap` for concurrent access. However, the global symbol registry uses `JsString`s as keys and values, which forces us to either use `Vec<u16>` instead (wasteful and needs to allocate to convert to `JsString` on each access) or make `JsString` thread-safe with an atomic counter. For this reason, I implemented the second option. This PR changes the following: - Makes `JsSymbol` thread-safe by using Arc instead of Rc, and making `SYMBOL_HASH_COUNT` an `AtomicU64`. - ~~Makes `JsString` thread-safe by using `AtomicUsize` instead of `Cell<usize>` for its ref count.~~ EDIT: Talked with @jasonwilliams and we decided to use `Box<[u16]>` for the global registry instead, because this won't penalize common usage of `JsString`, which is used a LOT more than `JsSymbol`. - Makes the `GLOBAL_SYMBOL_REGISTRY` truly global, using `DashMap` as our global map that is shared by all threads. - Replaces some thread locals with thread-safe alternatives, such as static arrays and static indices. - Various improvements to all related code for this.
Pull request successfully merged into main. Build succeeded: |
JsSymbol
thread-safeJsSymbol
thread-safe
The section about
Symbol
on the specification says:Our previous implementation of
JsSymbol
usedRc
and a thread localCell<usize>
. However, this meant that two different symbols in two different threads could share the same hash, making symbols not unique.Also, the GlobalSymbolRegistry is meant to be shared by all realms, including realms that are not in the same thread as the main one; this forces us to replace our current thread local global symbol registry with a thread-safe one that uses
DashMap
for concurrent access. However, the global symbol registry usesJsString
s as keys and values, which forces us to either useVec<u16>
instead (wasteful and needs to allocate to convert toJsString
on each access) or makeJsString
thread-safe with an atomic counter. For this reason, I implemented the second option.This PR changes the following:
JsSymbol
thread-safe by using Arc instead of Rc, and makingSYMBOL_HASH_COUNT
anAtomicU64
.MakesEDIT: Talked with @jasonwilliams and we decided to useJsString
thread-safe by usingAtomicUsize
instead ofCell<usize>
for its ref count.Box<[u16]>
for the global registry instead, because this won't penalize common usage ofJsString
, which is used a LOT more thanJsSymbol
.GLOBAL_SYMBOL_REGISTRY
truly global, usingDashMap
as our global map that is shared by all threads.