Normative: Strengthen Atomics.wait/notify synchronization to the level of other Atomics operations #1127

conrad-watt · 2018-03-05T01:34:10Z

See issue #1119 for background on the current behaviour.

This proposal strengthens the synchronization between a waking thread and its wakees as though they were each a coinciding atomic write/read (respectively) to the same memory location, in line with some previous comments by @lars-t-hansen on the informal language of an earlier draft of the memory model proposal.

Since each waker/wakee pair is determined purely operationally, there is no need to struggle to reassociate the events from scratch in the axiomatic semantics (a la the ReadsBytesFrom relation for Shared Data Block events). Instead, each agent tracks which waking events it is responsible for in a new [[AgentSynchronizesWith]] field of its Events Record, which is then unified with the global synchronizes-with relation.

I chose to introduce a rather general new "Synchronize" action rather than abuse any of the existing ones, since this spec machinery would be useful to specify a future fork or similar thread spawning operation, or explicit barrier operations, either in ECMAScript or in WebAssembly. This also limits the potential for unfortunate interactions with other parts of the spec which deal with Shared Data Block events.

I also rename the [[EventLists]] property of candidate executions to [[EventsRecords]] since Events Records now contain more than just EventLists, with the added bonus that it makes sentences which formerly read as "Let eventList be the [[EventList]] field of the element in execution.[[EventLists]] whose [[AgentSignifier]] is AgentSignifier()." easier to parse, considering the element of [[EventLists]] in question is actually an Events Record.

Some initial discussion points
1 -
Is "Synchronize" the right name for the new action? I mulled over "Scheduling" or something similar, but as I mentioned earlier these actions could potentially also be used to specify barriers in the future.

2 -
This change should not invalidate any user code - it is a strict subsetting of the previously allowed observable behaviour.

3 -
A scattering of informative notes throughout the PR may still be required.

4 -
On second thoughts I don't believe this is true
This new specification is still slightly weak. It doesn't incorporate all of the guarantees you might get from an OS barrier in Atomics.wait. For example, an execution for the following program is still allowed where x = 1, y = 1, z = 0.

Line	Thread 1	Thread 2
1	tA[2] = 1	Atomics.store(tA, 0, 1)
2	tA[1] = 1	var x = Atomics.wake(tA, 0, 1)
3	Atomics.wait(tA, 0, 0)	var y = tA[1]
4	-	var z = tA[2]

In a real implementation using OS calls for wake/wait, if x = 1 I'd expect to only observe y = z = 1. However this certainly isn't a property I'd expect as many people to care about or try to rely on. In particular, the execution where x = 0, y = 1, z = 0 must be allowed by the model, so it seems that the only programs that could benefit from a further strengthened model would be rather pathological.

5 -
On x86, this specification is satisfied by implementing Atomics.wake as a regular ECMAScript atomic load followed by a bare Linux wait, and Atomics.wake as a bare Linux wake, but (I believe) it is stronger than the same on ARM/Power, which will necessitate an extra read barrier at least (see below). I'm also not familiar enough with the barrier behaviour of other OS's to know what extra barriers are required, although I would be surprised if they do not at least issue a write barrier upon waking, as Linux does. It should be noted that this is not new - ECMAScript atomic writes are stronger than the (relaxed) atomic writes of ARM/Power, and this is reasonably well-known.

6 -
Implementation schemes - a "defensive" implementation scheme to ensure conformance after this change, assuming nothing about the underlying architecture/OS, is simply to do a full barrier immediately before any call to Atomics.wake and immediately after any call to Atomics.wait - this will always produce the correct behaviour, and is analogous to the "naive code generation scheme" suggested in the ECMAScript standard for atomic loads/stores.

An implementation wishing to do better than this (although I hope there are no benchmarks that depend on a very fast Atomics.wake!) must make use of knowledge about its particular architecture/OS combination. Below is my to-the-best-of-my-knowledge opinion on possible improvements.

Atomics.wait
On x86, a read barrier after Atomics.wait is all that is required for correctness. On ARM/Power, a full barrier is required.

On Linux a full barrier is carried out by the OS call to sleep, so no additional barrier is required by the ECMAScript implementation no matter the architecture. I don't know the behaviour of other OS's off the top of my head, but I imagine similar guarantees could be determined about their barrier behaviour.

Atomics.wake
No barrier at all is required at Atomics.wake if the call to Atomics.wake does not wake any waiting threads, so the cases afterwards will assume it wakes at least one.

On x86, a write barrier is sufficient for correctness, and therefore on an OS where the underlying syscall already performs this (Linux, at least, when at least one thread is woken) no additional barrier is required in the ECMAScript implementation.

On ARM/Power, a full barrier is required overall, but if the OS performs a write barrier already, the ECMAScript implementation need only perform an additional read barrier.

@lars-t-hansen, @syg, and @rossberg PTAL.

littledan · 2018-03-05T11:10:28Z

Why is this marked needs consensus? My understanding is that, although this is a normative change, it really reflects how any reasonable implementation would do things; it's more like a bug fix.

conrad-watt · 2018-03-05T12:04:04Z

I'm not in a position to judge whether this needs consensus, but I briefly discuss in the implementation schemes section why this proposal might not match existing implementations, depending on how defensively they're currently synchronizing.

In short, if implementations have been treating wait/wake as like ECMAScript atomic loads/stores (in the absence of formal spec indication), they should already be conformant to this change. But implementations delegating to the OS directly without any barriers of their own might become non-conformant on weaker architectures.

To tie in to issue #1119, the existing spec is much weaker than bare OS calls. This proposal is slightly stronger, in the same way that ECMAScript atomic load/stores are slightly stronger than the relaxed atomic load/stores of ARM/Power.

ljharb · 2018-03-05T19:23:16Z

My understanding is that any normative change needs consensus, unless in the editor's judgement it doesn't need consensus. I lack the context on Atomics to know one way or the other; I'm hoping @bterlson will have an opinion.

syg · 2018-03-05T19:37:11Z

I suppose it's arguable whether it is a bug fix.

The strengthened behavior in the PR disallows some compiler transforms that the spec technically allows, e.g., moving non-atomic stores across Atomic.wake calls. (I say compiler because I don't think there are any OSes that would make such a thing observable.) Those transforms cause astonishment, and the broad consensus @conrad-watt referred to is that in e-mail correspondence, we agreed among us that those transforms were probably intended to be disallowed from the beginning, but slipped through the cracks.

I am comfortable treating this as a bug fix. But to fully cover our bases, we could discuss very quickly in March with other interested parties (Waldemar). I'll defer to editor group.

I'll make time to review this before next week.

conrad-watt · 2018-03-05T22:37:29Z

If it's appropriate, I could also attend any discussion in person. I'm in London giving a presentation at the Formal Methods Meets JavaScript meeting the day before TC39 anyway.

I don't know what the rules are for external contributors attending TC39 meetings.

conrad-watt · 2018-03-06T10:50:06Z

I should also point out that the current semantics of the spec are far more troubling than I've perhaps motivated in my earlier examples. In #1119 I gave an example which could look like an over-eager compiler transformation of non-atomics, but for the following program, an execution where thread 1 reads 0 at line 2 (assuming tA initially zeroed) is allowed, which seems completely incoherent, and certainly won't be observable in practice.

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	Atomics.store(tA, 0, 1)
2	var x = Atomics.load(tA, 0)	Atomics.wake(tA, 0, 1)

This all stems from the lack of happens-before between wake and wait, meaning that the store and load look racy to the axiomatic model.

lars-t-hansen · 2018-03-06T12:02:33Z

Going back to a fairly random earlier draft, eg, https://github.com/tc39/ecmascript_sharedmem/blob/7c10268005c7ad11552aa333dc06229f29fd6c54/tc39/spec.html (mid 2016), we find that there are definitions of synchronizes-with and happens-before that include the wake -> wait edge in the synchronization order as a special case.

That draft also has a provision for "... an embedding-specific synchronizing event that has a sender and a receiver (such as sending a SharedArrayBuffer from one agent to another), the sending action synchronizes-with the receiving action". That seems reflected in current prose in the host-synchronizes-with relation.

So it would appear that the wake -> wait edge was truly left out by accident, not being a memory operation and not being a host operation. That is, this is simply a bug. Which of course should not dissuade anyone from discussing it.

littledan · 2018-03-06T17:00:57Z

@conrad-watt To attend TC39 as a subject matter expert and present on a proposal, one prerequisite is to sign this form to license any IPR related to your presentation for use in the ECMAScript specification. We've historically had TC39 members invite experts to participate; not sure if any more formal signoff would be needed here, cc @ecmageneva @RexJaeschke.

It looks like you're associated with Cambridge University--Ecma offers free memberships to academic institutions which could make it easier to participate in the future.

syg

The approach and logic looks good to me.

In addition to the specific changes below, please reorganize the commits into an editorial commit for the EventsRecords rename, and a normative commit for the strengthening. Their commit messages should be prefixed with "Normative:" and "Editorial:".

syg · 2018-03-09T19:32:32Z

spec.html

+          </tr>
+          <tr>
+            <td>[[Order]]</td>
+            <td>`"SeqCst"`</td>


Is this intended future-proofing for release-acquire?

The reason RMW events have a fixed [[Order]] field is because they're quantified over along with other Shared Data Blocks that refer to [[Order]] generically. Synchronize events don't participate in that stuff, so we could leave it out for simplicity.

Partially this, and also to typographically give Synchronize events their own table structure on par with SDB events. I've changed to a textual description, which doubles as an explanatory note.

syg · 2018-03-09T19:53:51Z

spec.html

@@ -39885,6 +39929,18 @@ <h1>ValueOfReadEvent( _execution_, _R_ )</h1>
        1. Return ComposeWriteEventBytes(_execution_, _R_.[[ByteIndex]], _Ws_).
      </emu-alg>
    </emu-clause>
+
+    <emu-clause id="sec-agentssynchronizewith" aoid="AgentsSynchronizeWith">
+      <h1>AgentsSynchronizeWith( _execution_ )</h1>


Having both AgentsSynchronizeWith and AgentSynchronizesWith is confusing to me.

In conjunction with my above comment below making [[AgentSynchronizesWith]] a List, I recommend inline the logic in this AO directly into the synchronizes-with section, should be a 1-liner.

syg

Missed a comment.

syg · 2018-03-09T19:59:46Z

spec.html

@@ -39722,6 +39750,11 @@ <h1>Agent Events Records</h1>
            <td>A List of events</td>
            <td>Events are appended to the list during evaluation.</td>
          </tr>
+          <tr>
+            <td>[[AgentSynchronizesWith]]</td>
+            <td>A Relation on Synchronize events</td>


Please change this to a List.

The framework I've tried to adhere to is that the operational semantics appends events directly to Lists, and Relations are built by the axiomatic semantics from those Lists. (Also the Relations are not constructed by AOs to avoid implying that the axiomatic semantics is a step-by-step thing.)

Done - I may submit a separate editorial PR to make the spec's general use of the word "pair" in different parts of the memory model slightly more consistent.

conrad-watt · 2018-03-10T12:47:40Z

rebased + comments addressed, PTAL

conrad-watt · 2018-03-17T14:21:05Z

I've been told that if I want to attend any discussion on this at next week's meeting (pending sign-off from the ECMA SG), I should note time constraints/preferences here. I would prefer the 20th, although on that day I would have to leave after 2pm. Failing that I'm available any other day for the duration.

I'm assuming this still needs to pass through the committee as a formality, in the absence of indications to the contrary.

Longer term I will look at convincing my department/university to affiliate with TC39 officially.

@IgnoredAmbience

syg

Looks great, I like the simplifications.

Please address the one editorial comment and merge this fixup into the previous normative commit, and it should be good to go on my end.

syg · 2018-03-19T12:21:29Z

spec.html

@@ -40009,7 +39979,7 @@ <h1>synchronizes-with</h1>
          </ul>
        </li>
        <li>For each pair (_E_, _D_) in _execution_.[[HostSynchronizesWith]], (_E_, _D_) is in _execution_.[[SynchronizesWith]].</li>
-        <li>For each pair of Synchronize events (_S_, _Sw_) in AgentsSynchronizeWith(_execution_), (_S_, _Sw_) is in _execution_.[[SynchronizesWith]].</li>
+        <li>For each element _eventsRecord_ of _execution_.[[EventsRecords]], for each pair (_S_, _Sw_) in _eventsRecord_.[[AgentSynchronizesWith]], (_S_, _Sw_) is in _execution_.[[SynchronizesWith]].</li>


Editorial nit: I think it'd be a clearer to have the [[AgentSynchronizesWith]] addition be its own <li>

done + rebased

nomadtechie · 2018-03-21T15:03:10Z

Hi @conrad-watt @syg - I've been working on adding test coverage to Atomics.wait and Atomics.wake. What implications (if any) does this change have on test coverage?

ajklein · 2018-03-21T15:10:48Z

@binji @dtig @flagxor heads-up that this change reached consensus at TC39, want to make sure V8 folks have seen it.

conrad-watt · 2018-03-21T15:53:10Z

@nomadtechie here are some representative program fragments for the new behaviour. All assume tA initially zeroed. I should stress that it's very unlikely that any of these were observable before - it would have required a very specific combination of architecture + compilation scheme that I don't have any evidence actually existed in the "wild".

To what extent is test262 the right venue for these kind of concurrent "litmus" tests? They're widely used in the experimental validation of architectural models in academia, but they're highly probabilistic and are often run hundreds of times in an attempt to enumerate the space of observable behaviours.

Example 1: nonsensical, would require specific (crazy) compiler transformations to observe:

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	Atomics.store(tA, 0, 1)
2	var x = Atomics.load(tA, 0)	Atomics.wake(tA, 0, 1)

Before: x must be 0 or 1
Now: x must be 1

Example 2: store buffering across wake, would require a very unusual compilation scheme to observe (or memory model sensitive compiler transformations). Prevented by even underlying OS barriers on all architectures.

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	tA[1] = 1
2	var x = tA[1]	Atomics.wake(tA, 0, 1)

Before: Thread 1 may get stuck, but if it does not, x must be 0 or 1
Now: If thread 1 doesn't get stuck, x must be 1

This example isn't a good test, since it can potentially fail to terminate. Below is a modification which always terminates, but may fail to detect some miscompilations of wake due to extra barriers potentially introduced by the atomic store.

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	tA[1] = 1
2	var x = tA[1]	Atomics.store(tA, 0, 1)
3	-	Atomics.wake(tA, 0, 1)

Before: x must be 0 or 1
Now: x must be 1

Example 3: load buffering across wake: my understanding is that, taking our best formal models of ARM, this is theoretically observable even with OS barriers (requires additional barriers in the compilation scheme). Of the people I talked to at TC39, everyone who was in a position to check said they're already doing the right barriers.

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	var x = tA[1]
2	tA[1] = 1	Atomics.wake(tA, 0, 1)

Before: x must be 0 or 1, thread 1 may get stuck
After: x must be 0, thread 1 may get stuck

Terminating version (same caveats):

Line	Thread 1	Thread 2
1	Atomics.wait(tA, 0, 0)	var x = tA[1]
2	tA[1] = 1	Atomics.store(tA, 0, 1)
3	-	Atomics.wake(tA, 0, 1)

Before: x must be 0 or 1
Now: x must be 0

binji · 2018-03-26T19:19:42Z

To what extent is test262 the right venue for these kind of concurrent "litmus" tests?

We discussed this some here, but the issue seems to have stalled. One issue is that the tool generated many litmus tests ("The litmus tests cover all possible memory interactions of sizes 1-4. They take up 202MB compressed and about 1.1GB uncompressed."). Another issue, as you mention, is that they are inherently probabilistic.

It seemed as though there was value to adding them to test262, though, perahps as an addendum.

littledan · 2018-04-04T08:26:44Z

Thanks for the analysis of tests, @binji and @conrad-watt. Seems to me like this fix is ready to land without blocking on tests.

erights · 2018-08-09T22:37:05Z

I think I do not need to review this. Can I painlessly withdraw as a reviewer of this?

bmeck · 2018-08-10T11:07:14Z

@erights yup, just wanted to make sure since I was curious about this after your spectre talk and saw @syg 's comment about ordering. Was more curious if you agreed on that issue in particular. However, if nothing seems controversial we should land this at this point.

ljharb · 2018-08-10T17:33:51Z

@conrad-watt would you mind rebasing this on the latest master?

conrad-watt · 2018-08-10T19:25:41Z

done

erights · 2018-08-11T06:02:06Z

It does look like this change goes strictly in the right direction --- towards a smaller space of possible outcomes, where the remaining outcomes are more intuitive than the ones ruled out. So on these grounds I am in favor. But I have not dug into this enough to consider my positive reaction a review; nor do I expect to. Thanks.

ljharb

@conrad-watt now that #1220 has landed, wake is likely no longer a term we should be using. Could you update the section i've indicated to use the updated terminology?

spec.html

+          1. Let _execution_ be the [[CandidateExecution]] field of the surrounding agent's Agent Record.
+          1. Let _eventsRecord_ be the Agent Events Record in _execution_.[[EventsRecords]] whose [[AgentSignifier]] is AgentSignifier().
+          1. Let _agentSynchronizesWith_ be _eventsRecord_.[[AgentSynchronizesWith]].
+          1. Let _wakerEventList_ be _eventsRecord_.[[EventList]].


ljharb · 2018-08-22T19:15:16Z

(linking to #1220)

ljharb added normative change Affects behavior required to correctly evaluate some ECMAScript source text needs consensus This needs committee consensus before it can be eligible to be merged. labels Mar 5, 2018

ljharb requested a review from syg March 5, 2018 06:05

lars-t-hansen mentioned this pull request Mar 6, 2018

Small Atomics changes to reduce duplications #835

Closed

ljharb added spec bug and removed needs consensus This needs committee consensus before it can be eligible to be merged. labels Mar 6, 2018

syg requested changes Mar 9, 2018

View reviewed changes

syg approved these changes Mar 19, 2018

View reviewed changes

ljharb added the has consensus This has committee consensus. label Mar 20, 2018

ljharb requested review from bterlson, ljharb and bmeck April 4, 2018 14:10

ljharb assigned bterlson Apr 4, 2018

bmeck requested a review from erights August 9, 2018 18:35

bterlson approved these changes Aug 9, 2018

View reviewed changes

ljharb assigned erights Aug 9, 2018

ljharb removed the request for review from erights August 10, 2018 17:32

ljharb unassigned erights Aug 10, 2018

bmeck approved these changes Aug 22, 2018

View reviewed changes

ljharb removed their request for review August 22, 2018 18:59

ljharb assigned ljharb and unassigned bterlson Aug 22, 2018

ljharb requested changes Aug 22, 2018

View reviewed changes

ljharb changed the title ~~Normative: Strengthen Atomics.wait/wake synchronization to the level of other Atomics operations~~ Normative: Strengthen Atomics.wait/notify synchronization to the level of other Atomics operations Aug 23, 2018

conrad-watt added 2 commits August 23, 2018 13:00

Editorial: rename EventLists internal property to EventsRecords (#1127)

4dbbbda

Normative: strengthen wait-notify synchronization (#1127)

ef0cf19

ljharb approved these changes Aug 23, 2018

View reviewed changes

ljharb merged commit ef0cf19 into tc39:master Aug 23, 2018

conrad-watt mentioned this pull request Apr 18, 2019

Normative: fix axiomatic model to be SC-DRF, and allow ARMv8 compilation #1511

Merged

conrad-watt mentioned this pull request Jul 5, 2019

remove SynchronizeEventSet? #1611

Open

syg mentioned this pull request Aug 30, 2019

(memory model, wait/notify) Atomics.wait/notify non-SC behaviour, what is expected? #1680

Closed

Normative: Strengthen Atomics.wait/notify synchronization to the level of other Atomics operations #1127

Normative: Strengthen Atomics.wait/notify synchronization to the level of other Atomics operations #1127

Conversation

conrad-watt commented Mar 5, 2018 • edited Loading

littledan commented Mar 5, 2018

conrad-watt commented Mar 5, 2018 • edited Loading

ljharb commented Mar 5, 2018

syg commented Mar 5, 2018

conrad-watt commented Mar 5, 2018

conrad-watt commented Mar 6, 2018 • edited Loading

lars-t-hansen commented Mar 6, 2018

littledan commented Mar 6, 2018

syg left a comment

Choose a reason for hiding this comment

syg Mar 9, 2018

Choose a reason for hiding this comment

conrad-watt Mar 10, 2018

Choose a reason for hiding this comment

syg Mar 9, 2018 • edited Loading

Choose a reason for hiding this comment

conrad-watt Mar 10, 2018

Choose a reason for hiding this comment

syg left a comment

Choose a reason for hiding this comment

syg Mar 9, 2018

Choose a reason for hiding this comment

conrad-watt Mar 10, 2018 • edited Loading

Choose a reason for hiding this comment

conrad-watt commented Mar 10, 2018

conrad-watt commented Mar 17, 2018

syg left a comment

Choose a reason for hiding this comment

syg Mar 19, 2018

Choose a reason for hiding this comment

conrad-watt Mar 20, 2018

Choose a reason for hiding this comment

nomadtechie commented Mar 21, 2018

ajklein commented Mar 21, 2018

conrad-watt commented Mar 21, 2018 • edited Loading

binji commented Mar 26, 2018

littledan commented Apr 4, 2018

erights commented Aug 9, 2018

bmeck commented Aug 10, 2018

ljharb commented Aug 10, 2018

conrad-watt commented Aug 10, 2018

erights commented Aug 11, 2018 • edited Loading

ljharb left a comment

Choose a reason for hiding this comment

This comment was marked as resolved.

This comment was marked as resolved.

ljharb commented Aug 22, 2018

conrad-watt commented Mar 5, 2018 •

edited

Loading

conrad-watt commented Mar 5, 2018 •

edited

Loading

conrad-watt commented Mar 6, 2018 •

edited

Loading

syg Mar 9, 2018 •

edited

Loading

conrad-watt Mar 10, 2018 •

edited

Loading

conrad-watt commented Mar 21, 2018 •

edited

Loading

erights commented Aug 11, 2018 •

edited

Loading