forceTransfer/steal lock request option instead of forceRelease() ? #23

ralphch0 · 2017-11-20T21:37:53Z

Current API does not allow a page to ensure that it can take over a lock. Inspecting the locks and calling forceRelease() allows that page to release the lock from the current holder, but it does not guarantee that the lock will be free for it to acquire, since other pages may have queued lock requests.

I'm suggesting instead to introduce a "steal" or "forceTransfer" option to LockOptions that allows the current page to not only to force release the lock, but also to take it over ahead of all queued requests. forceRelease() can still be achieved by acquiring the lock with steal=true, and releasing it.

inexorabletash · 2017-11-20T22:03:51Z

I was assuming (...) that forceRelease() would abort queued requests as well, but I'm not advocating any particular API shape for this.

Presumably, if two acquire(..., {steal:true}) calls take place around the same time the first caller will be told it has acquired the lock, but then the second caller would immediately steal it away?

ralphch0 · 2017-11-20T23:40:23Z

Sure, last one wins should be OK. This relies on the developers writing their code to be resilient to lock steals anyways.

inexorabletash · 2017-11-21T21:45:24Z

Huh... I wonder if we could make seem like less of a wart by making this a non-negative safe integer priority rather than a boolean. A request/held lock has a priority (default 0); if a request comes through that's higher than held/requested priorities then those are dropped.

But maybe that would just encourage the use in normal application logic rather than just recovery scenarios.

@pwnall - thoughts?

ralphch0 · 2017-11-21T22:24:26Z

I originally proposed a priority option, but I split it into two: lockPriority & stealPriority, to make stealing explicit. Regular devs would probably expect that trying to acquire a lock with a higher priority should not automatically steal the lock, but rather just put the request ahead of lower priorities on the wait queue.

Alternatively, lock priority can probably be achieved by storing some additional information in the database:

Acquire lock with ifAvailable=true
If not available, check database for last lock holder information.
If last lock holder is of lower priority, request the lock with steal=true.

Obviously, this is not ideal, because technically the lock holder can change between checking the database and issuing the lock request.

pwnall · 2017-11-22T22:27:12Z

It seems to me that (without additional support in other APIs) adding lock stealing to this API would undermine any sort of guarantees one could expect from such a system.

As far as I know, lock stealing works effectively when paired with operations that error out when the lock is broken (stolen). For example, in a distributed filesystem, reads and writes error out when the associated lock is broken.

For stealing to be meaningful in this API, I think it'd have to be complemented with support in:

postMessage - take a lock, and don't send the message if the lock is broken; this can be used to make sure that at most one worker is a coordinator in some system
indexedDB - associate a lock with a transaction, don't commit if the lock s broken; same as before, but specific to coordinators that revolve around updating database status (example: online sync)
fetch (and XHR, if it's not deprecated) - added for completion, because I/O

A (bold, IMO) alternative would be to restrict locks that can be stolen to worker contexts. When a lock stealing request is issued, the current lock owner's event loop is shut down as soon as possible (possibly like Worker.terminate?). When the stealing request is granted, it is guaranteed that the old worker has been terminated. I think this is simpler to reason about than the proposal above, and works for all coordination scenarios above. I'm labeling this proposal as bold because I don't know enough about worker lifecycle to know if it can be implemented in all browsers, though Worker.terminate's broad support makes me fairly optimistic.

inexorabletash · 2018-01-06T00:54:01Z

I took an initial stab at this in the proto-spec. steal as an option, and the behavior that any previous held locks are dropped and lock requests are aborted; in either case the promise returned from acquire() rejects with AbortError.

pwnall · 2018-01-11T00:24:28Z

I just looked at the spec change above. First off, it's very surprising to me that steal is true by default.

Second, I would like to see concrete use cases for lock stealing discussed. It would help guide my intuition.

The case I've been thinking of is using locks for master election. For simplicity, assume a system where every tab spawns a worker and sends it a request whenever it needs to do something complex. Each worker first runs a master election algorithm. Workers that don't win forward their requests to the master, and forward the response to their tab.

The master election algorithm runs the following steps in an infinite loop:

Acquire the lock with ifAvailable set to true. If the lock is acquired, break out of the loop and become master.
Attempt to connect to the existing master. (e.g, create a MessageChannel and send one of the MessagePorts to a BroadcastChannel, packed in a connect request)
Wait for some amount of time (1 second?) for the connection to complete. If the connection completes, break out of a loop and become a worker with a connection to the master.
If the wait times out, loop to step 1.

Resilience to masters getting stuck (i.e. in a deadlock) can be added with the following measures:

Add timeouts to requests sent to masters. If a worker fails to get an answer in x seconds, go back to the master election stage. This assumes idempotent requests. Effectively, wrap the earlier system description (master election + request forwarding) in a loop.
Add a large timeout for the entire loop above. If a worker isn't able to get help from the master in (say) 60 seconds, it attempts to steal the lock from the current master and become a master.

@ralphch0: Sorry if this is completely disconnected from what you're considering. I need a use case to reason about.

Now that I've laid out a use case, we can start thinking about how the stealing API would work out there. Specifically, I'm concerned with how a master would detect that a lock was stolen from it. The code below is a sketch for a master's request-handling loop.

let lock_stolen = false;
try {
  await navigator.locks.acquire('master-lock', async () => {
    while (!lock_stolen) {
      // process request from queue
    }
  });
} catch(e) {
  if (IsLockStolenAbortError(e)) {
    lock_stolen = true;
  }
}

First, this looks a bit un-intuitive. Second, I'd have to pepper my code with lock_stolen checks (ideally, there'd be a check before every network / database request). Third, there's still potential for races.

Based on what I know so far, I'd still like to see lock-stealing terminate the worker(s) that held the lock, if any.

inexorabletash · 2018-01-11T01:01:53Z

steal being true by default was a typo; and is fixed - thanks for catching that!

ralphch0 · 2018-01-27T00:20:01Z

Just wondering why we're also dropping other lock requests? If the intent is only to steal the lock, then this might cause an unintended side-effect.

I can also maybe see this feature used intelligently by some sites to transfer the ownership of a lock from one tab to another, without affecting any other pages waiting for the lock.

inexorabletash · 2018-01-27T00:38:47Z

Keeping existing requests is doable.

I don't have a great intuition here. Dropping everything seemed straightforward, but dropping only held locks seems simpler.

Aside: what would {mode: 'shared', steal: true} mean if the lock is already held as shared (or more generally: if already grantable).

inexorabletash · 2018-01-27T00:40:26Z

And should {steal: true, ifAvailable: true} be an error?

ralphch0 · 2018-01-27T00:48:01Z

I think in both cases it should succeed and steal as usual. It's not clear how these combinations would be useful, but none of these options are mutually conflicting.

inexorabletash · 2018-01-27T00:59:52Z

Thanks. I'll rework the spec and my implementation (just finished a first pass) ASAP next week.

inexorabletash · 2018-01-30T19:07:17Z

Proto-spec updated to leave the requests intact. The "steal" request jumps to the top of the request queue and is granted immediately.

The {steal: true, ifAvailable: true} combination ends up being slightly odd (in both spec and impl), since steal will ensure the request is granted and therefore ifAvailable is effectively ignored.

The {mode: 'shared', steal: true} combination causes all previous held locks — shared or exclusive — to be dropped, and the new lock is shared meaning any pending shared requests might be granted. A bit odd, especially if it was held as shared already, but consistent.

ralphch0 · 2018-01-30T19:25:07Z

Ah, I can see now how ifAvailable & steal are kind of confusing together. And since ifAvailable is completely ignored, maybe it's better to disallow this combination and force devs fix their options?

For steal and shared, I also see the gotcha there. We could either clarify this in the spec, or disallow if we don't think this is likely to be useful.

pwnall · 2018-01-31T01:49:44Z

(general note: I still think stealing is a footgun without something like worker termination)

I like the new approach of throwing if both steal and ifAvailable are specified. Seems like having both true can only happen in case of developer confusion, and correcting perception early in the application development cycle is good.

I find the behavior for stealing shared locks confusing. It seems to me that "steal" means "I want this lock right now without waiting, and I'm willing to step on anyone who gets in my way". In the current context of S/X locks, I would have guessed that stealing an X lock should break the lock for all current holders, while stealing an S lock would only break the lock if it's held in X mode. In a more general model, stealing a lock would break all current holders whose mode conflicts with the stealing request's mode.

Am I confusing "steal" with "preempt"? It seems to me that the current behavior of booting all S lock holders when stealing an S lock is closer to "purge" than to "steal".

ralphch0 · 2018-01-31T18:57:38Z

I generally don't have strong opinions about shared locks, simply because we currently don't have a scenario where we would use them. So if we decide to restrict this combination, I would not object. There is also a way to simulate this behavior:
Code can take an exclusive lock with steal = true, and then immediately release it and then take a shared lock.

While on the subject, "ifAvailable" and "signal" (abort controller) also seem to fall into the same category as "ifAvailable" & "steal", so maybe that should error out as well.

inexorabletash · 2018-01-31T19:23:53Z

Code can take an exclusive lock with steal = true, and then immediately release it and then take a shared lock.

Not if there's already a queued request for an exclusive lock. That would be run ahead of the request for the shared lock.

While on the subject, "ifAvailable" and "signal" (abort controller) also seem to fall into the same category as "ifAvailable" & "steal", so maybe that should error out as well.

Good catch. BTW, I pinged the folks working on the abort stuff; sounds like that code might be in to Blink next week.

inexorabletash · 2018-01-31T19:34:31Z

Hrm... and signal + steal is also silly:

ifAvailable + steal - steal would always make it available.
ifAvailable + signal - the availability check/grant would be processed before the signal could change the outcome
steal + signal - the steal would be processed before the signal could change behavior

I'll file another bug about signal edge cases.

inexorabletash · 2018-01-31T19:39:32Z

#34 for signal racing.

#23 (comment)

ralphch0 · 2018-01-31T19:49:17Z

Not if there's already a queued request for an exclusive lock. That would be run ahead of the request for the shared lock.

Good point.

inexorabletash · 2018-02-02T19:53:57Z

Based on offline conversations w/ @pwnall, making steal + shared an error. Easier to forbid now and allow later than the converse.

inexorabletash · 2018-02-08T00:09:27Z

Since we seem to have settled on a syntax and semantics, resolving this.

I'm still sympathetic to the argument that breaking locks is crazy, but let's open new bugs for that. :)

inexorabletash added the api specifics label Nov 21, 2017

inexorabletash mentioned this issue Dec 20, 2017

Diagnostics/Debuggability/Management #10

Closed

inexorabletash added a commit that referenced this issue Dec 21, 2017

Define steal option and behavior. For #23

8064fb9

inexorabletash added a commit that referenced this issue Jan 31, 2018

Make signal + (steal || ifAvailable) disallowed

16e62a1

#23 (comment)

inexorabletash added this to the v1 milestone Feb 7, 2018

inexorabletash closed this as completed Feb 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

forceTransfer/steal lock request option instead of forceRelease() ? #23

forceTransfer/steal lock request option instead of forceRelease() ? #23

ralphch0 commented Nov 20, 2017

inexorabletash commented Nov 20, 2017

ralphch0 commented Nov 20, 2017

inexorabletash commented Nov 21, 2017

ralphch0 commented Nov 21, 2017

pwnall commented Nov 22, 2017

inexorabletash commented Jan 6, 2018

pwnall commented Jan 11, 2018

inexorabletash commented Jan 11, 2018

ralphch0 commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

ralphch0 commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

inexorabletash commented Jan 30, 2018

ralphch0 commented Jan 30, 2018

pwnall commented Jan 31, 2018

ralphch0 commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

ralphch0 commented Jan 31, 2018 •

edited

Loading

inexorabletash commented Feb 2, 2018

inexorabletash commented Feb 8, 2018

forceTransfer/steal lock request option instead of forceRelease() ? #23

forceTransfer/steal lock request option instead of forceRelease() ? #23

Comments

ralphch0 commented Nov 20, 2017

inexorabletash commented Nov 20, 2017

ralphch0 commented Nov 20, 2017

inexorabletash commented Nov 21, 2017

ralphch0 commented Nov 21, 2017

pwnall commented Nov 22, 2017

inexorabletash commented Jan 6, 2018

pwnall commented Jan 11, 2018

inexorabletash commented Jan 11, 2018

ralphch0 commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

ralphch0 commented Jan 27, 2018

inexorabletash commented Jan 27, 2018

inexorabletash commented Jan 30, 2018

ralphch0 commented Jan 30, 2018

pwnall commented Jan 31, 2018

ralphch0 commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

inexorabletash commented Jan 31, 2018

ralphch0 commented Jan 31, 2018 • edited Loading

inexorabletash commented Feb 2, 2018

inexorabletash commented Feb 8, 2018

ralphch0 commented Jan 31, 2018 •

edited

Loading