Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of async context tracking #208

Merged
merged 1 commit into from
Jan 12, 2023

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Dec 6, 2022

Initial experimental async context tracking (internals only). This is intended to set the stage for an eventual implementation of AsyncLocalStorage (and later the proposed TC-39 AsyncContext proposal that is based on AsyncLocalStorage.)

This initial PR seeks only to add the internal book keeping for the async context, including implementing the basic v8 promise hooks, as well as attaching the async context tracking to setTimeout, setInterval, and queueMicrotask.

This PR does not implements the AsyncLocalStorage API as well.

Note that we eventually expect V8 to implement these core mechanisms as part of the AsyncContext proposal implementation as that moves forward in TC-39, but that will be some time off and there's no guarantee that proposal will advance through TC-39. The goal of this implementation here is to remain simple. It is modeled after Node.js' approach which is known to work very well.

For the AsyncLocalStorage piece... the API itself is not yet exposed to JavaScript to use. That'll come later. When it does come, users will use an import or require to utilize it:

// We are *not* implementing the full async_hooks API, just a subset...
import * as async_hooks from 'node:async_hooks';

const als = new async_hooks.AsyncLocalStorage();

export default {
  async fetch(request, ctx, env) {
    return als.run(123, async () => {
      // Do async stuff.
      console.log(als.getStore()); // 123
      return new Response("ok");
    });
  }
};

The ALS implementation will support the als.run(...), als.exit(...), and als.getStore() methods. The als.enterWith() and als.disable() methods will not be implemented.

TODOs

  • The implementation of AsyncLocalStorage here definitely is not yet correct in that each ALS instance does not yet maintain it's own storage context. Specifically, if we look at Node.js' implementation, multiple ALS instances can safely exist simultaneously without risk of interfering with each other's data. The implementation here currently does not have that protection. That needs to be addressed before is complete. Each ALS instance now maintains its own distinct storage cell.
  • We should also implement AsyncResource to allow for custom contexts
  • Currently we always enter the async context in run even if we are already in it. We can optimize things by checking to see if we're already within the context and if the given store is already the stored value.
  • There's currently a challenge when a tracked promise is gc'd while it's AsyncResource is still on the stack causing a UAF.
  • We need to identify which internal Things should be async resources. Currently this instruments promises, timers, and microtasks:
    • unhandledrejection/rejectionhandled events will pick up the async context of their associated promises.
    • HtmlRewriter -- Looks like there is a non-trivial performance issue enabling this with HtmlRewriter as is. Will need to investigate how to make it not as costly. There's also a crash that I need to investigate when enabling this.
    • Decide if the IoContext is itself an async resource so that we can differentiate between the top level async context (at global scope) and the context within a single request. Will handle these separately in other PRS if we choose to do anything
    • EventTarget things (like WebSocket) could potentially also be async contexts. This would be inconsistent with Node.js, however, in that Node.js does not treat EventEmitter instances as async resources. Do we want to make things like WebSocket an async resource? Will handle these separately in other PRS if we choose to do anything
  • Tests
    • Basic internal workers tests
    • test bin in workerd Not critical for landing, we have ew-tests internally. May add this with a separate PR
  • Documentation
  • Performance profiling

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 2 times, most recently from 839e276 to 1ab5c3f Compare December 7, 2022 01:04
@tom-sherman
Copy link

Is there risk in shipping non-standardised APIs from a different JS environment? This feels unprecedented in Workers.

@jasnell
Copy link
Member Author

jasnell commented Dec 7, 2022

Is there risk in shipping non-standardised APIs from a different JS environment? This feels unprecedented in Workers.

Yes, definitely. We plan to ship implementations of a subset of Node.js APIs gated by a compatibility flag to explicitly opt-in to their use and requiring import/require to use them. There is certainly an amount of risk involved.

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 3 times, most recently from a74f816 to 6df0c43 Compare December 8, 2022 23:10
@jasnell jasnell marked this pull request as ready for review December 8, 2022 23:12
@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 3 times, most recently from 2103091 to fc18885 Compare December 9, 2022 16:26
@jasnell jasnell force-pushed the jsnell/async-context-tracking branch from 81ea14c to f1c48ed Compare December 12, 2022 15:07
@jasnell jasnell force-pushed the jsnell/async-context-tracking branch from 87a5e19 to 232b511 Compare December 15, 2022 23:11
@jasnell
Copy link
Member Author

jasnell commented Dec 16, 2022

Fixups were squashed, commits rebased to resolve conflicts, new commit added to propagate context to unhandledrejection error. I'll likely go ahead and remove the HtmlRewriter changes from this PR so that we can further evaluate how that should be enabled. Done... the HtmlRewriter changes have been pulled out of this. We'll figure out the right model for context propagation for that separately.

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 2 times, most recently from f2cdbb0 to e3fe065 Compare December 16, 2022 14:40
Copy link
Collaborator

@harrishancock harrishancock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about halfway done reviewing.

src/workerd/jsg/async-context.h Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/api/node/async-hooks.h Outdated Show resolved Hide resolved
src/workerd/api/node/async-hooks.h Show resolved Hide resolved
src/workerd/api/node/async-hooks.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@harrishancock harrishancock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished first review pass.

src/workerd/api/node/async-hooks.c++ Outdated Show resolved Hide resolved
// Serves the same purpose as attach() in KJ things. Ensures that we hold a reference
// to the AsyncResource object wrapper for as long as the function is held.
jsg::check(bound->SetPrivate(js.v8Isolate->GetCurrentContext(),
v8::Private::ForApi(js.v8Isolate, jsg::v8StrIntern(js.v8Isolate, "ref")),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that "ref" is an awfully generic name for this. I suppose the likelihood of collisions in our usage of private symbols is low since we don't really use them anywhere yet (except here), but I wonder if we should be namespacing them better in some way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if we can come up with a reasonable convention now I'm happy to use it, otherwise it is a potential problem that we can solve later.

src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/api/node/async-hooks.h Outdated Show resolved Hide resolved
src/workerd/jsg/jsg.h Outdated Show resolved Hide resolved
Copy link
Member

@kentonv kentonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the overall design here is something like this:

  • An AsyncResource stores the current mapping of ASL IDs to values.
  • At any given time there is a "current" AsyncResource, which is used to implement als.getStore().
  • als.run() doesn't create a new map. It actually modifies the map in the current resource's map, then runs the callback, then reverts the change to the resource.
  • When the context needs to be inherited into an async callback (e.g. a promise continuation), this is always accomplished by making a whole new AsyncResource, which makes a full copy of the whole current storage map. The continuation can't just hold on to the current resource, since the value set by als.run() would be reverted before the continuation actually runs.

I would have expected a design more like this:

  • The mapping of ALS keys to values is held by an "async context frame".
  • als.run() creates a new frame, inheriting the contents of the current frame, but changing the one value. The callback runs with the new frame set "current". Upon return, the thread reverts to the previous frame being current.
  • Promise continuations inherit the current frame from the point where .then() is called. This simply references the existing frame, there is no need to allocate a new object or copy anything, since each frame is immutable once it is constructed.

This approach would be much more efficient, as it doesn't require any allocations except when the ALS API is actually used.

(Note that my comments below are more tactical and mostly written before I really understood the overall design as discussed above.)

src/workerd/jsg/jsg.h Outdated Show resolved Hide resolved
src/workerd/jsg/setup.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
@jasnell
Copy link
Member Author

jasnell commented Dec 24, 2022

If I understand correctly, the overall design here is something like this:

  • An AsyncResource stores the current mapping of ASL IDs to values.

I would recommend not getting too hung up on the naming of this construct. If it makes more sense to you calling it an "async context frame" then ok. Given that this is part of the Node.js compatibility bit, I prefer terminology and a model that aligns with Node.js but the naming isn't something I consider to be important at all.

That said, yes, this "async context frame" (or AsyncResource in Node.js terminology) stores a mapping of AsyncLocalStorage key to value.

In the Node.js model, every async context frame / AsyncResource has a numeric id (its asyncId) and references the id of the async context frame / AsyncResource that "triggered" it (its triggerAsyncId).

  • At any given time there is a "current" AsyncResource, which is used to implement als.getStore().

Put als.getStore() aside for a second. Yes, at any given time there is a current async context frame (or AsyncResource) in Node.js terminology. These are implemented as a stack.

A single async context frame / AsyncResource may be entered, pushing it to the top of stack, making it current.

While a particular async context frame / AsyncResource is current, als.getStore() retrieves the value associated with als's key in that async context frame / AsyncResource.

  • als.run() doesn't create a new map. It actually modifies the map in the current resource's map, then runs the callback, then reverts the change to the resource.

Correct. This is exactly what we want.

  • When the context needs to be inherited into an async callback (e.g. a promise continuation), this is always accomplished by making a whole new AsyncResource, which makes a full copy of the whole current storage map...

Whenever a new async context frame / AsyncResource is created, it should inherit a copy of the ALS storage map of the current async context frame.

So, for instance, an individual Promise is an async context frame / AsyncResource.

Let's assume that the current async context frame has asyncId equal to 1 and triggerAsyncId 0.

I create a new Promise using const p = new Promise((r) => res = r). This new promise will have a new asyncId (let's say 2) with triggerAsyncId equal to 1.

Then I create another new Promise using const p2 = p.then(() => { console.log(executionAsyncId(), triggerAsyncId()) }). This new promise will have a new asyncId (let's say 3) with triggerAsyncId equal to 2.

When we resolve the first promise by calling res() from within asyncId 1, the console.log statement inside the promise continuation will print 3 and triggerAsyncId() will print 2, which is exactly what we would expect.

Now, whether or not the current implementation correctly captures this model, I'm not entirely sure because I haven't yet implemented all of the tests to verify. It's entirely possible that some of the details in the current implementation are wrong and will need to verified still. For example, given the following...

let r;
const p = als.run(123, () => new Promise((res) => r = res));
p2 = p.then(() => { console.log(als.getStore()); }); 
r();

To match Node.js' model, the console.log should print undefined because p2 inherits from the ALS storage map that was current when that Promise was created.

The continuation can't just hold on to the current resource, since the value set by als.run() would be reverted before the continuation actually runs.

Again, als.run() just sets the value associated with the ALS key for a period of time. Any async context frame / AsyncResource created within that period of time will inherit that value. So, for instance,

const als = new AsyncLocalStorage();
let res;
const p0 = new Promise((r) => res = r);
const p1 = als.run(123, () => p0.then(() => { console.log(als.getStore()) }));
const p2 = als.run(321, () => p0.then(() => { console.log(als.getStore()) }));
res();

Here, in Node.js' model there are five distinct async context frames / AsyncResources:

  • The root (0)
  • The top level (1)
  • The first promise p0
  • The second promise p1
  • The third promise p2

The value associated with als inside 1 is set to 123 when p1 is created.
The value associated with als inside 1 is set to 321 when p2 is created.

I would have expected a design more like this:

  • The mapping of ALS keys to values is held by an "async context frame".

Again, I wouldn't get too hung up on the naming. What you call async context frame IS AsyncResource.

  • als.run() creates a new frame, inheriting the contents of the current frame, but changing the one value. The callback runs with the new frame set "current". Upon return, the thread reverts to the previous frame being current.

This is incorrect and is not what we want. als.run() neither creates a new frame nor enters one. It only sets the value associated with als's key in the current frame for the duration of the callback and that is all.

  • Promise continuations inherit the current frame from the point where .then() is called. This simply references the existing frame, there is no need to allocate a new object or copy anything, since each frame is immutable once it is constructed.

Again, a new promise is a new async context frame. Whenever a new async context frame is created, it inherits a copy of the als storage of the current async context frame. The current implementation here is written to be functional and not necessarily performant. The goal is to first match the correct behavior as modeled by Node.js (which I'm absolutely not yet convinced it currently does) and then to make it more performant afterwards. I'll work on optimizing performance once I'm sure the basic model is correct.

To summarize...

  1. What you call an async context frame is what Node.js calls an AsyncResource. Given that this is being implemented as part of the Node.js compat work, my preference is to use Node.js' terminology.
  2. There are many different types of AsyncResources: promises, timers, requests to read from a file that will invoke a later callback , etc.
  3. An AsyncResource can (and often will) be a JavaScript object (e.g. a promise, or any arbitrary user-created type).
  4. At any given time, there is a current AsyncResource.
  5. Any new AsyncResource created inherits a copy of the storage map of the current AsyncResource at the moment it is created; and must do so in a way that does not prevent the current AsyncResource from being garbage collected later.
  6. Any single AsyncResource can become the current ("entered") multiple times.

Item 5 is the key piece we need to verify here and optimize for. Verification will come as the set of tests are expanded (as I said, I'm not convinced this impl is quite right yet with regards to the promise hooks). Optimization will come after things are verified.

To illustrate point 6, consider the example:

als.run(123, () => {
  setInterval(() => {
    console.log(als.getStore())
  }, 1000);
});

als.run() temporarily sets the current stored value to 123. setInterval(...) creates the AsyncResource (the timer), which inherits the value 123 associated with als. Whenever the timer callback is triggered, we "enter" the AsyncResource (make it current).

And this example:

const als = new AsyncLocalStorage();

const fn = als.run(123, () => AsyncResource.bind(() => console.log(als.getStore())));

als.run(321, fn);  // prints 123
als.run(321, fn);  // prints 123
als.run(321, fn);  // prints 123
als.run(321, fn);  // prints 123

The function returned by AsyncResource.bind(...) is a single AsyncResource. Each call to als.run() sets the current store value when it is called, but does not create a new frame. Instead, when fn is called, it becomes the current frame.

@kentonv
Copy link
Member

kentonv commented Dec 30, 2022

James,

I think you misunderstood what I was trying to say. I am suggesting an alternate implementation which achieves the same semantics but will be much more efficient.

Your current implementation creates a new AsyncResource, and therefore creates a whole new copy of the entire ALS map, for every .then() or await. This seems excessively inefficient to me, I have a hard time imagining it performing well.

I wouldn't get too hung up on the naming. What you call async context frame IS AsyncResource.

No it isn't. I'm trying to describe a different implementation, which works differently. It's not a difference in naming, it's actually a different design.

als.run() neither creates a new frame nor enters one.

Yes it does. This is a key difference in the implementation I'm proposing. The application-visible semantics are the same, but the approach avoids the need to do any additional allocation on .then()/await.

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch from 170dbda to fa30035 Compare January 3, 2023 16:13
@jasnell jasnell force-pushed the jsnell/async-context-tracking branch from d45c248 to 308e2fd Compare January 3, 2023 23:16
@jasnell
Copy link
Member Author

jasnell commented Jan 3, 2023

@harrishancock @kentonv ... updated to address review feedback. Please take a look.

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch from 1be71fa to 8b04de0 Compare January 4, 2023 21:34
Copy link
Member

@kentonv kentonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking a lot nicer now. Not done reviewing yet but here's a few comments.

src/workerd/jsg/async-context.h Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Show resolved Hide resolved
src/workerd/jsg/async-context.h Show resolved Hide resolved
src/workerd/io/worker.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
JSG_NESTED_TYPE(AsyncResource);

if (flags.getNodeJsCompat()) {
JSG_TS_ROOT();
Copy link
Contributor

@mrbbot mrbbot Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will put AsyncLocalStorage and AsyncResource in the global typing environment, whereas we want them in a module like (assuming they're accessible via named exports):

declare module "node:async_hooks" {
  export class AsyncLocalStorage { ... }
  export class AsyncResource { ... }
}

This will need additional work in the type generation scripts.

I think we'll either want to add something like JSG_TS_MODULE("node:async_hooks") which acts like JSG_TS_ROOT(), but puts the visited types in a declare module, or we could try call registerNodeJsCompatModules in api-encoder.

I'm happy to contribute this in a follow-up PR. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built-ins using a JSG object do not yet supports named exports. Everything is imported via the default currently... e.g.

import { default as async_hooks } from 'node:async_hooks';

I'll be hoping to get to named export support soon (hopefully next week).

src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.h Outdated Show resolved Hide resolved
src/workerd/io/worker.c++ Outdated Show resolved Hide resolved
Copy link
Member

@kentonv kentonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I actually managed to read everything this time.

src/workerd/api/node/async-hooks.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/api/node/async-hooks.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/setup.h Outdated Show resolved Hide resolved
src/workerd/api/global-scope.c++ Outdated Show resolved Hide resolved
Copy link
Collaborator

@mikea mikea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at everything besides async-context and async-hooks. Will take a bit to understand those.

@@ -0,0 +1,23 @@
import { default as async_hooks } from 'node:async_hooks';
const { AsyncLocalStorage, AsyncResource } = async_hooks;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. doesn't import AsyncLocalStorage, AsyncResource from "" work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would if these were typescript modules. The built-in using a backing jsg::Object does not yet support named exports.

@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2023

@kentonv ... I'd like to get this squashed down and rebased on current main. Could I ask you to take a look at the most recent fixup commit that addresses your most recent set of comments first?

Copy link
Member

@kentonv kentonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer.

src/workerd/jsg/jsg.h Outdated Show resolved Hide resolved
src/workerd/jsg/setup.h Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/jsg/async-context.c++ Outdated Show resolved Hide resolved
src/workerd/api/global-scope.c++ Outdated Show resolved Hide resolved
src/workerd/api/node/async-hooks.c++ Outdated Show resolved Hide resolved
@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 2 times, most recently from f2c2598 to 2eee21f Compare January 11, 2023 20:11
Copy link
Member

@kentonv kentonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would still be nice to add a test for the case of a wrapped function throwing an exception but the implementation looks good to me at this point.

@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2023

Would still be nice to add a test for the case of a wrapped function throwing an exception...

That's added in the internal test.

@jasnell jasnell force-pushed the jsnell/async-context-tracking branch 2 times, most recently from de260f3 to a97d62f Compare January 11, 2023 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants