Zones language proposal and Node.js #4

rvagg · 2016-06-09T13:33:00Z

Zones is a JavaScript language feature proposal that:

is in stage 0 of the TC39 process, and is getting formalized and fleshed out in preparation for further advancement. It was originally presented at the January 2016 TC39 meeting.

They

are meant to help with the problems of writing asynchronous code. They are designed as a primitive for context propagation across multiple logically-connected async operations.

This is an area that we have a lot of opinions as a group and it would be good to focus some effort here on identifying Node.js-specific concerns as this proposal makes its way through TC39. From the discussion so far I doubt very much whether we will come to a unified position, but the discussion will be helpful for identifying the major areas where there are concerns or whether there is strong interest.

There is a lot of discussion already at the zones repo, including this one on adding error handling that is causing some consternation because of the overlap with Domains and the PTSD that many collaborators have with that.
Trevor did a nice postmortem of Domains if you need a reminder about why it was deprecated: https://github.com/nodejs/node/blob/master/doc/topics/domain-postmortem.md
This is a discussion that arises out of calls for in-built execution context hook functionality: continuation-local-storage without domains? diagnostics#57 (think continuation-local-storage, also the opinions I dropped regarding the need for this).
The AsyncWrap EP is in progress @ AsyncWrap public API proposal node-eps#18 and is worth reviewing because there is some overlap in scope.

Perhaps we should start by discussing:

Key areas of concern that folks have with Zones
Why this kind of functionality is important for users and what it adds that we can't already do with JavaScript and Node.js specifically

/cc @Fishrock123 @trevnorris @bmeck @ofrobots

benjamingr · 2016-06-09T14:05:16Z

Thanks for keeping track of this. I've been keeping an eye and @trevnorris has been providing excellent feedback and outlining problems.

I think it's important to point out in bold letters that zones are entirely opt in. As a platform Node can adopt them or not.

Even if Node core doesn't - frameworks like express can still use them and benefit from them and in fact a big motivator for zone inclusion is Angular 2 in the browser which is not platform aware.

That said, I think Node should keep a close eye on the proposal so that we end up with something that we can inter are in core if we want to.

bmeck · 2016-06-09T14:13:10Z

@benjamingr they have opt in entry point, they don't have a clear opt out point is one of the topics. Should express start to use them, all JS code will be within that zone transparently unless someone manually resets the zone to a different zone (yes, the proposal has a universal unescapable zone).

benjamingr · 2016-06-09T16:35:16Z

I understand, and express couldn't build error handling over it since across the "stack" there are core functions as well as user ones and they won't clean up nicely. This is why promises are nice - they don't have that problem (hides).

That's the point @trevnorris brought up in the domain postmortem and why error handling this way won't work.

What will work is things that do not modify the flow such as telemetry, metrics and tooling which is basically async_wrap.

If node is to support zones in any way we need async_wrap, not domains 2.0

trevnorris · 2016-06-10T19:30:21Z

Even if Node core doesn't - frameworks like express can still use them and benefit from them and in fact a big motivator for zone inclusion is Angular 2 in the browser which is not platform aware.

Here's what is missing. For anyone to use Zones node will have to integrate them into the API. At every I/O junction we'll have to store Zone.current and then run all callbacks in zone.run() whether or not anyone is using them. Because that's the way they are. It's all or nothing.

benjamingr · 2016-06-10T19:55:55Z

Here's what is missing. For anyone to use Zones node will have to integrate them into the API. At every I/O junction we'll have to store Zone.current and then run all callbacks in zone.run() whether or not anyone is using them. Because that's the way they are. It's all or nothing.

I believe Zone.current gets stored automatically between async invocations - that's why zones are a language proposal.

trevnorris · 2016-06-10T20:52:45Z

@benjamingr But how does v8 know there's an async invocation? Because we tell it there's one. I've had a long discussion with Domenic about this. The onus is on implementors to wrap their internals with Zone support. Here's a basic example (leaving out some unrelated bits):

fs.open = function(path, flags, mode, callback_) {
  // ...
  var req = new FSReqWrap();
  req.oncomplete = callback;
  req.zone = Zone.current;  // Store current zone for later

Then from MakeCallback we would need to run callback in zone.run().

Here's another example as it relates to the event emitter:

net.createServer((c) => {
  const z = Zone.current.fork({ name: 'foo', errorHandler: () => true });
  // Events must store the Zone in which they were called.
  z.run(() => {
    z.on('data', chunk => {});
  });
});

Now, unlike domains, the Zone propagates with any API call. So the callback passed to 'data' must run in its own applicable domain. Requiring we change our _addListener() code to be something like
this:

  listener = { listener, zone: Zone.current };
  // OR
  listener = Zone.current.wrapGuarded(listener);

ofrobots · 2016-06-10T22:28:06Z

• Why this kind of functionality is important for users and what it adds that we can't already do with JavaScript and Node.js specifically

APM is one specific use-case that would greatly benefit from async context propagation. Basically, I want to be able to track async work. AsyncWrap helps here greatly. The one place where AsyncWrap doesn't (can't) help, is the problem of what I call user-space-queuing problem. Others (@piscisaureus) call this the 'pooling problem'. It is common for various ecosystem modules to take an async callback, put it in a pool/queue, and then call them back later in a batch. Good examples of this would be mysql, mongodb, redis, bluebird, grpc.

Since the callbacks happen in 'batches', anyone tracking the async chain loses context at this point. On whose behalf is the callback being processed? The de-facto solution for this today is that every APM vendor monkey patches all the known such user-space queueing modules and insert appropriate Function.bind calls at the appropriate points to save and restore the context that they need to track. This is extremely brittle as the monkey patching may be needed on deep private internals of such modules. Semver-Minor or even server-patch level updates of these modules might break the async context tracking tools. It is unknown how many such modules exist that need monkey patching for an APM implementaiton to work correctly. It is not possible to have the full list of such modules. APM vendors also monkey patch Node-core, but that need will go away with AsyncWrap.

Another way of thinking about this would be to imagine if you were to write a long stack traces module. The stack reported by such a module would somtimes be incorrect, if the user is using mongodb, reds, mysql, grpc, etc. The stack trace generated by the callback is placed on top of the queued location instead of the original queuing location.

To solve this problem, at the end of the day, what you need is a mechanism available to the language implementation (engine), the host environment (Node), third party modules (mysql, etc.) and the user-space code to track async flow. All of these actors may be involved in async 'hops'. A solution like AsyncWrap provides the ability to track async flow for the hops that are done by the host environment, but it doesn't help module and user-space code that create their own hops. We further need to get the same mechanism available to the engine. Furthermore it would be good that the concept of async flow is shared, and works correctly, between frontend and backend code. An engine and a host environment alone cannot do this. Userspace code has to participate to make things work. It is incredibly difficult to get userspace to agree on a proper way to do things. If domains didn't have big glaring design flaws, they might have been the convention that everyone could be forced to follow.

Anyway. Those are just two use-case that I personally care about. Others may have more use-cases.

Note that other languages that expose async programming models do offer solutions to the async context propagation problem (see this research that we did a while back).

If this wasn't a real need, people would not be continually trying to come up ways to do context propagation for JavaScript (see: Domains, zone.js, StrongLoop's zone, continuation-local-storage, etc.). Note that not all of these are about error propagation. The question isn't whether the need is real but how it is going to be solved.

I appreciate that others have suffered through the flaws of domains. I do think their position and learnings are important.

Zones have use-cases beyond just error-handling. I think most of flaws of Domains have been fixed in Zones through explicit design choices. I think gulf that remains is only due to communication difficulties: see this thread. Maybe we can setup an in-person half day meeting with some of the people to hash out the details in depth and avoid a case of people talking past each other? Perhaps some time around Node Summit? Perhaps it would also be good to have more sets of eyes to have an objective look at the problem space and the proposal (/cc @chrisdickinson).

trevnorris · 2016-06-11T05:02:31Z

Regardless of the merits of Zones, the fact that it will place a performance burden on node, even for those who are not using it, is real. I put in non-trivial effort to make sure code related to domains was not executed. Now something similar is being forced upon us in a way that doesn't allow simply not executing the relevant code when it isn't being used. For example, we have the following in this very walked code path:

  if (env->in_domain())
    object->Set(env->domain_string(), env->domain_array()->Get(0));

Since users must use our API to enable a domain we are able to toggle a single bit in a C++ class when they're activated. Making this check practically a noop. I could only hope that v8 would supply something like isolate->CurrentZoneIsRoot().

Furthermore, the fact that the Zones spec is being pushed forward despite the fact that error handling hasn't been flushed out, or even written into the existing published spec, is a red flag. In the issue @ofrobots linked I've posted two challenges that represent very real usability concerns. Neither of which have been addressed. And those are only two of the most obvious. The fact that an exception propagates along .parent, and the fact that a forked Zone can define its own parent, means that exceptions can be handled in very non-obvious ways. And in ways that can prevent the author of an application from even knowing it happened. But I won't go on about this. I'll take further discussion on the other issue.

Fishrock123 · 2016-06-17T15:33:10Z

Zones seems like a bad implementation of a feature that is otherwise useful to some people. Virtually any option for CLS (continuation-local-storage) tracking induces some perf cost, and isn't useful for everyone, so having it be opt-in is very important.

As for potential error handling (always discussed in conjunction with the idea, even if not in the proposed "Spec"), it seems like more or less a worse version of Domains...

Perhaps the API is more usable than Domains (a big problem for them), but it seems to come at a universal cost, given everything requires wrapping unconditionally. I don't think it will be acceptable to tell everyone in some future version of node "oh hey everything is a bunch slower now, too bad".

Trott · 2016-10-13T03:35:46Z

If I'm not missing something, it seems that activity on Zones has died down for now. The last activity I can find on the Zones proposal repo is from June. What's more, it seems that Trevor and Bradley were active on an issue that specifically sought comments from the Node.js community. I'm going to close this and leave this note here to let people who want to get involved in the conversation know that domenic/zones#9 might be a better place, although it's been pretty quiet over there too.

In any event, if anyone feels that closing this is the wrong move, please re-open. Thanks.

Jeff-Lewis mentioned this issue Jul 25, 2016

continuation-local-storage without domains? nodejs/diagnostics#57

Closed

Trott closed this as completed Oct 13, 2016

hashseed mentioned this issue Aug 10, 2017

Call with V8 team, #3 #156

Closed

shaunwarman mentioned this issue Dec 5, 2017

Partially fix domain memory leak krakenjs/kraken-js#499

Closed

etsuo mentioned this issue Dec 31, 2017

Zonejs Integration sakuraapi/core#56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zones language proposal and Node.js #4

Zones language proposal and Node.js #4

rvagg commented Jun 9, 2016

benjamingr commented Jun 9, 2016

bmeck commented Jun 9, 2016

benjamingr commented Jun 9, 2016

trevnorris commented Jun 10, 2016

benjamingr commented Jun 10, 2016

trevnorris commented Jun 10, 2016

ofrobots commented Jun 10, 2016

trevnorris commented Jun 11, 2016

Fishrock123 commented Jun 17, 2016

Trott commented Oct 13, 2016

Zones language proposal and Node.js #4

Zones language proposal and Node.js #4

Comments

rvagg commented Jun 9, 2016

benjamingr commented Jun 9, 2016

bmeck commented Jun 9, 2016

benjamingr commented Jun 9, 2016

trevnorris commented Jun 10, 2016

benjamingr commented Jun 10, 2016

trevnorris commented Jun 10, 2016

ofrobots commented Jun 10, 2016

trevnorris commented Jun 11, 2016

Fishrock123 commented Jun 17, 2016

Trott commented Oct 13, 2016