-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AsyncLocalStorage kills 97% of performance in an async
environment
#34493
Comments
I've tried to run this test on the latest
So, it's slightly better than the reported result, yet ALS (or I've also tried to run static void FastPromiseHook(PromiseHookType type, Local<Promise> promise,
Local<Value> parent) {
Local<Context> context = promise->CreationContext();
Environment* env = Environment::GetCurrent(context);
if (env == nullptr) return;
env->new_async_id();
} and got the following result:
As you may see in this benchmark with get around 50% of the BTW I've also experimented with internal fields (a couple of them) instead of symbol properties and got no significant performance improvements. Mainly that's because of the fact that with this approach the hook has to do more operations on average when compared with the current implementation. |
Well, only slightly better. with ALS enabled is 8.9% of with ALS disabled, which means that 91% is lost, and not 97% (which was the consistent number I got on all machines). If there was a way to only get a hook installed for a case of an |
Yes and that aligns with the benchmark results I got for #34523. There is still a lot to wish for in terms of the PromiseHook overhead. However, in real-world applications the overhead will be much smaller.
That's what PromiseHook used by |
I'd consider this is a somewhat unrealistic micro-benchmark as it's awaiting a non-promise in a loop which is naturally going to have a lot more machinery in the barrier marking and tracking then in the non-async function call it is giving to the await. I did a few runs locally, replacing the function with some more realistic variations:
Most promises would be I/O related and would sit somewhere between the last two tests. It's fair to want the performance to be better, but the path your code tested is actually quite different, according to the spec, from what an actual typical promise-based await would look like. I don't think it's too helpful to rely on micro-benchmarks, especially with a system this complex. I do think there may be optimization potential with |
I agree that micro-benchmarks do not present the whole picture. So if the hit in the micro benchmark was 20%, that would be acceptable. But that's not the case. |
As I said, what you were testing was not async. The performance difference would almost never look like that in reality because an actual async promise has a bunch more overhead than the non-async test function you awaited in your test code. |
So you're saying the numbers I've posted are not bad enough? |
@Qard thanks for these experiments. I also agree that ALS is meant to be used in real-world applications and it makes sense to benchmark it in a different way. Micro-benchmarks are certainly valuable for various optimizations, however web applications do more than what's done in the original snippet and ALS' overhead will be much lower for them. @danielgindi however, this doesn't mean that |
They are bad. However it is very rare that By using a const { AsyncLocalStorage } = require('async_hooks');
const asyncLocalStorage = new AsyncLocalStorage();
const { promisify } = require('util')
let fn = promisify(setTimeout).bind(null, 2);
let runWithExpiry = async (expiry, fn) => {
let iterations = 0;
while (Date.now() < expiry) {
await fn();
iterations++;
}
return iterations;
};
(async () => {
console.log(`Performed ${await runWithExpiry(Date.now() + 100, fn)} iterations to warmup`);
asyncLocalStorage.run({}, () => {});
let withAls = await runWithExpiry(Date.now() + 10000, fn);
console.log(`Performed ${withAls} iterations (with ALS enabled)`);
asyncLocalStorage.disable();
let withoutAls = await runWithExpiry(Date.now() + 10000, fn);
console.log(`Performed ${withoutAls} iterations (with ALS disabled)`);
console.log('ALS penalty: ' + Math.round((1 - (withAls / withoutAls)) * 10000) / 100 + '%');
})(); We get:
This is totally acceptable and well within range. |
@mcollina I get that. Although some async operations may take less than 2ms, still being async and taking advantage of libuv threads. I'm going to do a wet test today on a large scale application that has all imaginable scenarios implemented, and that should service hundreds of millions of users. I'll let you guys know how it goes :-) |
Since #36394 has landed and got shipped in v16.2.0, I tried running the original script with that release. Here is the result:
From 96.8% down to 49.42% here. The script from #34493 (comment) (the 2ms
From 4.23% down to 1.75% here. The improvement is quite dramatic (thanks to @Qard) especially considering that the original script does almost nothing on the hot path, but creating promises. I think this issue can be now closed. |
This is great! Real dramatic improvement. 😀 |
What steps will reproduce the bug?
Output:
How often does it reproduce? Is there a required condition?
In any case that
await
is used.What is the expected behavior?
Should be around 10% penalty.
What do you see instead?
I see 97% reduction in performance.
Additional information
I've played in the past with a Zone polyfill of my own (
zone-polyfill
), and at the beginning I tried using the stack traces that's generated in latest V8 versions, where they keep their context afterawait
calls.Combining that with the basic technique to track context, I was able to track the zone but due to the string nature of the stack traces - I had to map them in the memory but there was no
WeakMap
available. Using a NAN solution I got to about 15% penalty.Now assuming that these capabilities are available on the c++ side and much more natively, with the option to directly leverage the compiler instead of generating stack traces to query for info - I'd assume max 10% penalty with a native implementation.
The text was updated successfully, but these errors were encountered: