-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster sync API via worker_thread #590
Comments
Oh wow what a hack. I love it. It looks like you're using the I think this could be good to add to esbuild, assuming benchmarks show that it's a big win (I expect them to). I want to keep esbuild dependency-free so it should be implemented as a part of esbuild itself instead of as a library that esbuild uses. This shouldn't require much additional code to support so including it in esbuild is fine. It definitely shouldn't be implemented such that it requires WebAssembly though. The WebAssembly version is up to 10x slower than the native version for a variety of reasons. Instead, it should be implemented such that it works equally well with either the native version or the WebAssembly version. Both versions use the same TypeScript library code (they just create a different executable as a child process) so this should naturally happen without any extra work. This could be used to add sync API support to the browser too. However, I think we shouldn't do that at this time. We should keep this node-only for now. There's a big push in the browser community to avoid blocking the main thread and having this be available would likely result in people using it because it's convenient without understanding the consequences. This could be used to add support for plugins to the I think I'd prefer if this were exposed as a part of the existing
Keep in mind that working on this won't touch any Go code. The natural way to implement this is all in TypeScript. I'm happy to accept a PR for this, and I'm also happy to implement this myself if you were hoping to work on Go code. |
Is there any reason to suspect a service will be juggling async and sync operations during its lifetime? If not, then maybe |
Thanks for the details response. FWIW, I learned about this ability to block the main thread from some of the core node team in an ESM modules design thread. They experimented with it for blocking I believe Atomics are a part of the JS standard, or browser standard, right? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics My impression from the work I've done so far, is that they need to be robust on all platforms because they're a spec-ed part of the web standards. You're concerned that old node versions will be buggy? It could be supported where available and fallback to process spawning on incompatible platforms. Though the performance disparity between new and ancient node versions might lead to confused bug reports. Have you thought about exposing functions via
Truth be told, I'm most interested in learning about the language's internals, the runtime, how it compares to rust, how it interops with the web, stuff like that. So this is an interesting problem space, even if it only involves writing TS. Given what you said about dependencies, probably best if you implement in esbuild and I'll focus on turning what I have into a reusable library. What is the motivation for keeping esbuild dependency free? Do you bundle third-party components, or is the goal to control every line of code? Regardless, I'm very excited to start using |
This can get messy because an argument to the factory affects the type signatures of the API elsewhere. |
The type signatures are easily solved :) Just putting it out there since it may reduce the overall code needed within the service -- though probably not. Either way, just a suggestion that came to mind when reading over the thread. Thanks for sharing! |
I see what you mean, but I still prefer the approach I described. It's more a question of API design. First of all, I see the
I'm concerned that support for the atomics API might have, say, landed in later node versions for MIPS processors than they did for for x64 processors. But it's not a big deal if that happened as long as it's supported now.
There is some experimental work along those lines here: #248. Right now esbuild is highly portable and easy to build. I see this as a plus. Integrating it with node adds complexity to the build process and makes it less portable, so doing this is not without drawbacks. And esbuild's primary purpose is being a bundler. The transform library use case is secondary. So I have some reservations about taking esbuild in that direction.
Sounds good! No worries. I'll add this in the next release then. The goal is pretty much to control every line of code (independent of code from the Go standard library obviously). I have several motivations:
I realize this approach is somewhat unusual, but it's not unheard of. The Rome Toolchain also deliberately has zero dependencies. They say "This allows us to develop faster and provide a more cohesive experience by integrating internal libraries more tightly and sharing concepts and abstractions. There always exist opportunities to have a better experience by having something purpose-built." |
Wait, that wouldn't work. Plugins are code and code can't be transferred between threads. |
Looks like it's 1.5x to 15x faster according to a quick benchmark. I tested |
I discovered Also: I'm hitting some intermittent CI hangs on Windows, so this approach may have problems on Windows. Either that or there's something iffy about my implementation. I'm going to release this approach as initially opt-in to avoid breaking people. If this approach works out then it'd be great for it to just always be enabled. |
Nice, hopefully I can try this out soon. I think I found the (potential) bug in your code. It's possible to https://github.com/cspotcode/typescript-http-support/blob/main/packages/sync_worker/src/worker.ts#L49 |
Thanks for pointing that out. I bet that's what happened. Will fix. I think there's actually potentially another race condition, one which is present in your library as well. If there are two synchronous messages, the call to I'm going to try to describe it. Given this condensed form of the code: // Main thread
/* A */ const signalBefore = Atomics.load(sharedBufferView, 0);
/* B */ mainPort.postMessage(...);
/* C */ Atomics.wait(sharedBufferView, 0, signalBefore);
/* D */ receiveMessageOnPort(mainPort);
// Worker thread
/* X */ workerPort.postMessage(...)
/* Y */ Atomics.add(sharedBufferView, 0, 1);
/* Z */ Atomics.notify(sharedBufferView, 0); Say the following sequence of events happens:
A fix for this is to use a separate shared buffer for each request. |
Awesome! Funnily enough, the Atomics approach is exactly the same one I used for https://github.com/lambci/sync-threads Source here: https://github.com/lambci/sync-threads/blob/master/src/index.js I think it's probably a little too simple for your use case here though |
The main difference I can see is that you're using |
Rather than using a separate buffer and channel, I would wrap Atomics.wait
in a loop that waits for the value to be incremented. So being woken up is
not sufficient to trigger receiving a message. If the value has not
incremented, the thread will Atomics.wait again.
…On Sat, Dec 12, 2020, 8:30 AM Michael Hart ***@***.***> wrote:
The main difference I can see is that you're using postMessage to
communicate data back and forth whereas I'm using the SharedArrayBuffer
itself. Using the SharedArrayBuffer might be faster because it's one less
copy, but it does require you to get the buffer size right.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#590 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC35OCOPTQ64AUCT53B6U3SUNWBHANCNFSM4UVEN5BQ>
.
|
FYI I am reverting the introduction of |
Ah bummer. I filed this bug report with node, which may be related to a root cause. |
The same message that gives the worker its https://github.com/nodejs/node/blob/master/lib/internal/main/worker_thread.js#L98-L131 I'm not saying these hacks and workarounds should be released to production, but it's useful to understand the problem more clearly. There may be an appetite for this feature to be available under an |
I have been playing around with a babel-node inspired approach that uses esbuild for running scripts in my codebase. In an effort to get it to work with source maps, yarn pnp and support for Here are some stats that might interest you @evanw and that has kept me on version 0.8.2 for a while:
|
I hope it's ok to keep peppering relevant information here. If it feels spammy, I can stop and consolidate it somewhere. One tl;dr from nodejs/node#36531 is that worker_threads inherit I also had a thought: users may want to consume the esbuild API from within their own worker thread, so checking nodejs/node#36531 means that |
Thanks for the heads up. This is clearly a performance regression in 0.8.4. I'll get it fixed in the next release regardless of what ends up happening with Edit: I just did a similar performance test with a large code base I have access to. Here are my performance results:
Looks like using yarn will slow down your code, presumably due to the custom path resolution. Just a heads up in case you're trying to get your code to go as fast as possible. |
I'm brand new to esbuild, but am looking for excuses to cut my teeth on more golang projects.
I see from various issues and pull requests that esbuild's sync APIs spawn a new external process for each transformation call, because this is the only way to synchronously pipe data to the external process and wait for a response.
To reduce the per-transformation overhead, I'm wondering if either of these solutions are a) possible and b) something I can work on as a pull request:
worker_thread
to perform async communication with an esbuild process, blocking the main thread. I've implemented code that blocks the main thread while doing async stuff in a worker, so I could publish that as a reusable library and use it in esbuild. (here is my code, specifically thesync_worker
package: https://github.com/cspotcode/typescript-http-support) I imagine usage would look like this pseudocode. (forgive me if this is obvious, as I say I'm brand new to esbuild)EDIT I see there is already a webassembly package. In that case, I can implement option (1) using the WebAssembly build.
The text was updated successfully, but these errors were encountered: