-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow main thread to be supplanted by another #38454
Comments
This is definitely a difficult problem given the relationship of That is, if we think about the current structure as a DAG, what we currently have is:
To do what you're suggesting, we would likely need to move to a model like:
From there, we could allow a parent worker to be replaced by one of the children, but as you point out it gets complicated when we ask what happens with the other child.
There are a couple of options. (a) We could terminate orphan workers automatically, (b) We could just leave them orphaned, (c) We could transfer responsibility for them to the new parent. I think we can all agree that (b) is not a great choice. We can let the user code decide between (a) and (c). For (c), the DAG would end up as:
Obviously there are a ton of complexities wrapped up in that! The key question I would have initially is: Is replacing threads really what we need? Alternatively, we could implement the ability for a Worker Thread to transfer ownership/responsibility for a child thread to a different thread. For instance, the Node.js main process thread can load the apm implementation in a separate thread, which does its thing and spawns the instrumented application code thread. Once the application code thread is spun up, the APM thread transfers ownership of that back to the main thread then exits. This, however, obviously does not provide the requirement of the spun up worker acting like it's the main thread (no piped i/o, etc) The other possibility is whether APM's could just be written to act as alternative launchers, or embedders, of the Node.js process so that they can inject the loaders the extensive code changes that would make replacing the thread possible. I know that has its own associated complexities, however. |
Well, the problem is only the main thread can act like a main thread given how worker_threads are implemented, we could also make this just be some form of allowing worker threads to look and act exactly like the main thread? I didn't want to approach it that way since things can break if running in worker threads and they don't have the proper APIs due to limitations on worker_threads.
This would seem fine to me, but still has the problem of the application living in a worker_thread environment which doesn't exactly match the main thread environment. So, this doesn't seem viable and more than other approaches.
How would this be any more able to solve the real world issues than the existing |
Theoretically couldn't you use a cjs entrypoint that calls |
@devsnek if the loader hasn't been used yet maybe (--require can call out to import)? but it also wouldn't solve for ESM entrypoints. |
As I said, it has it's own complexities ;-) ... there's not going to be an easy and simple solution to the problem, unfortunately. At this point I'm just saying we need to consider alternatives and rule them out to try to narrow in on the least complicated and difficult approach. I'm nowhere near convinced that the work necessary to make threads replaceable is it. |
I'd note that solving for "least complicated and difficult" varies wildly on what you are trying to optimize for. I think we should optimize for some usability beyond custom builds of node since this space is a fairly generic one.
As I've stated above, allowing worker threads to act exactly like the main thread would be sufficient as well. Though it would mean effectively leave a dead thread if the original isn't destroyed I'd assume. Also, I would not I am not talking about "native" threads here but conceptual ones. I could easily imagine the replace behavior spinning up an Isolate in the same native thread. |
This is an interesting approach also, and one that I've been stewing over... essentially, allowing multiple |
For multiple |
Oh very nice. I had missed that! |
What about dynamic import in an ESM entrypoint? |
@cspotcode this leads to a similar situation to how Loaders work today. Loaders actually have a separate Module Map (Registry if you like that term) from application code due to some modules being loaded in the instrumentation and having side effects. To avoid side effects like prototype pollution and/or leaking via singletons of the instrumentation code (not the globals, use --frozen-primordials for now) a separate module instance is loaded for the same URL for Loaders different from the one loaded for Application code. Overall, the problem with stating the entrypoint should just use dynamic import is similar to what requiring a CJS entrypoint is since you can use dynamic import also in CJS. |
I'm thinking that the application entrypoint can be something like this:
And the instrumentation -- whatever it may be -- will apply to the application's entire module graph, with the exception of It seems like the above should work if the instrumentation is bundled (via bundleDependencies or other solution) to avoid sharing modules with application code. Is that the issue, that instrumentation might share dependencies with the application, and thus instrumentation-loaded dependencies will not be instrumented? |
Half of that.
Yes
This is just one reason, other reasons are things like adding state in observable ways through instrumentation (e.g. how eventEmitter._events is visible and/or doing things like prepend / removing of instrumentation via APIs), or instrumentation code using singleton modules where state is stored. Honestly, in order to force things to be instrumented for application code, it isn't terrible, just make all app code marked w/ a special query parameter that you strip out of import.meta.url on each module loaded. That however, is a terrible solution due to a variety of reasons like mutating source code and/or some not having query parameter equivalents. This kind of terrible solution was done to prove service workers could in theory emulate the loader API in https://github.com/bmeck/node-sw-compat-loader-test and it is kind of a mess as can be seen. |
Is your feature request related to a problem? Please describe.
Please describe the problem you are trying to solve.
ESM in Node requires command line flags in order to use
--loader
. This is because if we try to provide an API to customize the loader, the entire module graph for the application is already linked and it is too late to do so. Common tools like APMs run on environments like cloud functions that do not support customizing CLI flags but need to muck with the loading systems.Describe the solution you'd like
Please describe the desired behavior.
When spinning up workers, we can already custom configure
--loader
since that module graph hasn't linked yet. I would like for applications to be able to do so in a way that makes the worker act exactly as the original main thread: no piping of I/O, no limitations on process, etc.This likely would be something like:
Likely this feature needs to sort out file descriptor behavior in particular when replacing the main/current thread and what happens with potentially orphaned workers.
Describe alternatives you've considered
Please describe alternative solutions or features you have considered.
None given how ESM works and links eagerly this is the only solution I've found.
The text was updated successfully, but these errors were encountered: