-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are user-land callback queues in *uninstrumented* modules the responsibility of the APM agent? #2498
Comments
My expectation would be option (2) as well. And this is also how for example the Java agent deals with certain instrumentation modules. For instance, if in Java a user disables Vert.x instrumentation this also breaks the context propagation, not only stops capturing spans. |
+1 on option 2 |
If an important use-case for disabling just the mysql span creation but enabling the context propagation, we could think about splitting the instrumentation into multiple parts.
|
For interest, Thomas thought out loud about something similar (adding more structure to
As a workaround for a user that was relying on "no mysql span creation, but mysql run context propagation", they could use a span filter to filter out the created MySQL spans after the fact. That works, but still implies the overhead of that span creation. |
[Alan (on private chat)]
|
To re-state: Q: Are user-land callback queues in uninstrumented modules the responsibility of the APM agent? I will add a note to the changelog for the 3.26.0 release about this and resolve this issue. |
When disableInstrumentations support was added in #353 it only half-disabled these instrumentations, briefly mentioning "continuation patches". I believe this was about user-land callback queue handling that should no longer be necessary after run-context and #2430 work. In #2498 it was discussed and decided that if listed in disableInstrumentations the agent should not touch the module at all. Refs: #2498
(This stems from #2487 (comment) and the set of changes to instrumentations to change/fix some run-context handling.)
background
Many node.js packages, especially clients for databases, support internal queuing: The user-code makes one or more client requests, e.g.:
and the client internally queues those up, gets a connection to the database, dispatches the queued commands (serially or in parallel), and calls the given callbacks when results are in. Thomas dubbed these "user-land callback queues".
One of the jobs of the Node.js APM agent is to track asynchronous context, so that when the result of some action (say, reading a file via
fs.readFile()
) is ready sometime later, the user code that receives the results have a "current run context" the follows from the when the action was inititiated:The user-land callback queues mentioned above can break expectations. Using the MySQL example above, if the package creates its queue dispatcher when
.connect()
'ing, then the callbacksonFoo
andonBar
will be in the context of that.connect()
call.Using this "play-mysql2-run-context.js" as a demo:
and running with the APM agent disabled, the manual spans
s1
ands2
end up on the transactiont0
that was active at the time the client connected, rather than the transactions active when the query calls were made:Normally the APM agent handles this when it instruments the module by "binding" the context to the callback, resulting in:
the question
Now what should the APM agent do if the user disabled instrumentation for this mysql module?
Should
disableInstrumentations
mean:The equivalent in OTel (not explicitly setting up a particular
opentelemetry-plugin-<module>
) is (2).current state
Why this is coming up now, is that the current Elastic Node.js APM agent is doing user-land callback queue handling for some modules even when that module is listed in
disableInstrumentations
, but not for others. For example it does (or did) so for themysql
,pg
(I think),ioredis
,redis
, andmysql2
packages (sometimes buggily); but not forexpress-queue
even though its instrumentation was explicitly added for user-land callback queue handling in #373. I say "or did" because as part of #2430 work I had (somewhat unwittingly) been removing this handling because I did not fully understand it. There was a mention of "continuation patches" in the PR that addeddisableInstrumentations
support, but no details. There are also no tests for these cases so I didn't notice breakage.Hence this issue to discuss it and address it consistently.
At time of writing it has been removed for mysql, pg, ioredis, and redis; and PR #2487 is open which removes it for mysql2.
The text was updated successfully, but these errors were encountered: