Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Updated undici instrumentation to fix crash with trying to calculate exclusive duration on a segment that no longer exists #2884

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

bizob2828
Copy link
Member

Description

v12.11.0 refactored our context manager and separated child segments from the current segment and stored them on the trace. Undici instrumentation had a bug when relying on the undici_async_tracking that took advantage of async_hooks executionAsyncResource. It was keeping a reference to transaction and parent segment that existed from a previous request. I suspect this has to do with the keepAlive logic in undici where it'd return the same async id for different requests.

When the trace for the new undici segment was being finalized it tried to calculate the exclusive time of the undici segment and it crashed with:

TypeError: Cannot destructure property 'children' of 'node' as it is null.
    at ExclusiveCalculator.process (/opt/application/node_modules/newrelic/lib/transaction/trace/exclusive-time-calculator.js:22:13)
    at TraceSegment.getExclusiveDurationInMillis (/opt/application/node_modules/newrelic/lib/transaction/trace/segment.js:248:16)
    at externalRecorder (/opt/application/node_modules/newrelic/lib/metrics/recorders/http_external.js:13:31)
    at Transaction.record (/opt/application/node_modules/newrelic/lib/transaction/index.js:680:23)
    at Transaction.end (/opt/application/node_modules/newrelic/lib/transaction/index.js:212:10)
    at ServerResponse.instrumentedFinish (/opt/application/node_modules/newrelic/lib/instrumentation/core/http.js:207:15)
    at Object.onceWrapper (node:events:638:28)
    at ServerResponse.emit (node:events:536:35)
    at AsyncLocalStorage.run (node:internal/async_local_storage/async_hooks:91:14)
    at AsyncLocalContextManager.runInContext (/opt/application/node_modules/newrelic/lib/context-manager/async-local-context-manager.js:58:38)

That was because its transaction and parent segments were from a previous request. The introduction of the undici_async_tracking was an attempt to fix an issue where if you had concurrent undici requests within a transaction the 2nd request's parent was the first and not the parent of both requests. This causes a slight inaccuracy in the parent/child relationship in a trace and transaction trace. There's no good way to fix this because undici relies on diagnostic channel and its behavior isn't the same as our traditional monkey patching instrumentation.

I also add more assertions to existing tests around concurrent requests with a note stating that the parentId of the 2nd undici segment is the first request for a reason. Lastly, I added a test that reproduces the issue that this is fixing where two concurrent transactions are occurring and this is where the state confliation was existing in the old instrumentation.

Lastly, I "released" the feature flag for undici_async_tracking as there is no longer code wired up to support it.

Related Issues

Closes #2881

…late exclusive duration on a segment that no longer exists
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.35%. Comparing base (dd30ad7) to head (5ce4f1b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2884      +/-   ##
==========================================
- Coverage   97.40%   97.35%   -0.05%     
==========================================
  Files         308      308              
  Lines       47369    47330      -39     
==========================================
- Hits        46138    46077      -61     
- Misses       1231     1253      +22     
Flag Coverage Δ
integration-tests-cjs-18.x 73.18% <100.00%> (-0.02%) ⬇️
integration-tests-cjs-20.x 73.18% <100.00%> (-0.02%) ⬇️
integration-tests-cjs-22.x 73.20% <100.00%> (-0.03%) ⬇️
integration-tests-esm-18.x 49.83% <30.00%> (-0.01%) ⬇️
integration-tests-esm-20.x 49.84% <30.00%> (-0.01%) ⬇️
integration-tests-esm-22.x 49.88% <30.00%> (-0.01%) ⬇️
unit-tests-18.x 89.11% <100.00%> (-0.01%) ⬇️
unit-tests-20.x 89.11% <100.00%> (-0.01%) ⬇️
unit-tests-22.x 89.12% <100.00%> (-0.01%) ⬇️
versioned-tests-18.x 79.19% <100.00%> (-0.21%) ⬇️
versioned-tests-20.x 79.20% <100.00%> (-0.21%) ⬇️
versioned-tests-22.x 79.21% <100.00%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@jsumners-nr jsumners-nr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this over Zoom. 👍

@bizob2828 bizob2828 merged commit 3b7e4bf into newrelic:main Jan 16, 2025
28 of 29 checks passed
@bizob2828 bizob2828 deleted the fix-undici-instrumentation branch January 16, 2025 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError: Cannot destructure property 'children' of 'node' as it is null.
2 participants