Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand the area of ScopeTrackedObject usage #14249

Closed
KBaichoo opened this issue Dec 2, 2020 · 6 comments · Fixed by #15314
Closed

Expand the area of ScopeTrackedObject usage #14249

KBaichoo opened this issue Dec 2, 2020 · 6 comments · Fixed by #15314
Assignees
Labels
area/server stale stalebot believes this issue/PR has not been touched recently

Comments

@KBaichoo
Copy link
Contributor

KBaichoo commented Dec 2, 2020

Expand the area of ScopeTrackedObject usage.

Currently the ScopeTrackedObject is used at the L7 stream level. The mechanism essentially registers that the dispatcher is doing some particular work.

When a Fatal Error occurs on a thread, we invoke several debug actions on the dispatcher that encountered the error, trying to also leverage the ScopeTrackedObject (what it was working on at the time) to give additional insight into the crash.

With this mechanism as is, there are many parts of the pipeline (see image below) that if we "burst" (crashed) there we'd get no additional information. Furthermore, even when we do crash at the L7 level we have room to improve what we dump -- we don't dump some useful L4 information such as IP address IIRC.

Stream Pipeline

Ideally, we should be able to crash along any point of the pipeline and get information from the earlier phase from the pipeline as well. This will likely mean extending what objects implement the ScopedTrackedObject interface, and probably changing how the dispatcher tracks ScopeTrackedObjects, perhaps as a stack vs as a single object.

Let me know if that makes sense, thanks.

@KBaichoo KBaichoo added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Dec 2, 2020
@mattklein123
Copy link
Member

+1 this makes sense to me. I agree that we should be able to track multiple scopes and then just have them all dump at once.

@KBaichoo
Copy link
Contributor Author

KBaichoo commented Dec 3, 2020

/assign @KBaichoo

@antoniovicente
Copy link
Contributor

Tracking a stack of ScopeTrackedObjects makes sense to me. I would suggest changing the fatal action debug interface to take a list/vector of tracked objects rather than invoking the interface for each object in the stack.

@antoniovicente
Copy link
Contributor

Another possible question is if it would be possible to dump the HTTP context when there's a crash at the connection level. This could be possible if there was a way for objects to register trackers with their parent object. Note that there would have to be some indirections in the ScopeTrackedObjects registration, so that an HTTP context that is added or removed after connection processing starts is properly dumped on crash.

@mattklein123 mattklein123 added area/server and removed enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Dec 7, 2020
@github-actions
Copy link

github-actions bot commented Jan 6, 2021

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 6, 2021
@github-actions
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

htuch pushed a commit that referenced this issue Mar 3, 2021
This allows us to more easily test response crashes.

Risk Level: low (this is a test-only filter AFAIK)
Testing: unit test, integration test
Docs Changes: Included
Release Notes: Included
Platform Specific Features: N/A
Related Issue #14249

Signed-off-by: Kevin Baichoo <[email protected]>
htuch pushed a commit that referenced this issue Mar 8, 2021
The separate components where we have crash dumping support have individual unit tests, some with more mocks than others. E2E tests will build more confidence in the crash dumping pipeline.

Risk Level: Low (tests only)
Testing: Integration tests
Docs Changes: NA
Release Notes: NA
Platform Specific Features: Systems that use the crash handlers (windows doesn't currently)
Fixes #14249

Signed-off-by: Kevin Baichoo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/server stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants