-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: avoid creating mutex and rwlock resource_span with parent, use None parent instead #6107
sync: avoid creating mutex and rwlock resource_span with parent, use None parent instead #6107
Conversation
OnceCell, Semaphore etc probably also have the same issue, IMO it's a good idea to also change that in this PR |
Yeah, if possible I'd like to fix all instances of this. I don't think we have many tests for the tracing stuff.
If you want to add tests, then those are always welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Alice and Jiahao–can you make changes to tokio::sync:OnceCell
, tokio::sync::Semaphore
, and tokio::sync::Barrier
similar to my suggestions on tokio::sync::RwLock
/tokio::sync::Mutex
?
@xuorig On the subject of tests, I know that @hds was looking at adding tests for resources in Tokio using |
@davidbarsky absolutely, this can wait. I've added a mock currently but happy to swap it out later 👍 |
5bd0a2f
to
5a74dd0
Compare
I just committed a test using tracing-mock before realizing this PR is introducing it more generally. Will wait for it to be merged and modify this PR. |
The other PR is merged now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some prototyping with tokio-console (the console-subscriber specifically) and discussions with @hawkw and @davidbarsky, I believe we should not add follows_from
relationship to the resource span (and for the same reasons, I don't plan to add it to the task span either).
TLDR: I propose that this change should only add parent: None
to these resource spans and make no other changes.
The reason for this is a bit long, but I felt it made sense to write it out so that the justification is available when people come looking at this change.
The reason is that using follows_from
with the current span is a bit delicate. You're using the global current span, but specific tracing-subscriber Layer
s may not be recording that span (it could be filtered out) and this results in the layer never receiving a call to on_follows_from
.
Here is an example (I'm illustrating based on task spans, but the same is the case for the sleep resource):
async fn spawn_parent() {
tokio::task::Builder::new()
.name("parent_ends")
.spawn(parent_spawns_child())
.unwrap();
}
#[tracing::instrument]
async fn parent_spawns_child() {
tokio::task::Builder::new()
.name("child_continues")
.spawn(async {
loop {
tokio::time::sleep(Duration::from_secs(100)).await;
}
})
.unwrap();
tokio::time::sleep(Duration::from_secs(5)).await;
}
Because of the #[tracing::instrument]
on parent_spawns_child()
, the span hierarchy looks like this:
- runtime.spawn[task.name=parent_ends] (id=1)
- parent_spawns_child (id=274877906945)
- runtime.spawn[task.name=child_continues] (id=274877906946)
And that parent_spans_child
span in the middle means that a Layer
which is only looking for tokio instrumentation (i.e. console-subscriber) will never receive a call to on_follow_from
. If I take the #[tracing::instrument]
out, then it will.
This means that the console-subscriber (or any subscriber layer) won't be able to rely on follows_from
unless it registers for all spans which are active. On the other side, the behavior which we proposed here (providing a "weak" connection to the context within which a resource or task is created) can be safely done within the subscriber layer itself, as the layer has the current span stack from its point of view (containing only spans that it is aware of) and in the example above, the console-subscriber would not see the parent_spans_child
span and could itself keep a connection between the two spans that won't cause the parent span to remain open longer than it should.
f20a43f
to
1578ab8
Compare
PR updated:
|
…arent instead A child span stored on sync primitives can keep the parent span open, unable to be closed by subscribers due to the sync resource referencing it. Fixes: tokio-rs#6106
1578ab8
to
fae8768
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but i'll let @hds confirm that it's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for your work on this change!
I will just note here that the tests (which are great and something very valuable to add), don't test the change, which is to say they're not testing that the resource spans are being created as explicit roots (parent: None
). However, there's actually a defect in tracing-mock which means that you can't test for that right now anyway (tokio-rs/tracing#2440).
Once that defect gets fixed, these tests can be updated, but for now we have significantly more test coverage than before, so let's go ahead and merge this change!
@hds one question about this, just for my own learnings:
I was under the impression they did: .new_span(mutex_span.clone().with_explicit_parent(None)) Now I understand there's a defect with the above, but are you saying this wasn't the way to do it in the first place? |
@xuorig Your expectations are correct. But they're not actually testing what you want them to because there is no contextual parent span when the resource gets created. To complete the test, we would need to do something like the following (for the #[tokio::test]
async fn test_barrier_creates_span() {
let barrier_span = expect::span()
.named("runtime.resource")
.with_target("tokio::sync::barrier");
let size_event = expect::event()
.with_target("runtime::resource::state_update")
.with_fields(expect::field("size").with_value(&1u64));
let arrived_event = expect::event()
.with_target("runtime::resource::state_update")
.with_fields(expect::field("arrived").with_value(&0));
let contextual_parent = expect::span().named("contextual parent");
let (subscriber, handle) = subscriber::mock()
// We need to expect our contextual parent span is created and entered
.new_span(contextual_parent.clone())
.enter(contextual_parent)
.new_span(barrier_span.clone().with_explicit_parent(None))
.enter(barrier_span.clone())
.event(size_event)
.event(arrived_event)
.exit(barrier_span.clone())
.drop_span(barrier_span)
.run_with_handle();
{
let _guard = tracing::subscriber::set_default(subscriber);
{
// We enter into a span so that the barrier will have a contextual parent.
let _span = tracing::info_span!("contextual parent").entered();
let _barrier = sync::Barrier::new(1);
}
}
handle.assert_finished();
} If you run this test on current |
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
Bumps tokio from 1.34.0 to 1.35.0. Release notes Sourced from tokio's releases. Tokio v1.35.0 1.35.0 (December 8th, 2023) Added net: add Apple watchOS support (#6176) Changed io: drop the Sized requirements from AsyncReadExt.read_buf (#6169) runtime: make Runtime unwind safe (#6189) runtime: reduce the lock contention in task spawn (#6001) tokio: update nix dependency to 0.27.1 (#6190) Fixed chore: make --cfg docsrs work without net feature (#6166) chore: use relaxed load for unsync_load on miri (#6179) runtime: handle missing context on wake (#6148) taskdump: fix taskdump cargo config example (#6150) taskdump: skip notified tasks during taskdumps (#6194) tracing: avoid creating resource spans with current parent, use a None parent instead (#6107) tracing: make task span explicit root (#6158) Documented io: flush in AsyncWriteExt examples (#6149) runtime: document fairness guarantees and current behavior (#6145) task: document cancel safety of LocalSet::run_until (#6147) #6001: tokio-rs/tokio#6001 #6107: tokio-rs/tokio#6107 #6144: tokio-rs/tokio#6144 #6145: tokio-rs/tokio#6145 #6147: tokio-rs/tokio#6147 #6148: tokio-rs/tokio#6148 #6149: tokio-rs/tokio#6149 #6150: tokio-rs/tokio#6150 #6158: tokio-rs/tokio#6158 #6166: tokio-rs/tokio#6166 #6169: tokio-rs/tokio#6169 #6176: tokio-rs/tokio#6176 #6179: tokio-rs/tokio#6179 #6189: tokio-rs/tokio#6189 #6190: tokio-rs/tokio#6190 #6194: tokio-rs/tokio#6194 Commits 92a3455 chore: prepare Tokio v1.35.0 (#6197) 1968565 chore: use relaxed load for unsync_load (#6203) c9273f1 sync: improve safety comments for WakeList (#6200) e05d0f8 changelog: fix missing link for 1.8.2 (#6199) debcb22 Revert "net: add SocketAddr::as_abstract_namespace (#6144)" (#6198) 83b7397 io: drop the Sized requirements from AsyncReadExt.read_buf (#6169) 3991f9f docs: fix typo in 'tokio/src/sync/broadcast.rs' (#6182) 48c0e62 chore: use relaxed load for unsync_load on miri (#6179) d561b58 taskdump: skip notified tasks during taskdumps (#6194) 3a4aef1 runtime: reduce the lock contention in task spawn (#6001) Additional commits viewable in compare view Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase. Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: @dependabot rebase will rebase this PR @dependabot recreate will recreate this PR, overwriting any edits that have been made to it @dependabot merge will merge this PR after your CI passes on it @dependabot squash and merge will squash and merge this PR after your CI passes on it @dependabot cancel merge will cancel a previously requested merge and block automerging @dependabot reopen will reopen this PR if it is closed @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots. (so that the span doesn't have a parent). This is necessary, because otherwise a resource that is created inside a task (or some other span) and then sent elsewhere will keep that contextual parent span open (but not necessarily active) for the lifetime of the resource itself.
When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots.
…ts (#6727) When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots.
Most of the context for this PR can be found in this issue.
The
TRACE
resource_span
thatMutex
andRWLock
create when usingnew
can keep open their parent for as long as the mutex lives. This pull request changes these spans to have a parent ofNone
rather than having a parent-child relation ship with the current span.As mentioned by @davidbarsky in the linked issue, another possibility could be to alter the API and accept a parent span. I'm open to changing this PR although I agree it feels strange when it comes to
tracing
.Looking for existing tests, I wasn't sure if regression testing this (which would require bringing a tracing subscriber most likely) was necessary here. Open to adding this as well.