-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cooperative multitasking for UnixDatagram
#5967
Conversation
It looks like a number of tests are now failing on CI: https://github.com/tokio-rs/tokio/actions/runs/6037644879/job/16382367407?pr=5967 I believe the reason that we're seeing failures in tests for unrelated APIs such as |
@@ -411,3 +411,47 @@ async fn poll_ready() -> io::Result<()> { | |||
|
|||
Ok(()) | |||
} | |||
|
|||
#[tokio::test(flavor = "current_thread")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test macro will actually default to this
|
||
async move { | ||
loop { | ||
tokio::time::sleep(Duration::from_millis(250)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would recommend avoiding time in tests like this, and instead using yield_now to ensure that tasks wait a certain number of ticks rather than an amount of time. timing makes things brittle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how would I do that. Are there any existing tests in the code base that I could reference as a template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'd recommend doing here is rewriting this test such that you have the main task use a pair of UDS sockets to message itself and read the datagram message in a for loop which runs just often enough that it should deplete its budget once, with a second task spawned before the loop which sets a boolean flag and exits to indicate that it run.
You'd then assert after the loop in the main task that the flag was flipped and the second task ran.
Yeah, I think we should either pull this out of there or move this up to the leaf functions. |
Indeed. One notable example is Should I open a different issue for this matter, or discussing here is fine? cc @Darksonn |
The solution I had been planning on here was moving budgeting into leaf futures and calls rather than doing it in here. Let's leave the behavior of the try_ functions as is for now. |
We could introduce a new future named |
cc16dfa
to
f37a7b1
Compare
So, one important factor here is that we should only consume budget when we actually perform an action. Having
Thoughts? @Noah-Kennedy @carllerche |
@Darksonn Yep, that seems right to me. The key to consider is 1 call to |
if !crate::runtime::coop::has_budget_remaining() { | ||
// Wasn't ready, take the lock (and check again while locked). | ||
let mut waiters = scheduled_io.waiters.lock(); | ||
|
||
let w = unsafe { &mut *waiter.get() }; | ||
|
||
if *is_waiter_registered { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already mentioned this on our Discord chat, but I'll put it here too. This code isn't correct because when the coop budget is empty, you're registering the waker with the IO driver. But in this case, we need to be woken up by the coop system, and not by the IO waker. So you should register for readiness with the coop system instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thank you for pointing it out. I'm swamped right now with my job and university. I'll get to it as soon as I find some free time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. That's perfectly fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maminrayej I can take this over from you if you're busy, I'd like to get this change over the finish line relatively quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Noah-Kennedy Absolutely, feel free to take it over. Thanks for stepping in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem!
Since Noah has offered to take over this, I'll close the PR to remove it from the list of things to review. However, if something changes and you would still like to continue instead of Noah, then feel free to reopen. |
The
recv
function ofUnixDatagram
does not participate in cooperative multitasking as illustrated in issue #5946.Motivation
Solution
UnixDatagram
utilizesRegistration
'sasync_io
function, which depends on theReadiness
future. I've modified itspoll
function to be budget-aware.NOTE: The
async_io
function is not exclusive toUnixDatagram
but is also used inudp.rs
. This PR might address potential starvation in those functions as well. I'll verify this through testing.Closes #5946.