Implement cooperative multitasking for `UnixDatagram` #5967

maminrayej · 2023-08-31T12:58:49Z

The recv function of UnixDatagram does not participate in cooperative multitasking as illustrated in issue #5946.

Motivation

Solution

UnixDatagram utilizes Registration's async_io function, which depends on the Readiness future. I've modified its poll function to be budget-aware.

NOTE: The async_io function is not exclusive to UnixDatagram but is also used in udp.rs. This PR might address potential starvation in those functions as well. I'll verify this through testing.

Closes #5946.

hawkw · 2023-08-31T16:18:54Z

It looks like a number of tests are now failing on CI: https://github.com/tokio-rs/tokio/actions/runs/6037644879/job/16382367407?pr=5967

I believe the reason that we're seeing failures in tests for unrelated APIs such as TcpStream is that these types also use Registration::readiness, in addition to Registration::poll_ready.

Noah-Kennedy · 2023-08-31T16:07:26Z

tokio/tests/uds_datagram.rs

@@ -411,3 +411,47 @@ async fn poll_ready() -> io::Result<()> {

    Ok(())
 }
+
+#[tokio::test(flavor = "current_thread")]


the test macro will actually default to this

Noah-Kennedy · 2023-08-31T16:27:18Z

tokio/tests/uds_datagram.rs

+
+        async move {
+            loop {
+                tokio::time::sleep(Duration::from_millis(250)).await;


i would recommend avoiding time in tests like this, and instead using yield_now to ensure that tasks wait a certain number of ticks rather than an amount of time. timing makes things brittle

I'm not sure how would I do that. Are there any existing tests in the code base that I could reference as a template?

What I'd recommend doing here is rewriting this test such that you have the main task use a pair of UDS sockets to message itself and read the datagram message in a for loop which runs just often enough that it should deplete its budget once, with a second task spawned before the loop which sets a boolean flag and exits to indicate that it run.

You'd then assert after the loop in the main task that the flag was flipped and the second task ran.

Noah-Kennedy · 2023-08-31T20:07:53Z

It looks like a number of tests are now failing on CI: https://github.com/tokio-rs/tokio/actions/runs/6037644879/job/16382367407?pr=5967

I believe the reason that we're seeing failures in tests for unrelated APIs such as TcpStream is that these types also use Registration::readiness, in addition to Registration::poll_ready.

Yeah, I think we should either pull this out of there or move this up to the leaf functions.

maminrayej · 2023-08-31T20:10:08Z

I believe the reason that we're seeing failures in tests for unrelated APIs such as TcpStream is that these types also use Registration::readiness, in addition to Registration::poll_ready.

Indeed. One notable example is writable in TcpStream becoming budget-aware and causing tests like try_read_write to fail. Of course I can rewrite the tests so they don't overuse the limited budget, but I believe some functions becoming budget-aware as a result of this PR will have noticeable effects on downstream crates.

Should I open a different issue for this matter, or discussing here is fine?

cc @Darksonn

Noah-Kennedy · 2023-08-31T20:17:34Z

I believe the reason that we're seeing failures in tests for unrelated APIs such as TcpStream is that these types also use Registration::readiness, in addition to Registration::poll_ready.

Indeed. One notable example is writable in TcpStream becoming budget-aware and causing tests like try_read_write to fail. Of course I can rewrite the tests so they don't overuse the limited budget, but I believe some functions becoming budget-aware as a result of this PR will have noticeable effects on downstream crates.

Should I open a different issue for this matter, or discussing here is fine?

cc @Darksonn

The solution I had been planning on here was moving budgeting into leaf futures and calls rather than doing it in here.

Let's leave the behavior of the try_ functions as is for now.

maminrayej · 2023-09-02T15:57:12Z

We could introduce a new future named BudgetAwareReadiness that wraps the existing Readiness future. This would enable us to selectively render specific functions, such as those in UnixDatagram and UdpSocket (as outlined in #5946), budget-aware while preserving the current behavior in others, like TcpStream::writable. Does this approach seem like a balanced compromise?
Additionally, I can create a tracking issue to enumerate the tasks required for achieving a more consistent behavior throughout.

Darksonn · 2023-09-03T10:16:46Z

So, one important factor here is that we should only consume budget when we actually perform an action. Having writable return Ready is not really an action. So perhaps this makes sense?

Have readable/writable and friends return Pending if the budget is consumed, but don't consume any budget.
Have try_* methods consume a budget if they succeed, but they don't otherwise look at the budget, and won't fail if the budget is empty.

Thoughts? @Noah-Kennedy @carllerche

carllerche · 2023-09-28T16:15:10Z

@Darksonn Yep, that seems right to me. The key to consider is 1 call to writable() may result in N calls to try_write, so it is important to consume the budget on the actual op.

Darksonn · 2023-10-22T09:13:00Z

tokio/src/runtime/io/scheduled_io.rs

+        if !crate::runtime::coop::has_budget_remaining() {
+            // Wasn't ready, take the lock (and check again while locked).
+            let mut waiters = scheduled_io.waiters.lock();
+
+            let w = unsafe { &mut *waiter.get() };
+
+            if *is_waiter_registered {


I already mentioned this on our Discord chat, but I'll put it here too. This code isn't correct because when the coop budget is empty, you're registering the waker with the IO driver. But in this case, we need to be woken up by the coop system, and not by the IO waker. So you should register for readiness with the coop system instead.

Yes. Thank you for pointing it out. I'm swamped right now with my job and university. I'll get to it as soon as I find some free time.

No worries. That's perfectly fine.

@maminrayej I can take this over from you if you're busy, I'd like to get this change over the finish line relatively quickly.

@Noah-Kennedy Absolutely, feel free to take it over. Thanks for stepping in.

no problem!

Darksonn · 2023-11-05T13:34:01Z

Since Noah has offered to take over this, I'll close the PR to remove it from the list of things to review. However, if something changes and you would still like to continue instead of Noah, then feel free to reopen.

maminrayej added 3 commits August 30, 2023 09:37

instrument the Readiness future with budgeting

3d4d096

add test to assert UnixDatagram cooperates

2d4b02c

reduce the test duration and relax its assertion

b55690e

maminrayej mentioned this pull request Aug 31, 2023

Datagram sockets do not appear to participate in cooperative multitasking #5946

Closed

Noah-Kennedy reviewed Aug 31, 2023

View reviewed changes

Darksonn added A-tokio Area: The main tokio crate M-net Module: tokio/net M-coop Module: tokio/coop labels Sep 1, 2023

remove assumption about constantly being ready

f37a7b1

maminrayej force-pushed the coop_for_ud branch from cc16dfa to f37a7b1 Compare September 2, 2023 22:51

use u32 instead of u64

c995a75

consume budget when async_io makes progress

b507e63

Darksonn reviewed Oct 22, 2023

View reviewed changes

Darksonn closed this Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement cooperative multitasking for `UnixDatagram` #5967

Implement cooperative multitasking for `UnixDatagram` #5967

maminrayej commented Aug 31, 2023 •

edited by Darksonn

Loading

hawkw commented Aug 31, 2023 •

edited

Loading

Noah-Kennedy Aug 31, 2023

Noah-Kennedy Aug 31, 2023

maminrayej Aug 31, 2023

Noah-Kennedy Aug 31, 2023

Noah-Kennedy commented Aug 31, 2023

maminrayej commented Aug 31, 2023

Noah-Kennedy commented Aug 31, 2023

maminrayej commented Sep 2, 2023

Darksonn commented Sep 3, 2023

carllerche commented Sep 28, 2023

Darksonn Oct 22, 2023

maminrayej Oct 23, 2023

Darksonn Oct 23, 2023

Noah-Kennedy Oct 23, 2023

maminrayej Oct 24, 2023

Noah-Kennedy Oct 24, 2023

Darksonn commented Nov 5, 2023

Implement cooperative multitasking for UnixDatagram #5967

Implement cooperative multitasking for UnixDatagram #5967

Conversation

maminrayej commented Aug 31, 2023 • edited by Darksonn Loading

Motivation

Solution

hawkw commented Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Noah-Kennedy commented Aug 31, 2023

maminrayej commented Aug 31, 2023

Noah-Kennedy commented Aug 31, 2023

maminrayej commented Sep 2, 2023

Darksonn commented Sep 3, 2023

carllerche commented Sep 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Darksonn commented Nov 5, 2023

Implement cooperative multitasking for `UnixDatagram` #5967

Implement cooperative multitasking for `UnixDatagram` #5967

maminrayej commented Aug 31, 2023 •

edited by Darksonn

Loading

hawkw commented Aug 31, 2023 •

edited

Loading