Handling non-blocking file descriptors consistently. #364

ioquatix · 2021-06-16T05:05:13Z

This is more of a discussion than an actual issue at the moment, but I hope to provide more concrete details as we explore the use cases.

We have an interesting problem regarding IORING_OP_READV/IORING_OP_WRITEV. In a general programming language environment, we don't always know if the file descriptor is in non-blocking mode or not.

But the behaviour of IORING_OP_READV/IORING_OP_WRITEV depends on it. In non-blocking mode, we can get a result of EAGAIN.

However, this is not very useful in the case of even driven concurrency where we want operations like this to be "asynchronous" "blocking" - i.e. the CQE should not be generated until the read/write could make progress.

In general, my understanding is, we'd need 3-4 system calls to work around this, something like this:

int current_state = get_nonblocking_state(fd);
if (current_state != nonblocking) set_nonblocking_state(fd);
io_uring_read(..., fd, ...)
if (nonblocking_state_was_changed) restore_nonblocking_state(fd, current_state);

We can cache this to reduce the round trips, but it feels like this would be something better solved by a flag on IORING_OP_READV/IORING_OP_WRITEV. In fact, I'm having a hard time imagining any valid use case for non-blocking operation in the uring.

I'll try to test out specific cases where this is a problem and report back.

The text was updated successfully, but these errors were encountered:

YoSTEALTH · 2021-06-16T06:50:36Z

Whole point of io_uring is its asynchronous by design! If there is a blocking call it either handles in async or opens a new kernel thread for that request.

By using O_NONBLOCK what you are saying is, let me choose what to do next when blocking happens vs letting io_uring deal with it.

There already is an flags option.

fd = open('path/to/file', O_RDONLY)  # normal blocking read only open.
...
io_uring_prep_readv(sqe, fd, iovecs, nr_vecs, offset)
sqe.rw_flags |= RWF_NOWAIT  # now read is non-blocking
...
# catch `-EGAIN` in `cqe.res`, ...

ioquatix · 2021-06-16T06:59:42Z

Is there a flag for the opposite of RWF_NOWAIT?

YoSTEALTH · 2021-06-16T07:01:20Z

Not that I know of. If you want a blocking read just use normal read() don't use io_uring_read*

ioquatix · 2021-06-16T07:07:13Z

We have file descriptors that might already have O_NONBLOCK set. But we still want to use them with io_uring without getting EAGAIN.

YoSTEALTH · 2021-06-16T07:26:07Z

Hmm that's a tricky situation. Monkey patching O_NONBLOCK flag to not be non-blocking might lead to other problems/bugs.

If you have access to when the original flag is being set

original_flags = O_CREAT | O_RDWR | O_NONBLOCK

You could add a check like

if original_flags & O_NONBLOCK:
    # raise error saying `O_NONBLOCK` is not supported, invalid or something!

I am not really proud of that solution.

Why not just catch -EGAIN and poll it? io_uring_prep_poll_add(sqe, fd, POLLIN)

ioquatix · 2021-06-16T08:14:17Z

Yes it’s one option but we wanted to use the direct read/write operations if possible.

YoSTEALTH · 2021-06-16T19:33:35Z

suppose you can also remove the flag by:

original_flags = O_CREAT | O_RDWR | O_NONBLOCK
new_flags = original_flags & ~O_NONBLOCK

axboe · 2021-06-16T19:59:17Z

It would be a pretty trivial change to arm poll rather than return -EAGAIN to the application, if O_NONBLOCK is set on the file. The main issue here is how to inform io_uring of that desired behavior. OP_READV/OP_WRITEV flags are really readv2/writev2 flags, so it'd then be a generic change to allow this. Which arguably would make sense, as the justification would be that you can do blocking IO without having the query/set new file flags.

This applies to more than OP_READV/OP_WRITEV, though. It also applies to SEND/RECV and friends, so it's actually more of a generic change. Which means that an IOSQE_ flag would make more sense. The only potential issue there is that the flag space is pretty small, we have 8 bits and 6 are already used. Not a huge issue, we have two bits left, and the last bit could always be turned into a "look at extended bits" or something like that.

In any case, let me know if you want to test a patch for this. Should be easy to cook up.

ioquatix · 2021-06-16T20:02:40Z

I am happy to test and help with this.

ioquatix · 2021-06-16T22:46:37Z

Just thinking through what you described, do you think it would make sense for us to propose a flag to readv2/writev2 to be more specific about this?

That way, this doesn't become an io_uring specific feature, but works at the kernel level in a generic way.

e.g.

       RWF_NONBLOCK
              Perform the operation as if the descriptor was in `O_NONBLOCK` mode.
       RWF_BLOCK
              Perform the operation as if the descriptor was not in `O_NONBLOCK` mode.

Maybe it's crazy proposal, but that seems logical to me, at least from my POV.

axboe · 2021-06-16T22:51:55Z

As mentioned, that only then works for the read/write variants, not for any other opcode. Hence I'd be leaning towards making it an sqe flag instead. If it was just applicable to OP_READV and friends, then the RWF approach would be preferable.

ioquatix · 2021-06-16T22:52:47Z

Do other opcodes also take flags? e.g. accept, open, close, etc.

axboe · 2021-06-16T23:15:00Z

Each opcode has a specific set of flags, like the RWF_* for read/write, but there's also a generic set of IOSQE_* flags that apply to any opcode. That handles things like links, etc. Things that sit above the specific request type.

ioquatix · 2021-06-16T23:56:29Z

Oh that makes total sense. Okay, I'm keen to test this when you are ready.

cc @ciconia

axboe · 2021-06-17T00:32:35Z

I’ll spin a patch tomorrow morning.

…

On Jun 16, 2021, at 5:56 PM, Samuel Williams ***@***.***> wrote: Oh that makes total sense. Okay, I'm keen to test this when you are ready. cc @ciconia — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

YoSTEALTH · 2021-06-17T06:39:40Z

It would be a pretty trivial change to arm poll rather than return -EAGAIN to the application, if O_NONBLOCK is set on the file.

@axboe O.o something like IOSQE_IO_BLOCKING_POLL*? If you can make this into an feature I would totally welcome it as well. Currently I have lines of code just checking -EAGAIN and polling it to account for both blocking/non-blocking flags. This feature would save many unnecessary round-trips.

Rather then just account for EAGAIN it would be nice if it accounted for "BLOCKING" states like EAGAIN, EALREADY, EINPROGRESS, EWOULDBLOCK, ...

ioquatix · 2021-06-17T06:57:41Z

I agree, the vast majority of user space code doesn't care about EAGAIN, it's like the half way house of concurrency.

noteflakes · 2021-06-17T09:08:34Z

That's a great idea, adding such a flag will be very useful! Keep in mind though that if you want to support older Linux kernel versions (once this feature lands in a kernel release), you'll still have to provide some workaround, either resetting O_NONBLOCK or handling EAGAIN.

axboe · 2021-06-17T16:22:55Z

Probably something like IOSQE_IGNORE_NONBLOCK or whatever. It'll just be a way to force the normal poll path for pollable IO even if the fd is marked non-blocking.

axboe · 2021-06-17T16:41:39Z

Here's a first attempt. Let me know if you'd prefer a patch somewhere, not sure what this does to formatting... If it destroys whitespace, probably just use patch -l to ignore that part.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 46a25a7cb70a..25f72f35902d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -108,7 +108,7 @@
 
 #define SQE_VALID_FLAGS	(IOSQE_FIXED_FILE|IOSQE_IO_DRAIN|IOSQE_IO_LINK|	\
 				IOSQE_IO_HARDLINK | IOSQE_ASYNC | \
-				IOSQE_BUFFER_SELECT)
+				IOSQE_BUFFER_SELECT | IOSQE_IGNORE_NONBLOCK)
 
 #define IO_TCTX_REFS_CACHE_NR	(1U << 10)
 
@@ -704,6 +704,7 @@ enum {
 	REQ_F_HARDLINK_BIT	= IOSQE_IO_HARDLINK_BIT,
 	REQ_F_FORCE_ASYNC_BIT	= IOSQE_ASYNC_BIT,
 	REQ_F_BUFFER_SELECT_BIT	= IOSQE_BUFFER_SELECT_BIT,
+	REQ_F_IGNORE_NONBLOCK_BIT	= IOSQE_IGNORE_NONBLOCK_BIT,
 
 	/* first byte is taken by user flags, shift it to not overlap */
 	REQ_F_FAIL_BIT		= 8,
@@ -740,6 +741,8 @@ enum {
 	REQ_F_FORCE_ASYNC	= BIT(REQ_F_FORCE_ASYNC_BIT),
 	/* IOSQE_BUFFER_SELECT */
 	REQ_F_BUFFER_SELECT	= BIT(REQ_F_BUFFER_SELECT_BIT),
+	/* IOSQE_IGNORE_NONBLOCK */
+	REQ_F_IGNORE_NONBLOCK	= BIT(REQ_F_IGNORE_NONBLOCK_BIT),
 
 	/* fail rest of links */
 	REQ_F_FAIL		= BIT(REQ_F_FAIL_BIT),
@@ -2444,6 +2447,15 @@ static bool io_resubmit_prep(struct io_kiocb *req)
 	return true;
 }
 
+/*
+ * Honor REQ_F_NOWAIT only if REQ_F_IGNORE_NONBLOCK isn't set
+ */
+static bool io_req_nowait(struct io_kiocb *req)
+{
+	return (req->flags & (REQ_F_NOWAIT | REQ_F_IGNORE_NONBLOCK)) ==
+		REQ_F_NOWAIT;
+}
+
 static bool io_rw_should_reissue(struct io_kiocb *req)
 {
 	umode_t mode = file_inode(req->file)->i_mode;
@@ -2451,7 +2463,7 @@ static bool io_rw_should_reissue(struct io_kiocb *req)
 
 	if (!S_ISBLK(mode) && !S_ISREG(mode))
 		return false;
-	if ((req->flags & REQ_F_NOWAIT) || (io_wq_current_is_worker() &&
+	if (io_req_nowait(req) || (io_wq_current_is_worker() &&
 	    !(ctx->flags & IORING_SETUP_IOPOLL)))
 		return false;
 	/*
@@ -3223,7 +3235,7 @@ static bool io_rw_should_retry(struct io_kiocb *req)
 	struct kiocb *kiocb = &req->rw.kiocb;
 
 	/* never retry for NOWAIT, we just complete with -EAGAIN */
-	if (req->flags & REQ_F_NOWAIT)
+	if (io_req_nowait(req))
 		return false;
 
 	/* Only for buffered IO */
@@ -3303,7 +3315,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 		if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
 			goto done;
 		/* no retry on NONBLOCK nor RWF_NOWAIT */
-		if (req->flags & REQ_F_NOWAIT)
+		if (io_req_nowait(req))
 			goto done;
 		/* some cases will consume bytes even on error returns */
 		iov_iter_revert(iter, io_size - iov_iter_count(iter));
@@ -3311,7 +3323,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 	} else if (ret == -EIOCBQUEUED) {
 		goto out_free;
 	} else if (ret <= 0 || ret == io_size || !force_nonblock ||
-		   (req->flags & REQ_F_NOWAIT) || !(req->flags & REQ_F_ISREG)) {
+		   io_req_nowait(req) || !(req->flags & REQ_F_ISREG)) {
 		/* read all, failed, already did sync or don't want to retry */
 		goto done;
 	}
@@ -3434,7 +3446,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
 		ret2 = -EAGAIN;
 	/* no retry on NONBLOCK nor RWF_NOWAIT */
-	if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
+	if (ret2 == -EAGAIN && io_req_nowait(req))
 		goto done;
 	if (!force_nonblock || ret2 != -EAGAIN) {
 		/* IOPOLL retry should happen for io-wq threads */
@@ -6455,7 +6467,7 @@ static void __io_queue_sqe(struct io_kiocb *req)
 		} else {
 			io_put_req(req);
 		}
-	} else if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
+	} else if (ret == -EAGAIN && !io_req_nowait(req)) {
 		if (!io_arm_poll_handler(req)) {
 			/*
 			 * Queued up for async execution, worker will release
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index f1f9ac114b51..582eee6e898b 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -70,6 +70,7 @@ enum {
 	IOSQE_IO_HARDLINK_BIT,
 	IOSQE_ASYNC_BIT,
 	IOSQE_BUFFER_SELECT_BIT,
+	IOSQE_IGNORE_NONBLOCK_BIT,
 };
 
 /*
@@ -87,6 +88,8 @@ enum {
 #define IOSQE_ASYNC		(1U << IOSQE_ASYNC_BIT)
 /* select buffer from sqe->buf_group */
 #define IOSQE_BUFFER_SELECT	(1U << IOSQE_BUFFER_SELECT_BIT)
+/* ignore if file has O_NONBLOCK set */
+#define IOSQE_IGNORE_NONBLOCK	(1U << IOSQE_IGNORE_NONBLOCK_BIT)
 
 /*
  * io_uring_setup() flags

ioquatix · 2021-06-17T21:16:21Z

I will try this out over the weekend.

axboe · 2021-06-17T21:31:36Z

I will try this out over the weekend.

I'll run some quick testing on this later, I'll post an updated version here if needed.

YoSTEALTH · 2021-06-18T08:20:00Z

That's neat,

Does it account for cases where O_NONBLOCK is not set but MSG_DONTWAIT flag is set directly into socket *recv?

Can this be in next Linux 5.13-rc* patch or its more of a 5.14 thing.

ioquatix · 2021-06-18T08:25:18Z

I'm hoping we can transparently support this by using the poll fallback - i.e. assume it's going to work, but if we get EAGAIN, go to a fallback path.

isilence · 2021-06-18T12:24:18Z

Ok, seems I need to ask unpopular questions...

Disregarding generality for other io_uring requests, I have a problem with that design. We have a behaviour, blocking or not, and it is specified by rw flags (and also by file flags). So, it's blocking by default and RWF_NOWAIT overwrites the behaviour to non-blocking. This one will be overriding overridden behaviour, that doesn't sound good from the design perspective.
To give the idea about concern, reductio to absurdium here would be to also add IOSQE_IGNORE_IGNORE_NONBLOCK

So, the question here, does it add something that can't be currently achieved? (e.g. by clearing/not non-blocking flags). Finer blocking control or just yet another BLOCK/NONBLOCK flag doing the same thing? If there are inconsistencies in io_uring handling noblock (and I remember there were at some point), we need to figure out and document how it should be treated.

Also curious, assuming that the library takes fds from the user but doesn't take over the ownership, i.e. the user may use it in any other way it wish in parallel:

How come that you force 'blocking' behaviour for userspace passing in non-blocking file, when it can be the desired outcome?
Why the requirement of passing non-blocking files can't be forced on the user?

isilence · 2021-06-18T12:25:13Z

Can this be in next Linux 5.13-rc* patch or its more of a 5.14 thing.

@YoSTEALTH, 5.13 is feature closed and will be released pretty soon, so definitely not 5.13

axboe · 2021-06-18T15:52:58Z

Pavel, outside of the patch being a quick'n dirty, the intent is only to ignore O_NONBLOCK if set. We should not be fiddling with RWF_NOWAIT or ditto send/recvmsg flags. Just for per-file state, not per request.

YoSTEALTH · 2021-06-18T15:57:27Z

@axboe My initial thought was if cqe.res == -EAGAIN and IOSQE_new_flag: wait/poll for task to finish at kernel-space, thus preventing a round trip to user-space and back. This way no set flags are messed with only checks for cqe end of the data. Is that even close to whats happening?

axboe · 2021-06-18T16:00:05Z

What you described, but only if O_NONBLOCK is set on the file. If you are setting request specific flags, like RWF_NOWAIT, then the -EAGAIN is returned as before. At that point you are specifically asking for non-block checking, the new sqe flag should only apply to state that you don't necessarily control (eg the file flags).

YoSTEALTH · 2021-06-18T16:17:00Z

@axboe OK, thank you for clarifying. So it can only be used in *open* atm. Which is such a small use-case! wouldn't it have been better for developer to new_flags = original_flags & ~O_NONBLOCK way to remove the flag.

As much as I would like to see new feature added to io_uring I also wouldn't want to waste resources that could be better used!

commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Source: Kernel.org MR: 125104 Type: Integration Disposition: Backport from git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable linux-5.10-y ChangeID: 246f26664b2ec47b4d6ba41b5c2b779550bda61d Description: commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Armin Kuster <[email protected]>

commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 345fb368e5f5b6b6809b5c97419da436b3694e3f) Signed-off-by: Jack Vogel <[email protected]>

commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c16bda37594f83147b167d381d54c010024efecf upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

vasama · 2023-08-15T09:30:11Z

What's the status of this issue? Is IORING_SETUP_IGNORE_ONONBLOCK going anywhere?

commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 246f26664b2ec47b4d6ba41b5c2b779550bda61d) Orabug: 35495339 Signed-off-by: Prasad Singamsetty <[email protected]> Reviewed-by: Alan Adamson <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Brian Maly <[email protected]>

stable inclusion from stable-v5.10.173 commit 246f26664b2ec47b4d6ba41b5c2b779550bda61d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7X0QU Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=246f26664b2ec47b4d6ba41b5c2b779550bda61d -------------------------------- commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: sanglipeng <[email protected]>

stable inclusion from stable-5.10.173 commit 246f26664b2ec47b4d6ba41b5c2b779550bda61d category: bugfix issue: #I8AH5O CVE: NA Signed-off-by: Ywenrui <[email protected]> --------------------------------------- commit c16bda37594f83147b167d381d54c010024efecf upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: yaowenrui <[email protected]>

commit c16bda37594f83147b167d381d54c010024efecf upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

axboe · 2024-10-02T16:18:33Z

Should be the case now.

ioquatix · 2024-10-02T21:51:29Z

I will try it out again.

ioquatix · 2024-10-04T22:53:09Z

I've done a basic test, and I'm not seeing EAGAIN any more. However, it's difficult to compare since I don't really have a system to reproduce the old behaviour. Thanks everyone, this looks pretty good to me now.

commit c16bda3 upstream. If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused. We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket. Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway. Cc: [email protected] # v5.10+ Link: axboe/liburing#364 Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Fengnan Chang <[email protected]>

This may be a bug in io_uring, but since EventSource's were changed to O_NONBLOCK, io_uring may sometimes return EAGAIN. Handle this by simply ignoring the error and hope for the best, as the tests currently seem to pass. EAGAIN's generally should not happen though on 6.x kernels <axboe/liburing#364>

armanbilge mentioned this issue Apr 9, 2023

Revert "Set NONBLOCK" armanbilge/fs2-io_uring#58

Merged

axboe closed this as completed Oct 4, 2024

mahkoh mentioned this issue Oct 19, 2024

6.12-rc3: reading from O_NONBLOCK DRM file descriptor returns EAGAIN #1270

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling non-blocking file descriptors consistently. #364

Handling non-blocking file descriptors consistently. #364

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021 •

edited

Loading

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

ioquatix commented Jun 16, 2021 •

edited

Loading

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

axboe commented Jun 17, 2021 via email

YoSTEALTH commented Jun 17, 2021

ioquatix commented Jun 17, 2021

noteflakes commented Jun 17, 2021

axboe commented Jun 17, 2021

axboe commented Jun 17, 2021

ioquatix commented Jun 17, 2021

axboe commented Jun 17, 2021

YoSTEALTH commented Jun 18, 2021 •

edited

Loading

ioquatix commented Jun 18, 2021

isilence commented Jun 18, 2021

isilence commented Jun 18, 2021

axboe commented Jun 18, 2021

YoSTEALTH commented Jun 18, 2021

axboe commented Jun 18, 2021

YoSTEALTH commented Jun 18, 2021 •

edited

Loading

vasama commented Aug 15, 2023

axboe commented Oct 2, 2024

ioquatix commented Oct 2, 2024

ioquatix commented Oct 4, 2024 •

edited

Loading

Handling non-blocking file descriptors consistently. #364

Handling non-blocking file descriptors consistently. #364

Comments

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021

ioquatix commented Jun 16, 2021

YoSTEALTH commented Jun 16, 2021 • edited Loading

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

ioquatix commented Jun 16, 2021 • edited Loading

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

axboe commented Jun 16, 2021

ioquatix commented Jun 16, 2021

axboe commented Jun 17, 2021 via email

YoSTEALTH commented Jun 17, 2021

ioquatix commented Jun 17, 2021

noteflakes commented Jun 17, 2021

axboe commented Jun 17, 2021

axboe commented Jun 17, 2021

ioquatix commented Jun 17, 2021

axboe commented Jun 17, 2021

YoSTEALTH commented Jun 18, 2021 • edited Loading

ioquatix commented Jun 18, 2021

isilence commented Jun 18, 2021

isilence commented Jun 18, 2021

axboe commented Jun 18, 2021

YoSTEALTH commented Jun 18, 2021

axboe commented Jun 18, 2021

YoSTEALTH commented Jun 18, 2021 • edited Loading

vasama commented Aug 15, 2023

axboe commented Oct 2, 2024

ioquatix commented Oct 2, 2024

ioquatix commented Oct 4, 2024 • edited Loading

YoSTEALTH commented Jun 16, 2021 •

edited

Loading

ioquatix commented Jun 16, 2021 •

edited

Loading

YoSTEALTH commented Jun 18, 2021 •

edited

Loading

YoSTEALTH commented Jun 18, 2021 •

edited

Loading

ioquatix commented Oct 4, 2024 •

edited

Loading