-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scheduling algorithm #199
Conversation
(Otherwise, new streams will be prioritized over old streams until they catch up.)
This deals with a kind of whiplash effect: - Stream X writes a lot of data - This places X far behind in the priority queue - By the time stream X gets a chance to write again, it has accumulated even more data, making matters even worse Also, the current regime didn't play well with streams that have 0 bytes to write. It would effectively starve other streams. With this change, every stream popped from the queue always advances its timer.
The issue was that in some situations, the scheduler would not flush stream B if there was a stream A with higher priority even if stream A had no data to send. This would not be an issue if streams were constantly sending data, but if all streams stopped sending, frames might be stuck in the scheduler. The fix to this is to traverse the children until a stream that needs to send data is found. Also fixes a possibly broken test. After the changes to the scheduler, a frame that wasn't flushed before is now flushed.
lib/scheduler.ml
Outdated
* Without this check, we might end up in a situation where a stream | ||
* won't be flushed until a few write operations later. *) | ||
then ( | ||
update_t_last p_node i.t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were previously updating the t_last
of the parent node to the t
value of the stream that was popped from the pq. Now we only do that if we wrote something. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is contrary to the algorithm, where it seems like it updates t_last
to the popped stream even if it's not going to write anything (though this doesn't seem to be very clear to me)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more reasoning that leads me to conclude that your code might be correct as is:
- if we're traversing the children until we write something, the
t_last
for the parent would be dictated by the stream that eventually wins (writes something).- in this case, it doesn't matter whether we update the
t_last
of the parent node or not
- in this case, it doesn't matter whether we update the
- if we end up not writing anything, we should keep
t_last
of the stream that last wrote something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite tricky and I'm not 100% convinced I'm right, but my reasoning was like yours above.
What makes it difficult to reason about is that the tree of priority queues in the present implementation contains all streams regardless of whether they have data to send or not - cf slide 14 where the assumption is that, in contrast to the dependency tree, it only contains the active streams.
Stream | ||
{ descriptor | ||
; t_last = 0 | ||
; t_last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch. My understanding:
- streams added under a parent node should inherit its node's
t_last
value. - the
t
value is calculated witht[i] = t_last[p] + nsent[i] * K / weight[i]
, which in this case ist_last[p] + 0 = t_last[p]
because the stream hasn't sent any bytes yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how I understand it too.
lib/scheduler.ml
Outdated
* won't be flushed until a few write operations later. *) | ||
then ( | ||
update_t_last p_node i.t; | ||
update_t i_node written; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the previous code we were adding i_node
back to the priority queue of children'
. This is important if t
has changed in order to re-balance the tree (just mutating t
won't do it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to write a test that fails with your (seemingly buggy) current implementation because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even trickier: we should only put the stream back in the tree if the subtree isn't active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the previous code we were adding
i_node
back to the priority queue ofchildren'
. This is important ift
has changed in order to re-balance the tree (just mutatingt
won't do it.)
Of course you're right 🤦
I introduced that regression in my latest commit when I though I could "simplify" things. I reverted this commit although I understand you're fixing it in #204.
lib/scheduler.ml
Outdated
implicitly_close_idle_stream i.descriptor max_seen_ids; | ||
(* XXX(anmonteiro): we may not want to remove from the tree right | ||
* away. *) | ||
remove_child p_node id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this was already wrong: I don't think we can remove a child without moving all its children to the parent node too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #201 fixes this: a subtree is not active if it has children. Therefore, this only removes leaf nodes.
I took a stab at fixing some issues here but couldn't push to your branch, so I rebased your commits + added my fixes in #204. Let's continue the discussion there. |
I believe this is only happening because we're now flushing twice for streams that have their (stream-ending) empty DATA frame pending. I don't think that's necessarily bad, though. |
This reverts commit d25743d.
superseded by #204 |
…2-async (0.10.0) CHANGES: - hpack: fix a case where hpack would raise an array out of bounds exception ([anmonteiro/ocaml-h2#183](anmonteiro/ocaml-h2#183)) ([@jonathanjameswatson](https://github.com/jonathanjameswatson)) - h2: (client) handle multiple RST_STREAM frames ([anmonteiro/ocaml-h2#184](anmonteiro/ocaml-h2#184)) ([@jonathanjameswatson](https://github.com/jonathanjameswatson)) - h2: (client) Fix a race condition with `~flush_headers_immediately:false` and empty request bodies ([anmonteiro/ocaml-h2#186](anmonteiro/ocaml-h2#186)) - h2: Make `H2.Reqd.error_code` part of the public interface ([anmonteiro/ocaml-h2#188](anmonteiro/ocaml-h2#188)) - h2: Add `~request_method` argument to `H2.Method.body_length` ([anmonteiro/ocaml-h2#190](anmonteiro/ocaml-h2#190)) ([@jonathanjameswatson](https://github.com/jonathanjameswatson)) - h2: Don't send any frames on a stream after an `RST_STREAM` frame ([anmonteiro/ocaml-h2#187](anmonteiro/ocaml-h2#187), [anmonteiro/ocaml-h2#194](anmonteiro/ocaml-h2#194)) - h2: call error handler on the client if the remote peer closes the commmunication channel ([anmonteiro/ocaml-h2#177](anmonteiro/ocaml-h2#177), [anmonteiro/ocaml-h2#196](anmonteiro/ocaml-h2#194)) - h2: when reprioritizing a stream, respect its new priority (accounts for inferred default priority when a dependent stream is not in the tree ([RFC7540§5.3.1](https://www.rfc-editor.org/rfc/rfc7540.html#section-5.3.1))) ([anmonteiro/ocaml-h2#200](anmonteiro/ocaml-h2#200)) - h2: don't remove parent streams from the scheduler if they have children ([anmonteiro/ocaml-h2#201](anmonteiro/ocaml-h2#201)) - h2: don't schedule streams as dependencies of others marked for removal ([anmonteiro/ocaml-h2#205](anmonteiro/ocaml-h2#205)) - h2: revise scheduling algorithm to avoid starvation ([anmonteiro/ocaml-h2#199](anmonteiro/ocaml-h2#199), [anmonteiro/ocaml-h2#204](anmonteiro/ocaml-h2#204), reported in [anmonteiro/ocaml-h2#162](anmonteiro/ocaml-h2#162), thanks [@quernd](https://github.com/quernd)) - h2-eio: adapt to the next gluten-eio version ([anmonteiro/ocaml-h2#210](anmonteiro/ocaml-h2#210))
Fixes #162.
The issue was that in some situations, the scheduler would not flush stream B if there was a stream A with higher priority even if stream A had no data to send. Since stream A did not send any data, its priority would not change, so it would remain at the top of the priority queue.
This would not be an issue if streams were constantly sending data at the same rate, but if some streams stopped sending (without closing the body) or streams were sending data at different rates, frames could get be stuck in the scheduler.
The fix to this is to traverse the children until a stream that needs to send data is found, if such a stream exists.
Also fixes a possibly broken test that illustrates the issue. After the changes to the scheduler, a frame that wasn't flushed before is now flushed. The test assumed that only one empty DATA frame should be sent but we closed two streams, so it has to be two frames.