Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scheduling algorithm #199

Closed
wants to merge 8 commits into from
Closed

Conversation

quernd
Copy link
Contributor

@quernd quernd commented Feb 1, 2023

Fixes #162.

The issue was that in some situations, the scheduler would not flush stream B if there was a stream A with higher priority even if stream A had no data to send. Since stream A did not send any data, its priority would not change, so it would remain at the top of the priority queue.

This would not be an issue if streams were constantly sending data at the same rate, but if some streams stopped sending (without closing the body) or streams were sending data at different rates, frames could get be stuck in the scheduler.

The fix to this is to traverse the children until a stream that needs to send data is found, if such a stream exists.

Also fixes a possibly broken test that illustrates the issue. After the changes to the scheduler, a frame that wasn't flushed before is now flushed. The test assumed that only one empty DATA frame should be sent but we closed two streams, so it has to be two frames.

(Otherwise, new streams will be prioritized over old streams until they catch up.)
This deals with a kind of whiplash effect:
- Stream X writes a lot of data
- This places X far behind in the priority queue
- By the time stream X gets a chance to write again, it has accumulated even more data, making matters even worse

Also, the current regime didn't play well with streams that have 0 bytes to write. It would effectively starve other streams. With this change, every stream popped from the queue always advances its timer.
The issue was that in some situations, the scheduler would not flush
stream B if there was a stream A with higher priority even if stream
A had no data to send. This would not be an issue if streams were
constantly sending data, but if all streams stopped sending, frames
might be stuck in the scheduler.

The fix to this is to traverse the children until a stream that needs
to send data is found.

Also fixes a possibly broken test. After the changes to the scheduler,
a frame that wasn't flushed before is now flushed.
lib/scheduler.ml Outdated
* Without this check, we might end up in a situation where a stream
* won't be flushed until a few write operations later. *)
then (
update_t_last p_node i.t;
Copy link
Owner

@anmonteiro anmonteiro Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were previously updating the t_last of the parent node to the t value of the stream that was popped from the pq. Now we only do that if we wrote something. Is that intentional?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is contrary to the algorithm, where it seems like it updates t_last to the popped stream even if it's not going to write anything (though this doesn't seem to be very clear to me)?

Copy link
Owner

@anmonteiro anmonteiro Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more reasoning that leads me to conclude that your code might be correct as is:

  • if we're traversing the children until we write something, the t_last for the parent would be dictated by the stream that eventually wins (writes something).
    • in this case, it doesn't matter whether we update the t_last of the parent node or not
  • if we end up not writing anything, we should keep t_last of the stream that last wrote something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite tricky and I'm not 100% convinced I'm right, but my reasoning was like yours above.
What makes it difficult to reason about is that the tree of priority queues in the present implementation contains all streams regardless of whether they have data to send or not - cf slide 14 where the assumption is that, in contrast to the dependency tree, it only contains the active streams.

Stream
{ descriptor
; t_last = 0
; t_last
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch. My understanding:

  • streams added under a parent node should inherit its node's t_last value.
  • the t value is calculated with t[i] = t_last[p] + nsent[i] * K / weight[i], which in this case is t_last[p] + 0 = t_last[p] because the stream hasn't sent any bytes yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how I understand it too.

lib/scheduler.ml Outdated
* won't be flushed until a few write operations later. *)
then (
update_t_last p_node i.t;
update_t i_node written;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the previous code we were adding i_node back to the priority queue of children'. This is important if t has changed in order to re-balance the tree (just mutating t won't do it.)

Copy link
Owner

@anmonteiro anmonteiro Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to write a test that fails with your (seemingly buggy) current implementation because of that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even trickier: we should only put the stream back in the tree if the subtree isn't active.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the previous code we were adding i_node back to the priority queue of children'. This is important if t has changed in order to re-balance the tree (just mutating t won't do it.)

Of course you're right 🤦
I introduced that regression in my latest commit when I though I could "simplify" things. I reverted this commit although I understand you're fixing it in #204.

lib/scheduler.ml Outdated
implicitly_close_idle_stream i.descriptor max_seen_ids;
(* XXX(anmonteiro): we may not want to remove from the tree right
* away. *)
remove_child p_node id);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this was already wrong: I don't think we can remove a child without moving all its children to the parent node too.

Copy link
Owner

@anmonteiro anmonteiro Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #201 fixes this: a subtree is not active if it has children. Therefore, this only removes leaf nodes.

@anmonteiro
Copy link
Owner

I took a stab at fixing some issues here but couldn't push to your branch, so I rebased your commits + added my fixes in #204. Let's continue the discussion there.

@anmonteiro
Copy link
Owner

Also fixes a possibly broken test that illustrates the issue. After the changes to the scheduler, a frame that wasn't flushed before is now flushed. The test assumed that only one empty DATA frame should be sent but we closed two streams, so it has to be two frames.

I believe this is only happening because we're now flushing twice for streams that have their (stream-ending) empty DATA frame pending. I don't think that's necessarily bad, though.

This reverts commit d25743d.
@anmonteiro
Copy link
Owner

superseded by #204

@anmonteiro anmonteiro closed this Feb 4, 2023
anmonteiro added a commit to anmonteiro/opam-repository that referenced this pull request Mar 17, 2023
…2-async (0.10.0)

CHANGES:

- hpack: fix a case where hpack would raise an array out of bounds exception
  ([anmonteiro/ocaml-h2#183](anmonteiro/ocaml-h2#183))
  ([@jonathanjameswatson](https://github.com/jonathanjameswatson))
- h2: (client) handle multiple RST_STREAM frames
  ([anmonteiro/ocaml-h2#184](anmonteiro/ocaml-h2#184))
  ([@jonathanjameswatson](https://github.com/jonathanjameswatson))
- h2: (client) Fix a race condition with `~flush_headers_immediately:false` and
  empty request bodies
  ([anmonteiro/ocaml-h2#186](anmonteiro/ocaml-h2#186))
- h2: Make `H2.Reqd.error_code` part of the public interface
  ([anmonteiro/ocaml-h2#188](anmonteiro/ocaml-h2#188))
- h2: Add `~request_method` argument to `H2.Method.body_length`
  ([anmonteiro/ocaml-h2#190](anmonteiro/ocaml-h2#190))
  ([@jonathanjameswatson](https://github.com/jonathanjameswatson))
- h2: Don't send any frames on a stream after an `RST_STREAM` frame
  ([anmonteiro/ocaml-h2#187](anmonteiro/ocaml-h2#187),
  [anmonteiro/ocaml-h2#194](anmonteiro/ocaml-h2#194))
- h2: call error handler on the client if the remote peer closes the
  commmunication channel
  ([anmonteiro/ocaml-h2#177](anmonteiro/ocaml-h2#177),
  [anmonteiro/ocaml-h2#196](anmonteiro/ocaml-h2#194))
- h2: when reprioritizing a stream, respect its new priority (accounts for
  inferred default priority when a dependent stream is not in the tree
  ([RFC7540§5.3.1](https://www.rfc-editor.org/rfc/rfc7540.html#section-5.3.1)))
  ([anmonteiro/ocaml-h2#200](anmonteiro/ocaml-h2#200))
- h2: don't remove parent streams from the scheduler if they have children
  ([anmonteiro/ocaml-h2#201](anmonteiro/ocaml-h2#201))
- h2: don't schedule streams as dependencies of others marked for removal
  ([anmonteiro/ocaml-h2#205](anmonteiro/ocaml-h2#205))
- h2: revise scheduling algorithm to avoid starvation
  ([anmonteiro/ocaml-h2#199](anmonteiro/ocaml-h2#199),
  [anmonteiro/ocaml-h2#204](anmonteiro/ocaml-h2#204), reported in
  [anmonteiro/ocaml-h2#162](anmonteiro/ocaml-h2#162), thanks
  [@quernd](https://github.com/quernd))
- h2-eio: adapt to the next gluten-eio version
  ([anmonteiro/ocaml-h2#210](anmonteiro/ocaml-h2#210))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduling algorithm starves streams
2 participants