Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiP admin: add flow control to admin filter #19898

Closed
wants to merge 16 commits into from

Conversation

jmarantz
Copy link
Contributor

Commit Message: This builds on #19831 , adding an attempt to use http filter methods to deploy flow-control, so we can pause generation of admin output until clients are ready for it.

This is not ready yet -- @alyssawilk has explained to me how to get the flow-control to work

  • use high-watermark callback to set a bit in the filter to avoid sending more data
  • use low-watermark callback to clear the above bit and send another chunk, and then queue up a callback for the next chunk (assuming high-water callback is not called).

This pattern should also be in the http cache filter and also the Router, though the latter is a lot more complex.
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Joshua Marantz <[email protected]>
Signed-off-by: Joshua Marantz <[email protected]>
@repokitteh-read-only
Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #19898 was opened by jmarantz.

see: more, trace.

decoder_callbacks_->encodeData(response, end_stream);
}
if (more_data) {
decoder_callbacks_->dispatcher().post([this]() { nextChunk(); });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking about this while looking at #19831 -- do we expect a handler to be able to return true from nextChunk without adding response data as a signal that it wishes to pause output (e.g. while waiting for some computation or event)? Because this seems like it'll spin (although not block, because of the dispatcher callbacks) calling nextChunk.

Might be something to make clear in the docs for nextChunk, one way or the other. And I guess if it's valid to pause there probably should be a more explicit signal and a way to indicate "ready to resume."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good questions and I'm evolving my understanding of how this ought to work, with much help from @alyssawilk .

One concern here is we don't want to starve other concurrent requests sharing the main thread, so we post as a way of doing cooperative multi-tasking (see #16425). In H2 this would also allow other streams on the same connection to make progress, but here we are also thread-bound.

Another consideration is that admin does not make any upstream requests: all the data it needs is basically ready to go, but it may be too much for the downstream to swallow at once, and we don't want to buffer it all either.

I am not that concerned about spinning with a nextChunk not providing any data, but returning true, in the context of #19693 -- that will not provide any empty chunks. Maybe I should disallow that case as I don't need to have it for that application.

In one intermediate version of this PR I didn't have nextChunk return a bool; it indicated completion by adding nothing to the response. That worked, but the call-sites turned out to be a lot messier.

jmarantz added a commit that referenced this pull request Feb 11, 2022
Commit Message: The admin handler structure uses a single callback that is expected to fill in the entire response body, making it impossible to stream data out as it is generated. This PR refactors the internal admin structures to require a Handler object with a nextChunk method, to enable chunking. It defers to other PRs to
 * actually convert any handlers to use chunking (e.g. WiP stats: Reimplement existing /stats API in a way that can be chunked. #19693)
 * implement proper flow control in the AdminFilter to avoid generating chunks before clients are ready to receive them (WiP admin: add flow control to admin filter #19898)

Additional Description:
Risk Level: medium -- this is a pure refactor but not a trivial one. The same exact admin content should be generated before/after this change
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Joshua Marantz <[email protected]>
joshperry pushed a commit to joshperry/envoy that referenced this pull request Feb 13, 2022
Commit Message: The admin handler structure uses a single callback that is expected to fill in the entire response body, making it impossible to stream data out as it is generated. This PR refactors the internal admin structures to require a Handler object with a nextChunk method, to enable chunking. It defers to other PRs to
 * actually convert any handlers to use chunking (e.g. WiP stats: Reimplement existing /stats API in a way that can be chunked. envoyproxy#19693)
 * implement proper flow control in the AdminFilter to avoid generating chunks before clients are ready to receive them (WiP admin: add flow control to admin filter envoyproxy#19898)

Additional Description:
Risk Level: medium -- this is a pure refactor but not a trivial one. The same exact admin content should be generated before/after this change
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Joshua Marantz <[email protected]>
Signed-off-by: Josh Perry <[email protected]>
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Mar 13, 2022
@jmarantz jmarantz added no stalebot Disables stalebot from closing an issue and removed stale stalebot believes this issue/PR has not been touched recently labels Mar 13, 2022
@jmarantz
Copy link
Contributor Author

This is blocked on #19693

Copy link
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR went stale and has been patched into #31087

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants