Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm: Add data buffering for chunks #36411

Merged
merged 6 commits into from
Oct 15, 2024

Conversation

juanmolle
Copy link
Contributor

Commit Message: wasm: Add data buffering for chunks

Additional Description:
In HTTP/2 connections, the last chunk, which contains the end_of_stream flag, is not used to call the Wasm callback. This fix addresses the issue by dumping the data into the buffer before calling the Wasm callback, ensuring that the data is now present.

Risk Level: Low
Testing: yes
Docs Changes: n/a
Release Notes: yes
Platform Specific Features: n/a

Fixes #35884

Signed-off-by: Juan Manuel Ollé <[email protected]>
@juanmolle
Copy link
Contributor Author

/retest

@mpwarres
Copy link
Contributor

mpwarres commented Oct 2, 2024

Ack. I am a bit overloaded at the moment but will review this evening.

@alyssawilk
Copy link
Contributor

@mpwarres ping?

yanavlasov
yanavlasov previously approved these changes Oct 9, 2024
@yanavlasov
Copy link
Contributor

This looks good to me. Will wait for @mpwarres review and then commit

/wait-any

Comment on lines +1726 to +1728
if (buffering_request_body_) {
decoder_callbacks_->addDecodedData(data, false);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will make the final output contains repeated data piece?

/wait

Copy link
Contributor Author

@juanmolle juanmolle Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't familiar with that function. and when try to understand where it was used and for what, I found this comment
https://github.com/envoyproxy/envoy/blob/main/source/server/admin/admin_filter.cc#L25
in our case streamming is a supported scenario and that is the reason it is only dumped in buffering scenario

I think I have that scenario covered with the test I have added. the one add at the end of the body '.0'
if other. scenario is not covered let me know I could add a test for that.
I made the suite for http and http/2 because this issue is not happening in http1, it seams the library is always sending the last chunk containing end_of_stream with 0 bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the addDecodedData use the move semantics. So, yeah, then this should be safe. I will take a more detailed look this night.

Comment on lines +216 to +217
auto request_body = std::vector<std::string>{{"request_"}, {"body"}};
auto upstream_response_body = std::vector<std::string>{{"upstream_"}, {"body"}};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a test with 3 (or more) pieces?

For the first piece, the addDecodedData won't be called before it haven't enter the buffering mode.
For the second piece, the addDecodeData will be called but it's not the ending, so, the filter chain will still be stoped to handle the empty piece (the piece has been moved by the addDecodedData).
For the third piece, the addDecodedData will be called and it the ending, so the filter chain will continue.

@wbpcode
Copy link
Member

wbpcode commented Oct 10, 2024

One comment to the test. Other things are fine. Thanks for this great fix. 🌷

Signed-off-by: Juan Manuel Ollé <[email protected]>
@wbpcode
Copy link
Member

wbpcode commented Oct 11, 2024

ping for @mpwarres for final review.

Copy link
Contributor

@mpwarres mpwarres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just minor/cosmetic comments. Thanks!

@@ -1801,7 +1807,7 @@ Http::FilterDataStatus Context::encodeData(::Envoy::Buffer::Instance& data, bool
buffering_response_body_ = false;
switch (result) {
case Http::FilterDataStatus::Continue:
request_body_buffer_ = nullptr;
response_body_buffer_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this as well.

logBody(type);
if (end_of_stream) {
getBufferStatus(type, &size, &flags);
setBuffer(type, size, 0, ".0");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) nit: for consistency with other operation handling in this plugin, where descriptive text (e.g. "partial.replace") is used as the data that is inserted, you might consider using ".end" or ".appended" instead of ".0".

@@ -12,13 +12,21 @@ namespace {

class WasmFilterIntegrationTest
: public HttpIntegrationTest,
public testing::TestWithParam<std::tuple<std::string, std::string, bool>> {
public testing::TestWithParam<
std::tuple<std::tuple<std::string, std::string, bool>, Http::CodecType>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: you could define a helper function similar to wasmDualFilterTestMatrix to allow the test to use a flat tuple<std::string, std::string, bool, Http::CodecType> rather than nesting tuples, which gets a little awkward with the nested std::get<>s.

static std::string
testParamsToString(const ::testing::TestParamInfo<
std::tuple<std::tuple<std::string, std::string, bool>, Http::CodecType>>& p) {
return fmt::format("{}_{}_{}_{}", std::get<2>(std::get<0>(p.param)) ? "downstream" : "upstream",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: if you decide to stick with nested tuples for test params, structured bindings might make this a little easier to read, e.g.

auto [wasm_test_params, codec] = p.param;
auto [runtime, language, direction] = wasm_test_params;
return fmt::format("{}_{}_{}_{}", direction ? "downstream" : "upstream", runtime, language, ...);

Signed-off-by: Juan Manuel Ollé <[email protected]>
@juanmolle juanmolle requested a review from mpwarres October 14, 2024 14:29
Signed-off-by: Juan Manuel Ollé <[email protected]>
wasmDualFilterWithCodecsTestMatrix(bool include_nullvm, bool cpp_only,
std::vector<Http::CodecType> codecs_type) {
std::vector<std::tuple<std::string, std::string, bool, Http::CodecType>> values;
for (const auto& codec_type : codecs_type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit type is better than auto except the type name is complex.
And you should use value semantics here rather than reference semantics for enum.

By the way, I think you could also use structured bindings in this helper function?

Signed-off-by: Juan Manuel Ollé <[email protected]>
@juanmolle juanmolle requested a review from wbpcode October 14, 2024 18:48
Copy link
Contributor

@mpwarres mpwarres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

@wbpcode
Copy link
Member

wbpcode commented Oct 15, 2024

This is a bug fix of wasm and won't make thing worse anyway. So, will let it enter the 1.32 at the last minute.

@wbpcode wbpcode merged commit 2e4ee89 into envoyproxy:main Oct 15, 2024
20 checks passed
@juanmolle juanmolle deleted the fix_wasm_body_buffering branch October 15, 2024 11:33
grnmeira pushed a commit to grnmeira/envoy that referenced this pull request Oct 17, 2024
Commit Message: wasm: Add data buffering for chunks

Additional Description:
In HTTP/2 connections, the last chunk, which contains the end_of_stream
flag, is not used to call the Wasm callback. This fix addresses the
issue by dumping the data into the buffer before calling the Wasm
callback, ensuring that the data is now present.

Risk Level: Low
Testing: yes
Docs Changes: n/a
Release Notes: yes
Platform Specific Features: n/a

Fixes  envoyproxy#35884

---------

Signed-off-by: Juan Manuel Ollé <[email protected]>
Signed-off-by: Gustavo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Response Body buffer is incomplete when using WASM extension and HTTP/2
5 participants