-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: Fix quiche test compilation #12898
Windows: Fix quiche test compilation #12898
Conversation
bazel/external/quiche.patch
Outdated
++consecutive_pto_count_; | ||
pending_timer_transmission_count_ = max_probe_packets_per_pto_; | ||
return PTO_MODE; | ||
+ default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the patch can be revised if there is a quiche codebase convention for aborting/panicking etc., this is related to the compilation error:
Execution platform: @local_config_platform//:host
C:\_eb\execroot\envoy\bazel-out\x64_windows-opt\bin\external\com_googlesource_quiche\quiche\quic\core\quic_sent_packet_manager.cc(770) : error C2220: the following warning is treated as an error
C:\_eb\execroot\envoy\bazel-out\x64_windows-opt\bin\external\com_googlesource_quiche\quiche\quic\core\quic_sent_packet_manager.cc(770) : warning C4715: 'quic::QuicSentPacketManager::OnRetransmissionTimeout': not all control paths return a value
Target //test/extensions/quic_listeners/quiche:envoy_quic_utils_test failed to build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing it in #13066
Assigning over to @danzh2010. Can we potentially just fix these issues upstream? |
yes, I can fix it in QUICHE |
Thanks @danzh2010 whenever your changes are ready we can drop the quiche patch and bump the dependency |
Hope you had a good holiday weekend @danzh2010 - just a weekly ping to check in on this issue and unblock us. (Let us know if it will take a few weeks, and we can always patch and revert later once a fix lands upstream.) |
# GSO not supported on Windows | ||
"//bazel:windows_x86_64": [], | ||
"//conditions:default": [":udp_gso_batch_writer_lib"], | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danzh2010 it isn't at all apparent to me why this is still required after the last quiche refactoring merged to this patch, but dropping it does break compilation on linux release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the select is still needed.
deps = [ | ||
":quic_test_utils_for_envoy_lib", | ||
":test_utils_lib", | ||
"//source/extensions/quic_listeners/quiche:active_quic_listener_config_lib", | ||
"//source/extensions/quic_listeners/quiche:active_quic_listener_lib", | ||
"//source/extensions/quic_listeners/quiche:envoy_quic_utils_lib", | ||
"//source/extensions/quic_listeners/quiche:udp_gso_batch_writer_lib", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danzh2010 should I try reverting the block above in source/e/q/q/BUILD and also guard this in a select? Or is there a better component to target now in this deps list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try reverting the changes in active_quic_listener_lib and also remove this line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope you had a good holiday weekend @danzh2010 - just a weekly ping to check in on this issue and unblock us. (Let us know if it will take a few weeks, and we can always patch and revert later once a fix lands upstream.)
Sorry for late reply, I was OOO till today. I'm working on the upstream fix now. It's not gonna take a few weeks.
# GSO not supported on Windows | ||
"//bazel:windows_x86_64": [], | ||
"//conditions:default": [":udp_gso_batch_writer_lib"], | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the select is still needed.
deps = [ | ||
":quic_test_utils_for_envoy_lib", | ||
":test_utils_lib", | ||
"//source/extensions/quic_listeners/quiche:active_quic_listener_config_lib", | ||
"//source/extensions/quic_listeners/quiche:active_quic_listener_lib", | ||
"//source/extensions/quic_listeners/quiche:envoy_quic_utils_lib", | ||
"//source/extensions/quic_listeners/quiche:udp_gso_batch_writer_lib", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try reverting the changes in active_quic_listener_lib and also remove this line here?
/retest |
Retrying Azure Pipelines, to retry CircleCI checks, use |
@danzh2010 Looks like this patch is back in shape with master, if you could take a look at what remains of the initial delta. The net effect is
|
@danzh2010 my efforts to pick up the new work at quiche are proving fruitless, I expect we need you to bump and then we can clean up this PR and get the tests running. Dinking with bazel/repository_locations.bzl;
...
I experimented with multiple hacks to test this. Of the list below, the version isn't recognized at storage.googleapis.com, on quiche.googlesource.com the sha256 changes every blinking time it gets downloaded (generated on the fly with the current timestamp perhaps?) and finally, from the third mirror/fork below...
the branch's quiche.BUILD appears broken or out-of-sync;
and haven't been able to work out precisely what broke here. |
We are on hold for #13066 to be fixed in gcc and merged, at which point this patch should become simpler. /wait |
- Skip uses of GSO as it is not available on Windows - Fix syscall result types - Fix usage of Edge event trigger - move ifdef that avoids a test on non-Linux to ensure test doesn't appear to pass in test results - quiche integration test is still failing, requires further investigation Signed-off-by: Sunjay Bhatia <[email protected]> Signed-off-by: William A Rowe Jr <[email protected]>
/retest |
Retrying Azure Pipelines, to retry CircleCI checks, use |
/retest-circle |
🔨 rebuilding |
@danzh2010 after all of your efforts, the proposal is reduced to this to bring two handfuls of skipped tests back to action on Windows. I'd like you to do a first-pass over the remaining bits, I've force-merged a rebase to avoid the confusion of all the small tweaks we had had to apply. If you can give it a pass we can drop the skipped on windows by a third. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I did work on this personally with sunjay so we would both appreciate more eyeballs, the resulting state of this PR looks great to me.
Signed-off-by: Sunjay Bhatia <[email protected]> Co-authored-by: William A Rowe Jr <[email protected]>
…tests-compilation Co-authored-by: Sunjay Bhatia <[email protected]> Signed-off-by: William A Rowe Jr <[email protected]>
@ggreenway and @danzh2010 can you review and confirm our approach for getting quiche up on Windows, LGTM? Would like to accelerate this particular patch since it sits ahead of toggling clang-cl (we've already validated this quiche logic against that bump.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an alternate suggestion for all the pragmas to disable warnings: Could we create quiche_include_pre.h and quiche_include_post.h, and bracket those around quiche includes, and put all the warning-disable flags, and compiler ifdefs for warning-disable flags, into the pre and post, so we don't duplicate all that junk in every file?
Network::UdpPacketWriterPtr udp_packet_writer = | ||
std::make_unique<Quic::UdpGsoBatchWriter>(io_handle, scope); | ||
std::make_unique<Network::UdpDefaultWriter>(io_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for this change will go away after #13086 merges.
@danzh2010 what was the reason for using the GSO batch writer in this test instead of the default writer? Coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait for the other PR to merge and then come back to this. It should merge soon.
/wait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for this change will go away after #13086 merges.
@danzh2010 what was the reason for using the GSO batch writer in this test instead of the default writer? Coverage?
Just to test more the interaction between the batch writer and QUIC listener because mostly likely the batch writer will be used in production.
I would not do that. precisely.... rather have a quiche_core_includes.h header that handles the 80/20 of the most commonly needed quiche headers by many sources, and handles the pragmas and inclusions. The other rare exceptions can be handled by the sources which consume them. The actual pragmas warnings ignored required do vary by included header and compiler (including clang/gcc differences which are glossed over at present.) |
For compiling external-to-envoy source, I'd say we can just disable all the warnings that occur in any quiche header, for every quiche header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danzh2010 I'm not sure where you would like these notes recorded, but this is feedback of the latest effort to upstream problems we are hitting with quiche;
We still have to disable -wd4068, because of unguarded #pragma GCC (originally, clang) occurring in quiche external prototype headers. But that isn't a big problem if we modify quiche.genrule_cmd to do the right thing with respect to clang, vs gcc, vs msvc, and I'm adding a note for bazel_internal.bzl to that effect.
This patch now includes many commits just to undo the #pragma GCC headache that existed at envoy. The guards could be more carefully focused as needed, by checking the GCC version, use clang or msvc equivalents where appropriate, and even be fancier about it if we want; https://stackoverflow.com/questions/3378560/how-to-disable-gcc-warnings-for-a-few-lines-of-code - this is just a straightforward stopgap.
Globally from bazel/external/quiche.BUILD, this is a newer addition to gcc since (Ubuntu 9.3.0-11ubuntu0~18.04.1) 9.3.0;;
cc1plus: error: unrecognized command line option '-Wno-range-loop-analysis' [-Werror] cc1plus: all warnings being treated as errors
It seems we need a feature detection here and dodge specific gcc flavors.
We have this compilation failure on gcc (Ubuntu 9.3.0-11ubuntu0~18.04.1) 9.3.0;
bazel-out/k8-fastbuild/bin/external/com_googlesource_quiche/quiche/quic/core/quic_framer.cc: In member function 'bool quic::QuicFramer::AppendIetfAckFrameAndTypeByte(const quic::QuicAckFrame&, quic::QuicDataWriter*)': bazel-out/k8-fastbuild/bin/external/com_googlesource_quiche/quiche/quic/core/quic_framer.cc:5546:61: error: comparison of integer expressions of different signedness: 'size_t' {aka 'long unsigned int'} and 'int' [-Werror=sign-compare] 5546 | static_cast<size_t>(writer->remaining() - ecn_size) < | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 5547 | QuicDataWriter::GetVarInt62Len(gap) + | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5548 | QuicDataWriter::GetVarInt62Len(ack_range)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Perhaps a judicious -Wno-sign-compare would work around this, but that's just a hack. What about ssize_t rather than of size_t? I'm curious why only we see it on gcc default bazel build, and it isn't observed in CI.
Fastbuild does not set _DEBUG. It also does not set NDEBUG; examining this error on windows msvc-cl;
ERROR: C:/_eb/external/com_googlesource_quiche/BUILD.bazel:2192:17: C++ compilation of rule '@com_googlesource_quiche//:quic_core_frames_frames_lib' failed (Exit 2) C:\_eb\execroot\envoy\bazel-out\x64_windows-fastbuild\bin\external\com_googlesource_quiche\quiche/quic/core/frames/quic_frame.h(99): error C2926: 'quic::QuicFrame::<unnamed-tag>::delete_forbidden': a default member initializer is not allowed for a member of an anonymous struct within a union
It isn't clear if this defect is specific to Windows (due to the previous error on gcc), but by papering over this in quiche.BUILD with a feature define to force QUIC_FRAME_DEBUG=0 got us past this new problem.
I will fix these 3 issues in upstream. As to #pragma GCC
override, I don't have strong preference in either way, they all seems to be one time change and eventually, QUICHE will be in a state that we don't need them.
tags = [ | ||
"fails_on_windows", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious which test fails on Windows. Can you file an issue about the test failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup (actually, now two failures since the last merge) #13273
Network::UdpPacketWriterPtr udp_packet_writer = | ||
std::make_unique<Quic::UdpGsoBatchWriter>(io_handle, scope); | ||
std::make_unique<Network::UdpDefaultWriter>(io_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for this change will go away after #13086 merges.
@danzh2010 what was the reason for using the GSO batch writer in this test instead of the default writer? Coverage?
Just to test more the interaction between the batch writer and QUIC listener because mostly likely the batch writer will be used in production.
…tests-compilation Signed-off-by: William A Rowe Jr <[email protected]>
That's awesome news, thank you. Lets please still land this patch in the interim? I'd rather we didn't miss the 1.16 window due to upstream/elsewhere issues. @ggreenway I think you might agree that there is zero return for investing more cycles at refactoring the existing pragma GCC mess, if we expect to eliminate them wholesale as the quiche project catches up with the language standards? |
Makes sense. Let's go with the pragma changes in this PR; we can revisit later if the situation changes. |
Signed-off-by: William A Rowe Jr <[email protected]>
Signed-off-by: William A Rowe Jr <[email protected]>
Signed-off-by: Sunjay Bhatia <[email protected]> Signed-off-by: William A Rowe Jr <[email protected]> Co-authored-by: Sunjay Bhatia <[email protected]> Co-authored-by: William A Rowe Jr <[email protected]>
Signed-off-by: William A Rowe Jr <[email protected]> Co-authored-by: Sunjay Bhatia <[email protected]>
- Mark 2 tests failing - one due to expecting buffered Gso style support - expect that the integration test is failing for a common defect between other failing integration tests - not regressions of this PR - explicitly discard result in test Signed-off-by: Sunjay Bhatia <[email protected]> Co-authored-by: William A Rowe Jr <[email protected]>
@mattklein123 I think this satisfies all your questions and concerns; the ASAN failure is our shared CI issue, not a regression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Let's ship and iterate to avoid future breakage.
Commit Message:
Windows: Fix quiche test compilation
appear to pass in test results
investigation
Additional Description:
Regarding quiche patch:
Risk Level: Low
Testing: Enables quiche tests on Windows, verified locally on Windows
Docs Changes: N/A
Release Notes: N/A
Fixes #10420