Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlproxyccl: flake in TestProxyProtocol #105585

Closed
yuzefovich opened this issue Jun 26, 2023 · 1 comment · Fixed by #105589
Closed

sqlproxyccl: flake in TestProxyProtocol #105585

yuzefovich opened this issue Jun 26, 2023 · 1 comment · Fixed by #105589
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-server-and-security DB Server & Security

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Jun 26, 2023

Seen on unrelated change here:

=== RUN   TestProxyProtocol/allow=false
    proxy_handler_test.go:2856: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/ccl/sqlproxyccl/proxy_handler_test.go:2883
        	Error:      	Expected nil, but got: &sqlproxyccl.errWithCode{
        	            	    code:  4,
        	            	    cause: &withstack.withStack{
        	            	        cause: &errutil.withPrefix{
        	            	            cause:  &errors.errorString{s:"invalid length of startup packet: 369295612"},
        	            	            prefix: "while receiving startup message",
        	            	        },
        	            	        stack: &withstack.stack{0x447439d, 0x44781ca, 0x447e879, 0x154b0a6, 0x4c4b41},
        	            	    },
        	            	}
        	Test:       	TestProxyProtocol/allow=false
    --- FAIL: TestProxyProtocol/allow=false (0.02s)

Jira issue: CRDB-29112

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 26, 2023
@blathers-crl blathers-crl bot added the T-server-and-security DB Server & Security label Jun 26, 2023
@jaylim-crl
Copy link
Collaborator

Thanks - will take a look.

craig bot pushed a commit that referenced this issue Jun 27, 2023
105316: obsservice: migrate gRPC ingest to follow new architecture  r=knz a=abarganier

**Reviewer note: review this PR commit-wise.**

----

In the original design of the obsservice, exported events were
intended to be written directly to storage. The idea was that
exported events would experience minimal transformation once
ingested, meaning that work done to "package" events properly
was left up to the exporting client (CRDB). The obsservice
would then store the ingested invents into a target storage.
This concept of target storage has been removed for now as
part of this patch.

In the new architecture, exported events are more "raw", and
we expect the obsservice to heavily transform & aggregate the
data externally, where the aggregated results are flushed
to storage instead.

This patch takes the pre-existing gRPC events ingester, and
modifies it to meet the new architecture.

The events ingester will now be provided with a consumer with
which it can feed ingested events into the broader pipeline.
It is no longer the responsibility of the ingester to write
ingested events to storage.

For now, we use a simple STDOUT consumer that writes all
ingested events to STDOUT, but in the future, this will
be a more legitimate component - part of a chain that
eventually buffers ingested events for aggregation.

Release note: none

Epic: CRDB-28526

105589: ccl/sqlproxyccl: fix possible flake in TestProxyProtocol r=pjtatlow a=jaylim-crl

Fixes #105585.

This commit updates the TestProxyProtocol test to only test the case where RequireProxyProtocol=true. There's no point testing the case where the RequireProxyProtocol field is false since every other tests do not use the proxy protocol (and that case is implicitly covered by them).

It's unclear what is causing this test flake (and it is extremely rare, i.e. 1 legit failure out of 1000 runs [1]). It may be due to some sort of race within the tests, but given that the case is covered by all other tests, this commit opts to remove the test entirely.

[1] https://teamcity.cockroachdb.com/test/-1121006080109385641?currentProjectId=Cockroach_Ci_TestsGcpLinuxX8664BigVm&expandTestHistoryChartSection=true

Release note: None

Release justification: Fixes a test flake.

Epic: none

105630: roachtest: handle panics in `mixedversion` r=smg260 a=renatolabs

Previously, a panic in a user function in a roachtest using the `mixedversion` package would crash the entire roachtest process. This is because all steps run in a separate goroutine, so if panics are not captured, the entire process crashes.

This commit updates the test runner so that all steps (including those that are part of the test infrastructure) run with panics captured. The panic message is returned as a regular error which should lead to usual GitHub error reports. The stack trace for the panic is also logged so that we can pinpoint the exact offending line in the test.

Epic: CRDB-19321

Release note: None

Co-authored-by: Alex Barganier <[email protected]>
Co-authored-by: Jay <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
@craig craig bot closed this as completed in bcfe501 Jun 27, 2023
blathers-crl bot pushed a commit that referenced this issue Jun 27, 2023
Fixes #105585.

This commit updates the TestProxyProtocol test to only test the case where
RequireProxyProtocol=true. There's no point testing the case where the
RequireProxyProtocol field is false since every other tests do not use the
proxy protocol (and that case is implicitly covered by them).

It's unclear what is causing this test flake (and it is extremely rare, i.e.
1 legit failure out of 1000 runs [1]). It may be due to some sort of race
within the tests, but given that the case is covered by all other tests, this
commit opts to remove the test entirely.

[1] https://teamcity.cockroachdb.com/test/-1121006080109385641?currentProjectId=Cockroach_Ci_TestsGcpLinuxX8664BigVm&expandTestHistoryChartSection=true

Release note: None

Release justification: Fixes a test flake.

Epic: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-server-and-security DB Server & Security
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants