Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/windowseventlog] Flaky test TestReadWindowsEventLogger #34687

Open
pjanotti opened this issue Aug 14, 2024 · 9 comments · Fixed by #34793
Open

[receiver/windowseventlog] Flaky test TestReadWindowsEventLogger #34687

pjanotti opened this issue Aug 14, 2024 · 9 comments · Fixed by #34793

Comments

@pjanotti
Copy link
Contributor

Component(s)

receiver/windowseventlog

Describe the issue you're reporting

Flaky test: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10391822945/job/28775648414?pr=34656#step:6:571

=== FAIL: . TestReadWindowsEventLogger (20.08s)
    receiver_windows_test.go:108: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:108
        	Error:      	Condition never satisfied
        	Test:       	TestReadWindowsEventLogger

=== FAIL: . TestReadWindowsEventLogger (re-run 1) (20.02s)
    receiver_windows_test.go:108: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:108
        	Error:      	Condition never satisfied
        	Test:       	TestReadWindowsEventLogger
@pjanotti pjanotti added the needs triage New item requiring triage label Aug 14, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@pjanotti
Copy link
Contributor Author

/label os:windows

@djaglowski djaglowski removed the needs triage New item requiring triage label Aug 16, 2024
@djaglowski
Copy link
Member

+1: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10422926675/job/28869515325?pr=34720#step:6:253

I thought this was due to other changes in the PR but it appears to be pre-existing.

@pjanotti
Copy link
Contributor Author

Yes, @djaglowski - I was glancing at the test code and there are many non-reliable things happening in the tests. I will start working on improving these tests.

@songy23 songy23 added the flaky test a test is flaky label Aug 16, 2024
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
)

**Description:**
There are 3 main issues being addressed on this PR:

* There is a race between start and generating the events in the tests:
the start require goroutines so a delay before generating the events
will be advisable (not perfect but good enough per my tests)
* The tests assume that only the tests are generating events in that
time window, that is impossible to guarantee on a windows box, so
instead of checking for exactly one event they need to get new events
and check the event provider instead
* The exclusion test checks if zero events were delivered by the
receiver, it should check if one not filtered out was received, and this
one should be generated after the event from the provider that should be
excluded.

**Link to tracking Issue:**
Fixes open-telemetry#34687

**Testing:**
Multiple local runs, we should do a few runs on CI given the runner
characteristics.

**Documentation:**
N/A
@songy23
Copy link
Member

songy23 commented Oct 16, 2024

This is still happening https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/11371732717/job/31634469890

=== Failed
=== FAIL: . TestReadWindowsEventLoggerRaw (13.26s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:209
        	Error:      	Condition never satisfied
        	Test:       	TestReadWindowsEventLoggerRaw
    receiver_windows_test.go:210: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:210
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestReadWindowsEventLoggerRaw

=== FAIL: . TestExcludeProvider/with_Raw (13.01s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277
        	Error:      	Condition never satisfied
        	Test:       	TestExcludeProvider/with_Raw
    receiver_windows_test.go:278: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:278
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestExcludeProvider/with_Raw

=== FAIL: . TestExcludeProvider (17.78s)

=== FAIL: . TestReadWindowsEventLoggerRaw (re-run 1) (13.27s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:209
        	Error:      	Condition never satisfied
        	Test:       	TestReadWindowsEventLoggerRaw
    receiver_windows_test.go:210: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:210
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestReadWindowsEventLoggerRaw

DONE 2 runs, 20 tests, 4 failures in 68.683s

@songy23 songy23 reopened this Oct 16, 2024
@pjanotti
Copy link
Contributor Author

A different test, but, for now keeping track of this failure here, they may have the same root cause - will open separate issue/PR is appropriate.

Data race:
https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/11805232755/job/32887233059#step:8:581

=== FAIL: . TestExcludeProvider/with_Raw (13.02s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277
        	Error:      	Condition never satisfied
        	Test:       	TestExcludeProvider/with_Raw
==================
WARNING: DATA RACE
Read at 0x00c000008ba0 by goroutine 59:
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:333 +0x20e
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277 +0x5e4
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44

Previous write at 0x00c000008ba0 by goroutine 102:
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:329 +0xa4
  github.com/stretchr/testify/assert.EventuallyWithT.func1()
      C:/Users/runneradmin/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:1983 +0xb7

Goroutine 59 (running) created at:
  testing.(*T).Run()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x825
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:248 +0x21b
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44

Goroutine 102 (finished) created at:
  github.com/stretchr/testify/assert.EventuallyWithT()
      C:/Users/runneradmin/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:1978 +0x42a
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328 +0x204
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277 +0x5e4
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44
==================
    receiver_windows_test.go:278: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:278
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestExcludeProvider/with_Raw
    testing.go:1398: race detected during execution of test

=== FAIL: . TestExcludeProvider (17.81s)

=== FAIL: . TestExcludeProvider/with_Raw (re-run 1) (13.03s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277
        	Error:      	Condition never satisfied
        	Test:       	TestExcludeProvider/with_Raw
==================
WARNING: DATA RACE
Read at 0x00c000342420 by goroutine 16:
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:333 +0x20e
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277 +0x5e4
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44

Previous write at 0x00c000342420 by goroutine 59:
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:329 +0xa4
  github.com/stretchr/testify/assert.EventuallyWithT.func1()
      C:/Users/runneradmin/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:1983 +0xb7

Goroutine 16 (running) created at:
  testing.(*T).Run()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x825
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:248 +0x21b
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44

Goroutine 59 (finished) created at:
  github.com/stretchr/testify/assert.EventuallyWithT()
      C:/Users/runneradmin/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:1978 +0x42a
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.assertExpectedLogRecords()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328 +0x204
  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/windowseventlogreceiver.TestExcludeProvider.func1()
      D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277 +0x5e4
  testing.tRunner()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1[689](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/11805232755/job/32887233059#step:8:690) +0x21e
  testing.(*T).Run.gowrap1()
      C:/hostedtoolcache/windows/go/1.22.8/x64/src/testing/testing.go:1742 +0x44
==================
    receiver_windows_test.go:278: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:278
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestExcludeProvider/with_Raw
    testing.go:1398: race detected during execution of test
make[2]: *** [../../Makefile.Common:131: test] Error 1
make[1]: *** [Makefile:200: receiver/windowseventlogreceiver] Error 2
make[1]: *** Waiting for unfinished jobs....

=== FAIL: . TestExcludeProvider (re-run 1) (13.57s)

@pjanotti
Copy link
Contributor Author

Another error on the same receiver, a different one:

https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/11804643847/job/32885316624#step:8:578

=== FAIL: . TestExcludeProvider/with_EventXML (13.02s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277
        	Error:      	Condition never satisfied
        	Test:       	TestExcludeProvider/with_EventXML
    receiver_windows_test.go:278: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:278
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestExcludeProvider/with_EventXML

=== FAIL: . TestExcludeProvider (17.84s)

=== FAIL: . TestExcludeProvider/with_EventXML (re-run 1) (13.01s)
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:330
        	            				C:/hostedtoolcache/windows/go/1.22.8/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"[]" should have 1 item(s), but has 0
    receiver_windows_test.go:328: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:328
        	            				D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:277
        	Error:      	Condition never satisfied
        	Test:       	TestExcludeProvider/with_EventXML
    receiver_windows_test.go:278: 
        	Error Trace:	D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/windowseventlogreceiver/receiver_windows_test.go:278
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestExcludeProvider/with_EventXML

=== FAIL: . TestExcludeProvider (re-run 1) (13.53s)

@pjanotti
Copy link
Contributor Author

pjanotti commented Nov 16, 2024

The data race at #34687 (comment) seems due to the use of assert.EventuallyWithT in a case that the waitFor expires but the lambda function doing the evaluation is still scheduled. Ideally this should use require.EventuallyWithT, but, we had some problems when the code was using it, but, since that it was before other changes it is time to revisit that.

TylerHelmuth pushed a commit that referenced this issue Nov 18, 2024
Fixes a data race detected
[here](#34687 (comment)).
This doesn't solve the flakiness detected on this receiver tests, only
avoids the data race.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants