Fix race condition when reloading configuration #32433

belimawr · 2022-07-21T10:42:49Z

What does this PR do?

This commit fixes the race condition introduced by
f3d1010 that aimed to fix some race
conditions.

A test has its logging updated to use t.Log, so it only logs when
verbosity is on and test line issuing the log is also shown.

Some t.Helper() calls are also added.

Why is it important?

It fixes a bug

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~- [x] I have made corresponding changes to the documentation~~
~~- [x] I have made corresponding change to the default configuration files~~
~~- [x] I have added tests that prove my fix is effective or that my feature works~~
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

~~## Author's Checklist~~

How to test this PR locally

TestAutodiscoverWithMutlipleEntries from libbeat/autodiscover/autodiscover_test.go fails intermittently if the bug is not fixed. A set of 5 consecutive runs always triggered the bug for me. So to verify the fix, just run that test a few times:

cd libbeat/autodiscover/
go test -run="TestAutodiscoverWithMutlipleEntries" -count 100

100 runs takes about 10s on my machine. All tests must pass.

Related issues

Fixes Build 666 for main with status FAILURE #32337
~~## Use cases~~
~~## Screenshots~~
~~## Logs~~

cmacknz · 2022-07-21T12:29:31Z

This will close #32337 when merged

elasticmachine · 2022-07-21T12:34:37Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

cmacknz · 2022-07-21T12:36:49Z

The 8.3.3 build candidate was generated earlier this morning. If we want this in the 8.3 release we need to raise it as a blocker on the release scheduled for next week and have a new build candidate generated. Otherwise this won't be fixed until the 8.4 release at the end of August.

@belimawr should I raise this as a blocker for 8.3.3 to have the build candidate regenerated?

CHANGELOG.next.asciidoc

cmacknz · 2022-07-21T12:38:24Z

libbeat/cfgfile/list.go

 			defer wg.Done()
 			runner.Stop()
 			r.logger.Debugf("Runner: '%s' has stopped", runner)
-		}()
+		}(runner)


I wish the linter would have caught this...

Am I correct that this means some running modules will never actually be stopped depending on the timing? This definitely feels like something we should include in the 8.3.3 release.

I wish the linter would have caught this...

Am I correct that this means some running modules will never actually be stopped depending on the timing? This definitely feels like something we should include in the 8.3.3 release.

Yes, it does. Without this PR, the situation is worse then the original bug 😢

But the original buggy PR is very recent, so I strongly believe no releases were made in the mean time.

I wish the linter would have caught this...

Me too, there should be a linter for that. It's such a basic case of variable shadowing...

Go vet can do it but it is disabled by default: https://golangci-lint.run/usage/linters/#govet

Probably it will be noisy but I can try turning it on and see what result it gives me.

I tried to run go vet but for some reason it didn't detect it. I even tried go vet -loopclosure, still not error shown :/

But I didn't spend much time trying to get it working

I tried to run go vet but for some reason it didn't detect it. I even tried go vet -loopclosure, still not error shown :/

https://cs.opensource.google/go/x/tools/+/refs/tags/v0.1.7:go/analysis/passes/loopclosure/loopclosure.go;l=24;bpv=1 (thanks to gopher slack tools people).

elasticmachine · 2022-07-21T12:42:22Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-07-21T13:30:13.348+0000
Duration: 81 min 53 sec

Test stats 🧪

Test	Results
Failed	0
Passed	22483
Skipped	1937
Total	24420

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

cmacknz · 2022-07-21T12:51:35Z

Tests failed with an unrelated environment error:

[2022-07-21T11:33:42.617Z] C:\Users\jenkins\workspace\PR-32433-1-146d402b-7897-4e7f-be39-eb41bfe01c93\src\github.com\elastic\beats>IF NOT EXIST C:\Python38\python.exe (
[2022-07-21T11:33:42.617Z] REM Install python 3.8  
[2022-07-21T11:33:42.617Z]  choco install python -y -r --no-progress --version 3.8.5   || exit /b 1 
[2022-07-21T11:33:42.617Z] ) 
[2022-07-21T11:33:43.442Z] Installing the following packages:
[2022-07-21T11:33:43.443Z] python
[2022-07-21T11:33:43.443Z] By installing you accept licenses for the packages.
[2022-07-21T11:34:06.306Z] python not installed. An error occurred during installation:
[2022-07-21T11:34:06.306Z]  The remote server returned an error: (500) Internal Server Error. Internal Server Error
[2022-07-21T11:34:06.306Z] python package files install completed. Performing other installation steps.
[2022-07-21T11:34:06.306Z] The install of python was NOT successful.
[2022-07-21T11:34:06.306Z] python not installed. An error occurred during installation:
[2022-07-21T11:34:06.306Z]  The remote server returned an error: (500) Internal Server Error. Internal Server Error
[2022-07-21T11:34:06.306Z] 
[2022-07-21T11:34:06.306Z] Chocolatey installed 0/1 packages. 1 packages failed.
[2022-07-21T11:34:06.306Z]  See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).

belimawr · 2022-07-21T13:26:39Z

The 8.3.3 build candidate was generated earlier this morning. If we want this in the 8.3 release we need to raise it as a blocker on the release scheduled for next week and have a new build candidate generated. Otherwise this won't be fixed until the 8.4 release at the end of August.

@belimawr should I raise this as a blocker for 8.3.3 to have the build candidate regenerated?

yes, it's better to raise it as a blocker.

This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added.

This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added. (cherry picked from commit 2777272) # Conflicts: # libbeat/autodiscover/autodiscover_test.go

This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added. (cherry picked from commit 2777272)

cmacknz · 2022-07-21T15:07:19Z

I can confirm this isn't in 8.3.2 so the bug was never released: https://github.com/elastic/beats/commits/v8.3.2

The original backport was merged here 2f69f86

This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added. (cherry picked from commit 2777272) Co-authored-by: Tiago Queiroz <[email protected]>

…on (#32443) This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added. (cherry picked from commit 2777272) Co-authored-by: Tiago Queiroz <[email protected]>

This commit fixes the race condition introduced by f3d1010 that aimed to fix some race conditions. A test has its logging updated to use t.Log, so it only logs when verbosity is on and test line issuing the log is also shown. Some t.Helper() calls are also added.

belimawr requested a review from a team as a code owner July 21, 2022 10:42

belimawr requested review from rdner and cmacknz and removed request for a team July 21, 2022 10:42

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 21, 2022

mergify bot assigned belimawr Jul 21, 2022

belimawr added backport-v8.3.0 Automated backport with mergify backport-7.17 Automated backport to the 7.17 branch with mergify labels Jul 21, 2022

cmacknz mentioned this pull request Jul 21, 2022

Build 666 for main with status FAILURE #32337

Closed

cmacknz added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jul 21, 2022

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 21, 2022

cmacknz reviewed Jul 21, 2022

View reviewed changes

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved

cmacknz reviewed Jul 21, 2022

View reviewed changes

cmacknz mentioned this pull request Jul 21, 2022

Build 690 for main with status FAILURE #32427

Closed

belimawr force-pushed the fix-fix-race-condition branch from 634246f to 1794e60 Compare July 21, 2022 13:29

belimawr mentioned this pull request Jul 21, 2022

x-pack/filebeat/input/httpjson: add transaction tracer #32412

Merged

6 tasks

cmacknz approved these changes Jul 21, 2022

View reviewed changes

belimawr merged commit 2777272 into elastic:main Jul 21, 2022

belimawr deleted the fix-fix-race-condition branch July 21, 2022 14:57

mergify bot mentioned this pull request Jul 21, 2022

[7.17](backport #32433) Fix race condition when reloading configuration #32443

Merged

mergify bot mentioned this pull request Jul 21, 2022

[8.3](backport #32433) Fix race condition when reloading configuration #32444

Merged

cmacknz added the blocker label Jul 21, 2022

cmacknz removed the blocker label Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition when reloading configuration #32433

Fix race condition when reloading configuration #32433

belimawr commented Jul 21, 2022 •

edited

Loading

cmacknz commented Jul 21, 2022

elasticmachine commented Jul 21, 2022

cmacknz commented Jul 21, 2022

cmacknz Jul 21, 2022

belimawr Jul 21, 2022

belimawr Jul 21, 2022

cmacknz Jul 21, 2022

belimawr Jul 21, 2022

efd6 Jul 21, 2022

elasticmachine commented Jul 21, 2022 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

cmacknz commented Jul 21, 2022

belimawr commented Jul 21, 2022

cmacknz commented Jul 21, 2022

Fix race condition when reloading configuration #32433

Fix race condition when reloading configuration #32433

Conversation

belimawr commented Jul 21, 2022 • edited Loading

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Related issues

cmacknz commented Jul 21, 2022

elasticmachine commented Jul 21, 2022

cmacknz commented Jul 21, 2022

cmacknz Jul 21, 2022

Choose a reason for hiding this comment

belimawr Jul 21, 2022

Choose a reason for hiding this comment

belimawr Jul 21, 2022

Choose a reason for hiding this comment

cmacknz Jul 21, 2022

Choose a reason for hiding this comment

belimawr Jul 21, 2022

Choose a reason for hiding this comment

efd6 Jul 21, 2022

Choose a reason for hiding this comment

elasticmachine commented Jul 21, 2022 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

cmacknz commented Jul 21, 2022

belimawr commented Jul 21, 2022

cmacknz commented Jul 21, 2022

belimawr commented Jul 21, 2022 •

edited

Loading

elasticmachine commented Jul 21, 2022 •

edited by jenkins-beats-ci bot

Loading