Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky tcp connection trace test fix #510

Merged

Conversation

gregschohn
Copy link
Collaborator

Description

tcpConnections weren't being properly tracked. Now they are because they're tied to the netty channel itself. I've also improved some logging and fixed a multi-threading bug in shutdown.

  • Category Bug/test fix
  • Why these changes are required? Metrics and traces were wrong.
  • What is the old behavior before changes and new behavior after changes? Flaky test and now that test should work all the time

Issues Resolved

https://opensearch.atlassian.net/browse/MIGRATIONS-1502

Testing

./gradlew build slowTest

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

The bug was in shutdown logic.  Cleanup work was being done for some connection data structures from threads other than the assigned eventloop one, causing exception.

Signed-off-by: Greg Schohn <[email protected]>
…StreamWithRequestsWithCloseIsCommittedOnce that prevented a potentially second call to shutdown.

The latest code does a join() on the shutdown call before it proceeds to shut down the http server.  That seemed to be more important for a stable test, especially since it's a valid use of a second shutdown.

Signed-off-by: Greg Schohn <[email protected]>

# Conflicts:
#	TrafficCapture/trafficReplayer/src/test/java/org/opensearch/migrations/replay/FullReplayerWithTracingChecksTest.java
…e target itself. This made the code WAAYYYY simpler too.

Signed-off-by: Greg Schohn <[email protected]>
@gregschohn gregschohn merged commit 18e2a95 into opensearch-project:main Feb 16, 2024
4 of 5 checks passed
@gregschohn gregschohn deleted the FlakyTcpConnectionTraceTestFix branch February 16, 2024 20:56
gregschohn added a commit to gregschohn/opensearch-migrations that referenced this pull request Feb 20, 2024
Bugfixes and minor logging improvements.
Move tcpConnection tracing into the netty channel that connects to the target itself.  This made the code WAAYYYY simpler too.
There was a bug in shutdown logic.  Cleanup work was being done for some connection data structures from threads other than the assigned eventloop one, causing exception.

Signed-off-by: Greg Schohn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants