Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Flaky test - testRestoreShardFromRemoteStore #5209

Closed
mch2 opened this issue Nov 10, 2022 · 6 comments · Fixed by #5399
Closed

[BUG] Flaky test - testRestoreShardFromRemoteStore #5209

mch2 opened this issue Nov 10, 2022 · 6 comments · Fixed by #5399
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run

Comments

@mch2
Copy link
Member

mch2 commented Nov 10, 2022

  2> REPRODUCE WITH: gradlew ':server:test' --tests "org.opensearch.index.shard.IndexShardTests.testRestoreShardFromRemoteStore" -Dtests.seed=709E54060BABD98F -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-ME -Dtests.timezone=Europe/Kiev -Druntime.java=17
  2> java.lang.AssertionError: expected:<0> but was:<2>
        at __randomizedtesting.SeedInfo.seed([709E54060BABD98F:DAD18613CF69A4F0]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.junit.Assert.assertEquals(Assert.java:633)
        at org.opensearch.index.shard.IndexShardTests.testRestoreShardFromRemoteStore(IndexShardTests.java:2710)

Run on windows CI against 2.4.

@mch2 mch2 added bug Something isn't working untriaged flaky-test Random test failure that succeeds on second run labels Nov 10, 2022
@kartg kartg removed the untriaged label Nov 17, 2022
@Bukhtawar
Copy link
Collaborator

cc: @sachinpkale

@mch2
Copy link
Member Author

mch2 commented Nov 28, 2022

@sachinpkale
Copy link
Member

Let me take a look and fix it on priority.

@mch2
Copy link
Member Author

mch2 commented Nov 29, 2022

Looks like a windows specific issue. I tried deleting in a loop while length > 0 and all files will get deleted. I think files are moving to pendingDeleteFiles and not getting purged until another check.

I've hit this with 1 file remaining. Usually write.lock and 2 files remaining write.lock and .cfe.

@mch2
Copy link
Member Author

mch2 commented Nov 29, 2022

@sachinpkale Put up a PR to fix this, its not specific to remote store rather buggy delete logic on Windows.

@sachinpkale
Copy link
Member

Great! Thanks @mch2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants