Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16184. De-flake TestBlockScanner#testSkipRecentAccessFile #3329

Merged
merged 1 commit into from
Aug 25, 2021

Conversation

virajjasani
Copy link
Contributor

@virajjasani virajjasani commented Aug 24, 2021

Description of PR

Test TestBlockScanner#testSkipRecentAccessFile is flaky:

[ERROR] testSkipRecentAccessFile(org.apache.hadoop.hdfs.server.datanode.TestBlockScanner)  Time elapsed: 3.936 s  <<< FAILURE!
java.lang.AssertionError: Scan nothing for all files are accessed in last period.
	at org.junit.Assert.fail(Assert.java:89)
	at org.apache.hadoop.hdfs.server.datanode.TestBlockScanner.testSkipRecentAccessFile(TestBlockScanner.java:1015)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)

Reason for failure:

Thread in VolumnScanner keeps scanning blocks. Before block scan, it executes TestScanResultHandler#setup and it keeps waiting on info object until main thread in testSkipRecentAccessFile() notifies info object and sets it's shouldRun to true for VolumnScanner thread to successfully return from TestScanResultHandler#setup. Now in main test, we try to assert that info.blocksScanned stays 0 which is only possible if VolumnScanner's block scanner thread does not reach info.blocksScanned++ point, which cannot be guaranteed always and hence this test is flaky.

Fix:

In order to fix this, the main thread in testSkipRecentAccessFile() should initialize info.sem as Semaphore (permitting only single thread to take a lock, Mutex) and take a lock by main thread so that until released, block scan thread cannot reach point info.blocksScanned++ and hence the test never fails. Before the block scanner thread reach the point of incrementing blocksScanned count, it has to acquire Semaphore and it gets blocked here until main thread releases the lock.

How was this patch tested?

Unit tests

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 30m 55s trunk passed
+1 💚 compile 1m 21s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 1s trunk passed
+1 💚 mvnsite 1m 25s trunk passed
+1 💚 javadoc 0m 56s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 31s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 9s trunk passed
+1 💚 shadedclient 16m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 1m 13s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 13s the patch passed
+1 💚 compile 1m 9s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 1m 9s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
+1 💚 checkstyle 0m 51s the patch passed
+1 💚 mvnsite 1m 13s the patch passed
+1 💚 javadoc 0m 46s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 24s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 7s the patch passed
+1 💚 shadedclient 15m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 235m 35s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
319m 46s
Reason Tests
Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/1/artifact/out/Dockerfile
GITHUB PR #3329
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 780bcd5f80f1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 32976fb
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/1/testReport/
Max. process+thread count 3286 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

@ayushtkn @tasanuma could you please review this PR? Thanks

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 47s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 30m 50s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 15s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 1s trunk passed
+1 💚 mvnsite 1m 27s trunk passed
+1 💚 javadoc 0m 57s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 30s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 9s trunk passed
+1 💚 shadedclient 16m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 15s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 51s the patch passed
+1 💚 mvnsite 1m 14s the patch passed
+1 💚 javadoc 0m 48s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 19s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 4s the patch passed
+1 💚 shadedclient 15m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 233m 25s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
317m 34s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/2/artifact/out/Dockerfile
GITHUB PR #3329
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux c4ba9578d2a6 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 32976fb
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/2/testReport/
Max. process+thread count 3433 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 37m 36s trunk passed
+1 💚 compile 1m 43s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 27s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 11s trunk passed
+1 💚 mvnsite 1m 34s trunk passed
+1 💚 javadoc 1m 5s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 40s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 44s trunk passed
+1 💚 shadedclient 18m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 24s the patch passed
+1 💚 compile 1m 29s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 29s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 1m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 59s the patch passed
+1 💚 mvnsite 1m 18s the patch passed
+1 💚 javadoc 0m 50s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 24s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 15s the patch passed
+1 💚 shadedclient 16m 29s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 233m 55s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
329m 44s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/3/artifact/out/Dockerfile
GITHUB PR #3329
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux a643b7af123d 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 32976fb
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/3/testReport/
Max. process+thread count 3340 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3329/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@ayushtkn
Copy link
Member

Can you update the description with the reason for failure and detail about the fix

@virajjasani
Copy link
Contributor Author

Done @ayushtkn. Thanks

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx @virajjasani for the fix, I tried reproducing this with a simple sleep, and got the same exception, and with the your fix that got sorted.
Changes LGTM

Will wait some time for Takanobu before pushing this.

@virajjasani
Copy link
Contributor Author

Thanks @ayushtkn for detailed review with cross-verification.

Copy link
Member

@tasanuma tasanuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice fix, @virajjasani. LGTM.

@virajjasani
Copy link
Contributor Author

Thanks to both of you for quick reviews @tasanuma @ayushtkn !!

@virajjasani
Copy link
Contributor Author

Also, it's good to see two consecutive +1s from QA bot for hadoop-hdfs. Haven't seen more than single +1 from QA build for long time on trunk.

@tasanuma tasanuma merged commit 1b9927a into apache:trunk Aug 25, 2021
@tasanuma
Copy link
Member

Merged. Thanks for your contribution, @virajjasani. Thanks for your review, @ayushtkn.

Also, it's good to see two consecutive +1s from QA bot for hadoop-hdfs. Haven't seen more than single +1 from QA build for long time on trunk.

That's true. QA results seem pretty good recently. Really thanks for fixing several flaky tests, @virajjasani.

tasanuma pushed a commit that referenced this pull request Aug 25, 2021
Reviewed-by: Ayush Saxena <[email protected]>
Signed-off-by: Takanobu Asanuma <[email protected]>
(cherry picked from commit 1b9927a)
@virajjasani virajjasani deleted the HDFS-16184-trunk branch August 30, 2021 07:28
@virajjasani
Copy link
Contributor Author

Merged. Thanks for your contribution, @virajjasani. Thanks for your review, @ayushtkn.

Also, it's good to see two consecutive +1s from QA bot for hadoop-hdfs. Haven't seen more than single +1 from QA build for long time on trunk.

That's true. QA results seem pretty good recently. Really thanks for fixing several flaky tests, @virajjasani.

Thanks @tasanuma for all the reviews!!
Big thanks to @ayushtkn for fixing bunch of them with #2860 !!

kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants