Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16427. Add debug log for BlockManager#chooseExcessRedundancyStriped #3888

Merged
merged 4 commits into from
Jan 27, 2022

Conversation

tomscut
Copy link
Contributor

@tomscut tomscut commented Jan 13, 2022

JIRA: HDFS-16427.

To solve this issue HDFS-16420 , we added some debug logs, which were also necessary. If there are other problems, we set the log level to DEBUG, which is convenient to analyze it.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 35m 14s trunk passed
+1 💚 compile 1m 27s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 23s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 58s trunk passed
+1 💚 mvnsite 1m 26s trunk passed
+1 💚 javadoc 1m 2s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 28s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 26s trunk passed
+1 💚 shadedclient 25m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 19s the patch passed
+1 💚 compile 1m 23s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 23s the patch passed
+1 💚 compile 1m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 54s the patch passed
+1 💚 mvnsite 1m 20s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 22s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 25s the patch passed
+1 💚 shadedclient 26m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 370m 9s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
479m 25s
Reason Tests
Failed junit tests hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/1/artifact/out/Dockerfile
GITHUB PR #3888
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux fb57d5b9b15e 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b9c642a
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/1/testReport/
Max. process+thread count 1979 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tomscut
Copy link
Contributor Author

tomscut commented Jan 20, 2022

Hi @tasanuma @ayushtkn , could you please take a look. Thanks.

@tasanuma
Copy link
Member

@tomscut Sorry for being late. I agree with adding more EC logs, but I feel that the log information is too detailed, and some variables duplicate with other variables. Could you please make it more concise and descriptive?

@tomscut
Copy link
Contributor Author

tomscut commented Jan 24, 2022

@tomscut Sorry for being late. I agree with adding more EC logs, but I feel that the log information is too detailed, and some variables duplicate with other variables. Could you please make it more concise and descriptive?

Thanks @tasanuma for your review.

These states are related to each other, and if the upper and lower states don't match, there is something wrong with the process in between.

Maybe nonExcess and storage2index are the same. But in some abnormal cases, we are not sure whether index will be calculated incorrectly, storage2index may cause elements of the same DN with different index to be overwritten.

If simplify this log, I would remove nonExcess and keep the rest. Do you think this is OK? Thank you.

@tomscut
Copy link
Contributor Author

tomscut commented Jan 24, 2022

At that time, we found that the candidates and replicasToDelete were inconsistent, so we determined that the placementPolicy had a problem and resolved it.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 3s trunk passed
+1 💚 compile 1m 50s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 7s trunk passed
+1 💚 mvnsite 1m 42s trunk passed
+1 💚 javadoc 1m 12s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 54s trunk passed
+1 💚 shadedclient 27m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 33s the patch passed
+1 💚 compile 1m 39s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 39s the patch passed
+1 💚 compile 1m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 59s the patch passed
+1 💚 mvnsite 1m 32s the patch passed
+1 💚 javadoc 1m 1s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 32s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 58s the patch passed
+1 💚 shadedclient 26m 50s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 381m 29s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
499m 20s
Reason Tests
Failed junit tests hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/2/artifact/out/Dockerfile
GITHUB PR #3888
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux f25a0a3859e5 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 509cac8
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/2/testReport/
Max. process+thread count 1939 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tomscut
Copy link
Contributor Author

tomscut commented Jan 25, 2022

Hi @tasanuma , could you please take a look again. Thank you very much.

@tasanuma
Copy link
Member

tasanuma commented Jan 26, 2022

@tomscut IMHO, adding unit tests and assertions would be better than logging all variables to prevent inconsistent conditions between variables. For example, I suggest adding the following Preconditions to avoid inconsistencies between candidates and replicasToDelete.

Preconditions.checkArgument(candidates.containsAll(replicasToDelete));

I still agree with adding more EC debug logs for not only developers but also non-developers. I prefer more descriptive logging rather than using variable names directly. And I think targetIndex, found, and duplicated should be omitted as they can be speculated from storage2index. (As I mentioned, we may want to add assertions or unit tests if they can be inconsistent.)

LOG.debug("Choose redundant EC replicas to delete from blk_{} which is located in {}", sblk.getBlockId(), storage2index);
LOG.debug("Storages with candidate blocks to be deleted: {}", candidates);
LOG.debug("Storages with blocks to be deleted: {}", replicasToDelete);

@tomscut
Copy link
Contributor Author

tomscut commented Jan 26, 2022

@tomscut IMHO, adding unit tests and assertions would be better than logging all variables to prevent inconsistent conditions between variables. For example, I suggest adding the following Preconditions to avoid inconsistencies between candidates and replicasToDelete.

Preconditions.checkArgument(candidates.containsAll(replicasToDelete));

I still agree with adding more EC debug logs for not only developers but also non-developers. I prefer more descriptive logging rather than using variable names directly. And I think targetIndex, found, and duplicated should be omitted as they can be speculated from storage2index. (As I mentioned, we may want to add assertions or unit tests if they can be inconsistent.)

LOG.debug("Choose redundant EC replicas to delete from blk_{} which is located in {}", sblk.getBlockId(), storage2index);
LOG.debug("Storages with candidate blocks to be deleted: {}", candidates);
LOG.debug("Storages with blocks to be deleted: {}", replicasToDelete);

Thanks @tasanuma for your advice, it makes sense to me.

I updated it. Please take a look. Thank you.

Co-authored-by: tasanuma <[email protected]>

Co-authored-by: Takanobu Asanuma <[email protected]>
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 35m 53s trunk passed
+1 💚 compile 1m 28s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 20s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 58s trunk passed
+1 💚 mvnsite 1m 28s trunk passed
+1 💚 javadoc 1m 1s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 24s trunk passed
+1 💚 shadedclient 25m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 20s the patch passed
+1 💚 compile 1m 23s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 23s the patch passed
+1 💚 compile 1m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 54s the patch passed
+1 💚 mvnsite 1m 22s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 24s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 28s the patch passed
+1 💚 shadedclient 25m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 325m 38s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
434m 22s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/4/artifact/out/Dockerfile
GITHUB PR #3888
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 1334b4677e84 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 69e22b5
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/4/testReport/
Max. process+thread count 1905 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 2m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 42s trunk passed
+1 💚 compile 1m 48s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 31s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 7s trunk passed
+1 💚 mvnsite 1m 45s trunk passed
+1 💚 javadoc 1m 12s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 53s trunk passed
+1 💚 shadedclient 27m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 32s the patch passed
+1 💚 compile 1m 37s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 37s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 56s the patch passed
+1 💚 mvnsite 1m 34s the patch passed
+1 💚 javadoc 1m 2s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 34s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 55s the patch passed
+1 💚 shadedclient 27m 2s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 438m 50s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 18s The patch does not generate ASF License warnings.
563m 26s
Reason Tests
Failed junit tests hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/3/artifact/out/Dockerfile
GITHUB PR #3888
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 35ee40fdfee8 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c61ec23
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/3/testReport/
Max. process+thread count 1988 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3888/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tasanuma tasanuma merged commit 6136d63 into apache:trunk Jan 27, 2022
@tasanuma
Copy link
Member

Merged it. Thanks for your contribution, @tomscut!

tasanuma pushed a commit that referenced this pull request Jan 27, 2022
tasanuma pushed a commit that referenced this pull request Jan 27, 2022
bogthe pushed a commit to bogthe/hadoop that referenced this pull request Feb 2, 2022
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants