Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-15201 SnapshotCounter hits MaxSnapshotID limit #1870

Merged
merged 5 commits into from
Mar 24, 2020

Conversation

karthikhw
Copy link
Contributor

@karthikhw karthikhw commented Mar 2, 2020

Jira: https://issues.apache.org/jira/browse/HDFS-15201

Users reported that they are unable to take HDFS snapshots and their snapshotCounter hits MaxSnapshotID limit. MaxSnapshotID limit is 16777215.

SnapshotManager.java

private static final int SNAPSHOT_ID_BIT_WIDTH = 24;

/**
 * Returns the maximum allowable snapshot ID based on the bit width of the
 * snapshot ID.
 *
 * @return maximum allowable snapshot ID.
 */
 public int getMaxSnapshotID() {
 return ((1 << SNAPSHOT_ID_BIT_WIDTH) - 1);
}

I think, SNAPSHOT_ID_BIT_WIDTH is too low. May be good idea to increase SNAPSHOT_ID_BIT_WIDTH to 31? to aline with our CURRENT_STATE_ID limit (Integer.MAX_VALUE - 1).

/**
 * This id is used to indicate the current state (vs. snapshots)
 */
public static final int CURRENT_STATE_ID = Integer.MAX_VALUE - 1;

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 19m 26s trunk passed
+1 💚 compile 1m 8s trunk passed
+1 💚 checkstyle 0m 45s trunk passed
+1 💚 mvnsite 1m 15s trunk passed
+1 💚 shadedclient 16m 3s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 47s trunk passed
+0 🆗 spotbugs 2m 50s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 48s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 7s the patch passed
+1 💚 compile 1m 1s the patch passed
+1 💚 javac 1m 1s the patch passed
+1 💚 checkstyle 0m 37s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 mvnsite 1m 8s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 13m 58s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 42s the patch passed
+1 💚 findbugs 2m 58s the patch passed
_ Other Tests _
-1 ❌ unit 80m 30s hadoop-hdfs in the patch passed.
+0 🆗 asflicense 0m 40s ASF License check generated no output?
146m 50s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRandomOpsWithSnapshots
hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens
hadoop.hdfs.server.namenode.TestCacheDirectives
hadoop.hdfs.server.namenode.snapshot.TestSnapshot
hadoop.hdfs.server.namenode.TestFsck
hadoop.hdfs.server.namenode.TestLargeDirectoryDelete
hadoop.hdfs.server.namenode.TestReencryptionWithKMS
hadoop.hdfs.server.namenode.TestNestedEncryptionZones
hadoop.hdfs.server.namenode.TestFSImage
hadoop.hdfs.server.namenode.TestReencryption
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile
hadoop.hdfs.server.namenode.TestFavoredNodesEndToEnd
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
hadoop.hdfs.server.namenode.TestAuditLoggerWithCommands
hadoop.hdfs.server.namenode.TestFileTruncate
hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys
hadoop.hdfs.server.namenode.snapshot.TestSnapshotStatsMXBean
Subsystem Report/Notes
Docker Client=19.03.6 Server=19.03.6 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/1/artifact/out/Dockerfile
GITHUB PR #1870
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 45d4fd00ddea 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / edc2e9d
Default Java 1.8.0_242
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/1/testReport/
Max. process+thread count 4152 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@karthikhw
Copy link
Contributor Author

Submitted new PR with the changes requested.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 18m 53s trunk passed
+1 💚 compile 1m 9s trunk passed
+1 💚 checkstyle 0m 44s trunk passed
+1 💚 mvnsite 1m 18s trunk passed
+1 💚 shadedclient 16m 3s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 46s trunk passed
+0 🆗 spotbugs 2m 49s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 48s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 7s the patch passed
+1 💚 compile 1m 0s the patch passed
+1 💚 javac 1m 0s the patch passed
+1 💚 checkstyle 0m 38s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 mvnsite 1m 7s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 39s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 44s the patch passed
+1 💚 findbugs 2m 54s the patch passed
_ Other Tests _
-1 ❌ unit 92m 36s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
159m 57s
Reason Tests
Failed junit tests hadoop.hdfs.TestEncryptionZonesWithKMS
hadoop.hdfs.TestEncryptionZones
Subsystem Report/Notes
Docker Client=19.03.6 Server=19.03.6 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/2/artifact/out/Dockerfile
GITHUB PR #1870
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 828dfedc29e0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / edc2e9d
Default Java 1.8.0_242
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/2/testReport/
Max. process+thread count 4480 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@jojochuang
Copy link
Contributor

@karthikhw can you double check the failed tests? especially the snapshot tests.

@jojochuang
Copy link
Contributor

I've gone through all the usage of the snapshot id. The only concern i had was bitwise operations on the snapshot id, but i didn't find any. Widening the allowed range shouldn't be a problem.

Copy link
Contributor Author

@karthikhw karthikhw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 19m 2s trunk passed
+1 💚 compile 1m 6s trunk passed
+1 💚 checkstyle 0m 46s trunk passed
+1 💚 mvnsite 1m 12s trunk passed
+1 💚 shadedclient 15m 58s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 44s trunk passed
+0 🆗 spotbugs 2m 51s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 49s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 6s the patch passed
+1 💚 compile 1m 3s the patch passed
+1 💚 javac 1m 3s the patch passed
+1 💚 checkstyle 0m 39s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 mvnsite 1m 10s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 13m 52s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 40s the patch passed
+1 💚 findbugs 2m 53s the patch passed
_ Other Tests _
-1 ❌ unit 92m 45s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
158m 10s
Reason Tests
Failed junit tests hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
hadoop.hdfs.TestEncryptionZonesWithKMS
hadoop.hdfs.TestEncryptionZones
Subsystem Report/Notes
Docker Client=19.03.7 Server=19.03.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/3/artifact/out/Dockerfile
GITHUB PR #1870
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux bb84c0e515d8 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 3afd4cb
Default Java 1.8.0_242
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/3/testReport/
Max. process+thread count 4515 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1870/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@karthikhw
Copy link
Contributor Author

It looks test case failure is un-related to this issue.

@karthikhw
Copy link
Contributor Author

@arp7 @jojochuang Can you please review this change if you get sometime?

Copy link
Contributor

@arp7 arp7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@szetszwo do you want to take a look too?

@szetszwo
Copy link
Contributor

The changes look good. Just a question: why changing getMaxSnapshotID() from -1 to -2?

@karthikhw
Copy link
Contributor Author

@szetszwo

Changed getMaxSnapshotID() from -1 to -2 because of CURRENT_STATE_ID(Integer.MAX_VALUE - 1) that have -1 already.

If SNAPSHOT_ID_BIT_WIDTH is 28 then we are ok with -1 but later change in SNAPSHOT_ID_BIT_WIDTH to 31 then user have to update getMaxSnapshotID() from -1 to -2.

@szetszwo
Copy link
Contributor

Let's keep it "-1" since we are using 28 for the moment. If there still a problem later on, we should think about what to do. We do not necessarily change it to 31 at that time.

@karthikhw
Copy link
Contributor Author

Thank you @szetszwo I changed back to -1.

@szetszwo
Copy link
Contributor

+1 the latest change looks good. Thanks @karthikhw

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karthikhw Thanks for working on this! The changes look good to me. +1.

@lokeshj1703 lokeshj1703 merged commit 5250cd6 into apache:trunk Mar 24, 2020
@lokeshj1703
Copy link
Contributor

@karthikhw Thanks for the contribution! @jojochuang @mukul1987 @arp7 @szetszwo Thanks for reviewing the PR! I have committed it to master branch.

RogPodge pushed a commit to RogPodge/hadoop that referenced this pull request Mar 25, 2020
@shenxingwuying
Copy link

shenxingwuying commented May 9, 2020

Submitted new PR with the changes requested.

@karthikhw

At the beginning, SNAPSHOT_ID_BIT_WIDTH was setted 31, but later was setted 28.

Why change SNAPSHOT_ID_BIT_WIDTH from 31 to 28? I think maybe:

  1. because of the lots of unittest cases failed ?
  2. SNAPSHOT_ID_BIT_WIDTH=31 will make java oom? (Now I have not understand why 31 make java oom but 28 not)

zhangxiping1 pushed a commit to zhangxiping1/hadoop that referenced this pull request Dec 13, 2022
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
(cherry picked from commit 5250cd6)
Change-Id: Ibf48916c28f35e866d8b441af65de1a0b92b1733
(cherry picked from commit 20ea94d4a940cef35f0ff873dfaea19c6e5a7b83)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants