-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-13616. Batch listing of multiple directories #1725
Conversation
A rebase of the original patch by @umbrant . There are not too many conflicts when doing the rebasing. IMO the original patch is already in a very good shape. I'll start from here and see if people have any comments. cc @jojochuang |
Glad that this is being picked back up! Thanks Chao! |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick first pass
@@ -697,6 +697,12 @@ boolean mkdirs(String src, FsPermission masked, boolean createParent) | |||
DirectoryListing getListing(String src, byte[] startAfter, | |||
boolean needLocation) throws IOException; | |||
|
|||
@Idempotent | |||
BatchedDirectoryListing getBatchedListing( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we want to add the annotation
@ReadOnly(isCoordinated = true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Done.
@Override | ||
public BatchedDirectoryListing getBatchedListing(String[] srcs, | ||
byte[] startAfter, boolean needLocation) throws IOException { | ||
throw new UnsupportedOperationException("Not implemented"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's file a jira to support this API in RBF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Filed https://issues.apache.org/jira/browse/HDFS-15029
public byte[] getSrcPathsHash(String[] srcs) { | ||
MessageDigest md; | ||
try { | ||
md = MessageDigest.getInstance("MD5"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should refine this method.
- synchronize this method.
- MessageDigest.getInstance() has big overhead so we should reuse it. (It's instantiates a new object each time)
- After use, call MessageDigest#reset().
https://stackoverflow.com/questions/13802627/when-to-use-messagedigest-reset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will fix this.
return md.digest(); | ||
} | ||
|
||
BatchedDirectoryListing getBatchedListing(String[] srcs, byte[] startAfter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is longer than I can possibly understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a little hard to refactoring this given that several variables are used throughout the method, such as lastListing
. Would it be helpful if we add more comments in the method body, explaining what each section is doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. i'm just being lazy.
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also looks like it doesn't compile.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/ListingBenchmark.java
Show resolved
Hide resolved
- Caches digest and make accesses to it synchronized. - Changes `parent` to `listedPath` according to CR on JIRA itself - Added documentation for a few public methods - Added metrics for listing operation
One thing I'm not totally sure is the case where the input
which I think is not very accurate. Is |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the last one looks good to me.
💔 -1 overall
This message was automatically generated. |
Hello, I encountered a bug when using the batch method, when I input a directory with more than 1000 files in it and 2 replications of each file's data block, only the first 500 files of this directory are returned and then it stops. I think it should be hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java in getBatchedListing() method to modify, as follows.
The reason for this bug is mainly that the result returned by the getListingInt(dir, pc, src, indexStartAfter, needLocation) method will limit both the number of files in the directory as well as the number of data blocks and replications of the files at the same time. But the getBatchedListing() method will only exit the loop if the number of returned results is greater than 1000. |
@fanlinqian best to file an HDFS issue on the apache jira server. |
(cherry picked from commit d7c4f8a) Change-Id: I7b058f0761a66558907401adf0340c86de3a5993
One of the dominant workloads for external metadata services is listing of partition directories. This can end up being bottlenecked on RTT time when partition directories contain a small number of files. This is fairly common, since fine-grained partitioning is used for partition pruning by the query engines.
A batched listing API that takes multiple paths amortizes the RTT cost. Initial benchmarks show a 10-20x improvement in metadata loading performance.