-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. #1388
Conversation
cc. @steveloughran to ask for reviewing. |
…c file as well when FileContext.rename(path, path, options) is called.
fbc0e19
to
b95c366
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
renameOpt = Options.Rename.OVERWRITE; | ||
} | ||
final Options.Rename opt = renameOpt; | ||
renameInternal(src, dst, (s, d) -> getMyFs().rename(s, d, opt)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be wrong to delegate to myFs like here:
hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FilterFs.java
Line 254 in d992c95
myFs.renameInternal(src, dst, overwrite); |
Just curious. What is the concrete class of the AbastractFileSystem field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChecksumFs should deal with checksum file by itself (others even don't have to know the existence of checksum - once all operations are pass through ChecksumFs's override methods): that's why the method should be override. Just delegating rename would have same result, checksum file leak.
What is the concrete class of the AbastractFileSystem field?
If the source file is from local filesystem, I think it's RawLocalFs. It could be DelegateToFileSystem, but it will redirect to RawLocalFs eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this, its the more elegant functional API. Which is nice for hadoop 3+. But I fear its probably going to lose all that elegance on branch-2 (assuming you do want a backport). If you do, then simply copying the existing renameInternal to one with a new signature is going to be the simplest
🎊 +1 overall
This message was automatically generated. |
* Defines a functional interface having two inputs which throws IOException. | ||
*/ | ||
@FunctionalInterface | ||
public interface CheckedBiFunction<LEFT, RIGHT, THROWABLE extends IOException> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cute! I never knew you could do that with templates and exceptions!
- can you put into org.apache.hadoop.fs.impl where the other internal-for-fs-only lambda stuff is going.
- be advised that for backports to branch 2 we will have to make things compile on Java 8. Mostly this is just using the IDE to convert things to callables. Doesn't mean they shouldn't be used, only that once you get sufficiently advanced things become unbackportable. This patch looks fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually CheckedBiFunction is already available in other place (HDDS) - I feel these interfaces/classes would be better to be moved into common module when Hadoop could forget about JDK7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, they should. I'm doing in new code that I know isn't going to be backportable into jdk7, not now we have to worry about jdk11
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java
Outdated
Show resolved
Hide resolved
core patch LGTM for a branch 3+ patch; made some minor comments about tests and imports If you want this in branch-2 then I think a copy-and-paste solution is the simpler one |
Sigh... I forgot Hadoop 2.x runs on JDK 7+. Actually I just did copy-and-paste and expected review comment on deduplication, hence taking this way but forgot about that. Thanks for reminding! I'll just roll back to copy-and-paste solution. |
This reverts commit 06df471.
Thanks for reviewing. I addressed review comments, including rolling back to the copy-and-paste approach. Please take a look. |
🎊 +1 overall
This message was automatically generated. |
all is good, +1 for trunk and I'll actually pull into 3.2 as well. For branch-2 I'll give you the homework of the cherrypick and retest before it goes in. |
Ok, committed to branch-3.2 and trunk. |
Thanks for reviewing and merging! I'll create separate PR for branch-2. Thanks for the guide. |
great! FWIW you made the 3.2 branch about 8h before the 3.2.1 branch was made, so this will be in that release. Please help test those RCs to make sure they do the right thing for you |
Please refer https://issues.apache.org/jira/browse/HADOOP-16255 for more details.
FYI,
FileContext.rename(path, path, options)
leaks crc file for source of rename when CheckFs or its descendant is used as underlying filesystem. https://issues.apache.org/jira/browse/SPARK-28025 took a workaround via removing crc file manually, and we hope to get rid of workaround eventually.