-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27522 Use reflect to read HDFS ReadStatistics #4917
Conversation
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Have you tried to depend on the 2.5.2-hadoop3 artifacts? It is compiled with hadoop 3.2.4 which should have fixed the problem. https://mvnrepository.com/artifact/org.apache.hbase/hbase-client/2.5.2-hadoop3 |
It looks like your HBase was build with Hadoop 2, but ran with Hadoop 3 runtime. Make sure your build and runtime dependency are the same. For what is worth, I specifically wrote this part of the code to avoid using reflection. |
IMO this PR is not needed for master branch, because master branch is supposed to depend on Hadoop 3 library only. It would make more sense for branch-2.x |
Thank you for your reminder, I tested it and it works with |
Yes, Hudi depends on You can find issue 5765 in the Hudi GitHub repository.
Reflection does have a slight performance penalty.
Thank you for the reminder, I now noticed that the master branch only has hadoop3. |
CurrentException in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
at org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
at org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
at org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
at org.test.TestHFile.main(TestHFile.java:39) TestHFilepackage org.test;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.io.hfile.CacheConfig;
import org.apache.hadoop.hbase.io.hfile.HFile;
import org.apache.hadoop.hbase.io.hfile.HFileContext;
import org.apache.hadoop.hbase.io.hfile.HFileContextBuilder;
import org.apache.hadoop.hdfs.MiniDFSCluster;
import java.io.IOException;
public class TestHFile {
public static void main(String[] args) throws IOException {
MiniDFSCluster dfsCluster = new MiniDFSCluster.Builder(new Configuration()).build();
FileSystem fs = dfsCluster.getFileSystem();
Configuration conf = fs.getConf();
Path file = new Path("/tmp/test_hbase_file.hfile");
HFileContext fileContext = new HFileContextBuilder().build();
CacheConfig cacheConfig = new CacheConfig(conf);
HFile.Writer writer = HFile.getWriterFactory(conf, cacheConfig)
.withPath(fs, file)
.withFileContext(fileContext)
.create();
KeyValue kv = new KeyValue("k1".getBytes(), null, null, "v1".getBytes());
writer.append(kv);
writer.close();
HFile.Reader reader = HFile.createReader(fs, file, conf);
try {
reader.close();
} finally {
dfsCluster.shutdown();
}
}
} pom.xml<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>hbase_hfile_test</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>hbase_hfile_test</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hbase.version>2.4.9</hbase.version>
<!-- <hbase.version>2.5.2-hadoop3</hbase.version>-->
</properties>
<profiles>
<profile>
<id>hadoop-2.10</id>
<properties>
<hadoop.version>2.10.0</hadoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
<classifier>tests</classifier>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<classifier>tests</classifier>
</dependency>
</dependencies>
</profile>
<profile>
<id>hadoop-3.0</id>
<properties>
<hadoop.version>3.0.0</hadoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-minicluster</artifactId>
<version>${hadoop.version}</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-api</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</profile>
</profiles>
<dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>${hbase.version}</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
</dependencies>
</project>
|
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
As I said above, if you want to use hadoop3, now you can depend on hbase-client 2.5.2-hadoop3. Let's just bump the hbase version in Hudi to solve the problem? Thanks. |
Thank you for comments. @Apache9 @jojochuang |
https://issues.apache.org/jira/browse/HBASE-27522
Use reflection in this PR for better compatibility.
In Hadoop 3.0.0 HDFS-8905, the return value of the
getReadStatistics
method of classHdfsDataInputStream
is modified.Add the metrics for reading
getReadStatistics
in HBASE-8868, but use HBase to read HFILE in the Hadoop 3 environment, and an error will be reported in the close reader.