-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate Hive DirectoryLister to TrinoFileSystem #17323
Migrate Hive DirectoryLister to TrinoFileSystem #17323
Conversation
b5c3a63
to
7b9f1ba
Compare
Note to self, |
|
Why do we need "Support Locations without authorities on all schemes"? |
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/IOIterator.java
Outdated
Show resolved
Hide resolved
|
||
return COMPLETED_FUTURE; | ||
} | ||
|
||
private List<TrinoFileStatus> listBucketFiles(Path path, FileSystem fs, String partitionName) | ||
private List<TrinoFileStatus> listBucketFiles(Location location, TrinoFileSystem fs, String partitionName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: put the file system argument first
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/TrinoFileStatus.java
Outdated
Show resolved
Hide resolved
The product tests hit a few cases where we were listing locations like |
lib/trino-filesystem/src/main/java/io/trino/filesystem/Location.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/TestLocation.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/TestLocation.java
Outdated
Show resolved
Hide resolved
432e959
to
9b9abc4
Compare
9b9abc4
to
102fcfb
Compare
Just a fixed checkstyle error |
Not sure if this is related or flaky:
|
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-hdfs/src/main/java/io/trino/filesystem/hdfs/HdfsFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-hdfs/src/main/java/io/trino/filesystem/hdfs/HdfsFileSystem.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/DirectoryListingFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/DirectoryListingFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/DirectoryListingFilter.java
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/fs/DirectoryListingFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/BackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
102fcfb
to
d616d90
Compare
d616d90
to
0d47cd3
Compare
Comments applied, had to resolve some conflicts from #17432 as well |
And looks like TestFileSystemCache was just a flake: #17158 |
|
trino/lib/trino-hdfs/src/main/java/io/trino/filesystem/hdfs/HdfsFileSystem.java Lines 199 to 222 in fad8847
@alexjo2144 @electrum Is it necessary to use the hierarchical method in HdfsFileSystem to determine whether the file system is a Hierarchical file system or an Object store file system? Because when using HdfsFileSystem , it is certain that it is not an Object store file system.
|
It is for now, because we haven't introduced native |
Tks for replying, got it |
Description
Refactoring some of the Hadoop FileSystem uses to instead use TrinoFileSystem.
Additional context and related issues
Removing the hierarchical listing API from the DirectoryLister was a necessary requirement for the migration, since TrinoFileSystem does not have a directory based listing API, only a recursive one.
There are a couple functional changes that I think are unavoidable:- Reading from Hive tables pointing to a location that does not exist produces no rows rather than failing-hive.ignore-absent-partitions
now applies both to partitions whose directories do not exist, but also ones whose directories exist but are emptyEdit: There are no longer any behavior changes
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: