-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ObjectStore Directory Semantics #2445
Comments
Regardless the chosen approach (ObjectStore vs FileSystem) I would consider to make the trait (and it's methods) consistent: Currently the trait is named ObjectStore but it only has methods related to Files. Either update/rename the methods (and datatypes) such as fn list_object(s) -> ObjectMetadata .. Or rename the trait to FileSystem... |
Also consider Issue-2465. The objectstore is requested to list files that match prefix "/Users/blah//". One could claim that this path does not match the prefix. For globbing to work, within a key/prefix concept, the returned objects/files should carry the "key" that matches the prefix. (Currently my fix in the mentioned issue works the other way round -> globbing is adapted to filesystem implementation). |
Also relevant: As part of that PR, I plan on creating a generic suite of tests to validate a For FileSystem vs ObjectStore, I'm only familiar with implementations of the first in the context of query engines (such as Arrow C++'s FileSystem or Python's fsspec). Are there examples of ObjectStore implementations? My preference is for a "FileSystem" approach since that's more familiar, but open the ObjectStore approach as long as that can be used to read and write in a way compatible with other systems that may use a FileSystem approach. (For example, the current implementation doesn't force delimiting paths with |
I'm not sure what you mean by this, but object stores are really just key value stores with a vaguely RESTful API, i.e.
There are more complex APIs for things like multipart uploads, bucket creation, etc... but in terms of what a client would be interested in that is the entirety of the API. To put it another way, the interface of object storage is significantly less expressive than that of a filesystem. This is why object storage is scalable, and things like NFS, EFS, are... not 😅 Trying to make object storage behave exactly like a filesystem is impossible (e.g. S3 doesn't support CreateIfNotExists), however, my thesis is that no query engine actually wants filesystem semantics, and this is why these linked abstractions kind of work (#2205 (comment)). My suggestion is that by instead implementing the less expressive object storage semantics, we can avoid a whole host of funky edge-cases around directories, paths, buffering, read-ahead etc...
Could you expand on what you mean by this, do you mean being able to read data written by another system which should be trivial, or are you talking about some sort of API-level integration like FFI? |
The canonical example of The idea of the "ObjectStore" interface in DataFusion was to provide API access to the lowest common denominator feature set across several storage implementations. For example, here are three implementations for S3, HDFS, and Azure specifically:
In terms of "glob"ing, that is typically not a feature provided by object stores (e.g. there is no such thing in S3, which instead offers a much more restricted notion of You can see another example of a Rust API to object storage in IOx: https://github.com/influxdata/influxdb_iox/blob/main/object_store |
It would help me significantly, to understand the globbing usecase more -- like when exactly are you selecting a subset of files in a directory via a glob? Most analytic systems I have seen tend to assume data has been pre-grouped into directories (or equivalent) AWS redshift does offer the ability to specify a subset of files that are not all in the same directory, but it does so by taking a manifest file: https://docs.aws.amazon.com/redshift/latest/dg/loading-data-files-using-manifest.html |
Also, @carols10cents spent considerable time sorting out consistent directory semantics for object stores and local files in https://github.com/influxdata/influxdb_iox/blob/main/object_store -- maybe we can just use those semantics (or maybe even the code?) |
Sorry that wasn't clear. I pointed out two implementations of an abstraction over object stores (S3, GCS, etc.) that are like filesystems (in that they have a notion of directories, not that they make any guarantees about atomicity). These are used by analytics systems like Dask and PyArrow, so there's some evidence we can build useful query engines on top of such an abstraction. Thanks @alamb for the IOx example.
I largely agree. I think the main thing these "FileSystem" abstractions provide is a notion of "directory", which is important in directory-partitioned datasets. The existing API can handle that fine with delimiter, but it does seem a little funny you can provide whatever delimiter you want.
Yeah I think as long as you could do the expected filesystem operations on top of the API, then that seems fine. For context, I plan to wrap the But I think I'll scale back my changes in #2246 and remove the
That sounds very promising @lamb. Thanks for pointing out! |
If we like the IOx object store interface and want to reuse the implementation, I can probably see about getting it published to crates.io, just let me know. It wasn't my intent with this issue, rather I just wanted clarity on what I should be reviewing 😅, but I would be happy to help make it happen if there is consensus on it being a good idea |
I would be supportive of that, but we probably would need to discuss what that means for Do we want to create a new issue to discuss that? |
@alamb The globbing is mainly relevant in raw/ingestion folders... Eg: we have end up with a structure such as: In a typical job we would then process and prepare the data for consumption: I don't need access to all sorts of key filters (compared to all key filters in a system such as HBase but globbing is not something I would push back to the end-user (In hadoop this is also supported by alternative (s3, azure) hadoop filesystem implementations) |
In summary, I agree with the ObjectStore semantics being sufficient. I also do want to point out that globbing is nothing more than making the suffix filter more powerful (instead of matching against a static suffix (eg: ".parquet") it allows matching against a pattern).
|
Currently the globbing implementation in datafusion is somewhat blurry, because it tries to workaround a limitation of the localfilesystem objectstore implementation.. As we all seem to agree, that proper solution would be to fix the LocalFileSystem implementation such that it does not err on a prefix which does not represent a file/directory. |
Apologies, for the going back and forth, next time i'll save you from my my out-loud-thinking and only post a coherent answer... Last realisation: By having an ObjectStore that only can filter/scan on prefix, we take away the possibility for objectstores to optimise eventual suffix filters (predicate pushdown for file searching as you will). |
In my case existing design of ObjectStore interface forced me to re-engineer ListingTable in order to provide yet another way of listing data source. From my perspective It might be beneficial to push information about data source from TableProvider to ObjectStore. Then ObjectStore for a local file system, would combine data(table) location and strategy for listing that kind of storage. As a result listing methods present in ObjectStore could drop the concept of path as a way to access data. Then ObjectStore could offer more generic interface with two methods:
Such interface should allow us to provide any kind of listing approach(dir, glob, etc), what do you think ? It's not a necessity but last component bound to a path is SizedFile, where actually outside of ObjectStore It should be treated as abstract blob with characteristics e.g. |
I really like the idea of providing an extensible storage interface that allows APIs such as suggested by @Cheappie and @timvw. Given these APIs seem to be adding semantics to the list of files on ObjectStorage, perhaps we could an extra layer specifically in the APIs rather than trying to extend
|
I think it is important to keep a separation between:
In particular, there is a very common use case where an additional catalog is used to provide query performance, listing files, performing schema inference, etc... is not cheap. By keeping the concerns separate we can ensure this remains well supported. Currently I would view the catalog abstraction as FWIW I created some tickets a while back on supporting external catalogs (e.g. #2206, #2208 and #2209) which may be relevant here. I also created tickets to make the file operators themselves less coupled with the catalog - #2291 and #2293. |
@tustvold thank you very much for driving these efforts. I apologize I have not been able to contribute much to the conversation or code on these. Based on my current capacity I will likely be limited in what I can contribute on most of these in the foreseeable future - the one exception being #2206 which would actually be very helpful on my side. Perhaps I could work with @timvw to get a first cut of this created in |
Getting there (still want to test some things and change some signatures (Eg: return Vec instead of Result when adding multiple tables at once).. -> https://github.com/timvw/datafusion-catalogprovider-glue |
I'm going to close this as I think it is superceded by #2504 Thank you all for helping move this forward 👍 |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
LocalFileSystem
interprets the prefix passed toObjectStore::list_file
as the path to a directory, and then proceeds to enumerate this directory recursively.S3FileSystem
, however, interprets the prefix as a string prefix.The distinction arises if you consider a file structure like
If called with a prefix of
fo
,LocalFileSystem
will return an error, whereasS3FileSystem
will return both files.Describe the solution you'd like
I personally would expect something called
ObjectStore
to behave like an object store, and not a filesystem. In particular I would expect it to behave like a KV store without any notion of directories.I would therefore suggest:
Describe alternatives you've considered
We could instead call the trait something like
FileSystem
and give it file system like semantics.Additional context
I noticed this whilst reviewing #2394 - it seems off to me that we should need to split based on path delimiters given object stores don't have such a concept.
Thoughts @matthewmturner @alamb @timvw ?
The text was updated successfully, but these errors were encountered: