-
Notifications
You must be signed in to change notification settings - Fork 850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ObjectStoreScheme (#4047) #4184
Conversation
1a6e3d8
to
28e7ccd
Compare
/// This can be combined with the [with_url](crate::aws::AmazonS3Builder::with_url) methods | ||
/// on the corresponding builder to construct the relevant type of store | ||
#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)] | ||
pub enum ObjectStoreScheme { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand why you are proposing to add this to the object store crate.
Users of object_store
would still have to match on the resulting scheme and instantiate a builder / configuration appropriate to whatever they wanted. The extra value to having a hard coded list of url prefixes seems relatively minimal.
Maybe this is just a first step.
If I were a user I would want something that took a url like s3://foo-bucket
or https://andrew:[email protected]/path
and returned an Arc<dyn ObjectStore>
.
For convenience the object_store crate could have default interpretations of these urls, but also some way to extend the API;
Basically I think the API here makes a lot of sense https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the API here makes a lot of sense
This API is just a trait? It doesn't contain any parsing logic, nor any logic to interpret schemes directly
The extra value to having a hard coded list of url prefixes seems relatively minimal
It isn't just schemes FWIW
why you are proposing to add this to the object store crate
If I were a user I would want something that took a url like s3://foo-bucket or https://andrew:[email protected]/path and returned an Arc
As explained in the description, because there isn't a way I can see to make such an API that is both coherent and not extremely opinionated...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API is just a trait? It doesn't contain any parsing logic, nor any logic to interpret schemes directly
The API is a trait but DataFusion provides default parsing logic / scheme interpretation in https://docs.rs/datafusion/latest/datafusion/datasource/object_store/struct.DefaultObjectStoreRegistry.html
As explained in the description, because there isn't a way I can see to make such an API that is both coherent and not extremely opinionated...
I wast trying to suggest an API that let users implement their own opinions while also providing a default implementation that worked for simple cases (with whatever opinions you wanted)
I will add a build_with_options or simliar method to this |
Which issue does this PR close?
Closes #4047
Rationale for this change
Inspired by the work in #4077 I started work on a generic
ObjectStoreBuilder
, however, this ran into a couple of challengesWhilst this PR does to a certain extent punt the issue onto downstreams, I think this standardises the non-trivial URL recognition logic, whilst allowing downstreams to make their own opinionated decision w.r.t the above.
What changes are included in this PR?
Are there any user-facing changes?