-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(object_store): Instantiate object store from provided url #4077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, mostly just some comments on error handling
@tustvold one question I had regarding the tests - I want to test parse_url for each supported store. so shall I put them separately in each module or create 1 big test in the same file? |
I suspect this will make the feature flags easier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very nice to me, perhaps @roeap may want to give it a look over
/// # Examples | ||
/// | ||
/// ``` | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant example 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still a WIP 😂
there are some open items in my checklist
Co-authored-by: Raphael Taylor-Davies <[email protected]>
object_store/src/options.rs
Outdated
.iter() | ||
.map(|(key, value)| { | ||
let conf_key = | ||
AzureConfigKey::from_str(&key.to_ascii_lowercase()).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we either return an error or omit the keys that will panic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant making the function fallible and return the error, rather then unwrapping and thus panicing ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roeap Thanks for the review.
I'm not sure what exactly you meant here, but based on my understanding and a little searching, I found something called error propagation and have made some changes for this.
Can you please check if this is what you meant and if this is on the right track?
I've marked this as a draft as it doesn't appear to be ready for review |
also one more question - Since each object store has its own |
pub fn parse_url( | ||
url: impl AsRef<str>, | ||
store_options: Option<impl Into<StoreOptions>>, | ||
_from_env: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could instead use a builder pattern here 🤔 I'll have a play over the weekend and see what I can come up with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I thought about that as well. let me do something about this and then we can merge our ideas on this.
goals -
- allow URL based obejct_store instantiation
- store options are either explicitly passed or picked from env or both (preference given to explicit over env)
- optionally allow internal
ClientOptions
to be passed - should work for all object stores like local, HTTP, AWS, GCS, Azure, mem
- user facing API should be simple and natural
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a builder pattern for this.
let me know if you had something like this in mind?
https://github.com/apache/arrow-rs/pull/4077/files#r1182195434
/// .with_env_variables(true) | ||
/// .build(); | ||
/// ``` | ||
pub struct ObjectStoreBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tustvold Based on your comment I have created a builder pattern for object store.
If this looks good, the we can get rid of the options.rs
I add as it contains duplicate code.
Also added better examples :)
Apologies I have a bit of an interrupted week, and want to give some thought into how this would integrate with things like DataFusion's ObjectStoreRegistry or downstream ObjectStore implementations like HDFS. Thank you for sticking with this, I'll get back to you as soon as I am able |
Thank you for your work on this, I've raised #4184 with a proposal inspired by this. PTAL and let me know what you think |
Closing this as #4200 has now been merged, thank you for your work on this |
Which issue does this PR close?
Closes #4047.
Closes #2304
Rationale for this change
This PR proposes a standardized implementation to create an object store from provided URL and options. It would make things significantly simple for developers using the crate.
What changes are included in this PR?
Check list
parse_url( ... )
from_env
toparse_url
parse_url( ... )
Are there any user-facing changes?
Yes,
parse_url( ... )
is user-facing.No breaking changes.