-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the datafusion-objectstore-hdfs in datafusion-contrib as an object store feature #260
Conversation
Will be ready after #258 merged. |
3fbad0b
to
8e549a3
Compare
/// Return the key and object store | ||
#[allow(unused_variables)] | ||
fn get_by_url(&self, url: &Url) -> Option<Arc<dyn ObjectStore>> { | ||
#[cfg(feature = "hdfs")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being able to enable/disable object stores based on features sounds sensible, but I would recommend supporting multiple based on the scheme. Having to recompile different binaries for different backends is unfortunate. It's possible you intend to do this anyway, but thought I would mention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a standalone system, we should know which object stores to be used before running it. Therefore, it's OK to make the decision at the compiling phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you distribute binaries for such a system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run it on K8S with replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me phrase it differently, how would the ballista project distribute binaries/docker images for general consumption in arbitrary environments. I presume we want to provide this? People shouldn't need to compile from source for their specific environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We made an agreement that it's supposed to be able to find all of the object stores from the ObjectStoreProvider. Otherwise, an error should be thrown. We'll change the related ObjectStoreProvider interface after we upgrading to latest datafusion.
} | ||
} | ||
|
||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the current construction this could return
"FeatureBasedObjectStoreProvider failed to create store for url {}"
This is already an improvement as it tells us that the provider was registered and used.
As described above, I think it would be even better if it could provide insight into why nothing was found, e.g. not compiled with the right feature, unsupported scheme, etc..
… object store feature
8e549a3
to
1bdb71b
Compare
Which issue does this PR close?
Closes #259, #6.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?