-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ConfigOption names into an Enum #4517
Comments
So I started bashing this out, with the incremental way like: /// DataFusion specific configuration names.
pub enum ConfigName
{
/// Configuration option "datafusion.execution.target_partitions"
TargetPartitions,
/// Configuration option "datafusion.catalog.create_default_catalog_and_schema"
CreateDefaultCatalogAndSchema,
/// Configuration option "datafusion.catalog.information_schema"
InformationSchema,
/// Configuration option "datafusion.optimizer.repartition_joins"
RepartitionJoins,
/// Configuration option "datafusion.optimizer.repartition_aggregations"
RepartitionAggregates,
/// Configuration option "datafusion.optimizer.repartition_windows"
RepartitionWindows,
...} But then I was thinking, if we are going to change the API anyways, why not go all the way with something fully statically typed and removing /// DataFusion specific configuration names.
pub enum ConfigValue
{
/// Configuration option "datafusion.execution.target_partitions"
TargetPartitions(usize),
/// Configuration option for arbitrary user defined data
UserDefined {
name: String,
value: Option<ScalarValue>
},
...}
impl ConfigValue {
/// Return the name of this configuration value
fn name(&self) -> &str {
match self {
Self::TargetPartitions(_) => "datafusion.execution.target_partitions",
...
}
/// Return the human readable description for this configuration value
fn description(&self) -> &str {
match self {
Self::TargetPartitions(_) =>
"Number of partitions for query execution. Increasing partitions can increase \
concurrency. Defaults to the number of cpu cores on the system.",
...
}
/// set the value of the configuration value
fn set_value(&mut self, new_value: ScalarValue) ->Result<()>{
match (self, value) {
(Self::TargetPartitions(v), ScalarValue::UInt64(Some(new_value))) => *v = new_value,
(Self::TargetPartitions(v), _) => return Err("Expected uint64 for {} but got {:?}", self.name(), new_value)).
...
} But before I go crank that through the process I wanted to get some feedback if that was a desirable way to go. I would retain the ability to store arbitrary name/value pairs in the metadata. What do you think @thinkharderdev @yahoNanJing @andygrove @avantgardnerio ? |
If you're going to go for statically typed why not just go with just a struct? We could always add a HashMap for custom extensions as the final field. |
☝️ what he said :) I'd like to be able to use BTW, thank you @alamb for taking on this thankless task. Recurring conflicts regarding multiple PRs each changing config in their own way is why I still have two of my own sitting in limbo. It will be nice to have this sorted - it should decrease the friction for everyone and PRs should flow faster. |
One potential issue with struct update syntax, is any additional config is technically a breaking change. Unfortunately
|
I don't have a strong opinion either way. The one thing I do like about @alamb original enum proposal is that is that we get a nice way to associate each config with a config key and description that would maybe a allow a more generic way of handling the decoding from CLI params and spitting out documentation/help text. Although you could probably do the same with a macro in the struct case. |
I don't think I have a strong opinion either as long as we retain the current capabilities, specifically:
|
I had a play around and came up with this - https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8c1ababec354db7118d3ff5ffd878f1e I think it meets all of the above requirements, whilst being a "simple" structured type, and not requiring a proc-macro. Let me know what you think, if people are happy with it I'd be happy to get something like it integrated with DataFusion. |
I think it looks good to me 👍 |
Nice, seems like a good approach |
The rationale here is to make the configuration code more "rust like" and follow the Rust way of strongly typed when possible
Originally posted by @thinkharderdev in #4492 (comment)
The text was updated successfully, but these errors were encountered: