-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added self documentation to the KSQL config public config variables. #422
Added self documentation to the KSQL config public config variables. #422
Conversation
retest this please |
ConfigDef.Type.SHORT, | ||
defaultSinkNumberOfReplications, | ||
ConfigDef.Importance.MEDIUM, | ||
"The default number of replications for the topics created by KSQL." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replications -> replicas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
ConfigDef.Type.STRING, | ||
KSQL_PERSISTENT_QUERY_NAME_PREFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Second part of the prefix for persitent queries.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a description of what the final query name would look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an example.
ConfigDef.Type.STRING, | ||
KSQL_TRANSIENT_QUERY_NAME_PREFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Second part of the prefix for transient queries.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an example.
|
||
public int defaultSinkNumberOfPartitions = 4; | ||
public short defaultSinkNumberOfReplications = 1; | ||
public static int defaultSinkNumberOfPartitions = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be public
, i.e., they are set as the defaults anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, changed them to private.
KSQL_TABLE_STATESTORE_NAME_SUFFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Suffix for state store names in Tables.") | ||
.define(DEFAULT_SINK_NUMBER_OF_PARTITIONS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we have DEFAULT_SOME_PROPERTY
and SOME_PROPERTY
? We just define what the default value is here and then we can use SOME_PROPERTY
everywhere. The ConfigDef
takes care of providing the default value if none is supplied. So further down from line 141 etc you don't need to create a new copy of the Map and put the defaults in, they will already be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to let user define the default for a set of statements. For instance if user want to have the default value of partitions to be 10 for all of his/her queries they can set the default and just write their queries instead of having to specify it in each of their queries. On the other hand, if user wants to have a different value for a specific query he/she can set the value in the specific query syntax and it will be applied to only that one. User can change the default to another value and that value will be used as default partition number for all the queries after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hjafarpour in the KsqlConfig
ctor you shouldn't need to create a map for the ksqlConfigProps
and add the defaults. That will already be done for you by the ConfigDef
. The only map we need to create is the one for ksqStreamConfigProps
and for that you should still be able to pull the defaults out of the ConfigDef
@dguy the properties in |
@hjafarpour sure, but you can still populate the maps from the defaults provided by the config, i.e, you can use |
@dguy changed the init for the configs based on your feedback. |
|
||
ksqlStreamConfigProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, defaultAutoOffsetRestConfig); | ||
ksqlStreamConfigProps.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, defaultCommitIntervalMsConfig); | ||
ksqlStreamConfigProps.put( | ||
StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, defaultCacheMaxBytesBufferingConfig); | ||
ksqlStreamConfigProps.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, defaultNumberOfStreamsThreads); | ||
|
||
for (Map.Entry<?, ?> entry : props.entrySet()) { | ||
for (Map.Entry<?, ?> entry : originals().entrySet()) { | ||
final String key = entry.getKey().toString(); | ||
if (key.toLowerCase().startsWith(KSQL_CONFIG_PREPERTY_PREFIX)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we'd already have all the properties in the ksqlConfigProps
by virtue of constructing the map from super.values()
, so we could skip updating the ksqlConfigProps
and only update ksqlStreamConfigProps
if the key doesn't start with KSQL_CONFIG_PROPERTY_PREFIX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just one more comment, otherwise LGTM
…via values. Correct the CLI test to expect the correct properties.
@dguy made the change. |
public static final String SINK_NUMBER_OF_REPLICATIONS = "REPLICATIONS"; | ||
public static final String SINK_NUMBER_OF_REPLICATIONS_PROPERTY = "ksql.sink.replications"; | ||
public static final String DEFAULT_SINK_NUMBER_OF_REPLICATIONS = "ksql.sink.replications.default"; | ||
public static final String SINK_NUMBER_OF_REPLICAS = "REPLICAS"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name and the value of this property don't seem to line up. What does the REPLICAS
value mean exactly? Would it be more appropriate to have a numeric value? This applies to the SINK_NUMBER_OF_PARTITIONS
property above as well. Also where is this SINK_NUMBER_OF_REPLICAS
string used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not part of the config variables that can be set by the user via the KsqlConfig but they will be set in the query to specify the number of partitions, replications and the timestamp column for the result stream/table. This is just a constant to represent the name of the attributes.
Here is an example:
CREATE STREAM enrichedpv1 with (timestamp='pageid', partitions = 3) AS SELECT users.userid AS userid, pageid as pageid, region, gender FROM pageview LEFT JOIN users ON pageview.userid = users.userid;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Then why is it in KsqlConfig
? There is also a SINK_NUMBER_OF_REPLICAS_PROPERTY
which I assume does go in the ConfigDef
? It would be better to have the literals that are in the query moved to a separate file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll move them to a new file, KsqlConstants.
ConfigDef.Type.STRING, | ||
KSQL_PERSISTENT_QUERY_NAME_PREFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Second part of the prefix for persitent queries. For instance if the prefix is transient_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean transient_
here? IT doesn't fit in with the rest of the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the second part. The first part come from KSQL_SERVICE_ID_CONFIG(ksql.service.id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I am still not sure I follow. I think it would make sense for the example to include the full chain, so that it is clearer to the user how it all comes together at each stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more details with an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc string here as written is:
Second part of the prefix for persitent queries. For instance if the prefix is transient_query_ the query name will be ksql_query_1.
Did you really mean for the example prefix to be 'transient_query_'?
ConfigDef.Type.LONG, | ||
defaultSinkWindowChangeLogAdditionalRetention, | ||
ConfigDef.Importance.MEDIUM, | ||
"The default window change log additional retention time." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to drop the 'default' here. Also, it may make sense to expand the description. It doesn't make sense as is. Perhaps a sentence like 'The amount of time to retain window change logs'. Actually I am still not sure what that means Also, there should be a time unit in the name of the config and the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added the time unit and more details to the description.
"The default window change log additional retention time. This is a streams " | ||
+ "config value which will be added to a windows maintainMs to ensure data is not " | ||
+ "deleted from " | ||
+ "the " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these new lines?
ie. why have this string over so many lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it, seems that my intelliJ formatted it that way automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. It looks good to me, barring some doc strings which I think have some copy/paste errors.
ConfigDef.Type.STRING, | ||
KSQL_PERSISTENT_QUERY_NAME_PREFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Second part of the prefix for persitent queries. For instance if the prefix is transient_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc string here as written is:
Second part of the prefix for persitent queries. For instance if the prefix is transient_query_ the query name will be ksql_query_1.
Did you really mean for the example prefix to be 'transient_query_'?
ConfigDef.Type.STRING, | ||
KSQL_TRANSIENT_QUERY_NAME_PREFIX_DEFAULT, | ||
ConfigDef.Importance.MEDIUM, | ||
"Second part of the prefix for transient queries. For instance if the prefix is " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the example. It makes sense now!
ConfigDef.Importance.MEDIUM, | ||
"Suffix for state store names in Tables. For instance if the suffix is _ksql_statestore the state " | ||
+ "store name would be ksql_query_1_ksql_statestore" | ||
+ "_ksql_statestore ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you add this extra line by mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch for 'transient_query_'
, it should be 'query_'
. Fixed it.
IntelliJ added this, fixed it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.