-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve config validation for prometheus receiver and target allocator #1729
Improve config validation for prometheus receiver and target allocator #1729
Conversation
e2507b9
to
e2c14ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a nice addition! 💪 I added only a few nits.
cmd/otel-allocator/main.go
Outdated
setupLog.Error(err, "Unable to apply initial configuration") | ||
return err | ||
if cfg.Config != nil { | ||
err = targetDiscoverer.ApplyConfig(allocatorWatcher.EventSourceConfigMap, cfg.Config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it would make more sense to check for cfg.Config == nil
inside of ApplyConfig
and if so, just do early return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but realistically, this is the only place in the whole package where this can happen, because we're getting the config from user input. So I think it's better to keep this bit of complexity here instead of in the component. We could also just have the Config itself come with a non-nil default if the user doesn't supply it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you actually reverse this condition such that we do
if cfg.Config == nil {
setupLog.Error("cannot apply an empty configuration")
return fmt.Error("...")
}
That way we prevent startup correctly on bad config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't reverse it if it stays where it is in main
, as cfg.Config == nil
by itself does not imply an error. Or do you mean doing this inside ApplyConfig
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't cfg.Config == nil
imply an error is i guess my conjecture here. i.e. if we don't have a config to apply at all we should fail fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@swiatekm-sumo IMHO still it's an exported method, could be used anywhere else. That way whoever re-uses this method in whichever place they do, do not need to worry about the nil
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matej-g makes sense, you've convinced me.
@jaronoff97 Having the raw Prometheus config be nil and Prometheus CR handling enabled looks like a valid configuration to me. That's why I added the validate method and check for this specific state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the check to ApplyConfig and added a Discoverer test for this.
cmd/otel-allocator/config/config.go
Outdated
if !*cliConfig.PromCRWatcherConf.Enabled && | ||
(config.Config == nil || | ||
len(config.Config.ScrapeConfigs) == 0) { | ||
return fmt.Errorf("either at least one scrape config must be defined, or Prometheus CR watching must be enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit
return fmt.Errorf("either at least one scrape config must be defined, or Prometheus CR watching must be enabled") | |
return fmt.Errorf("at least one scrape config must be defined, or Prometheus CR watching must be enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd prefer us to actually split up this if in to three statements. Although it's more LOC we'll be able to give a more descriptive error back which I think would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote it this way anticipating that we might want to add more checks in the future, in which case they all need to be negative checks (as in "this error condition doesn't occur"). And I can't really rewrite it into three separate statements, as it's logically an OR.
I could make it nested instead, but I'm not sure if you can really make the error message any more useful that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote this condition to be clearer, let me know if this helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few thoughts, overall this is going to be super useful though. thank you!
cmd/otel-allocator/config/config.go
Outdated
if !*cliConfig.PromCRWatcherConf.Enabled && | ||
(config.Config == nil || | ||
len(config.Config.ScrapeConfigs) == 0) { | ||
return fmt.Errorf("either at least one scrape config must be defined, or Prometheus CR watching must be enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd prefer us to actually split up this if in to three statements. Although it's more LOC we'll be able to give a more descriptive error back which I think would be better.
cmd/otel-allocator/main.go
Outdated
setupLog.Error(err, "Unable to apply initial configuration") | ||
return err | ||
if cfg.Config != nil { | ||
err = targetDiscoverer.ApplyConfig(allocatorWatcher.EventSourceConfigMap, cfg.Config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you actually reverse this condition such that we do
if cfg.Config == nil {
setupLog.Error("cannot apply an empty configuration")
return fmt.Error("...")
}
That way we prevent startup correctly on bad config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few thoughts, overall this is going to be super useful though. thank you!
@swiatekm-sumo my colleague @kristinapathak is also going to take a look at this as well before i merge. We've been deep in to configuration related PRs for the TA as of late so this will be a good continuation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Left two questions/suggestions.
@@ -57,6 +57,11 @@ func ReplaceConfig(instance v1alpha1.OpenTelemetryCollector) (string, error) { | |||
return "", getCfgPromErr | |||
} | |||
|
|||
validateCfgPromErr := ta.ValidatePromConfig(promCfgMap, instance.Spec.TargetAllocator.Enabled, featuregate.EnableTargetAllocatorRewrite.IsEnabled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ValidatePromConfig()
meant to be called every time ta.ConfigToPromConfig()
is called? Put another way, is there ever a situation when we want to call one function without the other? I'm wondering the benefit of two separate functions. Is it separation of concerns? Not making a breaking change to the adapters
pkg API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not making a breaking change and keeping current semantics of ta.ConfigToPromConfig()
is part of it, although the separation of concerns is a nice bonus from that. Though overall there isn't any strong reason behind it, if we're ok breaking ConfigToPromConfig
, we could simply make validation a private function and have ConfigToPromConfig
call it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds like a good followup for the future :)
7479876
to
1515f43
Compare
b774b95
to
c09cce3
Compare
@@ -74,6 +74,10 @@ func main() { | |||
setupLog.Error(configLoadErr, "Unable to load configuration") | |||
} | |||
|
|||
if validationErr := config.ValidateConfig(&cfg, &cliConf); validationErr != nil { | |||
setupLog.Error(validationErr, "Invalid configuration") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a future version should we fast fail here?
@@ -57,6 +57,11 @@ func ReplaceConfig(instance v1alpha1.OpenTelemetryCollector) (string, error) { | |||
return "", getCfgPromErr | |||
} | |||
|
|||
validateCfgPromErr := ta.ValidatePromConfig(promCfgMap, instance.Spec.TargetAllocator.Enabled, featuregate.EnableTargetAllocatorRewrite.IsEnabled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds like a good followup for the future :)
open-telemetry#1729) * Improve validation for prometheus receiver config * Improve config validation for target allocator * add changelog entry * review fixes * log empty Prometheus configs passed to discoverer * add tests for ValidateConfig
In a follow-up to #1688, improve validation for prometheus receiver configuration. This turned out to be more complicated than expected, so I'd like some more eyes to verify I haven't missed anything.
The validation logic for the prometheus receiver config itself is:
config
key is requiredconfig
has to exist to inject http_sd_configs, ortarget_allocator
needs to existHowever, I also noticed that target allocator itself doesn't correctly validate its configuration. For example, it's possible to pass an empty Prometheus config, but this will cause a crash in the targetDiscoverer. So I added an additional check, requiring that either:
I submitted both of these changes in a single PR, as they're related, but I can resubmit separately if that'll make review easier.