Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: validate benchmark type #583

Merged
merged 4 commits into from
Dec 19, 2022
Merged

Conversation

olegsu
Copy link
Contributor

@olegsu olegsu commented Dec 15, 2022

What does this PR do?
This will use the new config.v1 structure to validate that the cloudbeat knows how to run the selected benchmark.
In case the benchmark is not supported it may fail fast or ignore and update the status.

  1. When cloudbeat just started and the given configuration is based on a benchmark that is not supported, fail fast as there is nothing to do.
  2. When a reconfiguration comes with an unsupported benchmark ignore it and update the status

This PR only uses config.v1.benchmark to not deal with the new structure that integrations pr adding ( cc @kfirpeled)

Related Issues
Related #239
Related elastic/integrations#4752

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary README/documentation (if appropriate)

@olegsu olegsu requested review from a team as code owners December 15, 2022 14:25
@mergify mergify bot assigned olegsu Dec 15, 2022
@mergify
Copy link

mergify bot commented Dec 15, 2022

This pull request does not have a backport label. Could you fix it @olegsu? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit
    NOTE: backport-skip has been added to this pull request.

@olegsu olegsu force-pushed the health-status-agent-v2 branch from 1dc08a6 to 4fa50cf Compare December 15, 2022 14:26
@github-actions
Copy link

@olegsu olegsu force-pushed the health-status-agent-v2 branch from d98fccb to aa9af3a Compare December 15, 2022 16:02
@olegsu olegsu linked an issue Dec 18, 2022 that may be closed by this pull request
3 tasks
@olegsu olegsu force-pushed the health-status-agent-v2 branch from aa9af3a to 2ab04f5 Compare December 18, 2022 08:32
@oren-zohar oren-zohar requested a review from amirbenun December 18, 2022 08:36
@olegsu olegsu force-pushed the health-status-agent-v2 branch 4 times, most recently from 2ba924e to f9a9517 Compare December 18, 2022 09:56
Copy link
Contributor

@amirbenun amirbenun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

# exec_pod $POD "elastic-agent restart" # https://github.com/elastic/cloudbeat/pull/458#issuecomment-1308837098
# done
for P in $(get_agents); do
exec_pod $POD "elastic-agent restart" # https://github.com/elastic/cloudbeat/pull/458#issuecomment-1308837098
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the comment.
Also, I just noticed we are missing POD=$(echo $P | cut -d '/' -f 2).
Can you verify that it works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, will verify

AWSConfig aws.ConfigAWS `config:",inline"`
RuntimeCfg *RuntimeConfig `config:"runtime_cfg"`
Fetchers []*config.C `config:"fetchers"`
KubeConfig string `config:"kube_config"`
Period time.Duration `config:"period"`
Processors processors.PluginConfig `config:"processors"`
BundlePath string `config:"bundle_path"`
Benchmark string `config:"config.v1.benchmark"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change requires a matching change to the integration, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -239,6 +242,9 @@ func (l *launcher) reconfigureWait(timeout time.Duration) (*config.C, error) {
err := l.validator.Validate(update)
if err != nil {
l.log.Errorf("Config update validation failed: %v", err)
if errors.Is(err, cloudbeat_config.ErrBenchmarkNotSupported) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The launcher was designed in a way it is agnostic to cloudbeat or any Beater implementation.
In that case, we can extend the Validator to support a special kind of error that will send that degraded message.
The only difference will be that the Validator accepts config.C instead of our struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we can update the validator behavior to return a boolean whether the configuration is valid or not.
So the validator will also receive the management.Manager to update the status.

I am not sure it's better as we still have a coupling between the launcher and the validator.

@olegsu olegsu force-pushed the health-status-agent-v2 branch 2 times, most recently from 16fcd2b to 9daf134 Compare December 18, 2022 14:03
errors/errors.go Outdated
// can help to end user to operate cloudbeat
// for example when a cloudbeat configuration is invalid, this error will include
// more information about what is missing/expected and maybe links to external sources as well
type CloudbeatError struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked it, since I still don't want the launcher to be aware of cloudbeat I would just rename it.
Maybe BeaterDegradeError / BeaterUnhealthyError or some other name that indicates what is about to happen once this error is returned.
Also, it should reside near the launcher IMO.

Copy link
Contributor Author

@olegsu olegsu Dec 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good naming, will change the name.

Also, it should reside near the launcher IMO.

This might be an issue with imports

@olegsu olegsu force-pushed the health-status-agent-v2 branch from 9daf134 to 9f2c7b1 Compare December 18, 2022 18:40
@olegsu olegsu force-pushed the health-status-agent-v2 branch from 9f2c7b1 to 1ebe7b9 Compare December 18, 2022 18:40
@olegsu olegsu enabled auto-merge (squash) December 19, 2022 07:56
Copy link
Contributor

@amirbenun amirbenun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last touch

@@ -239,6 +242,10 @@ func (l *launcher) reconfigureWait(timeout time.Duration) (*config.C, error) {
err := l.validator.Validate(update)
if err != nil {
l.log.Errorf("Config update validation failed: %v", err)
heatlhErr := &cb_errors.BeaterUnhealthyError{}
if errors.As(err, heatlhErr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test this new functionality, it can be a part of TestLauncherValidator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, note that you are checking the error that returned from the Validator but BeaterUnhealthyError error is actually returned from config.New and wrapped in the validator and probably won't get detected here.

Copy link
Contributor Author

@olegsu olegsu Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test this new functionality, it can be a part of TestLauncherValidator

I dont think we have an easy way to mock the Manager.UpdateStatus tbh.

Also, note that you are checking the error that returned from the Validator but BeaterUnhealthyError error is actually returned from config.New and wrapped in the validator and probably won't get detected here.

The test for the errors pkg checking exactly this. First, create the BeaterUnhealthyError and then wrap it, later test that using errors.As it will get the original error.

From golang https://pkg.go.dev/errors#As

As finds the first error in err's chain that matches target, and if one is found, sets target to that error value and returns true. Otherwise, it returns false.

errors/errors.go Outdated
msg string
}

func New(msg string) BeaterUnhealthyError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, rename the function to indicate the error type, maybe errors.NewBeaterUnhealthy(...)?

@olegsu olegsu force-pushed the health-status-agent-v2 branch from fc5eda9 to db6084f Compare December 19, 2022 10:06
@mergify
Copy link

mergify bot commented Dec 19, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b health-status-agent-v2 upstream/health-status-agent-v2
git merge upstream/main
git push upstream health-status-agent-v2

Copy link
Collaborator

@gurevichdmitry gurevichdmitry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olegsu LGTM. scripts/common.sh approved.

@olegsu olegsu force-pushed the health-status-agent-v2 branch 2 times, most recently from 90b3186 to 36d98fb Compare December 19, 2022 15:31
@olegsu olegsu disabled auto-merge December 19, 2022 15:32
@olegsu olegsu force-pushed the health-status-agent-v2 branch from 36d98fb to db6084f Compare December 19, 2022 15:36
@mergify
Copy link

mergify bot commented Dec 19, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b health-status-agent-v2 upstream/health-status-agent-v2
git merge upstream/main
git push upstream health-status-agent-v2

@olegsu olegsu merged commit d2638ce into elastic:main Dec 19, 2022
@olegsu olegsu deleted the health-status-agent-v2 branch December 19, 2022 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support health status interface
3 participants