Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appsec high availability using replicas #208

Merged
merged 12 commits into from
Dec 16, 2024

Conversation

srkoster
Copy link
Contributor

@srkoster srkoster commented Nov 29, 2024

Thanks @he2ss for implementing Appsec into the Crowdsec helmchart. This PR builds upon #189 and #206 and make Appsec high availability possible by defining appsec.replicas.

Other changes:

  • RegistrationToken generation is moved into _helpers.tpl. It can now also be passed through values.yaml
  • appsec-deployment.yaml is structured more like agent-deployment.yaml
  • Added the appsec registration_token config into values.yaml (starred out)
  • ci\crowdsec-values.yaml: removed agent.additionalAcquisition as the dummy values make the helm chart test fail because the agent pod isn't able to start.
  • ci\crowdsec-values.yaml: added agent.metrics.enabled: true to use the agent metrics endpoint in the helm chart test
  • test_agent_up.yaml: Using the LAPI watchers/login endpoint with username and password is no longer an option when using autoregistration. Instead check the agent metrics endpoint to see if the agent is running during the helm chart test.

@github-actions github-actions bot added the needs/kind Kind label required label Nov 29, 2024
Copy link

@srkoster: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.

  • /kind feature
  • /kind enhancement
  • /kind fix
  • /kind chore
  • /kind dependencies
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the forked project rr404/oss-governance-bot repository.

Copy link

@srkoster: There are no area labels on this PR. You can add as many areas as you see fit.

  • /area agent
  • /area local-api
  • /area cscli
  • /area security
  • /area configuration
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the forked project rr404/oss-governance-bot repository.

@srkoster
Copy link
Contributor Author

/kind enhancement
/area configuration

@he2ss
Copy link
Member

he2ss commented Nov 29, 2024

Hi @srkoster,

I'm sorry I didn't reply quickly to the old PR. IMO, for the high availability, we can play with replicas, and that's it.
The DaemonSet for the agent is different because the agent pod is mounting the node logs directory and analyzing them while the appsec is communicating over HTTP with the ingress controllers via the bouncer.
So what I suggest, is to have only the deployment for the appsec, so users can handle HA with replicas and annotations if they want.

@srkoster
Copy link
Contributor Author

@he2ss no worries. Thanks for explaining why the agent is defined as a daemonset and your thoughts about the appsec deployment. I've removed the daemonset option from the PR.

@srkoster srkoster changed the title Appsec high availability through deployment with replicas or daemonset Appsec high availability using replicas Nov 29, 2024
@he2ss
Copy link
Member

he2ss commented Dec 9, 2024

Hi @srkoster,

Sorry, I got busy with other tasks. So I have a last comment. Otherwise the PR seems good, I also tested it and seems working 👍

srkoster and others added 2 commits December 9, 2024 17:47
…ompatibility
@srkoster
Copy link
Contributor Author

srkoster commented Dec 9, 2024

I resolved the merge conflict by accepting the config.yaml.local.api.server.auto_registration comment from main

@he2ss
Copy link
Member

he2ss commented Dec 16, 2024

Hi @srkoster,

I'll bother you again. The chart testing has been falling since this PR:#209

We may want to autoregister an agent before connecting to Lapi in the test chart. Can you please handle this?

@srkoster
Copy link
Contributor Author

Hi @he2ss,
Yes, i'll have a look.

Using the LAPI watchers/login endpoint with username and password is no longer an option when using autoregistration. Instead check the agent metrics endpoint to see if the agent is running.
@srkoster
Copy link
Contributor Author

@he2ss
The chart testing fails for 2 reasons:

  • The test-agent pod can't be created because it still relies on the agent-credentials secret, which is removed in favor of autoregistration. I've fixed this in this PR by removing the environment variables referring to it and changing the test to do a curl on the agent metrics port. This test is imho a better implementation than the current one, as it checks if the agent is running successfully instead of logging in the LAPI with the test-agent pod.
  • The agent pod isn't running but in status error. This is because the ci/crowdsec-values.yaml contains agent.additionalAcquisition. These additionalAcquisitions don't work, prevent the agent pod from running successfully and thus make the test-agent pod fail after my modification above. For example, the chart test shows:
time="2024-12-16T11:53:47Z" level=error msg="datasource 'journalctl' is not available: exec: \"journalctl\": executable file not found in $PATH"
time="2024-12-16T11:53:47Z" level=error msg="aws_region is not specified, specify it or aws_config_dir" group=/aws/my/group type=cloudwatch
time="2024-12-16T11:53:47Z" level=fatal msg="crowdsec init: while loading acquisition config: while configuring datasource of type cloudwatch from /etc/crowdsec/acquis.yaml (position: 4): failed to configure datasource cloudwatch: aws_region is not specified, specify it or aws_config_dir"

I've removed the agent.additionalAcquisition entry in a separate branch and the chart test is successful. Do you want me to modify the agent.additionalAcquisition entry in ci/crowdsec-values.yaml here or create a new PR?

@he2ss
Copy link
Member

he2ss commented Dec 16, 2024

Thanks @srkoster for this troubleshoot.
Yes you can remove the additionalAcquisitions as was done to test the jsonSchema, but not relevant to put random info in those additionalAcquistions.

These agent.additionalAcquisition dummy values make the helm chart test fail because the agent pod isn't able to start.
@srkoster
Copy link
Contributor Author

@he2ss I've removed the agent.additionalAcquisition entry in this PR and modified the PR comment to reflect all changes.

@he2ss he2ss merged commit 31d4478 into crowdsecurity:main Dec 16, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants