-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to deploy custom elastic-agents on different OS or runtimes #787
Comments
Regarding #786: I recommend not modifying the test runner much/at all. We'd like to rewrite that code eventually as it has too many responsibilities and it's error-prone now. I see a few approaches that we can apply here. Extend "profiles" with local patches When a user or CI executes Extend Compose stack definition with environment variables Let's not add anything special except allowing for var customizations in the Compose stack. Whenever a user or CI executes Problem: it doesn't solve the problem of elastic-package stack booting on the Windows machine. Should we move it to another, separate issue? Actually, you can pair it with image overrides. Hack: retag elatic-agent image It's a hack but may work temporarily. Before running the |
Another option could be to move agent initialization to the system test runner, and remove it from the default stack definition. Having it in the runner would allow it to have full control of the started agents, allowing to start them with different options, and handling platform-specific needings, as could be the case of Windows. We already have a custom agent for the Kubernetes service deployer, we could follow a similar strategy on any other deployer. If we remove agent from the default stack definition, we could still have a stack subcommand to start agents for manual tests. Something like this could also cover #548. I think this could be a more future-proof option, but it can require an important effort. And another option for the use case of starting an agent but no service, can be to add a new "system" deployer, that just starts an agent with a given configuration, intended for system-level monitoring. This could help with packages for auditbeat or for the system module itself. This could be extended in the future to start completely different OSs using VMs. This would be more in line of #786, but without needing to hack over the current test runner and compose deployer. |
There are two constraints related to this approach:
I like the approach of having the custom agent setup. You're right that we could apply similar logic as for
I had that in mind before, hence the issue, but always considered its complexity as +Inf. Maybe we can evaluate it as a good first issue and "rebuild the stack command"? To sum up, my vote would go to custom agent setup. |
Could this be done without modifying runners? |
Yes, I think so. Same way as closed most of the changes for Kubernetes service deployer in this file. There might be one inconvenience, the agent will be deployed during the first run of the system test. |
Ah ok, but it would be modifying service deployers. Would you prefer to add an agent to the compose deployer, or to add a new deployer for these use cases? |
It looks like it depends on the final infrastructure setup. Not sure if that option will work for @marc-gr and Windows containers.
This option seems to be pluggable and flexible in terms of specific configuration properties or OS-specific logic. It has also an extra benefit, it will prevent copying a custom agent code to multiple places. I'm thinking now if we aren't close to introducing a feature of using an agent under development. This way you could use even standalone builds. Maybe we should implement a proxy instead :) |
Discussed offline about this with Marc, he is going to explore the option of implementing something like #786, but as a new deployer, so the runner is not modified. This could cover the current auditbeat needings. We also discussed that probably we need something like vms for system tests, this will be neccesary to support running tests with windows, or even with linux if not enough privileges can be granted with containers for some use cases. |
For Linux https://multipass.run/ is a good cross-platform solution as long as you are fine with only supporting Ubuntu VMs. For Windows there is no cross-platform equivalent, you have to provision cloud VMs. This is generally what we do in the Elastic Agent test framework, https://github.com/elastic/elastic-agent/blob/main/docs/test-framework-dev-guide.md. You can test locally against multipass Ubuntu VMs, otherwise we are provisioning Linux and Windows machines in the cloud. MacOS VM support is TBD. It would be good if we could align the provisioning here with the agent framework so we aren't maintaining this functionality twice. The only quirk with the agent test framework is it uses https://github.com/adam-stokes/ogc for provisioning, we'd prefer to use Terraform but we haven't gotten that implemented yet. elastic/elastic-agent#2935 |
CC @blakerouse |
When enabling independent Elastic Agents, there are some packages that last around 3 hours to finish their tests (mainly system tests). Added a new PR to allow creating a new Agent Policy per each test executed: #1866 This will allow us to:
|
Two new PRs created to change how test runners work in
These two PRs introduce two different interfaces to manage runners and tests:
|
Next step is adding support to run system tests in parallel in This work is being done in two different PRs:
This will allow us to set system tests in parallel in packages with large number of system tests like |
Running some tests in this PR from integrations with just 2 packages ( Comparing times among the different settings:
CI builds:
|
I was wondering to close this issue once this PR (#1909) is merged. All the support related to independent Elastic Agents and running system tests in parallel would be completed at that point. It would be missing:
For that, it could be created a follow-up issue to enable those features in the integrations repository. There will be some packages to update while doing so. At least, It could be created another issue to run the system tests using the independent Elastic Agents by default. Could this be done as part of a different issue too? However, that means that developers would be triggering the tests using the Elastic Agent from the stack but the CI would be using the new independent Elastic Agents. If they want to be running independent Elastic Agents should be setting the environment variable: WDYT about closing this one (once the PR is merged) in favor of creating those new issues? @jsoriano @kpollich |
Just to add to the previous comment, it should be updated the docs too about these new settings. I'll update the current PR with the changes required about the docs: EDIT: updated in 419f8ea |
Yep, I mostly agree with closing this once we can enable independent agents more generally. But please take into account that one the original motivations for this issue was to be able to run winlog tests, and we are still unable to run Windows agents for this. |
That's true, I could keep this issue open (since there are other issues already linked to this one) even if the above mentioned PRs are merged, until we could find time to work on adding support to run Elastic Agents in other OS or runtimes. |
Created package-spec release 3.2.0 (elastic/package-spec#764) that includes the definition of the new configuration files to enable or not system parallel tests. |
As a summary for what it has been achieved until now, with the latest Pull Requests merged linked to this issue, there is now support in
As a follow-up, I created this issue to enable these features in the integrations repository: It will be pending here to allow running Elastic Agents in other OS (e.g. Windows) or in other runtimes (VMs?). Updated title and description accordingly. |
Thanks for providing a summary of where we are today, @mrodm. I'm moving this into a quality sprint for now as we'll need to dedicate a large amount of time here if we prioritize adding cross-platform support to this new type of test. |
A use case that this feature could be helpful would be for the This would allow to run the system tests with Elastic Agents running in different Linux OS other than Ubuntu, e.g. Fedora. So it could be tested that it can collect the required logs from rpm package manager. |
There are some integrations that might require elastic-agents with custom configurations for them or their container ie: winlogbeat requires a windows container, auditbeat requires special container capabilities, etc.
I initially created #786 that adds ability to deploy custom agents as test services, there is still missing code specific to deal with the windows scenario.
I open this thread to discuss other approaches that might avoid adding the complexity to the test runner if possible.
EDIT:
As mentioned in #787 (comment) , there is now support in
elastic-package
to:It will be pending here to allow running Elastic Agents in other OS (e.g. Windows) or in other runtimes (VMs?).
The text was updated successfully, but these errors were encountered: