-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Action] Decide framework for benchmark tests #16
Comments
Excellent outline @nikimanoledaki - Favoring the self-hosted GitHub Action runners option. @maxgio92 is our infra management expert. |
Wonderful - we'll start on this immediately. In the meantime, we will need your help with the following requirements:
|
Also - @incertum are you currently using the microservice demo that is currently deployed on the cluster for these stress test or planning to use it? Or can we remove it from the cluster for now? We can just comment it out so that Flux stops reconciling it. Please let me know :) |
Hi @nikimanoledaki, I'd propose to guarantee quality of service for the benchmark jobs and the ARC. WDYT? |
Hi @maxgio92! 👋 @rossf7 has been working on provisioning an isolated worker node for the
Please let us know if you have suggestions on any further isolation that could help with the benchmark tests :) I'm not 100% sure if it would be best for the ARC runner Pod to run on the system node or the Falco-only node. I don't think it should run in the test environment - running everything ARC-related on one of the system nodes would be better. WDYT? 🤔 |
Hi @maxgio92,
Yes, we will provision dedicated nodes for Falco using the labels defined in #2 this is done via our tofu automation.
@nikimanoledaki I think it would be better to run the ARC pods on our system node. To keep the nodes we're collecting measurements on as isolated as possible. If we get short on resources we could move some of our internal components to the control plane node. |
Thanks @rossf7 and @nikimanoledaki! |
We are not using it yet, but yes please keep it deployed. Much appreciated! |
Hi @incertum 👋
During our last discussion, we were not sure on the goal of this microservices deployment, it was also noticed that there's a stress test Deployment shipped [1] and [2]. |
We previously discussed that for a v1 we will use the following synthetic workloads:
Hi, we added a lot of new documentation to our website (https://falco.org/) explaining what Falco does and how it works if you are interested in more details. Falco is a Linux kernel security monitoring tool, passively hooking into syscall tracepoints. The more syscalls happen on a server the more work Falco has to do (simplified). Notably, Falco does not interact with synthetic workloads, rather, we use them to increase the frequency of syscalls, thereby making our testbed resemble real-life production environments where a diverse set of applications runs 24/7. What additional questions do you have for us? |
A few questions provided by @roobre :)
|
Thank you @incertum for the clarifications. It is really helpful!
|
Thanks @raymundovr & @incertum. Rewording my questions for clarity:
For example, we discussed specific syscalls from I/O or networking in the past. However, we're doing
This uses A different way to do this would be with
I'm trying to understand if we want to log the type of stressor as a variable. Does it matter? Or does it not matter as long as the target kernel event rate is reached?
This is just for me to understand how we're setting up the benchmark tests but I fully trust @incertum and team with owning the test scenarios etc. Thank you! :) |
Hi I am trying to sum up here a very interesting discussion we had around the proposal for the benchmark test in the public slack channel of the worker group. Thanks @leonardpahlke for suggesting public runners, @nikimanoledaki for stearing the discussion and all the others partecipating @rossf7 @ dipankardas011. This is the 3rd proposal: Modular GitHub Action workflow (public runners)Here you can find an overview drawn by @leonardpahlke: Workflow:
News:
This approach emphasizes sustainability, collaboration, and operational simplicity, which are crucial for the ongoing success and scalability of the green-reviews-tooling initiative. |
@nikimanoledaki I have responded here #13 (comment) to your feedback re the synthetic workloads composition, thank you! |
@AntonioDiTuri 🚀 thank you very much for taking the time and posting an update here #16 (comment). Amazing, we are looking forward to receiving more clear templates or instructions. As a heads-up we need to be mindful of @maxgio92 availability as well not just mine since Max is our infra expert and we will need his help 🙃. Some initial feedback:
|
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh with Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle rotten |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh with Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue with Mark the issue as fresh with Provide feedback via https://github.com/falcosecurity/community. |
@poiana: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Motivation
As part of the CNCF Green Reviews WG's milestone for KubeCon+CloudNativeCon Europe '24, our main goal is to create the first benchmark test for Falco.
Feature
Proposal 1: Self-hosted runners with Actions Runner Controller (ARC)
Self-hosted GitHub Action runners could help us achieve this. Specifically, the Actions Runner Controller (ARC). We could add self-hosted runner in this repo (
falcosecurity/cncf-green-review-testing
) so that the Falco maintainers have ownership of the benchmark tests.The benchmark tests can then be run in the cluster where Kepler and Prometheus are running and collecting energy metrics, along with other metrics for the SCI.
Stretch: The workflow could be triggered when there are new releases of Falco. GitHub Action workflows can be triggered by the build pipeline through a worfklow_dispatch.
+
-
Alternatives
Proposal 2: bash script that runs as a Kubernetes CronJob
We could create and maintain bash scripts that run the steps. We could run these as Kubernetes Jobs.
+
-
Additional context
Suggested steps
falco
namespace on the isolated worker node.falco
namespace.Benchmark Test Acceptance Criteria
The text was updated successfully, but these errors were encountered: