Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persistent worker: exit after a worker ran a single task #809

Open
0xB10C opened this issue Nov 12, 2024 · 3 comments · May be fixed by #813
Open

persistent worker: exit after a worker ran a single task #809

0xB10C opened this issue Nov 12, 2024 · 3 comments · May be fixed by #813
Labels
enhancement New feature or request question Further information is requested

Comments

@0xB10C
Copy link

0xB10C commented Nov 12, 2024

To avoid keeping (possibly malicious) state after a task has finished, we'd like to stop and remove an ephemeral container/VM we use to run cirrus worker run --token <token>. This could be done by exiting the process after a task finishes and the results are reported back to Cirrus. Having an option along the lines of cirrus worker runonce or cirrus worker run --one-task would be good. We'd make sure to spin up a new container/VM with a worker that registers for another single run after that.

I don't think this is currently supported, or am I missing something. Do you see any problems with this approach (e.g. churn on your persistent runner registration interface)?

@fkorotkov
Copy link
Contributor

Will isolation feature work for you? It supports https://github.com/cirruslabs/vetu too.

How do you orchestrate those VMs/containers?

@fkorotkov fkorotkov added enhancement New feature or request question Further information is requested labels Nov 12, 2024
@0xB10C
Copy link
Author

0xB10C commented Nov 16, 2024

Thanks! I've spent the past few days having a look at vetu and set up an example infrastructure with it. While the CI runs and works, I'm missing a few features from vetu. I understand that it's probably still in an early stage of development.

  • With tart and containers you are able to mount volumes and shared directories for e.g. caches. With vetu you can't do this yet.
  • In general, you can't set limits to the vetu VM disk size, memory size, or CPU limit on the host side. These are all set in the .cirrus.yml in the repository. If someone (maliciously) increase these in a PR, the cirrus runner would spawn a VM with all available CPU cores/uses up a lot of memory to a point where the cirrus-worker crashes and other tasks fail too/or fill up the disk making the system unusable.
  • vetu doesn't seem to have many users yet, the documentation is a bit sparse, and it's hard to control VM details

From the vetu README:

We say effortlessly, because the existing virtualization solutions like the traditional QEMU and the new-wave Firecracker and Cloud Hypervisor provide lots of options and require users to essentially build a tooling on top of them to be able to simply run a basic VM.

I think, in our case, we'd want to have this control over the VMs and provide these options to the VMs our self.

How do you orchestrate those VMs/containers?

After having tried a few options, our current plan would be to have a few, fixed size (CPU, memory, disk space) QEMU/cloud-hypervisor/firecracker VMs on a single host. These can then share cache directories. Inside each VM, we'd run a persistent worker. With something like cirrus worker --one-task, the cirrus process would end after completing a single task and we'd trigger a VM shutdown. Before bringing it up again, the VMs volumes would be re-created. Once back up, it will be waiting in standby for a new task or pick up an existing task.


I'm happy to work on an implementation of --one-task, give it a try, and then PR it here.

@fkorotkov
Copy link
Contributor

Thank you for the details. We use veru for Linux part of Cirrus Runners and it's been working great! But yeah, folder mounts not yet supported.

Seems this functionality can be under --ephemeral flag similar to GitHub Actions. PRs are always appreciated!

@0xB10C 0xB10C linked a pull request Nov 26, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants