-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pre-commit in CI as well to reduce deduplication #11211
Comments
HI @VannTen Great Idea. @floryut may have the permission :-) |
I've added it to the project See the linked PR (#11223). Looks like the pre-commit can be run from the CI, to avoid duplication as mentioned: build:
image: python:3.8
script:
- pip install pre-commit
- pip install .
- pre-commit run --all-files That way it can fail on a early stage and not trigger all the tests/build if needed |
Looks like we should disable the auto-fixing PR, since it messes with
the CLA stuff and all (and I'd rather have PR authors fixes their issues
anyway. I think it's configurable by adding a pre-commit file to the
repository.
Looks like the pre-commit can be run from the CI, to avoid duplication as mentioned:
to add in .gitlab-ci.yml:
That way it can fail on a early stage and not trigger all the tests/build if needed
Well, if we also runs pre-commit in gitlab ci it's not avoiding duplications ? I'm not sure what you mean ?
Besides, if we we're to run it in Gitlab CI (that's possible as well, I was simply eyeing pre-commit.ci because it seems less work for us), we should use a dedicated image and cache the envs, otherwise it's gonna take a lot of unnecessary time. (and parallelize the runs somehow, also)
|
Which duplications are you talking about then ? |
I mean if we run it in gitlab CI and in pre-commit.ci (Though yeah it's still deduplicating between .pre-commit.config.yaml and .gitlab-ci/ , but we should probably not run them twice, I think ?) |
My understanding is that some of the test are duplicated in For example:
.gitlab-ci/lint.yaml
Instead of having them duplicated as you suggested, we can have all the pre-commit-config.yaml run as one or few jobs in the current pipeline by using the command |
Yes, there is two differents things we are talking about I think:
I think 1. is needed anyway, but for 2 I'm not sure what is the preferred option. pre-commit.ci is more rigid but we don't have to maintain it. We should also check if it block tide merge correctly. |
Yes 100% with 1. |
for 2., I tend to go for fewer systems than more. It will still be something to look at, configure, upgrade...
Upgrade, presumably not, but yeah, I see your point, that's valid.
Another issue is that it's no longer part of the pipeline and tests will move to the next stage regardless of the precommit.ci result.
...
I actually think we should have less dependencies between jobs, but with a better cancellation story (aka, stopping pipelines immediately or almost if there is a new push).
The question in that case is of course garbage collection.
(For failure cases, it's less clear cut, because there is usually some value in getting other jobs output even when one fails)
If we could achieve that, that would mean faster pipelines, and maximize resources utilization.
That's for later though, something I'd like to tackle in the (not too far, hopefully) future.
In any case, I think my linked PR should achieve the deduplication while keeping stuff in gitlab-ci.
I haven't converted everything, but that should not be that hard (for vagrant, given hashicorp recent licensing shenanigans, we should probably drop this at some point, but for now I guess it can stay that way)
|
We don't want to spend "money" on tests for a PR that doesn't pass unit tests and or fails to deploy a simple setup. How many pushes per PR on average? How many commits failed on early tests?
That's assuming resources are not utilized; they are. Each job on each stage is configured to run in parallel when possible. Sometimes they don't, because it has already reached the limit. We have a fixed set of resources / $ to use. |
> If we could achieve that, it would mean faster pipelines and maximize resource utilization.
That's assuming resources are not utilized; they are.
They are, but not *always*, and that's where it could be improved. The
most annoying problem we regularly face is more related to using gitlab.com than the cluster capacity itself:
If scheduling a pipeline would go over 500 jobs (in total), the pipeline
is not scheduled and gets an error. It's more problematic than just
waiting for CI, because the pipeline is not re-scheduled at all, and I
think that + the confusion with Prow confuses a lot of contributor (see
the regular /retest comments, even though they don't do anything :/ )
We have a fixed set of resources / $ to use.
That said, the pipeline can still be reorganized and improved, but an overall guideline is that it should not use much "more".
I don't want to use more resources, I want to use the same amount, but
faster (in other words, maximize resource utilization). The cluster does not autoscale, right ?
IMO, we should aim for:
- prioritize fast-failing jobs
- auto-cancellation on re-push / fail (that'd need a rework of the
provisioning so that GC is independant from gitlab CI jobs, I think.
Maybe doable with ownerRefs in the cluster 🤔)
=> that would allow us to rely less on manual ordering / gating and use
as much resources as available, with minimal waste.
Even better would be Coscheduling (so that we only schedule a complete testcase inventory in max usage scenario). Since we now have it in Kubespray, that might be easier that I think.
|
Such changes were explored several years back (everything in a single stage) and rolled-back due to too high usage and we had even more resources than now: #2325
It does for some jobs up to 12 machines. Otherwise, it's only one machine running it all, and it's already heavily used in term of memory, of course there are low-usage period in the day.
Last I checked, there's already a GC on schedule, so that's fine. And a new push on the same PR should canceled previous pipeline (if not it's missconfigured)
^ not sure to understand this |
@VannTen Once we remove the autoscaling part (all jobs with the tag "c3.small.x86"), things can be tested and remapped. |
>I don't want to use more resources, I want to use the same amount, but
faster (in other words, maximize resource utilization). The cluster does not autoscale, right ?
It does for some jobs up to 12 machines. Otherwise, it's only one machine running it all, and it's already heavily used in term of memory, of course there are low-usage period in the day.
That's the docker-machine jobs ?
I'm pretty sure we should get rid completely of those (in fact, I'd go as far as to use exclusively the Kubernetes executor).
docker-machine is abandoned (well maintained by Gitlab, but it's a dead end)
>auto-cancellation on re-push / fail (that'd need a rework of the
provisioning so that GC is independant from gitlab CI jobs, I think.
Maybe doable with ownerRefs in the cluster 🤔)
Last I checked, there's already a GC on schedule, so that's fine.
Yeah but the schedule is hourly right ? (that's the openstack-cleanup in https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/pipeline_schedules ?)
What I was thinking about was having ownerRefs of the Pod job in created Kubevirt Objects, so once the job is over and the pod discarded, the Kubevirt stuff is always gc'ed by k8s immediately.
And a new push on the same PR should canceled previous pipeline (if not it's missconfigured)
It does not work completely, because some jobs are marked `interruptible: false`, so once they are started the pipeline does not cancel.
I'm not sure if it's because it would left stuff behind otherwise ?
> Even better would be Coscheduling (so that we only schedule a complete testcase inventory in max usage scenario). Since we now have it in Kubespray, that might be easier that I think.
^ not sure to understand this
Basically if we have two jobs which provision several kubevirt:
jobA create kubevirt-A-1, kubevirt-A-2, kubevirt-A-3
jobB create kubevirt-B-1, kubevirt-B-2, kubevirt-B-3
When the cluster if full, it's possible to get in a state where
kubevirt-A-1 kubevirt-A-2 , kubevirt-B-1 are scheduled -> Running
kubevirt-A-3 kubevirt-B-1 , kubevirt-B-2 are Pending because of node capacity
Thus neither jobA nor jobB can progress, but still block resources :/
Coscheduling[1] would ensure that a set of pods are either all scheduled together or not at all (well, until more resources are freed). So it minimizes the impact of "pending" jobs.
[1]: https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg/coscheduling/README.md
|
Yes
sounds good
I don't know. I suggest to remove interruptible: false and evaluate the situation.
Yes, also the VM / docker-machines had an automatic teardown after 4h (I reduced to 1h last week)
Yes, the idea was in the air at some point. Not sure why it wasn't implemented, could be lack of time from maintainers or a blocker that I'm not aware off. Worth to investigate again |
What would you like to be added
Switch some of our CI (all the tests mirroring our pre-commit hooks) to pre-commit.ci
Why is this needed
This would avoid duplication between .pre-commit-config.yaml and our .gitlab-ci/ files
It also free some of our CI resources.
We should check if the pre-commit CI works correctly with the Github status API, (prow relies on that to merge PRs) and we probably don't want the autofix PR stuff.
I don't know who can add the app to the repository though, are approvers also repository admins ?
@yankay @mzaian @floryut
The text was updated successfully, but these errors were encountered: