RFE | Re-run the CI before officially merging a PR #3974

fidencio · 2021-09-16T10:40:37Z

We've hit issues in the past where a PR "A" is opened against one state of our tree, all tests pass, but due to another PR "B" being merged earlier, tests break after "A" getting merged.

A way to solve this would be to ensure that before merging a PR to our tree we run the tests again with this PR "merged" atop of the current HEAD.

Examples of such issues are:

This is something that will take advantage and also provide more safeguards to the green ci effort.

I'm not sure whether GitHub provides something we can easily use, but that's open for discussion.

cc @Jakob-Naucke @bergwolf @chavafg @GabyCT @ariel-adam @wainersm

Ah, by the way, CRI-O does something like that already.

Jakob-Naucke · 2021-09-16T10:45:39Z

Ah, by the way, CRI-O does something like that already.

By CRI-O, you mean our CRI-O Jenkins CIs?

This sounds sweet at first glance, but I suppose we have to die one of two deaths:

You click "merge", the actual merge happens 30 minutes later because tests have to be run (during which all other merges would have to be locked? because now we're in possible conflict again)
For every update to the base branch, tests are re-run (CI load, costs, and energy consumption go brrr)

fidencio · 2021-09-16T11:31:30Z

Ah, by the way, CRI-O does something like that already.

By CRI-O, you mean our CRI-O Jenkins CIs?

No, I mean CRI-O does that with the PRs going to their project.

This sounds sweet at first glance, but I suppose we have to die one of two deaths:

You click "merge", the actual merge happens 30 minutes later because tests have to be run (during which all other merges would have to be locked? because now we're in possible conflict again)

For every update to the base branch, tests are re-run (CI load, costs, and energy consumption go brrr)

Yep, but I think it's still worth the cost when we consider the time of n developers looking for a regression at some given point.

GabyCT · 2021-09-17T15:09:00Z

I am not sure about how much that will increase our costs and how limited we are in terms of budget
^ @chavafg
As everytime that we are running the CI we are spending and this and the daily main CI will impact

Jakob-Naucke · 2021-09-17T15:12:23Z

@GabyCT Well, I mean, if we follow the approach of

You click "merge", the actual merge happens 30 minutes later because tests have to be run (during which all other merges would have to be locked? because now we're in possible conflict again)

I guess it won't be all that bad (how many PR tests are we running? how many PRs are we merging? this is the difference we'd be looking at). The other approach... is going to be more. But we don't have to follow it.

GabyCT · 2021-09-17T15:20:40Z

Maybe something that we can do first is to create the baseline jobs for the CI and then @chavafg can analyze how much impact in the cost we have and then we can see how much will this have...everytime that a CI job is trigger is less money that we have

fidencio · 2021-09-17T15:45:19Z

Maybe something that we can do first is to create the baseline jobs for the CI and then @chavafg can analyze how much impact in the cost we have and then we can see how much will this have...everytime that a CI job is trigger is less money that we have

Totally, I don't think we should do this with the current state of our CI, but I think this will help to keep baseline green once we have this effort running.

wainersm · 2021-09-17T16:12:42Z

We've hit issues in the past where a PR "A" is opened against one state of our tree, all tests pass, but due to another PR "B" being merged earlier, tests break after "A" getting merged.

Thanks for bringing it up, this is a legit problem. In your experience, @fidencio, does it occurs very often on our project; or the two examples are just unfortunate ones (maybe #2655 just show how unlucky @Jakob-Naucke is :)

I'm not sure whether GitHub provides something we can easily use, but that's open for discussion.

I don't know if GitHub provides such as feature either. I never worked on a project that run CI tests at the moment someone hits the "merge" button. What seems more common is to run jobs after the merge actually.

In the "green CI" proposal we have the "maintainer" role. If we run the baseline jobs after each merge then the maintainers will immediately notice jobs that became red. They can immediately action on it (e.g. revert patches); it doesn't prevent but rather mitigate the problem. On another hand there is this concern of consuming our CI re$ources. That's why I started asking you @fidencio , if this problem happens often enough to justify the extra costs ...

All in all, I tend to work with "baby-steps" in mind. So I would implement the initial "green CI" proposal as is and observe over time the areas that could (and must!) be improved.

Notes: regarding the worries about our CI costs... my recent initiates to ease the execution of the jobs on developer's local machine can help us to save some good money. After this #3965 I think the Vagrantfile will be in good shape for broaden use; and I will continue to work on #3751 . After merging those two PR I should be probably advocating for our developers to stop running the CI jobs for the sake of development.

fidencio · 2021-09-17T18:19:37Z

We've hit issues in the past where a PR "A" is opened against one state of our tree, all tests pass, but due to another PR "B" being merged earlier, tests break after "A" getting merged.

Thanks for bringing it up, this is a legit problem. In your experience, @fidencio, does it occurs very often on our project; or the two examples are just unfortunate ones (maybe #2655 just show how unlucky @Jakob-Naucke is :)

This doesn't happen often, but I we haven't had many refactors happening often either.

I'm not sure whether GitHub provides something we can easily use, but that's open for discussion.

I don't know if GitHub provides such as feature either. I never worked on a project that run CI tests at the moment someone hits the "merge" button. What seems more common is to run jobs after the merge actually.

You've been working with OpenShift CI and I'm pretty sure you can have a better answer for that than I do.
What I see on CRI-O is that merges are not done by maintainers, but by a bot. So, there they enforce the need of an /approve and a /lgtm, once those are there and all the tests are green, their bot will merge the code, and at this time the CI is triggered again.

Do you think we could leverage the infra they use in our favour? :-)
Mainly considering that kata-containers is now (even indirectly) part of OpenShift?

In the "green CI" proposal we have the "maintainer" role. If we run the baseline jobs after each merge then the maintainers will immediately notice jobs that became red. They can immediately action on it (e.g. revert patches); it doesn't prevent but rather mitigate the problem. On another hand there is this concern of consuming our CI re$ources. That's why I started asking you @fidencio , if this problem happens often enough to justify the extra costs ...

I'd be interested to know the costs and then judge whether it's worth it or not. :-)

All in all, I tend to work with "baby-steps" in mind. So I would implement the iitial "green CI" proposal as is and observe over time the areas that could (and must!) be improved.

Yep, and in this case I think having the green-ci may be a pre-req for this one.

c3d · 2021-09-28T14:14:30Z

Do we have a project for the "green CI" yet? If so, this could be linked to it.

Jakob-Naucke · 2021-09-30T08:47:50Z

Thanks for bringing it up, this is a legit problem. In your experience, @fidencio, does it occurs very often on our project; or the two examples are just unfortunate ones (maybe #2655 just show how unlucky @Jakob-Naucke is :)

@wainersm maybe I'm just extra unlucky but we saw this again in kata-containers/kata-containers#2736.

fidencio added enhancement Improvement to an existing feature needs-review Needs to be assessed by the team. labels Sep 16, 2021

fidencio changed the title ~~Re-run the CI before officially merging a PR~~ RFE | Re-run the CI before officially merging a PR Sep 16, 2021

c3d removed the needs-review Needs to be assessed by the team. label Sep 28, 2021

c3d added the area/ci label Sep 28, 2021

Jakob-Naucke mentioned this issue Sep 30, 2021

cmd: Fix mismatched types in testModuleData kata-containers/kata-containers#2736

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFE | Re-run the CI before officially merging a PR #3974

RFE | Re-run the CI before officially merging a PR #3974

fidencio commented Sep 16, 2021

Jakob-Naucke commented Sep 16, 2021 •

edited

Loading

fidencio commented Sep 16, 2021

GabyCT commented Sep 17, 2021 •

edited

Loading

Jakob-Naucke commented Sep 17, 2021

GabyCT commented Sep 17, 2021

fidencio commented Sep 17, 2021

wainersm commented Sep 17, 2021

fidencio commented Sep 17, 2021

c3d commented Sep 28, 2021

Jakob-Naucke commented Sep 30, 2021

RFE | Re-run the CI before officially merging a PR #3974

RFE | Re-run the CI before officially merging a PR #3974

Comments

fidencio commented Sep 16, 2021

Jakob-Naucke commented Sep 16, 2021 • edited Loading

fidencio commented Sep 16, 2021

GabyCT commented Sep 17, 2021 • edited Loading

Jakob-Naucke commented Sep 17, 2021

GabyCT commented Sep 17, 2021

fidencio commented Sep 17, 2021

wainersm commented Sep 17, 2021

fidencio commented Sep 17, 2021

c3d commented Sep 28, 2021

Jakob-Naucke commented Sep 30, 2021

Jakob-Naucke commented Sep 16, 2021 •

edited

Loading

GabyCT commented Sep 17, 2021 •

edited

Loading