Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding codecov causes spurious PR check failures #2727

Closed
jaraco opened this issue Jul 11, 2021 · 11 comments · Fixed by #2762
Closed

Adding codecov causes spurious PR check failures #2727

jaraco opened this issue Jul 11, 2021 · 11 comments · Fixed by #2762

Comments

@jaraco
Copy link
Member

jaraco commented Jul 11, 2021

In #2693, this project added codecov back in. Following that addition, I created a pull request (#2726) that didn't change any line of code but still somehow caused a check failure (indicating that code coverage was reduced by .04% and causing the checks to fail.

I'm hoping @tanvimoharir or @webknjaz can figure out how to either:

  • fix the root cause of these spurious check failures, or
  • devise a project-agnostic way to make the checks less sensitive (allowed to pass unless there are significant regressions in coverage).

I guess I'd also entertain simply disabling the ability for the checks to fail (report coverage, but never fail), which would be comparable to the prior behavior.

The reason I want the solution to be project-agnostic is because I'd like to be able to apply this technique across a suite of projects and I don't want each project to have to custom-tune the coverage in order to reach a reasonably quiet check. It's okay if a project wants to tune to be more sensitive, but the default should be to produce far less noise than signal.

@tanvimoharir
Copy link
Contributor

It does show coverage change:
https://app.codecov.io/gh/pypa/setuptools/compare/2726/changes

I will try to find the root cause (referring https://docs.codecov.com/docs/unexpected-coverage-changes)

jaraco added a commit that referenced this issue Jul 18, 2021
@tanvimoharir
Copy link
Contributor

tanvimoharir commented Jul 22, 2021

It seems like there are around 7 lines under setuptools/command/test.py which is causing this difference:
If you see lines 241-246 and 248 at
https://app.codecov.io/gh/pypa/setuptools/compare/2726/changes#D1L241

it shows that these lines were hit at the base but were not hit at head which is to say that disabling gcov introduced it?
I checked the build reports which were uploaded to codecov manually to confirm the same.
I'm trying to understand the test itself.

@webknjaz
Copy link
Member

I think what happened was a result of some environment change in CI. It may've been caused by the test env deps being unpinned as well.

If you looks at the older coverage here https://codecov.io/gh/pypa/setuptools/src/3f20c10f04c7777a3dafa3033be05cc07d9e0bb0/setuptools/command/test.py#L241 and the newer one here https://codecov.io/gh/pypa/setuptools/src/ef9b8dd0b12de5833a7967bea8719cd33cea216a/setuptools/command/test.py#L241, you may notice that in the "better covered" commit, there are just 2 hits on those lines (and 0 in the following commits).

This makes me think that the if-clause was True only in 2 jobs (or maybe just one job but two tests hit it). Then, some transitive deps got updated, or the GHA VMs got updates, and that conditional became False in all of the envs we have.

The coverage drop looks legit. Also, I must note that the coverage is not measured by Codecov, it's only reported there. One way to improve the debugability would be to store those XML reports as artifacts in the CI so we could download them and check locally next time this happens. We could also produce HTML coverage to include better context information on what lines were hit by which tests and so on.

@webknjaz
Copy link
Member

https://codecov.io/gh/pypa/setuptools/src/3f20c10f04c7777a3dafa3033be05cc07d9e0bb0/setuptools/command/test.py#L241 has flags at the top. I've toggled them and discovered that those lines were originally only covered by tests running against Python 3.6 under macOS. So that's what must've changed!

@webknjaz
Copy link
Member

I compared https://github.com/pypa/setuptools/runs/3038459747 and https://github.com/pypa/setuptools/runs/3040476882 but this doesn't give me any clue about what changed — they seem identical (OS, CPython, libs are all the same).

@webknjaz
Copy link
Member

FWIW it's possible to change the threshold in Codecov if you want it to be lower. It can be added to .codecov.yml that is already present in the project root.

@jaraco
Copy link
Member Author

jaraco commented Jul 23, 2021 via email

@webknjaz
Copy link
Member

I'm pretty sure that this loss was real.

@tanvimoharir
Copy link
Contributor

Should we go ahead with changing the threshold limit?

@webknjaz
Copy link
Member

webknjaz commented Aug 5, 2021

Yes. You may use this https://github.com/ansible-community/ansible-compat/blob/main/codecov.yml as a reference.

OTOH, it'd be useful to also figure out why that chunk stopped being covered and try to understand how to recover it.

@tanvimoharir tanvimoharir mentioned this issue Aug 30, 2021
2 tasks
@tanvimoharir
Copy link
Contributor

Yes. You may use this https://github.com/ansible-community/ansible-compat/blob/main/codecov.yml as a reference.

OTOH, it'd be useful to also figure out why that chunk stopped being covered and try to understand how to recover it.

I have added the same configurations for now (in a draft PR)
How much should the threshold value generally be?
Yes, I will try to investigate further into the failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants