Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc CI improvements #289

Closed
wants to merge 8 commits into from
Closed

Conversation

FFY00
Copy link
Member

@FFY00 FFY00 commented Feb 2, 2023

Just a couple of small things, it didn't seem worth to open separate PRs.

- python3 -m pip install .[test]
- python3 -m pytest --showlocals -vv
- python3 -m pip install .[test] pytest-codecov
- python3 -m pytest --showlocals -vv --cov --cov-report=xml:coverage-$CIRRUS_TASK_NAME.xml --codecov
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of reporting coverage data to codecov if we don't do anything with it? I would just get rid of the coverage reporting altogether.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, https://github.com/seantis/pytest-codecov is a plugin to upload the coverage data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but where is the data visualized?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://app.codecov.io/gh/mesonbuild/meson-python/ and the badge in the README and docs. I'd like to improve the coverage, so IMO it's definitely useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be something wrong with the codevcov.io: it does not point to the source code in this repository. and looking at the details for any given file just shows "There was a problem getting the source code from your provider. Unable to show line by line coverage."

I don't expect that running the code on different flavors of Linux will exercise different code paths, and this doubles the CI execution time. I prefer fast feedback loops. Are we sure we really need this? Does codecov provide a measure of how much each "measurement" from each test run contributes to the total coverage?

.cirrus.yml Outdated Show resolved Hide resolved
.cirrus.yml Outdated Show resolved Hide resolved
@FFY00
Copy link
Member Author

FFY00 commented Feb 2, 2023

Well, apparently debian-unstable just fixed itself 🤣

.cirrus.yml Outdated Show resolved Hide resolved
@dnicolodi
Copy link
Member

dnicolodi commented Feb 2, 2023

@FFY00 I pushed on main the obvious fixes to the CI. Please rebase. Let me know if rebasing looks to tedious and I'll do it for you.

EDIT: you can get the rebased comments here https://github.com/dnicolodi/meson-python/tree/pr289

@FFY00
Copy link
Member Author

FFY00 commented Feb 3, 2023

Next time please just push to my branch, it's easier and I don't mind 👍

@dnicolodi dnicolodi force-pushed the cov-igore-test-packages branch from b137ed7 to 14d23e6 Compare February 3, 2023 18:41
@dnicolodi
Copy link
Member

Sure. I'm never sure whether people minds me messing with their branches. Force pushed now.

@FFY00 FFY00 added the enhancement New feature or request label Mar 1, 2023
@FFY00
Copy link
Member Author

FFY00 commented Mar 10, 2023

I think we should be okay to merge this now, right @dnicolodi?

@dnicolodi
Copy link
Member

I still think that reporting coverage from the different Linux flavors we are testing on is completely useless. There is no code in meson-python that executes only on conditions that are affected on which Linux distribution it is running (well, with the exception of a few lines in mesonpy._introspection but that code is going away soon). Doubling the time it takes to run the CI for no reason really sucks.

@dnicolodi
Copy link
Member

Also, there are still commits that need to be squashed.

@@ -111,6 +111,11 @@ omit = [

[tool.coverage.report]
ignore_errors = true
exclude_lines = [
'\#\s*pragma: no cover', # we need this because this field overrides the default
'^\s*raise NotImplementedError\b',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exclude raise NotImplementedError? Error conditions are supposed to be tested as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but NotImplementedError is for something that is supposed to work eventually, but doesn't currently, so it's different from something where we do actually intent on throwing an error. Writting tests to see if something raises NotImplementedError doesn't seem very productive, I'm not really sure what actual value it brings on most situations.

@FFY00
Copy link
Member Author

FFY00 commented Mar 10, 2023

Doubling the time it takes to run the CI for no reason really sucks.

I don't recall it being that bad when I checked, but we should definitely look at it before merging then. Thanks for pointing it out 👍

@dnicolodi
Copy link
Member

I usually need a good reason to do something, not a good reason not to do it. Can you explain what you plan to gain running coverage on the different flavor of Linux?

@FFY00
Copy link
Member Author

FFY00 commented Mar 10, 2023

Well, this was triggered because of the custom Debian code. For example, right now it still provides us valuable information, like in #280, where we are planning to remove the custom Debian code, it will tell us if any code is actually being hit at the moment, which we assume will be none, but it's good to have data showing it.

I think due to the nature of the project, it being very dependent on the Python it runs on, it makes sense to have the coverage running in Cirrus if it's not a big performance hit. Before, with the Debian specific code, I'd be okay with a moderate performance it, but since that's not the case anymore, the value drops, and so the sensible tradeoff.

@dnicolodi
Copy link
Member

For example, right now it still provides us valuable information, like in #280, where we are planning to remove the custom Debian code, it will tell us if any code is actually being hit at the moment, which we assume will be none, but it's good to have data showing it.

The actual data is provided by removing the code and noticing that observing that nothing breaks.

@FFY00
Copy link
Member Author

FFY00 commented Mar 10, 2023

Yeah, sure, but there might be edge cases that are not triggered by any test. This just gives you more visibility over the code and what's actually happening.

Anyway, like I said, right now I think this only makes sense if it doesn't have a big performance hit, which I think it's reasonable, no?

@dnicolodi
Copy link
Member

Yeah, sure, but there might be edge cases that are not triggered by any test.

If the edge cases are not convered by any test, the test coverage of that code would be exactly zero. Therefore, I don't see how coverage data can inform any decision: whatever you do to that code, the fact that it is not tested does not change.

@rgommers
Copy link
Contributor

My $2c: I'd personally not both with code coverage on >1 CI job - it's not worth it. Also Codecov has a very bad track record, both security and reliability wise (often numbers are just nonsensical, especially for diffs because it compares PRs to main incorrectly). Adding it to multiple CI providers is effort and increases risk for very limited (if any) gains.

@FFY00
Copy link
Member Author

FFY00 commented Mar 14, 2023

If the edge cases are not convered by any test, the test coverage of that code would be exactly zero. Therefore, I don't see how coverage data can inform any decision: whatever you do to that code, the fact that it is not tested does not change.

Yes, but we don't have a 100% test coverage policy, therefore having the extra coverage information does provide extra information we might not get with the tests alone.

My $2c: I'd personally not both with code coverage on >1 CI job - it's not worth it.

That depends on the project, but sure.

Also Codecov has a very bad track record, both security and reliability wise (often numbers are just nonsensical, especially for diffs because it compares PRs to main incorrectly). Adding it to multiple CI providers is effort and increases risk for very limited (if any) gains.

I don't really think there's any risk, the only additional thing that could be exposed is the codecov key, which only lets you upload coverage reports.

Regarding reliability, yes, codecov has been quite bad, but we are already using it anyway.

Increased effort would be the main argument, but I already spent the time implementing it, and it shouldn't really require much maintenance, and if it does, we can just rip it out anyway.

@FFY00 FFY00 closed this Mar 14, 2023
@dnicolodi
Copy link
Member

If the edge cases are not convered by any test, the test coverage of that code would be exactly zero. Therefore, I don't see how coverage data can inform any decision: whatever you do to that code, the fact that it is not tested does not change.

Yes, but we don't have a 100% test coverage policy, therefore having the extra coverage information does provide extra information we might not get with the tests alone.

This statement does not follow any logic: if some code is not tested it has zero test coverage, by definition. How can the lack of test coverage provide any information, other than the obvious one that it is not tested? If it is tested, the test coverage stays the same whether the code works as intended or not. The only thing that test coverage tells you is, unsurprisingly, the amount of code covered by the test suite. This can only be used to inform decision about which tests to write.

@FFY00
Copy link
Member Author

FFY00 commented Mar 14, 2023

This statement does not follow any logic

Please refrain for making this kind of statement in the future if you can, please. It's not constructive or beneficial, the only thing it can do is raise tensions. Thanks ❤️

if some code is not tested it has zero test coverage, by definition. How can the lack of test coverage provide any information, other than the obvious one that it is not tested?

The issue is that the coverage might only be hit on certain systems, like Debian. Having the coverage metrics can tell us if the code used to be run, but isn't anymore, hinting that it might not be needed anymore, for eg.

Anyway, this PR is closed. No need to keep wasting energy on it, I think we all have many other things to deal with that are more beneficial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants