-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc CI improvements #289
Misc CI improvements #289
Conversation
- python3 -m pip install .[test] | ||
- python3 -m pytest --showlocals -vv | ||
- python3 -m pip install .[test] pytest-codecov | ||
- python3 -m pytest --showlocals -vv --cov --cov-report=xml:coverage-$CIRRUS_TASK_NAME.xml --codecov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of reporting coverage data to codecov if we don't do anything with it? I would just get rid of the coverage reporting altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do, https://github.com/seantis/pytest-codecov is a plugin to upload the coverage data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but where is the data visualized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://app.codecov.io/gh/mesonbuild/meson-python/ and the badge in the README and docs. I'd like to improve the coverage, so IMO it's definitely useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There must be something wrong with the codevcov.io: it does not point to the source code in this repository. and looking at the details for any given file just shows "There was a problem getting the source code from your provider. Unable to show line by line coverage."
I don't expect that running the code on different flavors of Linux will exercise different code paths, and this doubles the CI execution time. I prefer fast feedback loops. Are we sure we really need this? Does codecov provide a measure of how much each "measurement" from each test run contributes to the total coverage?
Well, apparently debian-unstable just fixed itself 🤣 |
@FFY00 I pushed on main the obvious fixes to the CI. Please rebase. Let me know if rebasing looks to tedious and I'll do it for you. EDIT: you can get the rebased comments here https://github.com/dnicolodi/meson-python/tree/pr289 |
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
This turns them into warnings instead. Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Next time please just push to my branch, it's easier and I don't mind 👍 |
b137ed7
to
14d23e6
Compare
Sure. I'm never sure whether people minds me messing with their branches. Force pushed now. |
I think we should be okay to merge this now, right @dnicolodi? |
I still think that reporting coverage from the different Linux flavors we are testing on is completely useless. There is no code in meson-python that executes only on conditions that are affected on which Linux distribution it is running (well, with the exception of a few lines in |
Also, there are still commits that need to be squashed. |
@@ -111,6 +111,11 @@ omit = [ | |||
|
|||
[tool.coverage.report] | |||
ignore_errors = true | |||
exclude_lines = [ | |||
'\#\s*pragma: no cover', # we need this because this field overrides the default | |||
'^\s*raise NotImplementedError\b', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why exclude raise NotImplementedError
? Error conditions are supposed to be tested as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but NotImplementedError
is for something that is supposed to work eventually, but doesn't currently, so it's different from something where we do actually intent on throwing an error. Writting tests to see if something raises NotImplementedError
doesn't seem very productive, I'm not really sure what actual value it brings on most situations.
I don't recall it being that bad when I checked, but we should definitely look at it before merging then. Thanks for pointing it out 👍 |
I usually need a good reason to do something, not a good reason not to do it. Can you explain what you plan to gain running coverage on the different flavor of Linux? |
Well, this was triggered because of the custom Debian code. For example, right now it still provides us valuable information, like in #280, where we are planning to remove the custom Debian code, it will tell us if any code is actually being hit at the moment, which we assume will be none, but it's good to have data showing it. I think due to the nature of the project, it being very dependent on the Python it runs on, it makes sense to have the coverage running in Cirrus if it's not a big performance hit. Before, with the Debian specific code, I'd be okay with a moderate performance it, but since that's not the case anymore, the value drops, and so the sensible tradeoff. |
The actual data is provided by removing the code and noticing that observing that nothing breaks. |
Yeah, sure, but there might be edge cases that are not triggered by any test. This just gives you more visibility over the code and what's actually happening. Anyway, like I said, right now I think this only makes sense if it doesn't have a big performance hit, which I think it's reasonable, no? |
If the edge cases are not convered by any test, the test coverage of that code would be exactly zero. Therefore, I don't see how coverage data can inform any decision: whatever you do to that code, the fact that it is not tested does not change. |
My $2c: I'd personally not both with code coverage on >1 CI job - it's not worth it. Also Codecov has a very bad track record, both security and reliability wise (often numbers are just nonsensical, especially for diffs because it compares PRs to main incorrectly). Adding it to multiple CI providers is effort and increases risk for very limited (if any) gains. |
Yes, but we don't have a 100% test coverage policy, therefore having the extra coverage information does provide extra information we might not get with the tests alone.
That depends on the project, but sure.
I don't really think there's any risk, the only additional thing that could be exposed is the codecov key, which only lets you upload coverage reports. Regarding reliability, yes, codecov has been quite bad, but we are already using it anyway. Increased effort would be the main argument, but I already spent the time implementing it, and it shouldn't really require much maintenance, and if it does, we can just rip it out anyway. |
This statement does not follow any logic: if some code is not tested it has zero test coverage, by definition. How can the lack of test coverage provide any information, other than the obvious one that it is not tested? If it is tested, the test coverage stays the same whether the code works as intended or not. The only thing that test coverage tells you is, unsurprisingly, the amount of code covered by the test suite. This can only be used to inform decision about which tests to write. |
Please refrain for making this kind of statement in the future if you can, please. It's not constructive or beneficial, the only thing it can do is raise tensions. Thanks ❤️
The issue is that the coverage might only be hit on certain systems, like Debian. Having the coverage metrics can tell us if the code used to be run, but isn't anymore, hinting that it might not be needed anymore, for eg. Anyway, this PR is closed. No need to keep wasting energy on it, I think we all have many other things to deal with that are more beneficial. |
Just a couple of small things, it didn't seem worth to open separate PRs.