-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CheckM as alternative to BUSCO #350
Conversation
…-sample rather than per-contig)
This PR is against the
|
|
CHANGELOG.md
Outdated
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
### `Changed` | |||
|
|||
- [#340](https://github.com/nf-core/mag/pull/340) - Update to nf-core 2.6.1 `TEMPLATE` | |||
- [#350](https://github.com/nf-core/mag/pull/350) - Adds support for CheckM as alternative bin completeness and QC tool (added by @jfy133). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a new convention how to state individual contributions? One could also think about changing the contributions section in the README
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, but if you tag people in the README, on release they get shown as contributors to that specific release which is a nice acknowledgment of the specific work they do (rather than just the general 'catch all' on the README. I did this on the last 2 MAG releases too:
https://github.com/nf-core/mag/releases/tag/2.2.1
https://github.com/nf-core/mag/releases/tag/2.2.0
But I'm happy to remove it if you don' tlike it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I just thought it would make sense to do this consistently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also do that! Tell me which way you prefer and I'll update accordingly :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you do this in other pipelines as well? I don't mind, we can keep it like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I don't mind either way. I'll suggest then mention each person for each PR in the CHANGELOG (then it makes it easier to just paste the changelog straight into the release notes and it'll already be there, rather than us having to go back and check each PR afterwards), but ultimately this is your pipeline so you can choose ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I was thinking again about this issue and if I should add my name now or not, and in the end I would somehow prefer not to start this (at least as long it's not nf-core wide like this), because then everyone who is not doing this would be excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, well actually in a later PR (from yday), I've gone back through all the merged PRs for the upcoming release and I've added everyone:
So we won't have missed anyone now. Note that these tags only reflect what is displayed in each release (not the contribtuors thorughout the entire project as a whole, which is still displayed here).
Things such as this are not necessary to be nf-core wide to be fair. But as this displaying of the contributors in the release stuff is pretty new anyway, I was sort of thinking maybe nf-core/mag could be the 'forerunner' of a nice practise ;).
But if you prefer, I could also do a poll on slack to ask if people think we should do it nf-core wide or not?
I sort of like it because it again makes it much more visible peoples contributors, and will help encourage people to keep contributing in the long run. But it's up to you! I can also remove all the tags from PR #350 again if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you mean #380, sorry didn't see it.
A poll would be nice though, then we could see how to do it consistently. Even if it doesn't necessarily has to be consistent, it just easier to stick to a certain way across multiple pipelines.
So don't remove it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops yes, typo.
Poll here: https://nfcore.slack.com/archives/C043UU89KKQ/p1673621984306899
I will have a more thorough look here and also look at the python scripts asap :) |
Hi @jfy133 and @d4straub, do you have any opinion about which of the checkm metrics should end up in the Ok, maybe the question is more: the output of Edit: will just use all |
But I believe Alex only actually uses output of qa |
Yes, I usually first run the |
I actually have no preference for CheckM, because I am not used to it. Usually the more the merrier ;) |
@alexhbnr what are your preferred columns? |
Here is the list that I usually would have a look at:
|
@jfy133 maybe you can have a look if this seems alright to you. However, the |
|
Python linting (
|
maybe worth mentioning: the CheckM results "Bin Id" does not contain a ".fa" extension and I didn't change it, while for the other tools and summaries the name for the bin does contain it. |
docs/usage.md
Outdated
It is _highly_ recommended to pass utilise the `--checkm_db` flag to pass pre-downloaded and uncompressed directory of required CheckM references. | ||
|
||
If you do not use this, this will result in the tool detecting the database is missing, and download the files for EVERY CheckM process that is executed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be the case anymore, right?
Do you mean inside the output table? Otherwise I should have adressed all your other comments @skrakau :) |
Ok for some reason the Black python linting error is now coming up even though I didn't touch that ... when I black locally nothing changes... could you try running it and see if it fixes it for you @skrakau ? |
Yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
Note this does not include the integration with the bin_summary reports etc., I will need somone else to contribute that as it reuqires python script updates (which I'm not really comfortble with)
Closes #326
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).