-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let PR workflow check file sizes #4005
Let PR workflow check file sizes #4005
Conversation
cf7727a
to
ea288f5
Compare
I get a bit different numbers: $ find . -size +500k | grep -v '^./.git/' | wc -l
324
$ find . -size +1M | grep -v '^./.git/' | wc -l
153 These are the worst offenders: $ find . -size +10M | grep -v '^./.git/' | sort | xargs ls -1sh
24M ./tools/feelnc/test-data/genome_chr38.fa
62M ./tools/gatk4/test-data/chr20.fa
15M ./tools/kallisto/test-data/cached_locally/sacCer2_chrX.kallistei
20M ./tools/khmer/test-data/test-abund-read-2.oxlicg
11M ./tools/maxbin2/test-data/3/out.reassem/out.reads.noclass
13M ./tools/maxbin2/test-data/interleavedPE_unmapped_Sample3_small.fasta
15M ./tools/nanoplot/test-data/alignment.bam
47M ./tools/pangolin/test-data/2021-04-21/data/decision_tree_rules.txt
46M ./tools/pangolin/test-data/2021-04-21/data/lineages.metadata.csv
14M ./tools/pygenometracks/test-data/Li_et_al_2015.h5
22M ./tools/ucsc_blat/test-data/amaVit1_Gallus/amaVit1.fa |
Co-authored-by: Nicola Soranzo <[email protected]>
Can you add a (temporary) commit adding a large test file to see it at work? |
Excellent idea. I would expect to get reported at least the following files (which seemed to be the smallest (larger than 500k) and the largest in the repo):
|
…as/tools-iuc into topic/ci-file-sizes
Looks good:
If you like I could add the size for each file. Also the artifact is produced. |
Co-authored-by: Nicola Soranzo <[email protected]>
Nice!
|
Indeed. I also could imagine that there are tools where it is hard/impossible to find smaller test data. Then we should probably still accept it. Is there a way to still be able to merge if this job fails? Just remove it from Comment is also an idea. |
Co-authored-by: Nicola Soranzo <[email protected]>
It may very well be that the comment won't work until this get merged. |
@nsoranzo any idea why the workflow isn't running anymore? |
It seems workflows are running now. |
This reverts commit 56409e0.
From https://github.com/peter-evans/create-or-update-comment: |
seems not possible
Large test data should be avoided. So lets automatize this in the PR workflow. .. Sometimes it can even create problems: galaxyproject/galaxy#12604
At the moment we have 410 files > 500k and 225 files > 1M
We can argue if we want to fail
determine-success
if large files are found.FOR CONTRIBUTOR: