-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add a tool to find broken files. #482
Conversation
Codecov Report
@@ Coverage Diff @@
## master #482 +/- ##
==========================================
+ Coverage 78.01% 78.77% +0.76%
==========================================
Files 101 103 +2
Lines 5617 5702 +85
Branches 923 927 +4
==========================================
+ Hits 4382 4492 +110
+ Misses 1108 1088 -20
+ Partials 127 122 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit and make some changes, please check it. @Ezra-Yu
I have used verify_dataset.py to check my datasets, and there were 1885 broken files. My datasets were organized according to ImageNet and they were all jpg images. I wander why they were broken, and I want to know how to handle the 1885 broken files to make them pass the verify_dataset.py. |
* add verify dataset * add phase * rm attr of single_process * Use `mmcv.track_parallel_progress` to track the validation. Co-authored-by: mzr1996 <[email protected]>
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
After preparing the dataset, the data may be broken. add a tool to find out all the broken files.
Modification
Add a verify_dataset.py tool
BC-breaking (Optional)
No.
Use cases (Optional)
Checklist
Before PR:
After PR: