Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding invalid extensions #678

Merged
merged 7 commits into from
Apr 22, 2022
Merged

Finding invalid extensions #678

merged 7 commits into from
Apr 22, 2022

Conversation

qarmin
Copy link
Owner

@qarmin qarmin commented Apr 17, 2022

Support for finding files with invalid extensions

Algorithm

  • Get original extension of file
  • Check what content is inside file(returns single extension)
  • To provided extension, mime type is provided
  • All extensions from this mime type are loaded
  • App check if original extension is inside provided ones by mime type

Due to not 100% perfect extension finding by used libraries, there are provided several workarounds to lower amount of not correctly detected records.

App sometimes can show invalid results especially with system files or less known extensions, so I suggest to use it only for well known own files like e.g. photo collections.

TODO
Button to rename files(not sure how it should work, because most of files have multiple possible proper extensions)

Fixes #611
Related to #651

@qarmin qarmin added the enhancement New feature or request label Apr 17, 2022
@qarmin qarmin force-pushed the find_invalid_extensions branch from f3f93de to 884cf7e Compare April 19, 2022 04:55
@qarmin qarmin force-pushed the find_invalid_extensions branch from 7e65827 to a8caf95 Compare April 19, 2022 19:13
@qarmin
Copy link
Owner Author

qarmin commented Apr 19, 2022

HELP IN FINDING PROPER LIBRARY NEEDED

Well,
Entire logic is inside, but I didn't read properly description of mime_guess crate - "MIME/MediaType guessing by file extension." XD (it was quite strange that ~500000 checked files had proper extensions)

So, I'm looking for replacement that library.
Currently looks that this library can be proper:

@qarmin qarmin marked this pull request as ready for review April 21, 2022 05:31
@qarmin qarmin merged commit 8e4e1f5 into master Apr 22, 2022
@qarmin qarmin deleted the find_invalid_extensions branch April 22, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Music Duplicates doesn't search AAC(.m4a) files anymore
1 participant