-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Similar videos: Failed to hash file, reason Too short #605
Comments
30s is limitation from used library - https://github.com/Farmadupe/vid_dup_finder/ |
I understand, thank you anyway for giving me this information! :) |
Hello! I am the author of vid_dup_finder_lib! I am currently drafting a an update to vid_dup_finder_lib with support for videos shorter than 30 seconds. When that is done I will try and create a pull request to bring this feature into czkawka. It's a hobby project though, so I cannot estimate when I will be able to publish this update. (I think eventually it should be possible to support videos down to ~two seconds length, as long as there are at least 64 frames of content.) |
It's good to know, I understand, it was just a friendly question haha Thank you for your work! I'll be looking at the project! :D |
Can't wait for this update!!! |
All of my 2160p files also stated that "Failed to hash file, reason Too short" even those files were all longer than 10min |
Same here, most of my videos are waaaaay longer than 30s so doesn't seem to be because of the @qarmin: Could this issue be re-opened? |
I agree |
Can anyone post here example video? |
Does any such video have chapters? I think vid_dup_finder_lib is checking the length of the first chapter/stream, which may be short. Either way, it would be useful if an example video file can be provided. |
Thank you. Fail - 50 FPS
Success 50 FPS
Fail - 59.940 FPS |
Thanks for the mediainfo out, there are some good clues about what may be the cause. The underlying library uses a commandline tool called
(You might get errors if your video filename contains spaces. If so, the easiest fix is to temporarily rename the video to a different name and run the command again) It would also be useful to see if there is any error when ffmpeg opens the file to extract the frames. If you are able, could you run this command and paste the output?
FWIW, the only way the 'too short' error can be generated is when FFMPEG exits before it has sent 10 frames to the underlying library. My current theories are:
|
Thanks again, here is the file. PS. I just realize that I can upload files in the comment. 😅 FFMPEG 50F.txt with BtbN/FFmpeg-Builds e6e28d4 Latest Auto-Build (2022-11-13 12:37) |
Apologies for the late reply. When I compare FFPROBE 50F.txt and FFPROBE 50P.txt I see that the F file contains two streams (0: Audio; 1: Video) and the P file contains three (0: Video; 1: Audio; 2: Tmcd). I think it's possible that vid_dup_finder_lib might be reading the metadata for the wrong stream. You may be able to test this by stripping all non-video streams from the F files and trying again. For example (If I get some time, I will try force feeding the ffprobe output into vid_dup_finder_lib to see if any internal wrong calculation occurs.) When looking at the ffmpeg outputs I think I can see that all F videos have a SAR (source aspect ratio) = 0, but all the P videos have a SAR = 1: I do not know if this is an error or not, but it might be suspicious. |
Any update on this? I just posted under #968 that I am seeing similar issues on my local install and Flatpak install but not on my docker install. |
I'm not trying |
I tried the test with the appimage and surprising that does find the duplicates properly. |
Just chiming in here, was able to reproduce this issue using the docker image being maintained here: https://hub.docker.com/r/jlesage/czkawka/ EDIT: Confirmed to be an issue on windows using GUI variant as well Heres the media info from ffprobe (redacted location data fyi)
|
I've run this program (not the underlying lib itself) via the windows gui and limited the Threads from all to just 8 to test if the memory could be a problem. Turns out with limited Threads it works fine and doesn't give the too short error. With all Threads active my RAM got full pretty fast and it even used the swap file. With only 8 Threads the RAM usage was always max 3/4 full and all files got checked perfectly fine. So if others encounter the file too short error, try limiting the used Threads, restart the application and clear the cache (those two are important). If that helps, don't forget to report back so that the problem can get narrowed down. |
I'll give that a try and report back. |
Wow this method yield more result, test with 2 threads on my 4790k. I got more video on the list. |
Tested this on the flatpak but this unfortunately did not change anything there. |
I am new to this, however I am coming across the same issue. I know this has been a long going situation as i've read through, just wanted to find out if there is any update or work around. I was able to do one folder of videos fine, but the other folder of videos to compare i am getting the same error message. |
Hello! In this issue the author states that he has a functioning branch that allows videos under 30 seconds, and that branch is in every way better than main. Could this be incorporated? I have huge folders of videos under 30 seconds to dedupe. Thank you! |
I didn't know about this, would be great. I still have a ton of short videos that I need to find duplicates for... 😅 |
I tested library from https://github.com/Farmadupe/vid_dup_finder_lib/tree/dct3d with medium sized collection of short videos and small collection of bigger and:
So it looks like the library is going in the correct direction, but without a stable version on crates.io, I am unable to update this library in the application.
|
just a quick reply:
I thought code quality is actually quite good and I am surprised there is a panic. Since it is an overflow error I think this may be because I never compile without optimizations. Hopefully if this error is fixed the code should be reliable. But the proof is in the pudding. |
P.S I think the new version of the library uses internal frame streaming, so it will not store all frames in memory at the same time. Hopefully it will not cause memory exhaustion. |
Any update to this? I have a folder with 70k 28s videos from security camera footage over the years that has a whole lot of junk in it where PIR is triggered by something just out of frame or directly below the camera causing a 28s recording of nothing happening... a lot. It would be nice to be able to identify and delete all the identical recordings of nothing going on. |
Unstable version of app with added support for shorter than 30 seconds files, are available as artifacts of #1356 This PR has a few problems that need to be solved before it can be merged. |
I gave those artifacts a try (windows), but all I get is a crash, immediately, even for folders that contain nothing at all. Is there anything else required that is not part of the artifacts? A special ffmpeg? Other libs? "standard" Czkawka works without any issues. |
Would love to see movement on this, or a recommendation for a tool that has this capability. |
Was it completed @qarmin ? Last release was 4 months ago and does not handle shorter videos. Is there a release coming soon with the feature? Thanks! |
It was added in #1425 forget to close it when closing this PR) |
I could be missing something, but it looks like last nightly release on the release tab was prior to the 8.0.0 release? @qarmin |
Hello! I'm using the similar videos function, I have like 3000 videos, many of them are just a few seconds long, and less than 1 Megabyte in size.
When using the similar videos function, it fails with a lot of files saying this in the log:
############### WARNING(53) ###############
Failed to hash file, reason Too short: E:\videos\EXAMPLE.m4v
I read the documentation and says that czkawka compares files longer than 30 seconds only, there must be a reason for this, so I'll just ask: Is there a way to change this parameter? If there is not, thank you anyway!
The text was updated successfully, but these errors were encountered: