Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar videos: Failed to hash file, reason Too short #605

Closed
Ciberbago opened this issue Jan 22, 2022 · 36 comments
Closed

Similar videos: Failed to hash file, reason Too short #605

Ciberbago opened this issue Jan 22, 2022 · 36 comments

Comments

@Ciberbago
Copy link

Hello! I'm using the similar videos function, I have like 3000 videos, many of them are just a few seconds long, and less than 1 Megabyte in size.

When using the similar videos function, it fails with a lot of files saying this in the log:

############### WARNING(53) ###############
Failed to hash file, reason Too short: E:\videos\EXAMPLE.m4v

I read the documentation and says that czkawka compares files longer than 30 seconds only, there must be a reason for this, so I'll just ask: Is there a way to change this parameter? If there is not, thank you anyway!

@qarmin
Copy link
Owner

qarmin commented Jan 24, 2022

30s is limitation from used library - https://github.com/Farmadupe/vid_dup_finder/
It is possible that this limit will be changed/removed, but this require to add this feature to library from above or use different algorithm/library

@Ciberbago
Copy link
Author

I understand, thank you anyway for giving me this information! :)

@Farmadupe
Copy link
Contributor

Farmadupe commented Feb 1, 2022

Hello! I am the author of vid_dup_finder_lib!

I am currently drafting a an update to vid_dup_finder_lib with support for videos shorter than 30 seconds. When that is done I will try and create a pull request to bring this feature into czkawka. It's a hobby project though, so I cannot estimate when I will be able to publish this update.

(I think eventually it should be possible to support videos down to ~two seconds length, as long as there are at least 64 frames of content.)

@Ciberbago
Copy link
Author

It's good to know, I understand, it was just a friendly question haha

Thank you for your work! I'll be looking at the project! :D

@AlphaHasher
Copy link

Can't wait for this update!!!

@DawgNewb
Copy link

DawgNewb commented Nov 7, 2022

All of my 2160p files also stated that "Failed to hash file, reason Too short" even those files were all longer than 10min

@penyuan
Copy link

penyuan commented Nov 7, 2022

Same here, most of my videos are waaaaay longer than 30s so doesn't seem to be because of the vid_dup_finder library.

@qarmin: Could this issue be re-opened?

@AlphaHasher
Copy link

I agree

@qarmin
Copy link
Owner

qarmin commented Nov 7, 2022

Can anyone post here example video?

@qarmin qarmin reopened this Nov 7, 2022
@Farmadupe
Copy link
Contributor

Does any such video have chapters? I think vid_dup_finder_lib is checking the length of the first chapter/stream, which may be short.

Either way, it would be useful if an example video file can be provided.

@DawgNewb
Copy link

DawgNewb commented Nov 8, 2022

Thank you.
I did some recheck not all of my 2160p were affectd by this so I attach both success and fail to hash file.
I'm sorry but I don't know if this can be use, it's kinda long.

Fail - 50 FPS

General
Count                                    : 331
Count of stream of this kind             : 1
Kind of stream                           : General
Kind of stream                           : General
Stream identifier                        : 0
Count of video streams                   : 1
Count of audio streams                   : 1
OtherCount                               : 1
Video_Format_List                        : AVC
Video_Format_WithHint_List               : AVC
Codecs Video                             : AVC
Audio_Format_List                        : AAC LC
Audio_Format_WithHint_List               : AAC LC
Audio codecs                             : AAC LC
Other_Format_List                        : QuickTime TC
Other_Format_WithHint_List               : QuickTime TC
Other_Codec_List                         : QuickTime TC
Other_Language_List                      : English
Complete name                            : X:\xxxxxxxxxxxxx\xxxxxxxxxxx.mp4
Folder name                              : X:\xxxxxxxxxxxxx\
File name extension                      : xxxxxxxxxxx.mp4
File name                                : xxxxxxxxxxx
File extension                           : mp4
Format                                   : MPEG-4
Format                                   : MPEG-4
Format/Extensions usually used           : braw mov mp4 m4v m4a m4b m4p m4r 3ga 3gpa 3gpp 3gp 3gpp2 3g2 k3g jpm jpx mqv ismv isma ismt f4a f4b f4v
Commercial name                          : MPEG-4
Format profile                           : Base Media
Internet media type                      : video/mp4
Codec ID                                 : isom
Codec ID                                 : isom (isom/iso2/avc1/mp41)
Codec ID/Url                             : http://www.apple.com/quicktime/download/standalone.html
CodecID_Compatible                       : isom/iso2/avc1/mp41
File size                                : 10996860149
File size                                : 10.2 GiB
File size                                : 10 GiB
File size                                : 10 GiB
File size                                : 10.2 GiB
File size                                : 10.24 GiB
Duration                                 : 1747627
Duration                                 : 29 min 7 s
Duration                                 : 29 min 7 s 627 ms
Duration                                 : 29 min 7 s
Duration                                 : 00:29:07.627
Duration                                 : 00:29:07:29
Duration                                 : 00:29:07.627 (00:29:07:29)
Overall bit rate mode                    : VBR
Overall bit rate mode                    : Variable
Overall bit rate                         : 50339621
Overall bit rate                         : 50.3 Mb/s
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
Frame count                              : 87379
Stream size                              : 1540230
Stream size                              : 1.47 MiB (0%)
Stream size                              : 1 MiB
Stream size                              : 1.5 MiB
Stream size                              : 1.47 MiB
Stream size                              : 1.469 MiB
Stream size                              : 1.47 MiB (0%)
Proportion of this stream                : 0.00014
HeaderSize                               : 32
DataSize                                 : 10995319939
FooterSize                               : 1540178
IsStreamable                             : No
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx
File creation date                       : xxxxxxxxxxx
File creation date (local)               : xxxxxxxxxxx
File last modification date              : xxxxxxxxxxx
File last modification date (local)      : xxxxxxxxxxx
Writing application                      : Blackmagic Design DaVinci Resolve
Writing application                      : Blackmagic Design DaVinci Resolve

Video
Count                                    : 382
Count of stream of this kind             : 1
Kind of stream                           : Video
Kind of stream                           : Video
Stream identifier                        : 0
StreamOrder                              : 0
ID                                       : 1
ID                                       : 1
Format                                   : AVC
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format/Url                               : http://developers.videolan.org/x264.html
Commercial name                          : AVC
Format profile                           : [email protected]
Format settings                          : CABAC / 2 Ref Frames
Format settings, CABAC                   : Yes
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 2
Format settings, Reference frames        : 2 frames
Internet media type                      : video/H264
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 1747580
Duration                                 : 29 min 7 s
Duration                                 : 29 min 7 s 580 ms
Duration                                 : 29 min 7 s
Duration                                 : 00:29:07.580
Duration                                 : 00:29:07:29
Duration                                 : 00:29:07.580 (00:29:07:29)
Bit rate mode                            : VBR
Bit rate mode                            : Variable
Bit rate                                 : 50013914
Bit rate                                 : 50.0 Mb/s
Maximum bit rate                         : 768000
Maximum bit rate                         : 768 kb/s
Width                                    : 3840
Width                                    : 3 840 pixels
Height                                   : 2160
Height                                   : 2 160 pixels
Sampled_Width                            : 3840
Sampled_Height                           : 2160
Pixel aspect ratio                       : 1.000
Display aspect ratio                     : 1.778
Display aspect ratio                     : 16:9
Rotation                                 : 0.000
Frame rate mode                          : CFR
Frame rate mode                          : Constant
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
FrameRate_Num                            : 50
FrameRate_Den                            : 1
Frame count                              : 87379
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Chroma subsampling                       : 4:2:0
Bit depth                                : 8
Bit depth                                : 8 bits
Scan type                                : Progressive
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.121
Delay                                    : 3600000
Delay                                    : 1 h 0 min
Delay                                    : 1 h 0 min 0 s 0 ms
Delay                                    : 1 h 0 min
Delay                                    : 01:00:00.000
Delay                                    : 01:00:00:00
Delay                                    : 01:00:00.000 (01:00:00:00)
Delay_Settings                           : DropFrame=No / 24HourMax=No / IsVisual=No
Delay_DropFrame                          : No
Delay, origin                            : Container
Delay, origin                            : Container
Stream size                              : 10925414506
Stream size                              : 10.2 GiB (99%)
Stream size                              : 10 GiB
Stream size                              : 10 GiB
Stream size                              : 10.2 GiB
Stream size                              : 10.18 GiB
Stream size                              : 10.2 GiB (99%)
Proportion of this stream                : 0.99350
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx
Buffer size                              : 768000
colour_description_present               : Yes
colour_description_present_Source        : Container / Stream
Color range                              : Limited
colour_range_Source                      : Container / Stream
Color primaries                          : BT.709
colour_primaries_Source                  : Container / Stream
Transfer characteristics                 : BT.709
transfer_characteristics_Source          : Container / Stream
Matrix coefficients                      : BT.709
matrix_coefficients_Source               : Container / Stream
Codec configuration box                  : avcC

Audio
Count                                    : 285
Count of stream of this kind             : 1
Kind of stream                           : Audio
Kind of stream                           : Audio
Stream identifier                        : 0
StreamOrder                              : 1
ID                                       : 2
ID                                       : 2
Format                                   : AAC
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Commercial name                          : AAC
Format_AdditionalFeatures                : LC
Codec ID                                 : mp4a-40-2
Duration                                 : 1747627
Duration                                 : 29 min 7 s
Duration                                 : 29 min 7 s 627 ms
Duration                                 : 29 min 7 s
Duration                                 : 00:29:07.627
Duration                                 : 00:29:07.627
Bit rate mode                            : CBR
Bit rate mode                            : Constant
Bit rate                                 : 320001
Bit rate                                 : 320 kb/s
Channel(s)                               : 2
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Channel positions                        : 2/0/0
Channel layout                           : L R
Samples per frame                        : 1024
Sampling rate                            : 48000
Sampling rate                            : 48.0 kHz
Samples count                            : 83886096
Frame rate                               : 46.875
Frame rate                               : 46.875 FPS (1024 SPF)
Frame count                              : 81920
Compression mode                         : Lossy
Compression mode                         : Lossy
Delay                                    : 3600000
Delay                                    : 1 h 0 min
Delay                                    : 1 h 0 min 0 s 0 ms
Delay                                    : 1 h 0 min
Delay                                    : 01:00:00.000
Delay                                    : 01:00:00.000
Delay_DropFrame                          : No
Delay, origin                            : Container
Delay, origin                            : Container
Delay relative to video                  : 0
Delay relative to video                  : 00:00:00.000
Delay relative to video                  : 00:00:00.000
Stream size                              : 69905413
Stream size                              : 66.7 MiB (1%)
Stream size                              : 67 MiB
Stream size                              : 67 MiB
Stream size                              : 66.7 MiB
Stream size                              : 66.67 MiB
Stream size                              : 66.7 MiB (1%)
Proportion of this stream                : 0.00636
Default                                  : Yes
Default                                  : Yes
Alternate group                          : 1
Alternate group                          : 1
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx

Other
Count                                    : 193
Count of stream of this kind             : 1
Kind of stream                           : Other
Kind of stream                           : Other
Stream identifier                        : 0
StreamOrder                              : 2
ID                                       : 3
ID                                       : 3
Type                                     : Time code
Format                                   : QuickTime TC
Format                                   : QuickTime TC
Commercial name                          : QuickTime TC
Duration                                 : 1747580
Duration                                 : 29 min 7 s
Duration                                 : 29 min 7 s 580 ms
Duration                                 : 29 min 7 s
Duration                                 : 00:29:07.580
Duration                                 : 00:29:07:29
Duration                                 : 00:29:07.580 (00:29:07:29)
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
FrameRate_Num                            : 50
FrameRate_Den                            : 1
Frame count                              : 87379
Time code of first frame                 : 01:00:00:00
Time code of last frame                  : 01:29:07:28
TimeCode_DropFrame                       : No
Time code, stripped                      : Yes
Time code, stripped                      : Yes
Language                                 : en
Language                                 : English
Language                                 : English
Language                                 : en
Language                                 : eng
Language                                 : en
Default                                  : No
Default                                  : No
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx

Success 50 FPS

General
Count                                    : 331
Count of stream of this kind             : 1
Kind of stream                           : General
Kind of stream                           : General
Stream identifier                        : 0
Count of video streams                   : 1
Count of audio streams                   : 1
OtherCount                               : 1
Video_Format_List                        : AVC
Video_Format_WithHint_List               : AVC
Codecs Video                             : AVC
Audio_Format_List                        : AAC LC
Audio_Format_WithHint_List               : AAC LC
Audio codecs                             : AAC LC
Other_Format_List                        : QuickTime TC
Other_Format_WithHint_List               : QuickTime TC
Other_Codec_List                         : QuickTime TC
Other_Language_List                      : English
Complete name                            : X:\xxxxxxxxxxxxx\xxxxxxxxxxx.mp4
Folder name                              : X:\xxxxxxxxxxxxx\
File name extension                      : xxxxxxxxxxx.mp4
File name                                : xxxxxxxxxxx
File extension                           : mp4
Format                                   : MPEG-4
Format                                   : MPEG-4
Format/Extensions usually used           : braw mov mp4 m4v m4a m4b m4p m4r 3ga 3gpa 3gpp 3gp 3gpp2 3g2 k3g jpm jpx mqv ismv isma ismt f4a f4b f4v
Commercial name                          : MPEG-4
Format profile                           : Base Media
Internet media type                      : video/mp4
Codec ID                                 : isom
Codec ID                                 : isom (isom/iso2/avc1/mp41)
Codec ID/Url                             : http://www.apple.com/quicktime/download/standalone.html
CodecID_Compatible                       : isom/iso2/avc1/mp41
File size                                : 12046424294
File size                                : 11.2 GiB
File size                                : 11 GiB
File size                                : 11 GiB
File size                                : 11.2 GiB
File size                                : 11.22 GiB
Duration                                 : 1919339
Duration                                 : 31 min 59 s
Duration                                 : 31 min 59 s 339 ms
Duration                                 : 31 min 59 s
Duration                                 : 00:31:59.339
Duration                                 : 00:31:59:14
Duration                                 : 00:31:59.339 (00:31:59:14)
Overall bit rate mode                    : VBR
Overall bit rate mode                    : Variable
Overall bit rate                         : 50210721
Overall bit rate                         : 50.2 Mb/s
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
Frame count                              : 95964
Stream size                              : 1689746
Stream size                              : 1.61 MiB (0%)
Stream size                              : 2 MiB
Stream size                              : 1.6 MiB
Stream size                              : 1.61 MiB
Stream size                              : 1.611 MiB
Stream size                              : 1.61 MiB (0%)
Proportion of this stream                : 0.00014
HeaderSize                               : 32
DataSize                                 : 12044734568
FooterSize                               : 1689694
IsStreamable                             : No
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx
File creation date                       : xxxxxxxxxxx
File creation date (local)               : xxxxxxxxxxx
File last modification date              : xxxxxxxxxxx
File last modification date (local)      : xxxxxxxxxxx
Writing application                      : Blackmagic Design DaVinci Resolve
Writing application                      : Blackmagic Design DaVinci Resolve

Video
Count                                    : 382
Count of stream of this kind             : 1
Kind of stream                           : Video
Kind of stream                           : Video
Stream identifier                        : 0
StreamOrder                              : 0
ID                                       : 1
ID                                       : 1
Format                                   : AVC
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format/Url                               : http://developers.videolan.org/x264.html
Commercial name                          : AVC
Format profile                           : [email protected]
Format settings                          : CABAC / 2 Ref Frames
Format settings, CABAC                   : Yes
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 2
Format settings, Reference frames        : 2 frames
Format settings, GOP                     : M=2, N=120
Internet media type                      : video/H264
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 1919280
Duration                                 : 31 min 59 s
Duration                                 : 31 min 59 s 280 ms
Duration                                 : 31 min 59 s
Duration                                 : 00:31:59.280
Duration                                 : 00:31:59:14
Duration                                 : 00:31:59.280 (00:31:59:14)
Bit rate mode                            : VBR
Bit rate mode                            : Variable
Bit rate                                 : 49885211
Bit rate                                 : 49.9 Mb/s
Maximum bit rate                         : 768000
Maximum bit rate                         : 768 kb/s
Width                                    : 3840
Width                                    : 3 840 pixels
Height                                   : 2160
Height                                   : 2 160 pixels
Sampled_Width                            : 3840
Sampled_Height                           : 2160
Pixel aspect ratio                       : 1.000
Display aspect ratio                     : 1.778
Display aspect ratio                     : 16:9
Rotation                                 : 0.000
Frame rate mode                          : CFR
Frame rate mode                          : Constant
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
FrameRate_Num                            : 50
FrameRate_Den                            : 1
Frame count                              : 95964
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Chroma subsampling                       : 4:2:0
Bit depth                                : 8
Bit depth                                : 8 bits
Scan type                                : Progressive
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.120
Delay                                    : 3600000
Delay                                    : 1 h 0 min
Delay                                    : 1 h 0 min 0 s 0 ms
Delay                                    : 1 h 0 min
Delay                                    : 01:00:00.000
Delay                                    : 01:00:00:00
Delay                                    : 01:00:00.000 (01:00:00:00)
Delay_Settings                           : DropFrame=No / 24HourMax=No / IsVisual=No
Delay_DropFrame                          : No
Delay, origin                            : Container
Delay, origin                            : Container
Stream size                              : 11967961070
Stream size                              : 11.1 GiB (99%)
Stream size                              : 11 GiB
Stream size                              : 11 GiB
Stream size                              : 11.1 GiB
Stream size                              : 11.15 GiB
Stream size                              : 11.1 GiB (99%)
Proportion of this stream                : 0.99349
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx
Buffer size                              : 768000
colour_description_present               : Yes
colour_description_present_Source        : Container / Stream
Color range                              : Limited
colour_range_Source                      : Container / Stream
Color primaries                          : BT.709
colour_primaries_Source                  : Container / Stream
Transfer characteristics                 : BT.709
transfer_characteristics_Source          : Container / Stream
Matrix coefficients                      : BT.709
matrix_coefficients_Source               : Container / Stream
Codec configuration box                  : avcC

Audio
Count                                    : 285
Count of stream of this kind             : 1
Kind of stream                           : Audio
Kind of stream                           : Audio
Stream identifier                        : 0
StreamOrder                              : 1
ID                                       : 2
ID                                       : 2
Format                                   : AAC
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Commercial name                          : AAC
Format_AdditionalFeatures                : LC
Codec ID                                 : mp4a-40-2
Duration                                 : 1919339
Duration                                 : 31 min 59 s
Duration                                 : 31 min 59 s 339 ms
Duration                                 : 31 min 59 s
Duration                                 : 00:31:59.339
Duration                                 : 00:31:59.339
Bit rate mode                            : CBR
Bit rate mode                            : Constant
Bit rate                                 : 319999
Bit rate                                 : 320 kb/s
Channel(s)                               : 2
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Channel positions                        : 2/0/0
Channel layout                           : L R
Samples per frame                        : 1024
Sampling rate                            : 48000
Sampling rate                            : 48.0 kHz
Samples count                            : 92128272
Frame rate                               : 46.875
Frame rate                               : 46.875 FPS (1024 SPF)
Frame count                              : 89969
Compression mode                         : Lossy
Compression mode                         : Lossy
Delay                                    : 3600000
Delay                                    : 1 h 0 min
Delay                                    : 1 h 0 min 0 s 0 ms
Delay                                    : 1 h 0 min
Delay                                    : 01:00:00.000
Delay                                    : 01:00:00.000
Delay_DropFrame                          : No
Delay, origin                            : Container
Delay, origin                            : Container
Delay relative to video                  : 0
Delay relative to video                  : 00:00:00.000
Delay relative to video                  : 00:00:00.000
Stream size                              : 76773478
Stream size                              : 73.2 MiB (1%)
Stream size                              : 73 MiB
Stream size                              : 73 MiB
Stream size                              : 73.2 MiB
Stream size                              : 73.22 MiB
Stream size                              : 73.2 MiB (1%)
Proportion of this stream                : 0.00637
Default                                  : Yes
Default                                  : Yes
Alternate group                          : 1
Alternate group                          : 1
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx

Other
Count                                    : 193
Count of stream of this kind             : 1
Kind of stream                           : Other
Kind of stream                           : Other
Stream identifier                        : 0
StreamOrder                              : 2
ID                                       : 3
ID                                       : 3
Type                                     : Time code
Format                                   : QuickTime TC
Format                                   : QuickTime TC
Commercial name                          : QuickTime TC
Duration                                 : 1919280
Duration                                 : 31 min 59 s
Duration                                 : 31 min 59 s 280 ms
Duration                                 : 31 min 59 s
Duration                                 : 00:31:59.280
Duration                                 : 00:31:59:14
Duration                                 : 00:31:59.280 (00:31:59:14)
Frame rate                               : 50.000
Frame rate                               : 50.000 FPS
FrameRate_Num                            : 50
FrameRate_Den                            : 1
Frame count                              : 95964
Time code of first frame                 : 01:00:00:00
Time code of last frame                  : 01:31:59:13
TimeCode_DropFrame                       : No
Time code, stripped                      : Yes
Time code, stripped                      : Yes
Language                                 : en
Language                                 : English
Language                                 : English
Language                                 : en
Language                                 : eng
Language                                 : en
Default                                  : No
Default                                  : No
Encoded date                             : xxxxxxxxxxx
Tagged date                              : xxxxxxxxxxx

Fail - 59.940 FPS
Success - 59.940 FPS
Fail - 60 FPS
Success - 60 FPS

@Farmadupe
Copy link
Contributor

Thanks for the mediainfo out, there are some good clues about what may be the cause.

The underlying library uses a commandline tool called ffprobe to get information about videos and it would be really useful if I could see some output (ideally one 'Success' and one 'Failed' video) If you are able, the command would be

ffprobe  -v quiet -show_format -show_streams -print_format json FILE_NAME_HERE

(You might get errors if your video filename contains spaces. If so, the easiest fix is to temporarily rename the video to a different name and run the command again)

It would also be useful to see if there is any error when ffmpeg opens the file to extract the frames. If you are able, could you run this command and paste the output?

ffmpeg -hide_banner -loglevel verbose -nostats -threads 1 -ss 30 -i FILE_NAME_HERE -vf fps=69905/16384 -vframes 15 -pix_fmt gray -c:v rawvideo -f null - > NUL

FWIW, the only way the 'too short' error can be generated is when FFMPEG exits before it has sent 10 frames to the underlying library. My current theories are:

  • Ffmpeg killed due to running out of memory because of large 4k video frame size (feels unlikely)
  • My library code has a wrong calculation, and tries to extract frames from a non-existent time index e.g after the end of the video (current best guess)

@DawgNewb
Copy link

DawgNewb commented Nov 13, 2022

Thanks again, here is the file.
These are probably difference files from mediainfo's post with both Success/Fail.
Since I can't remember then I'm re-run entry "Similar Videos" again after clearing the cache but it's still failed to hash.

PS. I just realize that I can upload files in the comment. 😅

FFMPEG 50F.txt
FFMPEG 50P.txt
FFMPEG 59.94F.txt
FFMPEG 59.94P.txt
FFMPEG 60F.txt
FFMPEG 60P.txt
FFPROBE 50F.txt
FFPROBE 50P.txt
FFPROBE 59.94F.txt
FFPROBE 59.94P.txt
FFPROBE 60F.txt
FFPROBE 60P.txt

with BtbN/FFmpeg-Builds e6e28d4 Latest Auto-Build (2022-11-13 12:37)

@Farmadupe
Copy link
Contributor

Apologies for the late reply.

When I compare FFPROBE 50F.txt and FFPROBE 50P.txt I see that the F file contains two streams (0: Audio; 1: Video) and the P file contains three (0: Video; 1: Audio; 2: Tmcd). I think it's possible that vid_dup_finder_lib might be reading the metadata for the wrong stream. You may be able to test this by stripping all non-video streams from the F files and trying again. For example ffmpeg -i $input_file -vcodec copy -an $output_file

(If I get some time, I will try force feeding the ffprobe output into vid_dup_finder_lib to see if any internal wrong calculation occurs.)

When looking at the ffmpeg outputs I think I can see that all F videos have a SAR (source aspect ratio) = 0, but all the P videos have a SAR = 1:
FFMPEG.50F.txt: [auto_scale_0 @ 00000242b36c7ac0] w:3840 h:2160 fmt:yuv420p sar:0/1 -> w:3840 h:2160 fmt:gray sar:0/1 flags:0x0
FFMPEG.50P.txt: [auto_scale_0 @ 000002b0dc622c00] w:3840 h:2160 fmt:yuv420p sar:1/1 -> w:3840 h:2160 fmt:gray sar:1/1 flags:0x0

I do not know if this is an error or not, but it might be suspicious.

@The-Istar
Copy link

Any update on this?

I just posted under #968 that I am seeing similar issues on my local install and Flatpak install but not on my docker install.
Even though all use the latest version of Czkawka and the same ffmpeg version.

@DawgNewb
Copy link

DawgNewb commented May 12, 2023

Any update on this?

I just posted under #968 that I am seeing similar issues on my local install and Flatpak install but not on my docker install. Even though all use the latest version of Czkawka and the same ffmpeg version.

I'm not trying ffmpeg -i $input_file -vcodec copy -an $output_file method yet. I'll need to free some space first haha

@The-Istar
Copy link

I tried the test with the appimage and surprising that does find the duplicates properly.
So what ever is different with the appimage should give us a clue on what is not working.

@VicePrez
Copy link

VicePrez commented Jul 19, 2023

Just chiming in here, was able to reproduce this issue using the docker image being maintained here: https://hub.docker.com/r/jlesage/czkawka/

EDIT: Confirmed to be an issue on windows using GUI variant as well

Heres the media info from ffprobe (redacted location data fyi)

/path/to/file # ffprobe -show_format -show_streams SAMPLEFILE.mp4

ffprobe version 5.1.3 Copyright (c) 2007-2022 the FFmpeg developers
  built with gcc 12.2.1 (Alpine 12.2.1_git20220924-r4) 20220924
  configuration: --prefix=/usr --enable-avfilter --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-gnutls --enable-gpl --enable-libass --enable-libmp3lame --enable-libpulse --enable-libvorbis --enable-libvpx --enable-libxvid --enable-libx264 --enable-libx265 --enable-libtheora --enable-libv4l2 --enable-libdav1d --enable-lto --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-libxcb --enable-librist --enable-libsrt --enable-libssh --enable-libvidstab --disable-stripping --disable-static --disable-librtmp --disable-lzma --enable-libaom --enable-libopus --enable-libsoxr --enable-libwebp --enable-vaapi --enable-vdpau --enable-vulkan --enable-libdrm --enable-libzmq --optflags=-O2 --disable-debug --enable-libsvtav1
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'SAMPLEFILE.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2023-01-07T18:21:13.000000Z
    location        : #########################          // Redacted
    location-eng    : #########################          // Redacted
    com.android.version: 13
    com.android.capture.fps: 60.000000
  Duration: 00:03:07.02, start: 0.000000, bitrate: 72260 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 3840x2160, 71998 kb/s, 59.79 fps, 59.94 tbr, 90k tbn (default)
    Metadata:
      creation_time   : 2023-01-07T18:21:13.000000Z
      handler_name    : VideoHandle
      vendor_id       : [0][0][0][0]
    Side data:
      displaymatrix: rotation of -90.00 degrees
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      creation_time   : 2023-01-07T18:21:13.000000Z
      handler_name    : SoundHandle
      vendor_id       : [0][0][0][0]
[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
profile=High
codec_type=video
codec_tag_string=avc1
codec_tag=0x31637661
width=3840
height=2160
coded_width=3840
coded_height=2160
closed_captions=0
film_grain=0
has_b_frames=0
sample_aspect_ratio=N/A
display_aspect_ratio=N/A
pix_fmt=yuv420p
level=52
color_range=tv
color_space=bt709
color_transfer=bt709
color_primaries=bt709
chroma_location=left
field_order=progressive
refs=1
is_avc=true
nal_length_size=4
id=0x1
r_frame_rate=60000/1001
avg_frame_rate=503190000/8415731
time_base=1/90000
start_pts=0
start_time=0.000000
duration_ts=16831462
duration=187.016244
bit_rate=71998179
max_bit_rate=N/A
bits_per_raw_sample=8
nb_frames=11182
nb_read_frames=N/A
nb_read_packets=N/A
extradata_size=35
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
DISPOSITION:captions=0
DISPOSITION:descriptions=0
DISPOSITION:metadata=0
DISPOSITION:dependent=0
DISPOSITION:still_image=0
TAG:creation_time=2023-01-07T18:21:13.000000Z
TAG:language=eng
TAG:handler_name=VideoHandle
TAG:vendor_id=[0][0][0][0]
[SIDE_DATA]
side_data_type=Display Matrix
displaymatrix=
00000000:            0       65536           0
00000001:       -65536           0           0
00000002:            0           0  1073741824

rotation=-90
[/SIDE_DATA]
[/STREAM]
[STREAM]
index=1
codec_name=aac
codec_long_name=AAC (Advanced Audio Coding)
profile=LC
codec_type=audio
codec_tag_string=mp4a
codec_tag=0x6134706d
sample_fmt=fltp
sample_rate=48000
channels=2
channel_layout=stereo
bits_per_sample=0
id=0x2
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/48000
start_pts=1392
start_time=0.029000
duration_ts=8974196
duration=186.962417
bit_rate=256006
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=8764
nb_read_frames=N/A
nb_read_packets=N/A
extradata_size=2
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
DISPOSITION:captions=0
DISPOSITION:descriptions=0
DISPOSITION:metadata=0
DISPOSITION:dependent=0
DISPOSITION:still_image=0
TAG:creation_time=2023-01-07T18:21:13.000000Z
TAG:language=eng
TAG:handler_name=SoundHandle
TAG:vendor_id=[0][0][0][0]
[/STREAM]
[FORMAT]
filename=SAMPLEFILE.mp4
nb_streams=2
nb_programs=0
format_name=mov,mp4,m4a,3gp,3g2,mj2
format_long_name=QuickTime / MOV
start_time=0.000000
duration=187.016200
size=1689230313
bit_rate=72260277
probe_score=100
TAG:major_brand=mp42
TAG:minor_version=0
TAG:compatible_brands=isommp42
TAG:creation_time=2023-01-07T18:21:13.000000Z
TAG:location=#########################          // Redacted
TAG:location-eng==#########################     // Redacted
TAG:com.android.version=13
TAG:com.android.capture.fps=60.000000
[/FORMAT]

@NoMoreAngel
Copy link

#605 (comment)

FWIW, the only way the 'too short' error can be generated is when FFMPEG exits before it has sent 10 frames to the underlying library. My current theories are:

* Ffmpeg killed due to running out of memory because of large 4k video frame size (feels unlikely)

* My library code has a wrong calculation, and tries to extract frames from a non-existent time index e.g after the end of the video (current best guess)

I've run this program (not the underlying lib itself) via the windows gui and limited the Threads from all to just 8 to test if the memory could be a problem. Turns out with limited Threads it works fine and doesn't give the too short error. With all Threads active my RAM got full pretty fast and it even used the swap file. With only 8 Threads the RAM usage was always max 3/4 full and all files got checked perfectly fine.

So if others encounter the file too short error, try limiting the used Threads, restart the application and clear the cache (those two are important). If that helps, don't forget to report back so that the problem can get narrowed down.

@VicePrez
Copy link

I'll give that a try and report back.

@DawgNewb
Copy link

#605 (comment)

FWIW, the only way the 'too short' error can be generated is when FFMPEG exits before it has sent 10 frames to the underlying library. My current theories are:

* Ffmpeg killed due to running out of memory because of large 4k video frame size (feels unlikely)

* My library code has a wrong calculation, and tries to extract frames from a non-existent time index e.g after the end of the video (current best guess)

I've run this program (not the underlying lib itself) via the windows gui and limited the Threads from all to just 8 to test if the memory could be a problem. Turns out with limited Threads it works fine and doesn't give the too short error. With all Threads active my RAM got full pretty fast and it even used the swap file. With only 8 Threads the RAM usage was always max 3/4 full and all files got checked perfectly fine.

So if others encounter the file too short error, try limiting the used Threads, restart the application and clear the cache (those two are important). If that helps, don't forget to report back so that the problem can get narrowed down.

Wow this method yield more result, test with 2 threads on my 4790k. I got more video on the list.

@The-Istar
Copy link

Tested this on the flatpak but this unfortunately did not change anything there.

@deadbeatdandylyon
Copy link

I am new to this, however I am coming across the same issue. I know this has been a long going situation as i've read through, just wanted to find out if there is any update or work around. I was able to do one folder of videos fine, but the other folder of videos to compare i am getting the same error message.

@evanheckert
Copy link

Farmadupe/vid_dup_finder

Hello! In this issue the author states that he has a functioning branch that allows videos under 30 seconds, and that branch is in every way better than main. Could this be incorporated? I have huge folders of videos under 30 seconds to dedupe. Thank you!

Farmadupe/vid_dup_finder#3

@Ciberbago
Copy link
Author

Farmadupe/vid_dup_finder

Hello! In this issue the author states that he has a functioning branch that allows videos under 30 seconds, and that branch is in every way better than main. Could this be incorporated? I have huge folders of videos under 30 seconds to dedupe. Thank you!

Farmadupe/vid_dup_finder#3

I didn't know about this, would be great. I still have a ton of short videos that I need to find duplicates for... 😅

@qarmin
Copy link
Owner

qarmin commented Oct 18, 2023

I tested library from https://github.com/Farmadupe/vid_dup_finder_lib/tree/dct3d with medium sized collection of short videos and small collection of bigger and:

  • It is slower than 0.1 version - mostly this is visible with short videos (probably a lot of this slowdowns is caused by hashing previously not hashed videos)
  • Sometimes crashes - looks that library have problems with cropping invalid videos
  • Not sure how exactly algorithm works, but looks that still likes to report movies with similar beginning as duplicates
  • General quality of duplicates is OK and the most important, it works fine with short videos - still finds some amount of false positives, but this is unavoidable with such algorithms, that needs to be fast and acceptable correct
  • API changes are minimal(at least from Czkawka perspective), so update should be easy

So it looks like the library is going in the correct direction, but without a stable version on crates.io, I am unable to update this library in the application.

thread '<unnamed>' panicked at /home/rafal/.cargo/git/checkouts/vid_dup_finder_lib-c8e12077b81b399b/6eb9c60/vid_dup_finder_common/src/crop.rs:70:21:
attempt to subtract with overflow

@Farmadupe
Copy link
Contributor

just a quick reply:

  • Performance is slower than previous version (not just due to being able to hash shorter videos). This is due to using more frames to generate hashes (old version was 10, I think new version is 64). Performance depends on number of frames, but it is not 6.4x slower (due to fixed cost to begin decoding a video)
  • I may be able to address such a crash (I am surprised because I thought quality is already quite good!).
    • But I have made no edits to the codebase in several months, so I can offer no guarantee
    • If you do not want to wait (for a long time) until I fix the code, anybody may fix it themselves and release to crates.io. No special permission is needed from me and the license already permits this.
  • Algorithm is similar to before. Old hash was formed from 10 separate 2d cosine transforms. New version is 1 3d cosine transform. Quality is quite improved
  • Code still only uses first 30 seconds, due to:
    1. Very slow to decode entire big/long video.
    1. There is no scene change detection or other alignment. So imperceptible speed difference (0.1%) between duplicates would break the algorithm for long videos.

I thought code quality is actually quite good and I am surprised there is a panic. Since it is an overflow error I think this may be because I never compile without optimizations. Hopefully if this error is fixed the code should be reliable. But the proof is in the pudding.

@Farmadupe
Copy link
Contributor

P.S I think the new version of the library uses internal frame streaming, so it will not store all frames in memory at the same time. Hopefully it will not cause memory exhaustion.

@All-The-Foxes
Copy link

All-The-Foxes commented Apr 7, 2024

Any update to this? I have a folder with 70k 28s videos from security camera footage over the years that has a whole lot of junk in it where PIR is triggered by something just out of frame or directly below the camera causing a 28s recording of nothing happening... a lot. It would be nice to be able to identify and delete all the identical recordings of nothing going on.

@qarmin
Copy link
Owner

qarmin commented Sep 27, 2024

Unstable version of app with added support for shorter than 30 seconds files, are available as artifacts of #1356

This PR has a few problems that need to be solved before it can be merged.

@Hoernchen
Copy link

I gave those artifacts a try (windows), but all I get is a crash, immediately, even for folders that contain nothing at all. Is there anything else required that is not part of the artifacts? A special ffmpeg? Other libs? "standard" Czkawka works without any issues.

@evanheckert
Copy link

Would love to see movement on this, or a recommendation for a tool that has this capability.

@qarmin qarmin closed this as completed Feb 26, 2025
@evanheckert
Copy link

Was it completed @qarmin ? Last release was 4 months ago and does not handle shorter videos. Is there a release coming soon with the feature? Thanks!

@qarmin
Copy link
Owner

qarmin commented Feb 26, 2025

It was added in #1425 forget to close it when closing this PR)
Not sure when next release will be ready, but nightly builds are available in release tab(some functionalities may be broken)

@evanheckert
Copy link

evanheckert commented Feb 28, 2025

nightly builds are available in release tab(some functionalities may be broken)

I could be missing something, but it looks like last nightly release on the release tab was prior to the 8.0.0 release? @qarmin

@qarmin
Copy link
Owner

qarmin commented Feb 28, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests