Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache management and Event Clips #942

Closed
mssaleh opened this issue Mar 27, 2021 · 4 comments
Closed

Cache management and Event Clips #942

mssaleh opened this issue Mar 27, 2021 · 4 comments
Labels

Comments

@mssaleh
Copy link

mssaleh commented Mar 27, 2021

I've previously planned to contribute to this project starting this month, but pressure at work won't allow me enough time to do so.

However, I tried to test a few things on a large real-world deployment (40 RTSP Wisenet Cameras) using a modest PC. I had a few issues and still couldn't get Frigate to work as expected.

Issues:

  • Issue 1: Docker's tmpfs mounted at /tmp/cache fill up after 30 minutes and then a cascade of errors happen with "detect" and "events" processes.
  • Issue 2: Event Clips not saved (or more accurately, not copied from cache) to Clips folder, even before the cache is full or cleared.
  • Issue 3: In the Web UI, Events page, Camera drop-down menu doesn't work if the number of Camera is too big. So I cannot select any camera. However, I can get around this by going to a camera page then click Events in this page.

Notes:
DB and MQTT still have all events logged, published correctly at all times.

Errors logged:

frigate.events        DEBUG   :Cleaning up cached file xxxx-20210327183421.mp4
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x556535d40140] moov atom not found
/tmp/cache/xxxx-20210327184635.mp4: Invalid data found when processing input
frigate.events        INFO    : bad file: xxxx-20210327184551.mp4
ffmpeg.xxxx.detect    ERROR   : Guessed Channel Layout for Input Stream #0.1 : mono
ffmpeg.xxxx.detect    ERROR   : Could not write header for output file #0 (incorrect codec parameters ?): No space left on device
ffmpeg.xxxx.detect    ERROR   : Consider increasing the value for the 'analyzeduration' and 'probesize' options

Setup:

  • PC: Core i5 9400, 16GB RAM, 1TB SSD, Coral Edge TPU M2
  • OS: Ubuntu 20.04
  • Docker 20.10.5, Docker-Compose 1.28.6
  • Host intel driver: intel-media-va-driver-non-free 21.1.2+i520~u20.04

Config:
docker-compose.yml

version: '3.9'
services:
  frigate:
    container_name: frigate
    privileged: true
    restart: unless-stopped
    image: blakeblackshear/frigate:stable-amd64
    devices:
      - /dev/bus/usb:/dev/bus/usb
      - /dev/apex_0:/dev/apex_0
      - /dev/dri/renderD128:/dev/dri/renderD128
    volumes:
      - /var/run/dbus:/var/run/dbus
      - /etc/localtime:/etc/localtime:ro
      - ./config/config.yml:/config/config.yml
      - .:/media/frigate
      - type: tmpfs
        target: /tmp/cache
    ports:
      - '5050:5000'
      - '1935:1935'
    environment:
      - LIBVA_DRIVER_NAME=iHD

config.yml (an example camera (x40 in my actual file) along with global options:

cameras:
  xxxx:
    best_image_timeout: 60
    clips:
      enabled: True
      post_capture: 3
      pre_capture: 3
    detect:
      enabled: True
      max_disappeared: 20
    ffmpeg:
      inputs:
        - path: rtsp://xxxx:[email protected]:554/profile2/media.smp
          roles:
            - detect
            - rtmp
            - clips
    fps: 4
    height: 1080
    mqtt:
      bounding_box: True
      crop: True
      enabled: True
      height: 270
      timestamp: False
    rtmp:
      enabled: True
    snapshots:
      bounding_box: True
      crop: False
      enabled: False
      timestamp: False
    width: 1920

clips:
  max_seconds: 300
  retain:
    default: 15
    objects:
      car: 15
      person: 15

detect:
  max_disappeared: 20

detectors:
  coral_pci:
    device: pci
    type: edgetpu

ffmpeg:
  global_args: '-hide_banner -loglevel warning -analyzeduration 2000000 -probesize 2000000'
  hwaccel_args: '-init_hw_device qsv=qsvgpu:/dev/dri/renderD128 -hwaccel auto -qsv_device qsvgpu -vcodec h264_qsv -hwaccel_output_format yuv420p'
  input_args: '-an -dn -stimeout 5000000 -fflags nobuffer -flags low_delay -flags output_corrupt -strict experimental -use_wallclock_as_timestamps 1 -rtsp_transport tcp -avoid_negative_ts make_zero -fflags +igndts+genpts'
  output_args:
    detect: -f rawvideo -pix_fmt yuv420p
    record: -f segment -segment_time 60 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an
    clips: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an
    rtmp: -c copy -f flv

logger:
  default: debug
  logs:
    frigate.mqtt: error

motion:
  contour_area: 100
  delta_alpha: 0.2
  frame_alpha: 0.2
  frame_height: 180
  threshold: 20

mqtt:
  client_id: frigate
  host: xxx.xxx.xxx.xxx
  password: xxxx
  port: xxxx
  stats_interval: 120
  topic_prefix: xxxx
  user: xxxx

objects:
  filters:
    car:
      max_area: 2000000
      min_area: 400
      min_score: 0.5
      threshold: 0.7
    person:
      max_area: 2000000
      min_area: 400
      min_score: 0.5
      threshold: 0.7
  track:
    - person
    - car

Tests:
I also tried to replace the container's intel driver and ffmpeg with versions (intel-non-free) matching what I have on the host OS, using frigate's final container as base.
Details are in this : Container on my Docker Hub
and this Dockerfile on my github
But both the original Frigate container and my modified version behaved exactly the same.

I am ready to provide any additional info that may help, and looking forward to your insight on this issue.

Thanks in advance.

@blakeblackshear
Copy link
Owner

If you don't have enough ram, you can't use a tmpfs volume for the cache. It will work fine without tmpfs. Not sure why #2 would happen. Does it happen with fewer cameras?

@mssaleh
Copy link
Author

mssaleh commented Mar 27, 2021

If you don't have enough ram, you can't use a tmpfs volume for the cache. It will work fine without tmpfs. Not sure why #2 would happen. Does it happen with fewer cameras?

Thanks for the quick response! and sorry again for not being able to contribute yet.

I didn't limit Docker's tmpfs size (it's unlimited by default) and of course I didn't use the overlapping option in config.yml
from glances and htop, RAM usage maxes out at 9GB (out of 16GB) with 40 cameras, so I am not sure whether RAM capacity is the issue.
Regarding the number of cameras, it might be a problem. I didn't test incremental numbers but with 9 cameras things sometimes work (but also multiple valid events in the DB that don't have an MP4 file). Moreover, I noticed that if the number of clips are too big, the HomeAssistant media browser integration won't show many of them, but Frigate's web UI does.

@mssaleh
Copy link
Author

mssaleh commented Mar 27, 2021

I did more testing:

It seems that when only one or two cameras miss frames or their VBR jumps too much, enough to trip ffmpeg, it stops splitting the segments or generally goes crazy (spiking CPU use). Then, all streams (even from good stable cameras) suffer, and no MP4 files are saved into Clips folder. The cache gradually fills up then cache clearing happens causing even more errors in the log. I guess RAM never fills up with tmpfs because of this periodic clearing.

Now I have a fully working setup with 36 streams from 1080p Wisenet Cameras. Usage: CPU ~80%, RAM ~5GB, GPU ~75%.
This is really an amazing result and a testament to the good work done by you and everyone involved in this project.

However, to get to this point I went through some quirks which I still don't understand:

  • Removed tmpfs mount from docker-compose.yml, then started with 20 good cameras >> everything works
  • Added more and more cameras until I tripped ffmpeg (I don't know if camera or ffmpeg is to blame) and eventually got 36 cameras working out of 40.
  • Re-added tmpfs mount to docker-compose.yml to see if it was the issue >> everything works too!

So now I am back at exactly the same config as I started with, less 4 cameras, and all is working fine!

@stale
Copy link

stale bot commented Apr 27, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 27, 2021
@mssaleh mssaleh closed this as completed Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants