[Bug]: Frigate starts writing to local disk if network storage is intermittently not available, does not recover, bricks Home Assistant #13951

DReffects · 2024-09-25T09:26:19Z

DReffects
Sep 25, 2024

Checklist

I have updated to the latest available Frigate version.
I have cleared the cache of my browser.
I have tried a different browser to see if it is related to my browser.
I have tried reproducing the issue in incognito mode to rule out problems with any third party extensions or plugins I have installed.

Describe the problem you are having

Two days ago I did some maintenance to my central storage system, therefor the SMB share for frigate became unavailable for a certain amount of time.
Today, when looking at System Metrics -> Storage the wrong storage device is being used to calculate disk usage.
In my case the Systems internal SSD with 240GB is being shown (Home Assistant installed on x64 NUC), while frigate media is actually being stored on a network share set up within Home Assistant that has 12TB of total space.

Further investigation revealed that frigate created the storage file structure on the local disk (/media/frigate) after the SMB share became unavailable.

This led to the following issues:

recordings never went back to the SMB share, even after it was back online
local home assistant disk space was 100% full after two days
restarting Home Assistant never resulted in mouting of the SMB share to /media/frigate since frigate created a new local file structur at the same path
The storage metrics statistics are messed up (minor issue of course)
Home Assistant was bricked to to 100% disk utilization

This is a very problematic error, since a split second network outage breaks the entirety of video recordings - and Home assistant in the aftermath. While I use a rather large 240GB disk for the HA installation, most people only have very small local storage available so it bricks the system for them even faster.

Since there is no message or notification whatsoever this breaks the entire HA installation due to the disk running full. Plus you loose your recordings from the point forward your disk gets full if you do not notice immediately.

Also recovery from this state is rather problematic as one needs the ability to ssh into the machine and clean up the storage manually to even make HA run again.

I really like that frigate tries to compensate for an outage in the media storage location but I strongly suggest that this becomes configurably in the frigate.yaml config file.

The outage of centralized storage is not a question of "if" but only "when". You update the firmware of your switch? Outage (with ubiquiti thats every few weeks). You restart the fileserver after an update? Outage. You change out a GBIC module? Outage. You replace a cable in your rack? Outage. You run your stuff via Wifi? Well, outages are preprogrammed then.

I suggest the following:

make two storage paths available via config
primary and secondary
have frigate check if primary is available periodically
if not, write to secondary but with a different set of retention to acommedate available diskspace
issue a notification / warning that the secondary storage is being used
recover automatically when primary storage is available again and move the secondary storage contents' to the primary

It is my point of view that besides all fancy detecion stuff the most important part of surveillance is a bullet proof method for recording. The system has to recover from power outages, intermittently loss of connectivity and all kinds of anomalies.

Steps to reproduce

This is a method to reproduce immediately. It happend to me without a HA restart, just not immediately.

Set up a SMB storage for frigate in HA on a network share
take the SMB storage offline
Restart HA
After the restart HA is unable to mount /media/frigate
Frigate starts to write to /media/frigate on the local storage
any further reboots will not restore recording to SMB share since /media/frigate is already occupied
HA disk will run full quicly
HA is bricked

Version

0.14.1-f4f3cfa

In which browser(s) are you experiencing the issue with?

No response

Frigate config file

docker-compose file or Docker CLI command

Relevant Frigate log output

Relevant go2rtc log output

Operating system

HassOS

Install method

HassOS Addon

Network connection

Wired

Camera make and model

Dahua

Screenshots of the Frigate UI's System metrics pages

see above

Any other information that may be helpful

No response

NickM-27 · 2024-09-25T11:52:47Z

NickM-27
Sep 25, 2024
Collaborator Sponsor

This is not a bug, and it seems you are misunderstanding how this is working.

Frigate runs in a docker container, it has no knowledge of what kind of storage it is writing to, if that storage disconnects, or anything of the sort. That is all handled by the host.

In docker config it is trivial to configure a container to stop if the network storage disconnects. If you're running as an addon then this would be a feature request for the home assistant supervisor

8 replies

NickM-27 Sep 29, 2024
Collaborator Sponsor

If it is a known issue with the primary environment frigate runs in and is designed for,

I don't know where you got that idea but that is incorrect, the docs specifically recommend running in native docker on a debian host.

https://docs.frigate.video/frigate/installation#operating-system

Am I correct in the assumtion, that if /media/frigate does not exist, frigate (not the host) would create this location?
I am thinking about a fail safe with a configuration variable $external_storage == true/false check like

you are incorrect, docker does this before Frigate even begins starting up.

NickM-27 Sep 29, 2024
Collaborator Sponsor

I guess in my case frigate was unable to recognize that the previously stored data was not available (see my screenshot with the 1888% usage reported) and therefor was also not able to delete the oldest 2 hours of footage to free up space.

that shouldn't interfere with the emergency cleanup, it specifically checks if the files it deleted were actually deleted before counting it as storage that was cleared.

If the only fail-safe routine/check within frigate is to delete old recordings I'd suggest to combine this with a hard record shut off threshold, preferably with its own configuration variable to be fitting for different environments. (if diskspace < 4GB then stop recording + alert)

Like I said and pointed to the docs, this already exists. Except it doesn't stop recording, it just deletes the oldest recordings that are available.

How do i enable debug logs for storage? Is it frigate.record.maintainer: debug?

The last minute storage check is frigate.storage, frigate.record.maintainer is the part that saves recordings to the disk

DReffects Sep 29, 2024
Author

Hi Nick,

I don't know where you got that idea but that is incorrect, the docs specifically recommend running in native docker on a debian host.
I got that from the main page here:

My association as an end-user is, if something is "designed for Home Assistant" it will be installed from the end-user's perspective, therefor as an addon to Home Asistant. The linked doc referes to https://docs.frigate.video/integrations/home-assistant which then goes on to tell me that the "best way to integrate with Home Assistant" is to use the official integration.
Since I have no clue about what a docker container does or is or if Debian is some sort of ex girlfriend or cookie (coming from the end-user perspective), I just followed the step by step guide from the docs and ended up with frigate as a HA addon plus the integration from HACS. If that's not the prefered way I am sorry that I missed that. From my point of view it is addon that broke home assistant ;-)

you are incorrect, docker does this before Frigate even begins starting up.
Damn :-(

What would you recommend to make this more stable for my use case? I do not have the hardware to set up another box just for frigate. Can I improve handling of storage from Home Assistant somehow? It was a huge pain in the behind to get HA working again after the disk got full and it's bound to happen again sooner or later since I will update ubiquiti switches or reboot my storage system from time to time.

that shouldn't interfere with the emergency cleanup, it specifically checks if the files it deleted were actually deleted before counting it as storage that was cleared.

odd. I will enable storage debug logs and test it when i've got spare time. Thanks!!

NickM-27 Sep 29, 2024
Collaborator Sponsor

The integration works regardless of how Frigate is installed, and one of the main advantages is using it with home assistant.

What would you recommend to make this more stable for my use case? I do not have the hardware to set up another box just for frigate. Can I improve handling of storage from Home Assistant somehow? It was a huge pain in the behind to get HA working again after the disk got full and it's bound to happen again sooner or later since I will update ubiquiti switches or reboot my storage system from time to time.

I am not super familiar with what is available to HA OS with a network storage configured but I can think of a few different ideas assuming HA doesn't improve this.

if there is a sensor that represents the state of the network storage then you could turn off recording for cameras in Frigate when this goes offline.
You could also use the ubiquiti intgration to know when that goes offline and turn off frigate or just recording.

DReffects Sep 29, 2024
Author

Thanks, I'll look into that and get back to you when I was able to test the issue with emergency cleanup not working.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Frigate starts writing to local disk if network storage is intermittently not available, does not recover, bricks Home Assistant #13951

{{title}}

Replies: 1 comment 8 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

[Bug]: Frigate starts writing to local disk if network storage is intermittently not available, does not recover, bricks Home Assistant #13951

DReffects Sep 25, 2024

Checklist

Describe the problem you are having

Steps to reproduce

Version

In which browser(s) are you experiencing the issue with?

Frigate config file

docker-compose file or Docker CLI command

Relevant Frigate log output

Relevant go2rtc log output

Operating system

Install method

Network connection

Camera make and model

Screenshots of the Frigate UI's System metrics pages

Any other information that may be helpful

Replies: 1 comment · 8 replies

NickM-27 Sep 25, 2024 Collaborator Sponsor

NickM-27 Sep 29, 2024 Collaborator Sponsor

NickM-27 Sep 29, 2024 Collaborator Sponsor

DReffects Sep 29, 2024 Author

NickM-27 Sep 29, 2024 Collaborator Sponsor

DReffects Sep 29, 2024 Author

DReffects
Sep 25, 2024

Replies: 1 comment 8 replies

NickM-27
Sep 25, 2024
Collaborator Sponsor

NickM-27 Sep 29, 2024
Collaborator Sponsor

NickM-27 Sep 29, 2024
Collaborator Sponsor

DReffects Sep 29, 2024
Author

NickM-27 Sep 29, 2024
Collaborator Sponsor

DReffects Sep 29, 2024
Author