-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One of Four Cameras Is Failing #114
Comments
Well, looks like the panic in #113 is no longer happening, so that's an improvement at least. Do you still have any logs from the older, working installation? I'd like to see the lines after It's not happy with I see several kinds of errors here:
The I wonder if you can reproduce with the
That command will just open a stream and discard the received bytes until it gets dropped or you hit If that shows problems, it'd be interesting to see if dropping the We could also try getting a packet capture at the same time as a log; it'd be relatively easy then to see if ffmpeg is right about the |
Most of these complaints happen to a lesser extent on the other cameras too.
|
Further breakdown of the
|
I'll see if I have some old logs... I doubt it, though. |
Good luck! fwiw, I think a "consumer" would be more likely to follow the recommended path of using the Docker image, which comes with a fixed version of the ffmpeg libraries. But I don't know for sure yet if that would avoid this problem, or if it's a problem with ffmpeg at all. It's probably not the whole story; I have a pi4 with those same versions and that |
Hmm, in issue #84 you pasted a log including this:
which is the same as the one from this bug:
There could be some subtle difference not captured in the version number but it seems unlikely. |
I found a log file that has:
Moreover, I cannot say for certain that in June or July I was using the same ffmpeg, but here's what is showing my Rasbertty Pi 4 now. I am not familiar with the Raspberry Pi deployment of binaries to know whether the July 8th date is the date the binary was installed on my machine, or an artificial date create by a some package manager.
I think what I need to do is just isolate the ffmpeg command for camera garage_west in its own session and see if I can recreate the error reports. |
So I tried the following: A) 4 consoles on Rapsberry PiI connect to the Rapsberry Pi and with screen attempted to create four windows, each being a session using the following formula:
As can happen with screen, I became lost in its maze after trying to create a four panel window and unsuccessfully trying to give each a title. (Will still try with tmux another time.) So I connected to the Raspberry Pi with four ssh consoles via Putty on Windows and ran ffmpeg against each camera. After about an hour, one then a second camera started to go into an infinite loop state with messages such as:
Again, I cannot say if my ffmpeg on the Raspberry Pi 4 was altered when I performed a system upgrade in the last few months. But I suspected my disparity of results, e.g. June - January vs. recent few weeks, was being caused at the ffmpeg level. B) Parallel Console On Different Servers & Different ffmpegI then wanted to compare my Gentoo VM version of ffmpeg with the Rapsberry Pi high watermark. I let the two consoles run overnight and 8 hours later found the Gentoo one in the endless loop "frame=...." above; the Raspberry Pi only had a handful of errors. Gentoo ffmpeg
Raspberry Pi4 ffmpeg
Note, that the Gentoo [server: ares] version does not have the library warning -- one of the benefits of compiling everything in Gentoo, methinks. It is unclear from Google searches what the risk of the library warning presents. My next step is to "unmask" a more recent version of ffmpeg on my Gentoo VM and see what happens with that. This issue is moving away from Moonfire-nvr and looking to be an ffmpeg issue. Further thoughts: I suppose I should try to save what comes in via ffmeg's rtsp call to learn if the repeated message of "frame=442457 fps= 12 q=-1.0 size..." means nothing and affects what is processed, or whether it is white noise. I'm going to have to learn more about ffmpeg which is a monstrous tome, nay operating system guised as a program. |
What were the errors? I don't see them here. I'm particularly interested to see if it says similar errors to as ones you saw when running through Moonfire NVR:
I also don't know what you mean by "library warning". The The repeated |
Here's a copy from the Raspberry Pi console. See on line 5 "WARNING: library configuration mismatch".
|
Okay, those Do they happen on the Gentoo server? Do they happen if you remove the The library warning seems to be saying that it compiled the binary and the libraries with this difference in options: $ diff -u binary avutil
--- binary 2021-03-12 08:57:21.179545106 -0800
+++ avutil 2021-03-12 08:57:21.179363592 -0800
@@ -64,7 +64,10 @@
--enable-chromaprint
--enable-frei0r
--enable-libx264
+--libdir=/usr/lib/arm-linux-gnueabihf/neon/vfp
+--cpu=cortex-a7
+--arch=armv6t2
+--disable-thumb
--enable-shared
---libdir=/usr/lib/arm-linux-gnueabihf
---cpu=arm1176jzf-s
---arch=arm
+--disable-doc
+--disable-programs I don't think it's causing any problems. |
So the "frame=..." is a bottom line in a console streen that is continually updated to show progress. I'm assuming this is an ncurses kind of feature to avoid scrolling caused by repetitive output. So in my comment earlier above, it looks like something happened within the console (or ncurses?) that caused the update line to move down the screen rather than replace itself so the console looks like it has a constantly updated bottom line. What I thought was an error with ffmpeg looks to have been something caused in my console where the refresh line no longer replaced, but moved down a line in the screen buffer causing the screen to scroll with each line of output. I now have three consoles running against the same camera, each with a different version of ffmpeg. I have on my taurus laptop the latest & greatest version of ffmpeg, release 4.3.2. I am doing this to see if there are consistent errors in all three session give a certain point of time. So far, I'm seeing different results. I'm now thinking network issues may be in play. The garage cameras come from the garage through a Ubiquiti wireless link, the Peck comes via a long Cat6 running to the house next door, and the laptop has my current house's wireless. So, a variety of network connections. I eventually found all screens populated with red error messages and then hung. No more updates. I entered Control-C and the ffmpeg program terminated and I was returned to a standard prompt. This proves that my ssh session was still alive and the hang up occurring on ffmpeg's side. I'm starting to think that moonfire-nvr is faced with handling problems from ffmpeg and being able to recover gracefully when ffmpeg hangs as it appears to have done in several versions on several machines. I'm going to do further testing just using various ffmpegs rtsp sessions, my approach has been haphazard and not very methodical and at this point we need to have some methodically produced results. |
The There's another option Moonfire NVR sets when it opens streams with ffmpeg,
If your network connection is unreliable, that might just be the best that can be done, with ffmpeg or any other RTSP client. Maybe what changed since you had a working setup on the Pi wasn't ffmpeg or Moonfire but the wireless network conditions. |
I've been playing a bit with a Reolink camera (a RLC-410, hardware version IPC_3816M, firmware v2.0.0.1441_19032101). I think it's actually sending interleaved RTP data over a RTSP connection which must be intended for a different RTSP connection:
I imagine these problems are worse if you have multiple NVRs running, record both the Once I move to a non-ffmpeg RTSP implementation, I might be able to work around this problem (by discarding data which doesn't have the expected I also get warnings in the logs about long times to get the next frame if there's a dramatic change to the image. I have the thing in a cabinet with a door right now; if I open or close the door, I can see it in my logs every time. That really shouldn't happen either. Finally I see cases where the RTP timestamp goes slightly backwards; Moonfire NVR drops the connection in this case. (iirc this also happens with my Hikvision and/or Dahua cameras sometimes.) This seems to happen when they do a time step with SNTP—it must be calculating offsets with |
It may not be all the camera's fault, though. The |
I think I've found something that might bear on the problems I encountered. One of the facts that were different after I started having problems was that I had upgraded my RaspberryPi4 system. This bug gets into the nitty gritty of kernels and sleep modes for USB attached hard drives, and some of the people posting noted their problem went away when the downgraded to a previous kernel. I just wanted to add this URL because it may have bearing on what I experienced: I am kind of regretting having gone with RaspberryPi since it depends on an SD HC which has an admitted limited life. Even Scott's suggestions tries to limit the cycles performed against the diskette. I noticed some of the of the systems Scott recommended are out-of-stock at the linked vendor. I'm thinking for 24/7 video, something not operating on an SD HC is preferable. Just too may write cycles. And the kernel issue in the bug of the URL herein is troubling. I've also complicated things by using LVM. I used to use LVM for my Xen hypervisor and then I discovered some shortcoming and well as LVM developer malaise that made me decide to go back to the tried-and-true partition approach. What tangled webs we weave. But having video surveillance feeds coming to a generic browser is so worth it, I just wish I had more time. |
From a previous email, I recall you have a 5-bay USB SATA enclosure. Have you considered buying a SATA SSD and sticking in there? The Pi4 supports USB booting, and a real SATA SSD would be much more reliable. |
Good suggestion, but I'm shy about SSDs. Any reason not to use a Western Digital Blue, at $20, it seems like a no-brainer? I don't need the performance an SSD offers; the stability is more important to me. I have two empty slots in my USB array. Edit: changed the URL to points to 3.5" form, original link was to notebook which is a form factor I'm not sure is supported by my array. |
I don't think you need to feel shy about SSDs; they're in a different reliability class than SD cards, and they actually report SMART statistics so you can see how much wear they have on them. I'd expect them to be more reliable as a root volume than a refurbished ("renewed") HDD like that one. But to answer your question: putting the root drive and SQLite database on a HDD is probably fine. Database accesses will be noticeably slower but probably bad enough so that Moonfire NVR will actually break. |
I still see the 2.5" link. Another thing to check is if the 3.5" drive (or whatever you select) is using SMR; I try to avoid those drives. Usually arrays have trays that let you put 2.5" drives in 3.5" slots. SSDs all come in 2.5" form so you'd need this to take my suggestion of putting in a SSD. |
I did more reading about SSD and decided Scott's suggestion was well taken. So I ordered the following:
Installed new Raspberry Pi minimal operating system on PNY.
The goal here is to isolate potential failures to more reliable systems than an SD HC card that has been in use for over 1 year. Yesterday, on the SD HC-based system, something happened and I could not even reboot from a console, the system locked up. /var/log/messages had some ext4 related errors which causes me to think there were disk problems.
|
As this is now understood as an RTSP-level interop problem with these buggy cameras, let's continue discussion at scottlamb/retina#17 as necessary. Current best solution is to use the new --rtsp-transport=udp argument. |
One of four cameras seems to go into a repeated loop stopping and being unable to accept streams and record same. See screenshot below of the web UI where garage_west has an abnormal number of small files
Running moonfire-nvr updated 3/10/2021. Have run previous version, including circa 6/1/2020 where I had several months of operation without this kind of problem.
Desired result: equal recording of camera "garage_west" as three other cameras and web interface offering files of similar size as other cameras
moonfire-nvr@raspberrypi:/usr/local/src/moonfire-nvr/server/target/release $ git describe
v0.6.0-40-ge66a88a
moonfire-nvr@raspberrypi:/usr/local/src/moonfire-nvr/server/target/release $ ./moonfire-nvr --version
moonfire-nvr 0.6.1
moonfire-nvr@raspberrypi:/usr/local/src/moonfire-nvr/server/target/release $
Ran with RUST_BACKTRACE=1
log ( 254MB; feel free to select snippets and post herein as I'll remove this large file after 1 week) for approximately 4 hour session this afternoon at:
https://drive.google.com/file/d/1Ir0SZ3jlsiLO5IiyMELQnX4YkJp_Akxk/view?usp=sharing
Screenshot showing the Web interface and the garage_west files constantly restarting
As a check, I have the Reolink client software running and capturing events from garage_west and the three other cameras and there has been no noticeable difference for garage_west's camera. I therefore think something on the receiving side, e.g. moonfire-nvr on my Raspberry Pi4 is causing the problem.
The text was updated successfully, but these errors were encountered: