Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'should always be an unindexed sample' panic #113

Closed
jlpoolen opened this issue Mar 9, 2021 · 9 comments
Closed

'should always be an unindexed sample' panic #113

jlpoolen opened this issue Mar 9, 2021 · 9 comments
Labels
bug rust Rust backend work required

Comments

@jlpoolen
Copy link
Contributor

jlpoolen commented Mar 9, 2021

This issue is being opened just as a placeholder for the moment.
History
I installed Moonfire-nvr in June, 2020, on a new clean RaspberryPi 4 purchased just for this software. My set-up included 8 TBs of disk space to hold the feeds of 4 Reolink cameras at high resolutions, e.g. 1920x1080, at high frame rates, e.g. 27+ frames/second. The performance went extremely well. Occasionally, some hiccups and I would restart one or more cameras.

I then upgraded Moonfire-nvr and probably performed several other upgrades to my Raspberry Pi environment. Since then, I have had intermittent problems with one or more cameras' feeds not being preserved and the web interface having little, if anything for some cameras. It did not seem camera specific; however, I have not fully analyzed the matter.

This issue is being opened to start tracking my investigation as to what the problem may be. I opened yesterday Issue #112 to document what I was doing to capture and preserve colorized logs. Scott's comments therein note that he does not colorize output, so the colorized output is coming from ffmpeg.

Current
Here is an example of my web interface showing two cameras (garage_west & Peck_west) down:
Screenshot_2021-03-09_0836AM_Moonfire NVR
Here is a link to a 14 MB HTML formatted log preserving the coloration: https://drive.google.com/file/d/1UrQgLzgetLfCT8681uOOO9u1a_gGGKjU/view?usp=sharing

I have not reviewed the logs carefully, I just wanted to open this issue to set up a place where I can share my findings and our react to suggestions and other comments. Scott had mentioned recreating the ffmpeg command directly in a console against one of the cameras and see if the problems repeat themselves. I want to try building Moonfire-nvr in a Gentoo-based VM where I have more control over my environment and see if the same results occur there as in the RaspberryPi. I'm suspecting the problem I am facing is not necessarily related to Moonfire-nvr and is a problem with dependencies. Since RaspberryPi is a suggested platform for running this software, it merits further investigation.

@scottlamb
Copy link
Owner

I can think of two potentially-relevant changes in Moonfire NVR itself between the two versions:

  • in 75dce88 I started doing the equivalent of ffmpeg -fflags nobuffer.
  • It looks like it's doing H.264 decoding. Did you build with the analytics feature? I don't think that existed with the old version.

I would first try building without the analytics feature and see if that solves the problem. It's not well-tested, potentially uses a lot of CPU which might be a cause of connection drops (especially on a Raspberry Pi), and doesn't do much useful yet (it doesn't save any results of its work to the database yet).

It'd also be helpful to compare to logs from the old version, if you still have any around. In particular, I'd like to see the old log's match of the following new log lines, so we know what's changed on ffmpeg's side.

I0304 104038.168 main moonfire_ffmpeg] Initialized ffmpeg. Versions:
avutil: running=56.22.100 compiled=56.22.100
avcodec: running=58.35.100 compiled=58.35.100
avformat: running=58.20.100 compiled=58.20.100

@scottlamb
Copy link
Owner

I also just noticed this error:

thread 's-peck_west-main' panicked at 'should always be an unindexed sample', /usr/local/src/moonfire-nvr/server/db/writer.rs:750:54
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

That's definitely a bug in Moonfire NVR itself. Could you set the environment variable it mentions to get some more info?

@jlpoolen
Copy link
Contributor Author

jlpoolen commented Mar 9, 2021

Having been running for 1.5 hours with " RUST_BACKTRACE=1", spotted a few tracebacks. Current web interface has all four cameras showing, one with an infinity entry, otherwise all look normal.

1.5 hours of log at: https://pastebin.com/9KCUEagm

@jlpoolen
Copy link
Contributor Author

Here's a log file representing 19 hours. It is just the log file, with ANSI, color codes, and continues from the previous log file I posted in #113 (comment)

https://drive.google.com/file/d/1D9ZRn6aWfbuLqFca-gozmMuwO7d5JB-4/view?usp=sharing

The current state of the web interface is that 2 cameras (this time garage_east & garage_west) have not content. The Reolink client display all four cameras and the Reolink cached events have files from all four cameras; there is no indication from the 307 cached files for March 10th (8+ hours) to suggest a failure.
Screenshot_2021-03-10_0822_ Moonfire_NVR

If you think you have identified what may be causing the problem and want to try pushing a change into a development version, I can clone such and build and run to test. Or I can patch my existing instance:

jlpoole@raspberrypi:/usr/local/src/moonfire-nvr $ git show
commit ed521521a411d97c40cee67dd9831237b2fef6a4 (HEAD -> master, origin/master, origin/HEAD)
Author: Scott Lamb <[email protected]>
Date:   Thu Feb 11 20:27:12 2021 -0800

    fix SQLite3 integrity check

diff --git a/server/db/check.rs b/server/db/check.rs
index e6b96f5..bbd814f 100644
--- a/server/db/check.rs
+++ b/server/db/check.rs
@@ -60,9 +60,15 @@ pub fn run(conn: &mut rusqlite::Connection, opts: &Options) -> Result<i32, Error
     let mut printed_error = false;

     info!("Checking SQLite database integrity...");
-    if let Err(e) = conn.execute("pragma check_integrity", params![]) {
-        error!("Database integrity error: {}", e);
-        printed_error = true;
+    {
+        let mut stmt = conn.prepare("pragma integrity_check")?;
+        let mut rows = stmt.query(params![])?;
+        while let Some(row) = rows.next()? {
+            let e: String = row.get(0)?;
+            if e == "ok" { continue; }
+            error!("{}", e);
+            printed_error = true;
+        }
     }
     info!("...done");

jlpoole@raspberrypi:/usr/local/src/moonfire-nvr $

@scottlamb
Copy link
Owner

Haven't figured this out yet, but if you run the latest version the error messages will be prettier. 🤷‍♂️ I'm going to look more today.

@jlpoolen
Copy link
Contributor Author

Sorry, I'm slow on being mindful to pick-up your latest builds; thank you for the hint. To that end I performed the following:

  sudo pi
  cd /usr/local/src/moonfire-nvr
  git pull
  git status
  cd server
 cargo build --release
  exit

and I have launched a new session:

sudo moonfire-nvr
cd /usr/local/src/moonfire-nvr/server/target/release 
screen -t MoonfireShell
export START_TIME=`date +"%Y-%b-%d_%H_%M"`
export AV_LOG_FORCE_COLOR=1
export RUST_BACKTRACE=1


script --flush /tmp/moonfire-nvr_${START_TIME}.log
./moonfire-nvr run

[To leave screen (and script): Ctrl-d a]

@scottlamb scottlamb added the bug label Mar 10, 2021
@scottlamb scottlamb changed the title Upgrade on RaspberryPi 4 Results in Problems 'should always be an unindexed sample' panic Mar 10, 2021
@scottlamb
Copy link
Owner

No worries; I was just making fun of myself for fixing the cosmetic stuff before the bug. Those changes shouldn't be necessary to figure out what's going on. I'll let you know if I do need you to pick up logging changes for htat.

@scottlamb
Copy link
Owner

The problem has to be here:

let duration = i32::try_from(pts_90k - i64::from(unindexed.pts_90k))?;

The comment above says we must restore the invariant on all exit paths, but I missed one. If the offset from the previous pts doesn't fit in a u32 (it jumps forward by 231 or more, or backward by more than 231), the invariant isn't restored.

@scottlamb scottlamb added the rust Rust backend work required label Mar 10, 2021
@scottlamb
Copy link
Owner

e66a88a should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug rust Rust backend work required
Projects
None yet
Development

No branches or pull requests

2 participants