-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
starting omxplayer in a while true loop fails after some time #181
Comments
Hi i have same or similar problem running loop script :
running video with h264 codec an HD resolution sometimes on tv is only black screen but omxplayer process running i running omxplayer on sceen. After load screen omxplayer say: /usr/bin/omxplayer: line 67: 30400 Aborted |
Hi I am also bugged by this problem. |
@Avoncliff are you sure you use a recent version? This looks like a bug in dbus-daemon handling that is resolved by now. |
Only current binary from apt, so if that is now fixed sorry to trouble you. I have resolved my problem, by reverting to a very old version that loops for at least days on end with no problem. Again not sure of a version number as it did not have -v command, but at least Jan 2013 old. |
@Avoncliff You should try a recent build from http://omxplayer.sconde.net |
I can confirm this bug, it is very annoying. omxplayer reads the video till the end but then fails to exit, so the loop can't go on. Killing it works though, and the loop resumes as usual. I tried everything (last release from http://omxplayer.sconde.net, firmware update, removing USB devices, disabling overclocking...) without results. I had to implement a crude watchdog that kills any ill-behaving omxplayer.bin |
So I've tried nearly every build from http://omxplayer.sconde.net Between these two commits ( 38f05ee...c0dd950 ), there is fd8fe4a : "Add high level locking to OMXVideo". I don't know if it's the right one but It sure looks like a good place to start digging. I'm afraid my skills pretty much stop there, I hope this helps and that somebody will figure this out. If necessary, I'm able to compile intermediate versions to narrow it down a bit more. For everybody else having this issue, revert to 0.3.2 |
I can confirm that 38f0ee is not affected. Affected versions can run up 300 iterations of the wile true loop until they break. I've built a recent version (46616c5) that got fd8fe4a reverted and the issue still persisted. ec440e9 looks very obscure to me, @popcornmix can you shed some light on what changes are introduces in this commit? |
I compiled many versions between 38f05ee and c0dd950. I'm now pretty sure the bug was introduced by 5419655 ! Note that I conducted my tests without screen connected to my Raspberries, so it used the composite output. Reading the commit diff, I wonder if the issue will be the same with HDMI. EDIT : Later versions don't lock up as quickly as 5419655. Maybe I've spoken too fast. |
I have spoken too fast. See issues #124 and #12 and commit e239f05. But there still is an issue because 38f05ee is 100% not affected. e239f05 corrects the most frequent bug introduced by 5419655, but there still is something wrong between 38f05ee and now. Maybe it's another bug of 5419655, maybe of a later commit... |
I did exactly what you asked* and I can confirm that it is better than 5419655 but still worse than 38f05ee. The behaviour is consistent with current master branch : it hangs after a little less than 300 iterations (I've seen 38f05ee work for more than 1500 iterations before I stopped it) So my working theory is that 5419655 is still responsible, and while e239f05 surely helps it doesn't fix everything... *Procedure for somebody else willing to test : If needed, I can upload binaries somewhere. |
If you are able to produce a patch that applies to head of tree that restores the 38f05ee reliability I'd be interested. I'll try to have a look at some point, but too busy right now. |
Yeah I understand you have better things to do. I'll continue my investigations, share my findings here, and hopefully I or somebody else will be able to produce a patch ;-) |
While hunting for the bug I tried some good ol' printf debugging... omxplayers hangs at the very end of the exit procedure, while doing
The simple solution (to avoid that line being executed) is to launch omxplayer with --no-keys option. I've attained more than 3000 iterations without hang ! (working with 0.3.5 I have no clue as to why this simple action works 300 times but not the 301th. I'm also scratching my head on why the bug didn't affect 38f05ee and what differences did 5419655 make to the keyboard handling. Maybe some weird memory leak, kernel or firmware bug ? tl;dr : found solution, use --no-keys. |
@Tito1337: Awesome! |
I don't understand the connection of keyboard code and 5419655. |
I'm still investigating with printf debugging :D So I've traced it to the Keyboard thread : when everything should be shutting down at the end, it calls StopThread() and the line
that waits for the thread to finish, hangs, probably because of a lock that never gets released. I'm no expert but I think we face a race condition :
Maybe the simple order or timing differences introduced in 5419655 made this bug appear. I tried to use valgrind (with --tool=helgrind) to check for thread lockups but there is a known, wontfix, incompatibility with Raspbian. At this point I still don't know how to find where the lock is created but I would put my money on DBUS. I'll keep you updated if I find something |
I think the valgrind failure can be avoided by bypassing the accelerated memcpy library.
in /etc/ld.so.preload gdb might be a better tool for debugging this type of hang - you can run "thread apply all bt" to get the backtrace of all threads. You will find keyboard thread is blocked calling something... |
Thanks for you help, if you hadn't guessed I never used valgrind or gdb before ! Your valgrind workaround seems to work, but it is very slow so I don't know if I will ever obtain one of those problematic states. I also tried with gdb as you suggested but there seem to be a similar issue :
Is there a workaround for that one? EDIT : found one
|
updated to work with actual omxplayer, since the keyboard caused to crash omxplayer[[https://github.com/popcornmix/omxplayer/issues/181|issue 181]]
@Tito1337 Thanks very much for your work on this. Omxplayer stability has been a persistent issue for me with omxplayer builds after July 2013 (when it was very reliable and could run in a loop for weeks on end)...but I have not been able to isolate and resolve the problem. Have you learned anything more about the issue? Recent builds have been more difficult...and using them my application will now frequently hang Videocore. Jamesh commented on my forum post (http://www.raspberrypi.org/forums/viewtopic.php?t=77808) that: "Videocore has run out of memory, something is allocating and not releasing. Difficult to tell if it's Videocore side or ARM side (The ARM can do stuff that causes Videocore allocations, but in those circumstances the ARM also need to deallocate)." I think what's happening is that omxplayer does not shut down cleanly (or at all) after playing a video in a number of situations. Sending SIGINT works in some cases to get omxplayer to shutdown/return...but not all...and then the only thing that works is sending a SIGKILL. I suspect the cumulative effect is Videocore hanging. |
Like I said before, the hang is due to a lock in the thread listening to keyboard inputs. I don't know what circumstances lock the thread but there is a simple solution : disable the keyboard listener by using the --no-keys option. My investigations are totally stuck as I can't pinpoint the exact origin of the deadlock, but the --no-keys solutions works 100% for the bug I was chasing so maybe the VideoCore issue is another bug |
This is my first post so apologies if I don't understand this fully. I have the same problem of Omxplayer stopping from time to time when called from a bash script. I have tried the command "omxplayer --no-keys file.mp4" and I get an error of "unrecognised option" |
Hi Bruddy, what version of omxplayer do you use? All recent versions should You can download the latest build (in the form of a .deb package) from (Note that this thread is for talking about this specific bug, not general support, so keep it clean ;-)) |
That may be my problem. I am using Raspbian and apt-get install and it says that I have the latest version but I presume their repositories are probably not fully up to date? |
@Bruddy |
@lampone1967 This is not the right place. Your issue is due to omxplayer-sync, not omxplayer itself. You can contact the original developper here : https://github.com/turingmachine/omxplayer-sync/issues?labels=question |
unfortunately the |
I face similar problem. I am trying to play MP4 video and it hangs in the end. The latest working build for me is http://omxplayer.sconde.net/builds/omxplayer_0.3.0~git20130729~efd1049_armhf.deb |
@revolunet : what do you exactly mean with "unfortunately the --no-keys option doesnt work here with Version : 6ee9a0a [master]." @justinasjaronis: hope the modification and commit for issue 266 works for you. Otherwise I would be interested in the video, too. |
Hey there, |
Dear @popcornmix and @turingmachine and @Tito1337 , |
Given the following bash script:
After some iterations (between 5 and 100) the screen stays black.
That's the console output on failure:
omxplayer and omxplayer.bin are running. The omxplayer.log only contains a correct shutdown sequence of the last run
attaching to the omxplayer.bin process with strace -f -F -p shows:
The text was updated successfully, but these errors were encountered: