Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMAX locked when seeking multiple times #824

Open
carlonluca opened this issue Jun 10, 2017 · 32 comments
Open

OpenMAX locked when seeking multiple times #824

carlonluca opened this issue Jun 10, 2017 · 32 comments

Comments

@carlonluca
Copy link

Hello! I'm experiencing a issue very similar to #449. I can reproduce this with my application, but I can also reproduce with omxplayer.
This is what I experience: repeating a seek command for long time results in a complete lock of the OpenMAX libraries, and often of the entire GPU. If I repeat seek for around 7 hours (but may be more or less), omxplayer freezes, the seek command does not return and the GPU is stuck.
I built omxplayer in debug and this is what I get when locked (I had to build a new gdb cause the one shipped with raspbian was crashing):

(gdb) t a a where

Thread 11 (Thread 0xaf3ae420 (LWP 8650)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6dce918 in ilcs_execute_function_ex () from /opt/vc/lib/libopenmaxil.so
#3  0xb6dcf904 in ilcs_pass_buffer () from /opt/vc/lib/libopenmaxil.so
#4  0x00032e98 in COMXCoreComponent::EmptyThisBuffer (this=0x1cc8d8, omx_buffer=0x1bdbb0) at OMXCore.cpp:367
#5  0x00043968 in COMXAudio::AddPackets (this=0x1cbb38, data=0x755098, len=49152, dts=298666.66666666669, pts=298666.66666666669, frame_size=24576) at OMXAudio.cpp:963
#6  0x0004ca84 in OMXPlayerAudio::Decode (this=0xa6458 <m_player_audio>, pkt=0x1d7228) at OMXPlayerAudio.cpp:256
#7  0x0004cdc8 in OMXPlayerAudio::Process (this=0xa6458 <m_player_audio>) at OMXPlayerAudio.cpp:314
#8  0x00029938 in OMXThread::Run (arg=0xa6458 <m_player_audio>) at OMXThread.cpp:105
#9  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#10 0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 10 (Thread 0xaffff420 (LWP 8649)):
#0  0xb6b85780 in __lll_lock_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b801a4 in pthread_mutex_lock () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x00047854 in OMXClock::Lock (this=0x1672e0) at OMXClock.cpp:63
#3  0x00048480 in OMXClock::OMXMediaTime (this=0x1672e0, lock=true) at OMXClock.cpp:329
#4  0x0004de98 in OMXPlayerSubtitles::__lambda6::operator() (__closure=0xafffed2c) at OMXPlayerSubtitles.cpp:180
#5  0x0004e6c4 in OMXPlayerSubtitles::RenderLoop (this=0xa66b8 <m_player_subtitles>, font_path=..., italic_font_path=..., font_size=0.0549999997, centered=false, ghost_box=true, 
    lines=3, clock=0x1672e0) at OMXPlayerSubtitles.cpp:286
#6  0x0004dcf4 in OMXPlayerSubtitles::Process (this=0xa66b8 <m_player_subtitles>) at OMXPlayerSubtitles.cpp:122
#7  0x00029938 in OMXThread::Run (arg=0xa66b8 <m_player_subtitles>) at OMXThread.cpp:105
#8  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#9  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 9 (Thread 0xb07ff420 (LWP 8648)):
#0  0xb6b827a4 in pthread_cond_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x0004a140 in OMXPlayerVideo::Process (this=0xa6220 <m_player_video>) at OMXPlayerVideo.cpp:228
#2  0x00029938 in OMXThread::Run (arg=0xa6220 <m_player_video>) at OMXThread.cpp:105
#3  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#4  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 8 (Thread 0xb11ff420 (LWP 8647)):
#0  0xb6b86730 in nanosleep () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b85504 in __pthread_enable_asynccancel () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 7 (Thread 0xb19ff420 (LWP 8646)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6c21ca0 in vchiu_queue_pop () from /opt/vc/lib/libvchiq_arm.so
#3  0xb6dcedc8 in ilcs_task () from /opt/vc/lib/libopenmaxil.so
#4  0xb6c05cdc in vcos_thread_entry (arg=0x15d9b8) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
Backtrace stopped: Cannot access memory at address 0x20a250f8

Thread 6 (Thread 0xb2346420 (LWP 8645)):
---Type <return> to continue, or q <return> to quit---
#0  0xb6b827a4 in pthread_cond_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6c05dc0 in _timer_thread (arg=0x15d948) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:722
#2  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
Backtrace stopped: Cannot access memory at address 0x18

Thread 5 (Thread 0xb2b46420 (LWP 8644)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6dedbfc in cecservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0xb6c05cdc in vcos_thread_entry (arg=0xb6e06a90 <cecservice_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (Thread 0xb3346420 (LWP 8643)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6ded078 in tvservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0xb6c05cdc in vcos_thread_entry (arg=0xb6e05d08 <tvservice_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 3 (Thread 0xb3b46420 (LWP 8642)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6df09b8 in dispmanx_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0xb6c05cdc in vcos_thread_entry (arg=0xb6e078b8 <dispmanx_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0xb4346420 (LWP 8641)):
#0  0xb5b2df2c in ioctl () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0xb6c1f000 in completion_thread () from /opt/vc/lib/libvchiq_arm.so
#2  0xb6c05cdc in vcos_thread_entry (arg=0xb6c32448 <vchiq_instance+16>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#3  0xb6b7de90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#4  0xb5b35598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0xb4348000 (LWP 8640)):
#0  0xb6b84a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0xb6b84af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0xb6dce918 in ilcs_execute_function_ex () from /opt/vc/lib/libopenmaxil.so
#3  0xb6dcf6a8 in ilcs_execute_function () from /opt/vc/lib/libopenmaxil.so
#4  0xb6dcfffc in vcil_out_set () from /opt/vc/lib/libopenmaxil.so
#5  0x000353f0 in COMXCoreComponent::SetConfig (this=0x167310, configIndex=OMX_IndexConfigTimeScale, configStruct=0xbe956c78) at OMXCore.cpp:1181
#6  0x00048d00 in OMXClock::OMXSetSpeed (this=0x1672e0, speed=1000, lock=false, pause_resume=true) at OMXClock.cpp:499
#7  0x00048b5c in OMXClock::OMXResume (this=0x1672e0, lock=true) at OMXClock.cpp:469
#8  0x00081b18 in main (argc=2, argv=0xbe957704) at omxplayer.cpp:1726

It seems that once again calling the OpenMAX function locks into ilcs_execute_function_ex like in
#449, but in this case the queue is empty. In my tests it seems this typically happens in SetConfig, resuming from the buffering procedure. I have the same identical behaviour in my application, which is very similar to omxplayer.

I can reproduce in a few hours by running omxplayer and executing a script like this:

while [ true ]; do ./dbuscontrol.sh setposition 0; sleep 1; done

This is a huge problem as not only the application freezes, but also the board needs a complete reboot.

@popcornmix
Copy link
Contributor

Can you point to a file that is confirmed to hit this problem? (e.g link to a specific Big Buck Bunny or a sample you have tested).
Just in case it only occurs with certain codecs or timestamps or similar.

@carlonluca
Copy link
Author

Thank you for your answer. I could reproduce with other video files, but the test I reported above was run with the "standard" big buck bunny file from the website: http://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_1080p_h264.mov.

I just reproduced the condition again, took me more hours, probably around 24. In this case I ran using start_debug=1 and vcdbg log assert returned:

001229.004: assert( source ) failed; ../../../../../middleware/confzilla/cp_front_fdt.c::cp_front_fdt_load_builtin line 115 rev 3ca4cf4
vcdbg_ctx_get_dump_stack: dump_stack failed
----------------
4161818.063: assert( !edid->RxVSDB.Audio || edid->BcmSupportedAudioFormats[EDID_AudioFormat_ePCM].Supported == 1 ) failed; ../../../../../middleware/hdmi/edid.c::hdmi_edid_initialise line 694 rev 3ca4cf4
vcdbg_ctx_get_dump_stack: dump_stack failed
----------------
3342174.825: assert( !edid->RxVSDB.Audio || edid->BcmSupportedAudioFormats[EDID_AudioFormat_ePCM].Supported == 1 ) failed; ../../../../../middleware/hdmi/edid.c::hdmi_edid_initialise line 694 rev 3ca4cf4
vcdbg_ctx_get_dump_stack: dump_stack failed

Now I killed omxplayer and it can't start again without a reboot. Is there anything else I can do to get some more info about this condition before I reboot?
Thanks!

@popcornmix
Copy link
Contributor

Can you run "sudo vcdbg log ex"

@carlonluca
Copy link
Author

pi@raspberrypi:~ $ sudo vcdbg log ex
No exceptions found

@dimovnike
Copy link

I am experiencing the same issue with a custom application and omxplayer.

@carlonluca
Copy link
Author

@popcornmix were you able to reproduce the issue by any chance? I forgot to say that I'm currently testing with Pi1. Not sure if this happens also in Pi2 and Pi3.

@6by9
Copy link

6by9 commented Jun 27, 2017

I've reproduced this with the VideoCore debugger attached.
The audio render component is busy waiting for the DMA channel for audio to go idle, and that isn't being signalled. I foolishly made it run on without noting the full DMA state, so now have to wait to reproduce it again to get more information :-(
Oh, and that's on a Pi3, so almost certainly independent of Pi variant.

@dimovnike
Copy link

We run many PI's and we had this bug since PI1 (PI B) so I believe that its independent of the PI variant.

@carlonluca
Copy link
Author

@6by9 I appreciate you took the time to have a look at this! Did you succeed in reproducing the issue again by any chance? How much did it take for you to reproduce the first time on Pi3? I guess #449 was different from this, but in that case it was much quicker to reproduce on Pi1 and harder on Pi3 or when overclock was applied. Maybe switching to a Pi1 may help fixing this as well?

@6by9
Copy link

6by9 commented Jul 5, 2017

Yes, I think we know what is happening, although it's only a hypothesis on the cause rather than proof. I had to reproduce the failure several times to get sufficient information, which generally meant leaving it overnight so it's taken a while.

I've a local fix that I left running last night. If it is still running when I get in then I'll suggest popcornmix makes a test firmware with it in for you to try out. The only niggle in my testing is that my monitor doesn't actually have speakers to tell for sure that the audio is still working!

AIUI This is going to be independent of flavour of Pi - it's timing in the DMA for HDMI audio, and that is common across all of them. #449 was the IPC doing the wrong thing, so the ARM spec had an effect.

@pelwell
Copy link
Contributor

pelwell commented Jul 5, 2017

amixer cset numid=3 1 will force on-board audio out through the headphone jack - you can switch it mid-stream. amixer cset numid=3 0 routes it to HDMI. amixer cset numid=3 2 enables auto mode, which uses HDMI if the cable is plugged in, resorting to headphones if not.

@6by9
Copy link

6by9 commented Jul 5, 2017

Locked up in what appears to be a different way and the debugger failed. Need to kick it off again :-(

@carlonluca
Copy link
Author

carlonluca commented Jul 7, 2017

Please let me know if there is anything I can try to do to assist you.

@6by9
Copy link

6by9 commented Jul 11, 2017

Sorry for the delay - it doesn't help that this takes so long to reproduce.

I have seen a different issue with the fix where omxplayer has stalled, but everything on the GPU looks fine, and it is still responding normally. Whilst that instance of omxplayer appears to have locked up (not responding to the dbus commands), killing and restarting it works fine.
I've got it running again and will be digging deeper into what has stalled if I can reproduce it again.

The current change has been merged internally, so should be in the next rpi-update.

@dimovnike
Copy link

dimovnike commented Jul 11, 2017

When can we expect the current the fix in rpi-update?

popcornmix added a commit that referenced this issue Jul 11, 2017
firmware: usb: Force MSD app to use CM3 pin conf

firmware: IL ISP: Fix typo in logging
firmware: IL ISP: Add black level and lens shading controls
firmware: isp: Correct ISP Bayer stride calcs for supported formats
firmware: Remove unused duplicate versions of vc_sm_defs.h
firmware: IL camera: add get_parameter for OMX_IndexConfigCustomAwbGains
firmware: IL resize: Support get_parameter OMX_IndexConfigCommonInputCro
firmware: MMAL/RIL: Add MMAL_PARAMETER_RESIZE_PARAMS / OMX_IndexParamResize mapping
firmware: ISP IL: Add lresize output
firmware: MMAL/RIL: Add mapping for OMX_IndexConfigCommon[In|Out]putCrop
firmware: MMAL/RIL: Correct handling of MMAL_PARAMETER_VIDEO_SOURCE_PATTERN
firmware: IL ISP: Add H & V flip support
firmware: IL ISP: Implement OMX_IndexConfigCommonInputCrop

firmware: imx219: Refactor exposure calculations

firmware: dmalib: Stop spinning on dma_pause if END is signalled
See: #824
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Jul 11, 2017
firmware: usb: Force MSD app to use CM3 pin conf

firmware: IL ISP: Fix typo in logging
firmware: IL ISP: Add black level and lens shading controls
firmware: isp: Correct ISP Bayer stride calcs for supported formats
firmware: Remove unused duplicate versions of vc_sm_defs.h
firmware: IL camera: add get_parameter for OMX_IndexConfigCustomAwbGains
firmware: IL resize: Support get_parameter OMX_IndexConfigCommonInputCro
firmware: MMAL/RIL: Add MMAL_PARAMETER_RESIZE_PARAMS / OMX_IndexParamResize mapping
firmware: ISP IL: Add lresize output
firmware: MMAL/RIL: Add mapping for OMX_IndexConfigCommon[In|Out]putCrop
firmware: MMAL/RIL: Correct handling of MMAL_PARAMETER_VIDEO_SOURCE_PATTERN
firmware: IL ISP: Add H & V flip support
firmware: IL ISP: Implement OMX_IndexConfigCommonInputCrop

firmware: imx219: Refactor exposure calculations

firmware: dmalib: Stop spinning on dma_pause if END is signalled
See: raspberrypi/firmware#824
@popcornmix
Copy link
Contributor

rpi-update is pushed.

@6by9
Copy link

6by9 commented Jul 11, 2017

I've just caught the stall I'd seen overnight again - it looks to be my network locking up, nothing to do with omxplayer or the GPU.
I was running the seek script over SSH, and the video was being read over NFS.
Lose the network and you lose the seeks. omxplayer then plays on until it needs to read more data for the file, at which point it stalls.

New test set up to run overnight with everything local to the Pi.

@dimovnike
Copy link

great news! Thank you. Also a queston, do I understand it correctly that the fix is in one of these files below?

git diff --name-only 400f6d196503e..a4e1434bfa3f
bootcode.bin
fixup.dat
fixup_cd.dat
fixup_db.dat
fixup_x.dat
start.elf
start_cd.elf
start_db.elf
start_x.elf

because only these change since the last commit.

@popcornmix
Copy link
Contributor

Yes

@6by9
Copy link

6by9 commented Jul 17, 2017

My test setup is still misbehaving. Has anyone else tried the new firmware and got any improvement?

@carlonluca
Copy link
Author

Yes, I'm testing it and seems fine so far. More than 4 days and omxplayer is still running properly. I have other pi's running tests and so far no lock. What misbehaviour did you notice?

@6by9
Copy link

6by9 commented Jul 17, 2017

@carlonluca Thanks for the response, and that is good news.

I was running with the debugger attached but that has been known to produce odd results of itself.
Whereas before I was seeing the player lock up with the time overlay on display, the recent lockups were on a totally black screen, and generally happened within 12 hours (I left it overnight). Stopping the debugger generally resulted in the debugger crashing and taking the Pi with it :-(
I was also running with a firmware debug build, although that bit shouldn't make a huge difference.

I'll revert my test rig to running without the debugger overnight and see what I get.

@carlonluca
Copy link
Author

carlonluca commented Jul 17, 2017

Generally happened within 12/24 hours yes. The update seems very good so far. Need more time anyway for precise results.

@carlonluca
Copy link
Author

Hello @6by9, do you have any news about this topic by any chance? Unfortunately it seems a few locks still occur. I just reproduced on omxplayer (unfortunately not a debug build):

Thread 11 (Thread 0x74339420 (LWP 3634)):
#0 0x75b7df2c in ioctl () at ../sysdeps/unix/syscall-template.S:81
#1 0x76c70010 in completion_thread () from /opt/vc/lib/libvchiq_arm.so
#2 0x76c56cc4 in vcos_thread_entry (arg=0x76c83318 <vchiq_instance+16>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#3 0x76bcee90 in start_thread (arg=0x74339420) at pthread_create.c:311
#4 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 10 (Thread 0x739ff420 (LWP 3635)):
#0 0x76bd5a40 in do_futex_wait (isem=isem@entry=0x76e586e8 <dispmanx_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1 0x76bd5af4 in __new_sem_wait (sem=0x76e586e8 <dispmanx_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2 0x76e42140 in dispmanx_notify_func () from /opt/vc/lib/libbcm_host.so
#3 0x76c56cc4 in vcos_thread_entry (arg=0x76e59428 <dispmanx_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4 0x76bcee90 in start_thread (arg=0x739ff420) at pthread_create.c:311
#5 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 9 (Thread 0x731ff420 (LWP 3636)):
#0 0x76bd5a40 in do_futex_wait (isem=isem@entry=0x76e57864 <tvservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1 0x76bd5af4 in __new_sem_wait (sem=0x76e57864 <tvservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2 0x76e3e078 in tvservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3 0x76c56cc4 in vcos_thread_entry (arg=0x76e57878 <tvservice_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4 0x76bcee90 in start_thread (arg=0x731ff420) at pthread_create.c:311
#5 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 8 (Thread 0x729ff420 (LWP 3637)):
#0 0x76bd5a40 in do_futex_wait (isem=isem@entry=0x76e585ec <cecservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1 0x76bd5af4 in __new_sem_wait (sem=0x76e585ec <cecservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2 0x76e3ec34 in cecservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3 0x76c56cc4 in vcos_thread_entry (arg=0x76e58600 <cecservice_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4 0x76bcee90 in start_thread (arg=0x729ff420) at pthread_create.c:311
#5 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 7 (Thread 0x721ff420 (LWP 3638)):
#0 0x76bd37a4 in __pthread_cond_wait (cond=cond@entry=0x207f968, mutex=mutex@entry=0x207f94c) at pthread_cond_wait.c:187
#1 0x76c56da8 in _timer_thread (arg=0x207f948) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:722
#2 0x76bcee90 in start_thread (arg=0x721ff420) at pthread_create.c:311
#3 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 6 (Thread 0x719ff420 (LWP 3639)):
#0 0x76bd5a40 in do_futex_wait (isem=isem@entry=0x207fafc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1 0x76bd5af4 in __new_sem_wait (sem=0x207fafc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2 0x76c72b60 in vchiu_queue_pop () from /opt/vc/lib/libvchiq_arm.so
#3 0x76e1fd70 in ilcs_task () from /opt/vc/lib/libopenmaxil.so
#4 0x76c56cc4 in vcos_thread_entry (arg=0x207f9b8) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5 0x76bcee90 in start_thread (arg=0x719ff420) at pthread_create.c:311
#6 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 5 (Thread 0x711ff420 (LWP 3640)):
#0 0x76bd7730 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1 0x0004f5e0 in ?? ()
#2 0x0004fcd0 in ?? ()
#3 0x000258dc in OMXThread::Run(void*) ()
#4 0x76bcee90 in start_thread (arg=0x711ff420) at pthread_create.c:311
#5 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (Thread 0x706ff420 (LWP 3641)):
#0 0x76bd5a40 in do_futex_wait (isem=isem@entry=0x207fb84) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1 0x76bd5af4 in __new_sem_wait (sem=0x207fb84) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2 0x76e1f8bc in ilcs_execute_function_ex () from /opt/vc/lib/libopenmaxil.so
---Type <return> to continue, or q <return> to quit---
#3 0x76e208a4 in ilcs_pass_buffer () from /opt/vc/lib/libopenmaxil.so
#4 0x0002b194 in ?? ()
#5 0x000320f4 in ?? ()
#6 0x00039624 in ?? ()
#7 0x00039740 in ?? ()
#8 0x000258dc in OMXThread::Run(void*) ()
#9 0x76bcee90 in start_thread (arg=0x706ff420) at pthread_create.c:311
#10 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 3 (Thread 0x6feff420 (LWP 3642)):
#0 0x76bd3b30 in __pthread_cond_timedwait (cond=0x69bc0 <m_player_subtitles+184>, mutex=0x69ba8 <m_player_subtitles+160>, abstime=0x6fefecc8) at pthread_cond_timedwait.c:199
#1 0x0003bd64 in ?? ()
#2 0x0003cb90 in ?? ()
#3 0x000258dc in OMXThread::Run(void*) ()
#4 0x76bcee90 in start_thread (arg=0x6feff420) at pthread_create.c:311
#5 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0x6f36d420 (LWP 3643)):
#0 0x76bd7730 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1 0x00038fcc in ?? ()
#2 0x0003ab18 in ?? ()
#3 0x0003ad38 in ?? ()
#4 0x000258dc in OMXThread::Run(void*) ()
#5 0x76bcee90 in start_thread (arg=0x6f36d420) at pthread_create.c:311
#6 0x75b85598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0x7433b000 (LWP 3633)):
#0 0x76bd6780 in lll_lock_wait (futex=futex@entry=0x69fc8 <m_player_video+320>, private=<optimized out>) at ../ports/sysdeps/unix/sysv/linux/arm/nptl/lowlevellock.c:46
#1 0x76bd11a4 in GI_pthread_mutex_lock (mutex=0x69fc8 <m_player_video+320>) at pthread_mutex_lock.c:79
#2 0x0003981c in ?? ()
#3 0x00012aa4 in ?? ()
#4 0x00017048 in ?? ()
#5 0x75acc294 in libc_start_main (Cannot access memory at address 0x0
main=0x7e9ca704, argc=1975455744, argv=0x75acc294 <__libc_start_main+276>, init=<optimized out>, fini=0x517cc, rtld_fini=0x76efd408 <_dl_fini>, stack_end=0x7e9ca704) at libc-start.c:287
#6 0x0001826c in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Again dbus seems stuck during SetPosition:

pi 3625 0.0 0.2 3032 2084 ? S Aug02 0:00 /bin/bash /usr/bin/omxplayer /home/pi/bbb.mov
pi 3633 30.6 3.4 159012 26028 ? Sl Aug02 816:03 /usr/bin/omxplayer.bin /home/pi/bbb.mov
pi 4646 0.2 0.3 3188 2372 ? S Aug02 5:30 /bin/bash ./omxplayer_seek.sh
pi 12137 0.0 0.0 2284 472 ? S 15:32 0:00 dbus-send --print-reply=literal --session --dest=org.mpris.MediaPlayer2.omxplayer /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player SetPosition objpath /not/used int64 276000000

took a couple of days to reproduce. The only difference here is that I changed my seek script to something like this (unsure if this is relevant or not):

while [ true ]; do ./dbuscontrol.sh setposition $((RANDOM%540*1000000)); sleep 1; done

It seems however that restarting omxplayer is still possible now without rebooting first. Do you have any idea?
Thank you for your help.

@carlonluca
Copy link
Author

I was able to reproduce again. This time omxplayer includes the symbols:

(gdb) t a a where

Thread 11 (Thread 0x6f134420 (LWP 10602)):
#0  0x76c08a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c08af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x76e5e8bc in ilcs_execute_function_ex () from /opt/vc/lib/libopenmaxil.so
#3  0x76e5f648 in ilcs_execute_function () from /opt/vc/lib/libopenmaxil.so
#4  0x76e5ff9c in vcil_out_set () from /opt/vc/lib/libopenmaxil.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 10 (Thread 0x6feff420 (LWP 10601)):
#0  0x76c06b30 in pthread_cond_timedwait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x0004d81c in __gthread_cond_timedwait (__cond=0xa6770 <m_player_subtitles+184>, __mutex=0xa6758 <m_player_subtitles+160>, __abs_timeout=0x6fefeb20)
    at /opt/pi5.8_custom/gcc-linaro-arm-linux-gnueabihf-raspbian/arm-linux-gnueabihf/include/c++/4.8.3/arm-linux-gnueabihf/bits/gthr-default.h:871
#2  0x0005c668 in std::condition_variable::__wait_until_impl<std::chrono::duration<long long, std::ratio<1ll, 1000000000ll> > > (this=0xa6770 <m_player_subtitles+184>, 
    __lock=..., __atime=...) at /opt/pi5.8_custom/gcc-linaro-arm-linux-gnueabihf-raspbian/arm-linux-gnueabihf/include/c++/4.8.3/condition_variable:160
#3  0x00059d38 in std::condition_variable::wait_until<std::chrono::duration<long long, std::ratio<1ll, 1000000000ll> > > (this=0xa6770 <m_player_subtitles+184>, __lock=..., 
    __atime=...) at /opt/pi5.8_custom/gcc-linaro-arm-linux-gnueabihf-raspbian/arm-linux-gnueabihf/include/c++/4.8.3/condition_variable:100
#4  0x0004feb4 in std::condition_variable::wait_until<std::chrono::_V2::system_clock, std::chrono::duration<long long int, std::ratio<1ll, 1000000000ll> >, Mailbox<Ts>::receive_wait(const std::chrono::duration<_Rep, _Period>&, Funs&& ...) [with Rep = long long int; Period = std::ratio<1ll, 1000ll>; Funs = {OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda9, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda10, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda11, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda12, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda13, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda14, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda15, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda16}; Ts = {OMXPlayerSubtitles::Message::Stop, OMXPlayerSubtitles::Message::Flush, OMXPlayerSubtitles::Message::Push, OMXPlayerSubtitles::Message::Touch, OMXPlayerSubtitles::Message::SetPaused, OMXPlayerSubtitles::Message::SetDelay, OMXPlayerSubtitles::Message::DisplayText, OMXPlayerSubtitles::Message::SetRect}]::__lambda4>(std::unique_lock<std::mutex> &, const std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long long, std::ratio<1ll, 1000000000ll> > > &, Mailbox<OMXPlayerSubtitles::Message::Stop, OMXPlayerSubtitles::Message::Flush, OMXPlayerSubtitles::Message::Push, OMXPlayerSubtitles::Message::Touch, OMXPlayerSubtitles::Message::SetPaused, OMXPlayerSubtitles::Message::SetDelay, OMXPlayerSubtitles::Message::DisplayText, OMXPlayerSubtitles::Message::SetRect>::__lambda4) (this=0xa6770 <m_player_subtitles+184>, __lock=..., __atime=..., __p=...)
    at /opt/pi5.8_custom/gcc-linaro-arm-linux-gnueabihf-raspbian/arm-linux-gnueabihf/include/c++/4.8.3/condition_variable:123
#5  0x0004fbdc in std::condition_variable::wait_for<long long int, std::ratio<1ll, 1000ll>, Mailbox<Ts>::receive_wait(const std::chrono::duration<_Rep, _Period>&, Funs&& ...) [with Rep = long long int; Period = std::ratio<1ll, 1000ll>; Funs = {OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda9, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda10, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda11, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda12, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda13, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda14, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda15, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda16}; Ts = {OMXPlayerSubtitles::Message::Stop, OMXPlayerSubtitles::Message::Flush, OMXPlayerSubtitles::Message::Push, OMXPlayerSubtitles::Message::Touch, OMXPlayerSubtitles::Message::SetPaused, OMXPlayerSubtitles::Message::SetDelay, OMXPlayerSubtitles::Message::DisplayText, OMXPlayerSubtitles::Message::SetRect}]::__lambda4>(std::unique_lock<std::mutex> &, const std::chrono::duration<long long, std::ratio<1ll, 1000ll> > &, Mailbox<OMXPlayerSubtitles::Message::Stop, OMXPlayerSubtitles::Message::Flush, OMXPlayerSubtitles::Message::Push, OMXPlayerSubtitles::Message::Touch, OMXPlayerSubtitles::Message::SetPaused, OMXPlayerSubtitles::Message::SetDelay, OMXPlayerSubtitles::Message::DisplayText, OMXPlayerSubtitles::Message::SetRect>::__lambda4) (this=0xa6770 <m_player_subtitles+184>, 
    __lock=..., __rtime=..., __p=...) at /opt/pi5.8_custom/gcc-linaro-arm-linux-gnueabihf-raspbian/arm-linux-gnueabihf/include/c++/4.8.3/condition_variable:139
#6  0x0004f904 in Mailbox<OMXPlayerSubtitles::Message::Stop, OMXPlayerSubtitles::Message::Flush, OMXPlayerSubtitles::Message::Push, OMXPlayerSubtitles::Message::Touch, OMXPlayerSubtitles::Message::SetPaused, OMXPlayerSubtitles::Message::SetDelay, OMXPlayerSubtitles::Message::DisplayText, OMXPlayerSubtitles::Message::SetRect>::receive_wait<long long int, std::ratio<1ll, 1000ll>, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda9, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda10, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda11, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda12, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda13, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda14, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda15, OMXPlayerSubtitles::RenderLoop(const string&, const string&, float, bool, bool, unsigned int, OMXClock*)::__lambda16>(const std::chrono::duration<long long, std::ratio<1ll, 1000ll> > &) (this=0xa6730 <m_player_subtitles+120>, 
    rel_time=...) at utils/Mailbox.h:76
#7  0x0004e690 in OMXPlayerSubtitles::RenderLoop (this=0xa66b8 <m_player_subtitles>, font_path=..., italic_font_path=..., font_size=0.0549999997, centered=false, ghost_box=true, 
    lines=3, clock=0xb3d2e0) at OMXPlayerSubtitles.cpp:282
---Type <return> to continue, or q <return> to quit---
#8  0x0004dcf4 in OMXPlayerSubtitles::Process (this=0xa66b8 <m_player_subtitles>) at OMXPlayerSubtitles.cpp:122
#9  0x00029938 in OMXThread::Run (arg=0xa66b8 <m_player_subtitles>) at OMXThread.cpp:105
#10 0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#11 0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 9 (Thread 0x706ff420 (LWP 10600)):
#0  0x76c067a4 in pthread_cond_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x0004a140 in OMXPlayerVideo::Process (this=0xa6220 <m_player_video>) at OMXPlayerVideo.cpp:228
#2  0x00029938 in OMXThread::Run (arg=0xa6220 <m_player_video>) at OMXThread.cpp:105
#3  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#4  0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 8 (Thread 0x711ff420 (LWP 10599)):
#0  0x76c0a730 in nanosleep () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c09504 in __pthread_enable_asynccancel () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 7 (Thread 0x719ff420 (LWP 10598)):
#0  0x76c08a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c08af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x76ca4b60 in vchiu_queue_pop () from /opt/vc/lib/libvchiq_arm.so
#3  0x76e5ed70 in ilcs_task () from /opt/vc/lib/libopenmaxil.so
#4  0x76c88cc4 in vcos_thread_entry (arg=0xb339b8) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
Backtrace stopped: Cannot access memory at address 0x25ea4ff8

Thread 6 (Thread 0x721ff420 (LWP 10597)):
#0  0x76c067a4 in pthread_cond_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c88da8 in _timer_thread (arg=0xb33948) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:722
#2  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
Backtrace stopped: Cannot access memory at address 0x18

Thread 5 (Thread 0x729ff420 (LWP 10596)):
#0  0x76c08a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c08af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x76e7dc34 in cecservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0x76c88cc4 in vcos_thread_entry (arg=0x76e97600 <cecservice_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (Thread 0x731ff420 (LWP 10595)):
#0  0x76c08a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c08af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x76e7d078 in tvservice_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0x76c88cc4 in vcos_thread_entry (arg=0x76e96878 <tvservice_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

---Type <return> to continue, or q <return> to quit---
Thread 3 (Thread 0x739ff420 (LWP 10594)):
#0  0x76c08a40 in do_futex_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c08af4 in sem_wait@@GLIBC_2.4 () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x76e81140 in dispmanx_notify_func () from /opt/vc/lib/libbcm_host.so
#3  0x76c88cc4 in vcos_thread_entry (arg=0x76e98428 <dispmanx_notify_task>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#4  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#5  0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0x743d4420 (LWP 10593)):
#0  0x75bb2f2c in ioctl () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76ca2010 in completion_thread () from /opt/vc/lib/libvchiq_arm.so
#2  0x76c88cc4 in vcos_thread_entry (arg=0x76cb5318 <vchiq_instance+16>) at /home/dc4/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#3  0x76c01e90 in start_thread () from /lib/arm-linux-gnueabihf/libpthread.so.0
#4  0x75bba598 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0x743d6410 (LWP 10592)):
#0  0x76c09780 in __lll_lock_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x76c043c4 in pthread_mutex_lock () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x0003e9a4 in CCriticalSection::Lock (this=0xba2b2c) at utils/SingleLock.h:40
#3  0x0003e9f8 in CSingleLock::CSingleLock (this=0x7ef3cb08, cs=...) at utils/SingleLock.h:55
#4  0x00044080 in COMXAudio::GetDelay (this=0xba1b38) at OMXAudio.cpp:1066
#5  0x0004d5a8 in OMXPlayerAudio::GetDelay (this=0xa6458 <m_player_audio>) at OMXPlayerAudio.cpp:472
#6  0x00081490 in main (argc=2, argv=0x7ef3d5a4) at omxplayer.cpp:1681

again dbus call to SetPosition is hanging. Took me 10/15 hours to reproduce this time.
The backtrace seems to suggest something a bit different this time. This run was started after the previous lock without rebooting. Again, after this lock GPU seems ok and omxplayer can be started again.

@6by9
Copy link

6by9 commented Aug 5, 2017

I haven't been looking into this for a while as your previous response had been that you hadn't managed to reproduce a lockup.
I think the best course of action is for me to return to getting the system dump and offline analysis tools back up and running. If I can do that then it should be possible to dump your locked up system to a file and us analyse what state the GPU is really in.

@carlonluca
Copy link
Author

It would be awesome, thank you.

@carlonluca
Copy link
Author

For some unknown reason it seems I'm now able to reproduce in less than 24 hours again. I'm trying to determine why, but can you confirm, just to be safe, that I'm testing on a firmware which is including your latest patches?

pi@raspberrypi:~ $ sudo vcgencmd version
Jul 11 2017 12:31:25 
Copyright (c) 2012 Broadcom
version aabbc2f1c981c560a6549fba5e3a0425145f8e02 (tainted) (release)

@dimovnike
Copy link

Hello, i have the same question about the firmware. How to make sure all fixes are in effect? It would be great if you could provide with checksums to important files. Thanks.

@6by9
Copy link

6by9 commented Aug 7, 2017

The commit was tagged against this issue, so #824 (reference) showed up in this issue log.

Commited to the firmware and Hexxeh/rpi-firmware repos on 11 July.
firmware: dmalib: Stop spinning on dma_pause if END is signalled

@carlonluca
Copy link
Author

@6by9 I suspect that you actually fixed an issue with your latest patch, but I suspect another issue is still there. I keep getting a similar issue, but at this point it seems that the GPU is not stuck. Killing omxplayer and starting it again is sufficient to make it run properly. No more reboot needed. This is another backtrace that I dumped when omxplayer was stuck:

(gdb) bt
#0  0x76c0d780 in __lll_lock_wait (futex=futex@entry=0xb7240 <m_player_video+320>, private=<optimized out>) at ../ports/sysdeps/unix/sysv/linux/arm/nptl/lowlevellock.c:46
#1  0x76c081a4 in __GI___pthread_mutex_lock (Cannot access memory at address 0x0
mutex=0xb7240 <m_player_video+320>) at pthread_mutex_lock.c:79
#2  0x00051844 in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) t a a where

Thread 11 (Thread 0x743d8420 (LWP 16299)):
#0  0x75bb6f2c in ioctl () at ../sysdeps/unix/syscall-template.S:81
#1  0x76cad7d8 in completion_thread (arg=0x76cbf248 <vchiq_instance>) at /home/pi/userland/interface/vchiq_arm/vchiq_lib.c:1540
#2  0x76c8d18c in vcos_thread_entry (arg=0x76cbf258 <vchiq_instance+16>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#3  0x76c05e90 in start_thread (arg=0x743d8420) at pthread_create.c:311
#4  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 10 (Thread 0x739ff420 (LWP 16300)):
#0  0x76c0ca40 in do_futex_wait (isem=isem@entry=0x76ea6428 <dispmanx_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1  0x76c0caf4 in __new_sem_wait (sem=0x76ea6428 <dispmanx_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2  0x76e8e704 in vcos_event_wait (event=0x76ea6410 <dispmanx_notify_available_event>) at /home/pi/userland/interface/vcos/pthreads/vcos_platform.h:593
#3  0x76e904e8 in dispmanx_notify_func (arg=0x0) at /home/pi/userland/interface/vmcs_host/vc_vchi_dispmanx.c:1294
#4  0x76c8d18c in vcos_thread_entry (arg=0x76ea6438 <dispmanx_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5  0x76c05e90 in start_thread (arg=0x739ff420) at pthread_create.c:311
#6  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 9 (Thread 0x731ff420 (LWP 16301)):
#0  0x76c0ca40 in do_futex_wait (isem=isem@entry=0x76ea4854 <tvservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1  0x76c0caf4 in __new_sem_wait (sem=0x76ea4854 <tvservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2  0x76e86b24 in vcos_event_wait (event=0x76ea483c <tvservice_notify_available_event>) at /home/pi/userland/interface/vcos/pthreads/vcos_platform.h:593
#3  0x76e88598 in tvservice_notify_func (arg=0x76ea3b48 <tvservice_client>) at /home/pi/userland/interface/vmcs_host/vc_vchi_tvservice.c:701
#4  0x76c8d18c in vcos_thread_entry (arg=0x76ea4868 <tvservice_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5  0x76c05e90 in start_thread (arg=0x731ff420) at pthread_create.c:311
#6  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 8 (Thread 0x729ff420 (LWP 16302)):
#0  0x76c0ca40 in do_futex_wait (isem=isem@entry=0x76ea55f4 <cecservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1  0x76c0caf4 in __new_sem_wait (sem=0x76ea55f4 <cecservice_notify_available_event+24>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2  0x76e8b494 in vcos_event_wait (event=0x76ea55dc <cecservice_notify_available_event>) at /home/pi/userland/interface/vcos/pthreads/vcos_platform.h:593
#3  0x76e8c978 in cecservice_notify_func (arg=0x76ea4950 <cecservice_client>) at /home/pi/userland/interface/vmcs_host/vc_vchi_cecservice.c:560
#4  0x76c8d18c in vcos_thread_entry (arg=0x76ea5608 <cecservice_notify_task>) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#5  0x76c05e90 in start_thread (arg=0x729ff420) at pthread_create.c:311
#6  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 7 (Thread 0x721ff420 (LWP 16303)):
#0  0x76c0a7a4 in __pthread_cond_wait (cond=0x148f968, mutex=0x148f94c) at pthread_cond_wait.c:187
#1  0x76c8e670 in _timer_thread (arg=0x148f948) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:722
#2  0x76c05e90 in start_thread (arg=0x721ff420) at pthread_create.c:311
#3  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 6 (Thread 0x719ff420 (LWP 16304)):
#0  0x76c0ca40 in do_futex_wait (isem=isem@entry=0x148fafc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1  0x76c0caf4 in __new_sem_wait (sem=0x148fafc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2  0x76cae454 in vcos_event_wait (event=0x148fae4) at /home/pi/userland/interface/vcos/pthreads/vcos_platform.h:593
#3  0x76cae87c in vchiu_queue_pop (queue=0x148fab0) at /home/pi/userland/interface/vchiq_arm/vchiq_util.c:102
#4  0x76e696b4 in ilcs_process_message (st=0x148f930, block=1) at /home/pi/userland/interface/vmcs_host/vcilcs.c:490
#5  0x76e6955c in ilcs_task (param=0x148f930) at /home/pi/userland/interface/vmcs_host/vcilcs.c:440
#6  0x76c8d18c in vcos_thread_entry (arg=0x148f9b8) at /home/pi/userland/interface/vcos/pthreads/vcos_pthreads.c:144
#7  0x76c05e90 in start_thread (arg=0x719ff420) at pthread_create.c:311
---Type <return> to continue, or q <return> to quit---
#8  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 5 (Thread 0x711ff420 (LWP 16305)):
#0  0x76c0e730 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00080da8 in ?? ()
#2  0x00080f78 in ?? ()
#3  0x00031bb4 in OMXThread::Run(void*) ()
#4  0x76c05e90 in start_thread (arg=0x711ff420) at pthread_create.c:311
#5  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (Thread 0x706ff420 (LWP 16306)):
#0  0x76c0ca40 in do_futex_wait (isem=isem@entry=0x148fbb8) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
#1  0x76c0caf4 in __new_sem_wait (sem=0x148fbb8) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:69
#2  0x76e68900 in vcos_event_wait (event=0x148fba0) at /home/pi/userland/interface/vcos/pthreads/vcos_platform.h:593
#3  0x76e6a0f4 in ilcs_execute_function_ex (st=0x148f930, func=IL_EMPTY_THIS_BUFFER, data=0x706fec80, len=84, data2=0x706fec40, len2=64, bulk_mem_handle=0, bulk_offset=0x1886a80, 
    bulk_len=17136, resp=0x706fec38, rlen=0x706fec34) at /home/pi/userland/interface/vmcs_host/vcilcs.c:809
#4  0x76e6a644 in ilcs_pass_buffer (st=0x148f930, func=IL_EMPTY_THIS_BUFFER, reference=0x3e7c1a20, pBuffer=0x1500080) at /home/pi/userland/interface/vmcs_host/vcilcs.c:971
#5  0x76e6beb0 in vcil_out_EmptyThisBuffer (hComponent=0x14acd10, pBuffer=0x1500080) at /home/pi/userland/interface/vmcs_host/vcilcs_out.c:568
#6  0x0003af10 in ?? ()
#7  0x00045d78 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 3 (Thread 0x6feff420 (LWP 16307)):
#0  0x76c0ab30 in __pthread_cond_timedwait (cond=0xb7650 <m_player_subtitles+184>, mutex=0xb7638 <m_player_subtitles+160>, abstime=0x6fefeb10) at pthread_cond_timedwait.c:199
#1  0x00055560 in ?? ()
#2  0x00064640 in ?? ()
#3  0x00061db4 in ?? ()
#4  0x00057c2c in ?? ()
#5  0x00057954 in ?? ()
#6  0x00057674 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0x6f1ef420 (LWP 16308)):
#0  0x76c0e730 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00050d24 in ?? ()
#2  0x00054704 in ?? ()
#3  0x00054af0 in ?? ()
#4  0x00031bb4 in OMXThread::Run(void*) ()
#5  0x76c05e90 in start_thread (arg=0x6f1ef420) at pthread_create.c:311
#6  0x75bbe598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0x743da000 (LWP 16298)):
#0  0x76c0d780 in __lll_lock_wait (futex=futex@entry=0xb7240 <m_player_video+320>, private=<optimized out>) at ../ports/sysdeps/unix/sysv/linux/arm/nptl/lowlevellock.c:46
#1  0x76c081a4 in __GI___pthread_mutex_lock (Cannot access memory at address 0x0
mutex=0xb7240 <m_player_video+320>) at pthread_mutex_lock.c:79
#2  0x00051844 in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

It takes me 2 or 3 days to get this. What I notice is that this seems to only happen when running on more than 1 core. Pi1 does not seem to be affected and also Pi3 running with only 1 CPU enabled in the kernel seems to be ok. Is it possible this is something only related to omxplayer maybe? Is there something that I can do to investigate this further? Can you advise on something that I can do?

@JamesH65
Copy link
Contributor

This seems to have dropped from view. Anyone have any further comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants