-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vc4_bo_create -> Failed to allocate from CMA. #1247
Comments
Does increasing cma help? Add |
OK, I've adjusted my settings, and I am testing now. Thanks. |
Well, it seems like the problem is still there, but its a different message this time. (still seems to be a CMA allocation error though.) Here are the logs from dmesg. dmesg: |
What's the simplest way to reproduce? e.g. with a clean raspbian buster image if you run vlc (and nothing else) and play a single video longer than 30m would you see this issue? Is there a freely available video file that exhibits this problem? (this doesn't seem to be my experience - I've left VLC or chrome playing youtube videos overnight a number of times and it still seems happy in the morning - but perhaps there is something related to format a videos, or some other difference in our setups). |
In my case, it occurs more often when I've switched in and out of Fullscreen mode a couple of times in VLC Media player. (3 - 5 times). And Google chrome seems to have this problem when viewing websites that use a lot of transitions and effects like: apple.com. (But most times Google chrome seems to "Aww, Snap" (crash) when using Hardware "intensive" websites). Here is my config if needed.
|
Ah sorry - cma=512m should be added to end of cmdline.txt, not config.txt. |
Alright, I've adjusted my settings again and I am testing now. Thanks. |
The issue still persists (though it did take much longer to occur this time), the same CMA allocation error as mentioned in the previous comment. It happened in VLC Media player, while watching a 1080p60 video (mp4). Specifically when trying to rewind 30 seconds back in the video. |
So, if you don't wind back, then the problem does not occur? Quite an important bit of information that. |
Let us know if you find a sequence of operations that makes the issue happen repeatably and ideally quickly. We can get the vlc/chromium guy to investigate this, but specific instructions to reproduce make it a lot more likely he'll be able to find and fix the problem. What is your display resolution? Default skin on VLC? |
It can still freeze even if you don't rewind VLC Media player.
Will do! I will make sure to let you know as soon as possible, when I can do so. EDIT: LOL, Found it! So, literally scrubbing the time-line on VLC Media player while the video is playing can cause this error. I am currently playing a 1920x1080p 60fps video (mp4). (My config has not changed since the last time I mentioned it btw.) EDIT 2: So the "after-shock" of this error seems to cause other applications to crash when continuing to use them after that error has been produced. In my case Chromium seems to "Aww, Snap" (crash) the tabs that are using Hardware accelerated rendering such as:
NOTE: It doesn't crash it instantly, it only crashes when you try to interact with the website (after the error has been produced.) and will only stop crashing when you reboot the system (Restarting Chromium won't fix it). EDIT 3: Making the Chromium tabs crash, seem to push more of the same error out when taking a look at the Kernel logs using "dmesg". And I forgot to mention that VLC Media player becomes unresponsive when trying to close it.
Raspberry Pi is running @ 1920x1080p 60fps VLC is using the "default skin" but, I am running the LXDE Desktop environment not the default RPD desktop environment. (but this issue has occurred multiple times before I changed my Desktop environment). |
I seem to have a similar issue, but the context is a bit different. If this counts as hijacking: Sorry, please tell me to open a new issue in this case. ContextWe have a custom build UI that runs under wayland, or to be more exact under weston in fullscreen. The display has only a resolution of 720x480. The user interface renders its content using cairo (using the cairo-gl backend), pango (for font rendering) and librsvg (for displaying some vector graphics). IssueOn some screens the UI suddenly freezes with very similar issues described by @DeviceIoControl.
Both logs contain also the kernel command line, plenty of stack traces and similar lines like:
and:
System
Please tell me what other information I can provide to help to debug this. |
@sahib Your issue is very different. 4.14 is now very out of date, particularly with regard 3D and DRM/KMS driver changes. I would strongly recommend you update to 4.19, if not later. |
Thank you very much @6by9, I'm still trying to wrap my head around all of this.
Sorry then about hijacking this issue. I will open a new issue once I have some new information.
Is increasing the amount of space for the CMA supposed to help? Also, can this be an issue in the application itself (i.e. using insane amounts of memory for unknown reasons) or is this more of a driver issue? I suppose the answer is "both"...
I can try and update once I get to it. Should be easily possible.
This seems to be at 19.0.8. I can also try to update this. |
Update on this: I updated the kernel to 4.19.71 and set the CMA memory to 256M (might try lower later). I did not update mesa. So far I have not been able to reproduce the bug. Thanks @6by9 👍 If this crops up again, I will open a new issue. |
Hi, To reproduce, first enable OpenGL in raspi-config and reboot as prompted (fake vs full KMS didn't make a difference). It might also make sense to enable the SSH server so you can log into the half-frozen RPi later. Then build latest dhewm3 git:
Now you should have a dhewm3 executable in dhewm3/build/.
So now your original top dir (probably /home/raspberrypi/ by default) should contain the following stuff:
Now run dhewm3, you should get to see the main menu, where we'll do some configuration before restarting the game:
Now Doom3/dhewm3 is configured to run in the lowest possible settings, and it should have written its config so it remembers those settings after a later crash. Now you can finally run the game:
An excerpt from my dmesg: [ 20.911620] fuse init (API version 7.27) # the last "normal", old line from boot [ 253.900107] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from CMA: [ 253.900122] [drm] V3D: 488460kb BOs (3503) [ 253.900126] [drm] V3D shader: 260kb BOs (64) [ 253.900130] [drm] dumb: 9016kb BOs (2) [ 253.900136] [drm] total purged BO: 712kb BOs (8) [ 253.900149] vc4_v3d 3fc00000.v3d: Failed to allocate memory for tile binning: -12. You may need to enable CMA or give it more memory. [ 254.917700] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from CMA: [ 254.917715] [drm] V3D: 498188kb BOs (3603) [ 254.917719] [drm] V3D shader: 260kb BOs (64) [ 254.917723] [drm] dumb: 9016kb BOs (2) [ 254.917727] [drm] total purged BO: 1636kb BOs (22) [ 254.918211] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from CMA: [ 254.918215] [drm] V3D: 497892kb BOs (3598) [ 254.918219] [drm] V3D shader: 260kb BOs (64) [ 254.918222] [drm] dumb: 9016kb BOs (2) [ 254.918226] [drm] total purged BO: 1636kb BOs (22) [ 254.918685] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from CMA: [ 254.918689] [drm] V3D: 497892kb BOs (3598) [ 254.918693] [drm] V3D shader: 260kb BOs (64) [ 254.918696] [drm] dumb: 9016kb BOs (2) [ 254.918700] [drm] total purged BO: 1636kb BOs (22) ... (it goes on like this forever) Basically the same happens when using d3wasm (https://github.com/gabrielcuvillier/d3wasm/), which is based on dhewm3 but has a new renderer, instead of dhewm3 (https://github.com/dhewm/dhewm3/). |
Happens with Kodi as well. Leaving its media player on pause for ~30 min, or just inactive for long time (anywhere, including main interface) |
Got the same issue twice today. [ 4191.445808] [drm:vc4_bo_create [vc4]] ERROR Failed to allocate from CMA: Did not do anything particularly strange, other than some browsing, terminal, cmake + emacs dtparam=audio=on [pi4] [all] |
Set this in /boot/config.txt, seems to help, so far no problems for a day. max_framebuffers=1 |
Not sure what max_framebuffers does, but I think on a pi3 it defaults to 1??? |
default is 2 |
Simply limits the number of displays that will be instantiated. So if you set it to one on a Pi4 you only get one HDMI port, which can save some memory. TBH, setting to 2 is fine for almost all use cases., even on devices prior to the 4, pre-KMS, since frame buffers are only created if displays are found. |
cma_lwm/cma_hwm were removed from firmware over two years ago. Identifying the exact line you think helped would be useful (note: it is definitely not cma_lwm/cma_hwm which don't exist). |
havent "crashed" yet since that change. |
If you want to help narrow down this issue, then remove the lines one at a time and see if things are still stable after a day or two. Report back if removing any line had an obvious effect on stability. |
It is definitely not fixed. I will try to revert back to the legacy driver and see if it still happens. |
Is there anything else that can be done to track down / solve this? The legacy driver clearly does not crash, but it has the bad habit of not sending my monitor to sleep for instance, so I really miss vc4. |
Isn't this workaround for this issue? https://www.raspberrypi.org/forums/viewtopic.php?t=223363#p1614476 Our tests are currently running so I don't know if it really solve this problem but it seems promising. |
I can confirm that the workaround works for dhewm3. With |
Should this be applied automatically by raspi-config when the OpenGL module is selected? |
This seems to be the same issue as anholt/linux#135 |
the workaround doesn't work for me: pi@raspi:~ $ cat /sys/devices/platform/soc/3fc00000.v3d/power/control on Jun 12 18:51:21 raspi lightdm[2064]: Error getting user list from org.freedesktop.Accounts: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Accounts was not provided by any .service files Jun 12 18:51:23 raspi lightdm[2127]: Error getting user list from org.freedesktop.Accounts: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Accounts was not provided by any .service files Jun 12 22:00:14 raspi kernel: [96643.452290] NMI backtrace for cpu 0 Jun 12 22:00:14 raspi kernel: [96643.452300] CPU: 0 PID: 2069 Comm: Xorg Tainted: G C 4.19.66-v7+ #1253 Jun 12 22:00:14 raspi kernel: [96643.452303] Hardware name: BCM2835 Jun 12 22:00:14 raspi kernel: [96643.452329] [<80111f38>] (unwind_backtrace) from [<8010d4b0>] (show_stack+0x20/0x24) Jun 12 22:00:14 raspi kernel: [96643.452342] [<8010d4b0>] (show_stack) from [<808191e0>] (dump_stack+0xd4/0x118) Jun 12 22:00:14 raspi kernel: [96643.452351] [<808191e0>] (dump_stack) from [<8081fc08>] (nmi_cpu_backtrace+0xc8/0xcc) Jun 12 22:00:14 raspi kernel: [96643.452360] [<8081fc08>] (nmi_cpu_backtrace) from [<8081fd04>] (nmi_trigger_cpumask_backtrace+0xf8/0x134) Jun 12 22:00:14 raspi kernel: [96643.452368] [<8081fd04>] (nmi_trigger_cpumask_backtrace) from [<8011050c>] (arch_trigger_cpumask_backtrace+0x20/0x24) Jun 12 22:00:14 raspi kernel: [96643.452380] [<8011050c>] (arch_trigger_cpumask_backtrace) from [<80191ab4>] (rcu_dump_cpu_stacks+0xac/0xdc) Jun 12 22:00:14 raspi kernel: [96643.452390] [<80191ab4>] (rcu_dump_cpu_stacks) from [<8019133c>] (rcu_check_callbacks+0x8ec/0x968) Jun 12 22:00:14 raspi kernel: [96643.452398] [<8019133c>] (rcu_check_callbacks) from [<80199440>] (update_process_times+0x40/0x6c) Jun 12 22:00:14 raspi kernel: [96643.452410] [<80199440>] (update_process_times) from [<801abd70>] (tick_sched_handle+0x64/0x70) Jun 12 22:00:14 raspi kernel: [96643.452419] [<801abd70>] (tick_sched_handle) from [<801abfe4>] (tick_sched_timer+0x5c/0xb8) Jun 12 22:00:14 raspi kernel: [96643.452427] [<801abfe4>] (tick_sched_timer) from [<80199fc8>] (__hrtimer_run_queues+0x164/0x320) Jun 12 22:00:14 raspi kernel: [96643.452435] [<80199fc8>] (__hrtimer_run_queues) from [<8019abe8>] (hrtimer_interrupt+0x130/0x2a4) Jun 12 22:00:14 raspi kernel: [96643.452447] [<8019abe8>] (hrtimer_interrupt) from [<806abe20>] (arch_timer_handler_phys+0x40/0x48) Jun 12 22:00:14 raspi kernel: [96643.452458] [<806abe20>] (arch_timer_handler_phys) from [<80184958>] (handle_percpu_devid_irq+0x88/0x23c) Jun 12 22:00:14 raspi kernel: [96643.452470] [<80184958>] (handle_percpu_devid_irq) from [<8017ea5c>] (generic_handle_irq+0x34/0x44) Jun 12 22:00:14 raspi kernel: [96643.452480] [<8017ea5c>] (generic_handle_irq) from [<8017f198>] (__handle_domain_irq+0x6c/0xc4) Jun 12 22:00:14 raspi kernel: [96643.452492] [<8017f198>] (__handle_domain_irq) from [<801021b4>] (bcm2836_arm_irqchip_handle_irq+0x60/0xa4) Jun 12 22:00:14 raspi kernel: [96643.452500] [<801021b4>] (bcm2836_arm_irqchip_handle_irq) from [<801019bc>] (__irq_svc+0x5c/0x7c) Jun 12 22:00:14 raspi kernel: [96643.452504] Exception stack(0x9c0efc08 to 0x9c0efc50) Jun 12 22:00:14 raspi kernel: [96643.452511] fc00: 808367e4 00000000 40000093 40000093 60000013 973b4e40 Jun 12 22:00:14 raspi kernel: [96643.452518] fc20: ffffffff ffffffff 973b4f20 9d199400 973b4f10 9c0efc6c 80d0517c 9c0efc58 Jun 12 22:00:14 raspi kernel: [96643.452522] fc40: 00000000 808367f8 40000013 ffffffff Jun 12 22:00:14 raspi kernel: [96643.452531] [<801019bc>] (__irq_svc) from [<808367f8>] (_raw_spin_unlock_irqrestore+0x50/0x70) Jun 12 22:00:14 raspi kernel: [96643.452480] [<8017ea5c>] (generic_handle_irq) from [<8017f198>] (__handle_domain_irq+0x6c/0xc4) Jun 12 22:00:14 raspi kernel: [96643.452492] [<8017f198>] (__handle_domain_irq) from [<801021b4>] (bcm2836_arm_irqchip_handle_irq+0x60/0xa4) Jun 12 22:00:14 raspi kernel: [96643.452500] [<801021b4>] (bcm2836_arm_irqchip_handle_irq) from [<801019bc>] (__irq_svc+0x5c/0x7c) Jun 12 22:00:14 raspi kernel: [96643.452504] Exception stack(0x9c0efc08 to 0x9c0efc50) Jun 12 22:00:14 raspi kernel: [96643.452511] fc00: 808367e4 00000000 40000093 40000093 60000013 973b4e40 Jun 12 22:00:14 raspi kernel: [96643.452518] fc20: ffffffff ffffffff 973b4f20 9d199400 973b4f10 9c0efc6c 80d0517c 9c0efc58 Jun 12 22:00:14 raspi kernel: [96643.452522] fc40: 00000000 808367f8 40000013 ffffffff Jun 12 22:00:14 raspi kernel: [96643.452531] [<801019bc>] (__irq_svc) from [<808367f8>] (_raw_spin_unlock_irqrestore+0x50/0x70) Jun 12 22:00:14 raspi kernel: [96643.452582] [<808367f8>] (_raw_spin_unlock_irqrestore) from [<7f731140>] (vc4_v3d_get_bin_slot+0x58/0xec [vc4]) Jun 12 22:00:14 raspi kernel: [96643.452668] [<7f731140>] (vc4_v3d_get_bin_slot [vc4]) from [<7f731434>] (validate_tile_binning_config+0x78/0x174 [vc4]) Jun 12 22:00:14 raspi kernel: [96643.452735] [<7f731434>] (validate_tile_binning_config [vc4]) from [<7f731aa8>] (vc4_validate_bin_cl+0xd4/0x2ac [vc4]) Jun 12 22:00:14 raspi kernel: [96643.452799] [<7f731aa8>] (vc4_validate_bin_cl [vc4]) from [<7f729938>] (vc4_submit_cl_ioctl+0x79c/0xd18 [vc4]) Jun 12 22:00:14 raspi kernel: [96643.453002] [<7f729938>] (vc4_submit_cl_ioctl [vc4]) from [<7f57caec>] (drm_ioctl_kernel+0xb4/0xf0 [drm]) Jun 12 22:00:14 raspi kernel: [96643.453262] [<7f57caec>] (drm_ioctl_kernel [drm]) from [<7f57ced4>] (drm_ioctl+0x230/0x3cc [drm]) Jun 12 22:00:14 raspi kernel: [96643.453386] [<7f57ced4>] (drm_ioctl [drm]) from [<802bf870>] (do_vfs_ioctl+0xbc/0x804) Jun 12 22:00:14 raspi kernel: [96643.453396] [<802bf870>] (do_vfs_ioctl) from [<802bfffc>] (ksys_ioctl+0x44/0x6c) Jun 12 22:00:14 raspi kernel: [96643.453404] [<802bfffc>] (ksys_ioctl) from [<802c003c>] (sys_ioctl+0x18/0x1c) Jun 12 22:00:14 raspi kernel: [96643.453414] [<802c003c>] (sys_ioctl) from [<80101000>] (ret_fast_syscall+0x0/0x28) Jun 12 22:00:14 raspi kernel: [96643.453418] Exception stack(0x9c0effa8 to 0x9c0efff0) Jun 12 22:00:14 raspi kernel: [96643.453423] ffa0: 010007a0 7eb99e28 0000000d c0b06440 7eb99e28 00000000 Jun 12 22:00:14 raspi kernel: [96643.453430] ffc0: 010007a0 7eb99e28 c0b06440 00000036 76f24968 76f24968 7611f498 7eb99e28 Jun 12 22:00:14 raspi kernel: [96643.453434] ffe0: 76d1208c 7eb99dec 76cf8788 7699551c |
@cagnulein your issue seems to be unrelated. This issue and the workaround covers only the failure to allocate buffer object from CMA. Which is not your case. (No simptoms like |
@j123b567 i had them before adding "/sys/devices/platform/soc/*.v3d/power/control set to on" |
I still do not understand why this "workaround" is not released properly. |
I recently experienced CMA allocation problems on my RPi4 8GB + Ubuntu 20.10 + Sway (a wayland compositor based on wlroots). I added cma=512M@128M to cmdline.txt and so far that seems to work fine (but more tests are needed).
which means that it is possible to monitor CmaFree with a command such as
I noticed that most applications allocate a few MB from CMA . I assume that this is to store their window content in one or two buffers (reminder: a typical 1920x1080 screen consumes 192010804 = 8MB). However, I also noticed that CmaFree can decrease by hundred of MBs while resizing some windows (all XWayland windows, imv-wayland, Firefox Wayland with layers.acceleration.force-enables=true,...). Other windows can be resized without consuming any CMA (wev, Firefox Wayland with layers.acceleration.force-enables=false, ...). This is coherent with the fact that I experienced a lot of crashes while resizing windows. So I suspect that the problem is that a lot of applications (including XWayland) do not immediately free their old buffers when they receive a resize event. The resource is probably marked as unused until it is released by a garbage collector. A Wayland compositor (or a X11 server?) can send several resize events per seconds thus causing a large number of old buffers to remain alive. If I am right then there are 2 ways to solve the problem:
|
This issue has been open for long enough, I think it time to go. Thanks to everyone for contributing and helping. |
Has it been fixed? No? Then please don't close it |
I still experience this on Debian 11. |
Hi everyone, Just wanted to clarify that I closed this issue, because the issue had been open for more than 2 years with no mention of a fix in sight. If people still feel strongly about this issue then, I can reopen it, but for the time being it will remain closed. |
I don't think there's any reason to close this given that it hasn't been resolved as far as I can tell. Why do you think it should be closed? |
I don't use my Raspberry Pi anymore, the comments in this issue have become cluttered, and I have pretty much given up on this issue at this point since, I can't see any resolution to this issue after 2 years of it being open. Unless there is a fix in the works that I am unaware of? |
You can unsubscribe from this issue if you don't care about it. Whether you use your RPi or not doesn't change that this issue still affects people. I don't know if a fix is in the works. |
Yes, please reopen it, otherwise someone else would have to create a new issue and all the information collected here is lost and the people subscribed to this post (that still care) are disconnected. Update: Thank you very much! :) |
Bug Description:
Usage of any type of Hardware accelerated application (VLC media player, Kodi, Google Chrome, etc.) for an extended period of time (between 30 mins to 1 hour), causes CMA to fail to allocate memory (out of memory) for the vc4-fkms-v3d driver. This causes all of the Hardware accelerated applications running at that point in time to freeze (turn black) or produce serious visual artifacting.
The only fix that I have found, is to disable the vc4-fkms-v3d driver in the /boot/config.txt file (or by switching the GL Driver to "Legacy" mode in raspi-config), but unfortunately, this results in the loss of Hardware acceleration.
To reproduce:
This typically occurs when watching a video for more than 30 mins in the VLC media player application. Eventually the video will freeze and any attempt to interact with the application will cause the application to freeze or produce visual artifacting.
Expected behaviour:
Expected to use Hardware accelerated applications for an extended period of time, without causing applications freezing or produce visual artifacting.
Actual behaviour:
See "Bug Description" section above.
System:
Device Model -> Raspberry Pi 4 Model B (4GB)
OS -> Raspbian GNU/Linux 10 (buster) armv7l
Firmware version -> a51b488198a8c0360b93351682e7432d89d70411
Kernel version -> 4.19.66-v7l+
Logs:
[3912.566190] [drm:vc4_bo_create [vc4]] ERROR Failed to allocate from CMA:
[3912.573209] [drm] kernel: 5120kb BOs (1)
[3912.579505] [drm] V3D: 15116kb BOs (15)
[3912.585862] [drm] V3D shader: 120kb BOs (29)
[3912.592241] [drm] dumb: 48kb BOs (3)
The text was updated successfully, but these errors were encountered: