kola: 36.20220906.20.0: mm/page_alloc.c kernel warning on aarch64/openstack for 5.19 kernels #1292

dustymabe · 2022-09-06T14:08:46Z

We're seeing this in our testing-devel and branched streams.

36.20220906.20.0 with kernel-5.19.6-200.fc36.aarch64
37.20220904.92.0 with kernel-5.19.6-300.fc37.aarch64

Here's what the warning looks like:

[    5.135660] ------------[ cut here ]------------                                     
[    5.139852] WARNING: CPU: 0 PID: 18 at mm/page_alloc.c:5402 __alloc_pages+0x1a0/0x290
[    5.146972] Modules linked in:                                                       
[    5.149828] CPU: 0 PID: 18 Comm: cpuhp/0 Not tainted 5.19.6-200.fc36.aarch64 #1      
[    5.156667] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015           
[    5.163162] pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)          
[    5.169897] pc : __alloc_pages+0x1a0/0x290                                           
[    5.173875] lr : alloc_pages+0xb8/0x16c                                              
[    5.177528] sp : ffff80000810bb90                                                    
[    5.180671] x29: ffff80000810bb90 x28: 0000000000000000 x27: ffff30fe849c1000        
[    5.187576] x26: 0000000000000000 x25: ffff0001fef07940 x24: 000000000000001e        
[    5.195276] x23: 0000000000000dc0 x22: 0000000000000000 x21: 000000000000001e        
[    5.202055] x20: 000000000000001e x19: 0000000000040dc0 x18: ffffffffffffffff        
[    5.208899] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000        
[    5.215898] x14: 0000000000000001 x13: 0000000000000000 x12: 0000000000000001        
[    5.222655] x11: ffffcf037aaef6d0 x10: 0000000000001d90 x9 : ffffcf03789296ec        
[    5.229443] x8 : ffff0000c03b61f0 x7 : 7fffffffffffffff x6 : 000000036312036f        
[    5.236172] x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000        
[    5.242961] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffcf037ae76009        
[    5.249890] Call trace:                                                              
[    5.252240]  __alloc_pages+0x1a0/0x290                                               
[    5.255798]  alloc_pages+0xb8/0x16c                                                  
[    5.259083]  kmalloc_order+0x3c/0xc0                                                 
[    5.262558]  kmalloc_order_trace+0x38/0x144                                          
[    5.266504]  __kmalloc+0x308/0x370                                                   
[    5.269709]  cacheinfo_cpu_online+0x68/0x1d0                                         
[    5.273747]  cpuhp_invoke_callback+0x128/0x4e4                                       
[    5.278107]  cpuhp_thread_fun+0xe0/0x184                                             
[    5.281942]  smpboot_thread_fn+0x1e8/0x220                                           
[    5.285824]  kthread+0xf0/0x100                                                      
[    5.288818]  ret_from_fork+0x10/0x20                                                 
[    5.292210] ---[ end trace 0000000000000000 ]---

It happens consistently in aarch64/openstack in our provider (VexxHost). It may be something specific about their infra (i.e. software versions of hypervisor, etc) that is causing us to see this.

It also happens across a large number of tests, which makes me think maybe it isn't specific to the test but just a general issue.

One full console log:
console.txt

The text was updated successfully, but these errors were encountered:

travier · 2022-09-06T15:15:08Z

Do we have a contact at Vexxhost that could take a look?
I don't think that should block us from releasing if this is only on aarch64 OpenStack
Reporting upstream / to the provider would be good.

dustymabe · 2022-09-06T15:26:44Z

This doesn't appear to be happening with current rawhide (kernel-6.0.0-0.rc4.31.fc38.aarch64) so I assume there is a fix upstream somewhere.

bgilbert · 2022-09-06T15:53:33Z

This is fixed by torvalds/linux@e75d18c which is in 5.19.7. At a quick glance, I think the consequence of the bug is that some CPU cache information doesn't get populated into sysfs.

dustymabe · 2022-09-07T00:56:22Z

override proposed in: coreos/fedora-coreos-config#1960

dustymabe · 2022-10-03T19:35:29Z

The fix for this went into testing stream release 36.20220918.2.2. Please try out the new release and report issues.

dustymabe · 2022-10-03T19:36:32Z

The fix for this went into next stream release 37.20220910.1.0. Please try out the new release and report issues.

dustymabe · 2022-10-18T13:25:39Z

The fix for this went into stable stream release 36.20220918.3.0.

This was referenced Sep 6, 2022

testing: new release on 2022-09-05 - 36.20220906.2.0 coreos/fedora-coreos-streams#552

Closed

next: new release on 2022-09-05 - 36.20220906.1.0 coreos/fedora-coreos-streams#551

Closed

dustymabe added the status/pending-testing-release Fixed upstream. Waiting on a testing release. label Sep 9, 2022

dustymabe mentioned this issue Sep 22, 2022

stable: new release on 2022-09-19 (36.20220906.3.2) coreos/fedora-coreos-streams#560

Closed

34 tasks

dustymabe added status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. and removed status/pending-testing-release Fixed upstream. Waiting on a testing release. labels Oct 3, 2022

dustymabe closed this as completed Oct 18, 2022

dustymabe removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kola: 36.20220906.20.0: mm/page_alloc.c kernel warning on aarch64/openstack for 5.19 kernels #1292

kola: 36.20220906.20.0: mm/page_alloc.c kernel warning on aarch64/openstack for 5.19 kernels #1292

dustymabe commented Sep 6, 2022

travier commented Sep 6, 2022

dustymabe commented Sep 6, 2022

bgilbert commented Sep 6, 2022

dustymabe commented Sep 7, 2022

dustymabe commented Oct 3, 2022

dustymabe commented Oct 3, 2022

dustymabe commented Oct 18, 2022 •

edited

Loading

kola: 36.20220906.20.0: mm/page_alloc.c kernel warning on aarch64/openstack for 5.19 kernels #1292

kola: 36.20220906.20.0: mm/page_alloc.c kernel warning on aarch64/openstack for 5.19 kernels #1292

Comments

dustymabe commented Sep 6, 2022

travier commented Sep 6, 2022

dustymabe commented Sep 6, 2022

bgilbert commented Sep 6, 2022

dustymabe commented Sep 7, 2022

dustymabe commented Oct 3, 2022

dustymabe commented Oct 3, 2022

dustymabe commented Oct 18, 2022 • edited Loading

dustymabe commented Oct 18, 2022 •

edited

Loading