Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

BUG: kernel NULL pointer dereference #52

Closed
JohnyPeaN opened this issue Nov 19, 2019 · 3 comments
Closed

BUG: kernel NULL pointer dereference #52

JohnyPeaN opened this issue Nov 19, 2019 · 3 comments

Comments

@JohnyPeaN
Copy link

Hi. I got this kernel log on an idle system.
CPU: AMD 3900X
OS: Manjaro
Kernel: 5.3.11 (with bmq cpu scheduler)

The kernel is custom built and it could be caused by a lot of customizations. but it seems that the cause is UKSM. The system became unstable. I can ssh in, but can't run most of basic system utilities like htop.

dmesg:
[143121.647917] BUG: kernel NULL pointer dereference, address: 0000000000000800
[143121.647920] #PF: supervisor write access in kernel mode
[143121.647921] #PF: error_code(0x0002) - not-present page
[143121.647921] PGD e4fe0c067 P4D e4fe0c067 PUD dffc48067 PMD c2d97f067 PTE 0
[143121.647924] Oops: 0002 [#1] PREEMPT SMP NOPTI
[143121.647926] CPU: 0 PID: 341 Comm: uksmd Tainted: P OE 5.3.11-1-MANJARO #1
[143121.647927] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming X, BIOS P2.46 10/30/2019
[143121.647930] RIP: 0010:rb_erase+0xa2/0x380
[143121.647931] Code: 84 0e 02 00 00 49 3b 78 10 0f 84 37 02 00 00 4d 89 48 08 4d 85 d2 0f 84 e2 00 00 00 48 83 c8 01 48 89 0a 49 89 02 c3 48 8b 07 <48> 89 01 48 83 e0 fc 0f 84 cf 01 00 00 48 3b 78 10 0f 84 1d 02 00
[143121.647932] RSP: 0018:ffffb045809e3c68 EFLAGS: 00010246
[143121.647933] RAX: 0000000000000001 RBX: ffff9aa8082458c0 RCX: 0000000000000800
[143121.647934] RDX: 0000000000000000 RSI: ffff9aa5b63e8e98 RDI: ffff9aa5ce5c4668
[143121.647935] RBP: ffff9aa5ce5c4640 R08: 0000000000000000 R09: 0000000000000067
[143121.647935] R10: ffff9aa5ce5c4640 R11: 0000000010088e84 R12: 0000000010088e84
[143121.647936] R13: 0000000000000000 R14: ffff9aa5ce5c4640 R15: 00007f061a238000
[143121.647937] FS: 0000000000000000(0000) GS:ffff9aa8ae600000(0000) knlGS:0000000000000000
[143121.647938] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[143121.647938] CR2: 0000000000000800 CR3: 0000000ea8f54000 CR4: 00000000003406f0
[143121.647939] Call Trace:
[143121.647942] cmp_and_merge_page+0xfca/0x23d0
[143121.647945] ? follow_page+0x1f5/0x290
[143121.647946] scan_vma_one_page+0x447/0x1710
[143121.647949] ? preempt_count_add+0x68/0xa0
[143121.647951] ? _raw_spin_lock+0x13/0x30
[143121.647952] uksm_do_scan+0x18c/0x35b0
[143121.647954] ? _raw_spin_lock_irqsave+0x26/0x50
[143121.647955] ? _raw_spin_unlock_irqrestore+0x20/0x40
[143121.647957] ? del_timer_sync+0xa6/0xc0
[143121.647959] ? schedule_timeout+0x236/0x4c0
[143121.647961] uksm_scan_thread+0x16c/0x1b0
[143121.647963] kthread+0x13b/0x170
[143121.647964] ? uksm_do_scan+0x35b0/0x35b0
[143121.647965] ? kthread_park+0x80/0x80
[143121.647967] ret_from_fork+0x22/0x40
[143121.647969] Modules linked in: snd_seq_dummy snd_seq rpcsec_gss_krb5 rfcomm fuse xt_CHECKSUM xt_MASQUERADE xt_conntrack ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_filter cmac algif_hash algif_skcipher af_alg bnep zfs(POE) zunicode(POE) nls_iso8859_1 zlua(POE) nls_cp437 zavl(POE) vfat icp(POE) fat edac_mce_amd kvm_amd kvm iwlmvm f2fs mac80211 libarc4 btusb btrtl btbcm crct10dif_pclmul zcommon(POE) crc32_pclmul btintel ghash_clmulni_intel znvpair(POE) iwlwifi snd_usb_audio bluetooth cdc_ether sp5100_tco usbnet mxm_wmi mousedev joydev wmi_bmof snd_usbmidi_lib aesni_intel snd_rawmidi aes_x86_64 snd_seq_device pl2303 crypto_simd input_leds ecdh_generic mc r8152 cfg80211 cryptd spl(OE) ecc glue_helper igb pcspkr i2c_piix4 mii ccp rfkill dca tpm_crb tpm_tis tpm_tis_core tpm wmi rng_core pinctrl_amd mac_hid evdev
[143121.648000] acpi_cpufreq zenpower(OE) uinput nct6775 hwmon_vid vhost_scsi target_core_mod vhost_net snd_hda_codec_realtek tun vhost snd_hda_codec_generic tap ledtrig_audio nfsd sg snd_hda_codec_hdmi crypto_user snd_hda_intel snd_hda_codec auth_rpcgss nfs_acl lockd snd_hda_core snd_hwdep grace snd_pcm sunrpc snd_timer snd soundcore ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 ata_generic pata_acpi sd_mod hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid ahci libahci libata crc32c_intel xhci_pci scsi_mod xhci_hcd amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart msr vfio_pci bcache irqbypass vfio_virqfd vfio_iommu_type1 crc64 vfio zram lz4 lz4_compress
[143121.648027] CR2: 0000000000000800
[143121.648029] ---[ end trace e0e7df9fa2e6061c ]---
[143121.648030] RIP: 0010:rb_erase+0xa2/0x380
[143121.648031] Code: 84 0e 02 00 00 49 3b 78 10 0f 84 37 02 00 00 4d 89 48 08 4d 85 d2 0f 84 e2 00 00 00 48 83 c8 01 48 89 0a 49 89 02 c3 48 8b 07 <48> 89 01 48 83 e0 fc 0f 84 cf 01 00 00 48 3b 78 10 0f 84 1d 02 00
[143121.648033] RSP: 0018:ffffb045809e3c68 EFLAGS: 00010246
[143121.648034] RAX: 0000000000000001 RBX: ffff9aa8082458c0 RCX: 0000000000000800
[143121.648034] RDX: 0000000000000000 RSI: ffff9aa5b63e8e98 RDI: ffff9aa5ce5c4668
[143121.648036] RBP: ffff9aa5ce5c4640 R08: 0000000000000000 R09: 0000000000000067
[143121.648037] R10: ffff9aa5ce5c4640 R11: 0000000010088e84 R12: 0000000010088e84
[143121.648038] R13: 0000000000000000 R14: ffff9aa5ce5c4640 R15: 00007f061a238000
[143121.648038] FS: 0000000000000000(0000) GS:ffff9aa8ae600000(0000) knlGS:0000000000000000
[143121.648039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[143121.648040] CR2: 0000000000000800 CR3: 0000000ea8f54000 CR4: 00000000003406f0

@JohnyPeaN
Copy link
Author

I have found out that fix for older bug in #18 was not included. Although it seems as a different code path it was a NULL pointer dereference problem, too. I will try to apply the fix and see if the problem reoccurs.

@JohnyPeaN
Copy link
Author

I ran uksm with the mentioned patch until 5.4 without any error. Now I tried to sync the patch to 5.4 and it seems to work (my machine didn't blow up, yet). I don't know how to create pull requests and since I'm not really familiarized with kernel internals I didn't create one. Anyway, here is a patch for uksm-5.3.patch to make it work on linux 5.4:

uksm_5.3_5.4.patch

PS: could we apply fix in #18 to master?

@dolohow
Copy link
Owner

dolohow commented Dec 3, 2019

Fix finally incorporated in 5.4. Closing...

@dolohow dolohow closed this as completed Dec 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants