-
Notifications
You must be signed in to change notification settings - Fork 34
VM crash with UKSM #25
Comments
sample of error shown on VM !!!! X64 Exception Type - 00(#DE - Divide Error) CPU Apic ID - 00000000 !!!! |
Hi, it seems a math calculation error. Is this error from the guest OS inside QEMU or from host OS? |
The problem occurs on VM provided by Cisco. ( nexus 9000v , a Cisco customised linux )
The issues are not always the same… Sometimes it is one process(not always the same process crash with SIG11, sometimes a kernel failure… )
With UKSM disabled, we never see any issue….
Is there any recommendation regarding Kernel option ( compile ) on the Host ?
As we can’t tune the guest Kernel/Os, we can only tune kernel or Qemu options….
Alain
… On 9 Sep 2017, at 05:22, naixia ***@***.***> wrote:
Hi, it seems a math calculation error. Is this error from the guest OS inside QEMU or from host OS?
A detailed crash information can be more helpful, here is the previous closed issue example:
#18 <#18>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#25 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGON-PeACZ470NEc83MdYx7NqBZSNQ--ks5sggSBgaJpZM4PRXSM>.
|
It seems like a data corruption caused by KSM/UKSM. |
see here a scheduler crash observed: |
I did not test yet your patch... I will launch a compile and test... |
is this bug could be related to : |
Currently testing the last patch... |
Glad to hear that. This fix will be included in UKSM for v4.13 and later versions. |
I confirm... Regards and Hat off !!! |
Looks like I hit the same bug, I'm using EVE-NG version: 2.0.3-86,QEMU version: 2.4.0. I'm using a VM with 64 vCPUs and 128 gigs of RAM. When I try to run more than 10 Nexus NK9's, I see the crashes and the 11th VM goes into boot loop. How do I fix this? |
Please create new bug report. With stacktrace and more information attached. |
I upgraded kernel to 4.14.44 and i'm still seeing the problem..will open a new bug. Thanks. root@eve-ng:~# uname -a |
Can you tell me how to patch it? Which patch? |
这个问题解决了吗? |
Hi,
I'm the main developper of eve-ng and we have integrated UKSM in our kernel.
We currently use ubuntu kernel 4.9.40 and we observe a lot of crash on big Qemu VM.
Indeed running 6 Big VM ( 2vCPU + 8G of ram and using a lot of interrupt inside the VM ) is unstable and not safe at all....
I understand that you need information so could you please give a set of required information needed for investigations ?
We could also communicate via mail ( [email protected] )
The text was updated successfully, but these errors were encountered: