Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guest VM system reset may fail and ACRN DM program hang #7133

Closed
yonghuah opened this issue Feb 21, 2022 · 1 comment
Closed

Guest VM system reset may fail and ACRN DM program hang #7133

yonghuah opened this issue Feb 21, 2022 · 1 comment
Labels
status: new The issue status: new for creation

Comments

@yonghuah
Copy link
Contributor

Recursively lock on 'mevent_lmutex' can be detected
in mevent thread when user tries to trigger system
reset from user VM, in this case, user VM reboot hang.

The backtrace for this issue:
#1 in mevent_qlock () at core/mevent.c:93
#2 in mevent_delete_even at core/mevent.c:357
===>Recursively LOCK
#3 in mevent_delete_close at core/mevent.c:387
#4 in acrn_timer_deinit at core/timer.c:106
#5 in virtio_reset_dev at hw/pci/virtio/virtio.c:171
#6 in virtio_console_reset at
hw/pci/virtio/virtio_console.c:196
#7 in virtio_console_destroy at
hw/pci/virtio/virtio_console.c:1015
#8 in virtio_console_teardown_backend at
hw/pci/virtio/virtio_console.c:1042
#9 in mevent_drain_del_list () at
core/mevent.c:348 ===> 1st LOCK
#10 in mevent_dispatch () at core/mevent.c:472
#11 in main at core/main.c:1110

So the root cause is:
mevent_mutex lock is recursively locked by mevent thread
itself (#9 for this first lock and #2 for recursively lock),
which is not allowed for mutex with default attribute.

@yonghuah yonghuah added the status: new The issue status: new for creation label Feb 21, 2022
@yonghuah
Copy link
Contributor Author

[External_System_ID] ACRN-7424

yonghuah added a commit to yonghuah/acrn-hypervisor that referenced this issue Feb 21, 2022
 'mevent_lmutex' is initialized as default type
 attempting to recursively lock on this kind of
 mutext results in undefined behaviour.

 Recursively lock on 'mevent_lmutex' can be detected
 in mevent thread when user tries to trigger system
 reset from user VM, in this case, user VM reboot hang.

 The backtrace for this issue:
  projectacrn#1 in mevent_qlock () at core/mevent.c:93
  projectacrn#2 in mevent_delete_even at core/mevent.c:357
    ===>Recursively LOCK
  projectacrn#3 in mevent_delete_close at core/mevent.c:387
  projectacrn#4 in acrn_timer_deinit at core/timer.c:106
  projectacrn#5 in virtio_reset_dev at hw/pci/virtio/virtio.c:171
  projectacrn#6 in virtio_console_reset at
     hw/pci/virtio/virtio_console.c:196
  projectacrn#7 in virtio_console_destroy at
    hw/pci/virtio/virtio_console.c:1015
  projectacrn#8 in virtio_console_teardown_backend at
    hw/pci/virtio/virtio_console.c:1042
  projectacrn#9 in mevent_drain_del_list () at
    core/mevent.c:348 ===> 1st LOCK
  projectacrn#10 in mevent_dispatch () at core/mevent.c:472
  projectacrn#11 in main at core/main.c:1110

  So the root cause is:
  mevent_mutex lock is recursively locked by mevent thread
  itself (projectacrn#9 for this first lock and projectacrn#2 for recursively lock),
  which is not allowed for mutex with default attribute.

  This patch changes the mutex type of 'mevent_lmutex'
  from default to "PTHREAD_MUTEX_RECURSIVE", because
  recrusively lock shall be allowed as user of mevent
  may call mevent functions (where mutex lock maybe required)
  in teardown callbacks.

Tracked-On: projectacrn#7133
Signed-off-by: Yonghua Huang <[email protected]>
Acked-by: Yu Wang <[email protected]>
yonghuah added a commit to yonghuah/acrn-hypervisor that referenced this issue Feb 21, 2022
 'mevent_lmutex' is initialized as default type,
 while attempting to recursively lock on this
 kind of mutext results in undefined behaviour.

 Recursively lock on 'mevent_lmutex' can be detected
 in mevent thread when user tries to trigger system
 reset from user VM, in this case, user VM reboot hang.

 The backtrace for this issue:
  projectacrn#1 in mevent_qlock () at core/mevent.c:93
  projectacrn#2 in mevent_delete_even at core/mevent.c:357
    ===>Recursively LOCK
  projectacrn#3 in mevent_delete_close at core/mevent.c:387
  projectacrn#4 in acrn_timer_deinit at core/timer.c:106
  projectacrn#5 in virtio_reset_dev at hw/pci/virtio/virtio.c:171
  projectacrn#6 in virtio_console_reset at
     hw/pci/virtio/virtio_console.c:196
  projectacrn#7 in virtio_console_destroy at
    hw/pci/virtio/virtio_console.c:1015
  projectacrn#8 in virtio_console_teardown_backend at
    hw/pci/virtio/virtio_console.c:1042
  projectacrn#9 in mevent_drain_del_list () at
    core/mevent.c:348 ===> 1st LOCK
  projectacrn#10 in mevent_dispatch () at core/mevent.c:472
  projectacrn#11 in main at core/main.c:1110

  So the root cause is:
  mevent_mutex lock is recursively locked by mevent thread
  itself (projectacrn#9 for this first lock and projectacrn#2 for recursively lock),
  which is not allowed for mutex with default attribute.

  This patch changes the mutex type of 'mevent_lmutex'
  from default to "PTHREAD_MUTEX_RECURSIVE", because
  recrusively lock shall be allowed as user of mevent
  may call mevent functions (where mutex lock maybe required)
  in teardown callbacks.

Tracked-On: projectacrn#7133
Signed-off-by: Yonghua Huang <[email protected]>
Acked-by: Yu Wang <[email protected]>
acrnsi-robot pushed a commit that referenced this issue Feb 21, 2022
 'mevent_lmutex' is initialized as default type,
 while attempting to recursively lock on this
 kind of mutext results in undefined behaviour.

 Recursively lock on 'mevent_lmutex' can be detected
 in mevent thread when user tries to trigger system
 reset from user VM, in this case, user VM reboot hang.

 The backtrace for this issue:
  #1 in mevent_qlock () at core/mevent.c:93
  #2 in mevent_delete_even at core/mevent.c:357
    ===>Recursively LOCK
  #3 in mevent_delete_close at core/mevent.c:387
  #4 in acrn_timer_deinit at core/timer.c:106
  #5 in virtio_reset_dev at hw/pci/virtio/virtio.c:171
  #6 in virtio_console_reset at
     hw/pci/virtio/virtio_console.c:196
  #7 in virtio_console_destroy at
    hw/pci/virtio/virtio_console.c:1015
  #8 in virtio_console_teardown_backend at
    hw/pci/virtio/virtio_console.c:1042
  #9 in mevent_drain_del_list () at
    core/mevent.c:348 ===> 1st LOCK
  #10 in mevent_dispatch () at core/mevent.c:472
  #11 in main at core/main.c:1110

  So the root cause is:
  mevent_mutex lock is recursively locked by mevent thread
  itself (#9 for this first lock and #2 for recursively lock),
  which is not allowed for mutex with default attribute.

  This patch changes the mutex type of 'mevent_lmutex'
  from default to "PTHREAD_MUTEX_RECURSIVE", because
  recrusively lock shall be allowed as user of mevent
  may call mevent functions (where mutex lock maybe required)
  in teardown callbacks.

Tracked-On: #7133
Signed-off-by: Yonghua Huang <[email protected]>
Acked-by: Yu Wang <[email protected]>
@fuzhongl fuzhongl closed this as completed Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: new The issue status: new for creation
Projects
None yet
Development

No branches or pull requests

2 participants