Skip to content

Commit

Permalink
x86/CET: Fix S3 resume with shadow stacks active
Browse files Browse the repository at this point in the history
The original shadow stack support has an error on S3 resume with very bizarre
fallout.  The BSP comes back up, but APs fail with:

  (XEN) Enabling non-boot CPUs ...
  (XEN) Stuck ??
  (XEN) Error bringing CPU1 up: -5

and then later (on at least two Intel TigerLake platforms), the next HVM vCPU
to be scheduled on the BSP dies with:

  (XEN) d1v0 Unexpected vmexit: reason 3
  (XEN) domain_crash called from vmx.c:4304
  (XEN) Domain 1 (vcpu#0) crashed on cpu#0:

The VMExit reason is EXIT_REASON_INIT, which has nothing to do with the
scheduled vCPU, and will be addressed in a subsequent patch.  It is a
consequence of the APs triple faulting.

The reason the APs triple fault is because we don't tear down the stacks on
suspend.  The idle/play_dead loop is killed in the middle of running, meaning
that the supervisor token is left busy.

On resume, SETSSBSY finds busy bit set, suffers #CP and triple faults because
the IDT isn't configured this early.

Rework the AP bring-up path to (re)create the supervisor token.  This ensures
the primary stack is non-busy before use.

Note: There are potential issues with the IST shadow stacks too, but fixing
      those is more involved.

Fixes: b60ab42 ("x86/shstk: Activate Supervisor Shadow Stacks")
Link: QubesOS/qubes-issues#7283
Reported-by: Thiner Logoer <[email protected]>
Reported-by: Marek Marczykowski-Górecki <[email protected]>
Signed-off-by: Andrew Cooper <[email protected]>
Tested-by: Thiner Logoer <[email protected]>
Tested-by: Marek Marczykowski-Górecki <[email protected]>
Reviewed-by: Jan Beulich <[email protected]>
(cherry picked from commit 7d95892)
  • Loading branch information
andyhhp committed Mar 25, 2022
1 parent 7f35c1f commit 82fc152
Showing 1 changed file with 13 additions and 5 deletions.
18 changes: 13 additions & 5 deletions xen/arch/x86/boot/x86_64.S
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,21 @@ ENTRY(__high_start)
test $CET_SHSTK_EN, %al
jz .L_ap_cet_done

/* Derive MSR_PL0_SSP from %rsp (token written when stack is allocated). */
mov $MSR_PL0_SSP, %ecx
/* Derive the supervisor token address from %rsp. */
mov %rsp, %rdx
and $~(STACK_SIZE - 1), %rdx
or $(PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8, %rdx

/*
* Write a new supervisor token. Doesn't matter on boot, but for S3
* resume this clears the busy bit.
*/
wrssq %rdx, (%rdx)

/* Point MSR_PL0_SSP at the token. */
mov $MSR_PL0_SSP, %ecx
mov %edx, %eax
shr $32, %rdx
mov %esp, %eax
and $~(STACK_SIZE - 1), %eax
or $(PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8, %eax
wrmsr

setssbsy
Expand Down

0 comments on commit 82fc152

Please sign in to comment.