Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: helpers: Clarify comment on size_t = uintptr_t guard #330

Merged
merged 1 commit into from
Jun 3, 2021

Conversation

geofft
Copy link

@geofft geofft commented Jun 3, 2021

Also fix the guard to not fail spriously on ARM. We just need the types
to be ABI-compatible; we don't need C to be willing to implicitly
convert between them.

Signed-off-by: Geoffrey Thomas [email protected]

@geofft
Copy link
Author

geofft commented Jun 3, 2021

Fixes #329.

(Also happy to see this squashed into whatever, it's effectively cleaning up old linux-kernel-module-rust code and making it presentable :) )

@alex
Copy link
Member

alex commented Jun 3, 2021

@geofft can you rebase this on a more recent rust branch? It looks to be several months out of date, which is (I think) why CI isn't running.

@ojeda
Copy link
Member

ojeda commented Jun 3, 2021

@geofft can you rebase this on a more recent rust branch? It looks to be several months out of date, which is (I think) why CI isn't running.

The reason that the CI is not running is that this is against rust-next -- please do not merge!

Given GitHub does not seem to allow to simply disable PR functionality, and only allows to "protect" some branches (which is enforced on pushes too), I will make @ksquirrel warn about it. Give me a moment to test it.

@Rust-for-Linux Rust-for-Linux deleted a comment from ksquirrel Jun 3, 2021
@ksquirrel

This comment has been minimized.

Also fix the guard to not fail spriously on ARM. We just need the types
to be ABI-compatible; we don't need C to be willing to implicitly
convert between them.

Signed-off-by: Geoffrey Thomas <[email protected]>
@geofft geofft changed the base branch from rust-next to rust June 3, 2021 13:03
@ksquirrel
Copy link
Member

Review of 3c262d5ce907:

  • ✔️ Commit 3c262d5: Looks fine!

Copy link
Collaborator

@TheSven73 TheSven73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(edited) no worries!

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will let other folks have a look at this before merging.

Copy link
Collaborator

@TheSven73 TheSven73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot! 💯

I also noticed that size_t and unsigned long were completely identical on arm32, yet gcc didn't think they were compatible. I'd love to further understand the underlying reason for that.

But that definitely shouldn't hold up this PR!

@geofft
Copy link
Author

geofft commented Jun 3, 2021

See discussion in #329, but tl;dr the two underlying types are int vs. long, which are both u32 but don't satisfy __builtin_types_compatible_p. Basically I was wrong in my assumption about what __builtin_types_compatible_p does and I am hoping this new check is closer to what we want.

@TheSven73
Copy link
Collaborator

Indeed. I just don't get why two types which are both 32-bit, unsigned, and using the same rules for arithmetic etc would end up as being marked "not compatible" by gcc / __builtin_types_compatible_p.

@ojeda ojeda merged commit 6e51134 into Rust-for-Linux:rust Jun 3, 2021
@ojeda
Copy link
Member

ojeda commented Jun 8, 2021

Indeed. I just don't get why two types which are both 32-bit, unsigned, and using the same rules for arithmetic etc would end up as being marked "not compatible" by gcc / __builtin_types_compatible_p.

It is just what the C standard says in 6.7.2: int and long are distinct types. So are char and signed char. But not int and signed int! :)

@TheSven73
Copy link
Collaborator

It is just what the C standard says in 6.7.2: int and long are distinct types. So are char and signed char. But not int and signed int! :)

I'm looking for the "facepalm" emoji but can't find it on GitHub :)

fbq pushed a commit that referenced this pull request Sep 30, 2024
Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock
on x86 due to a chain of locks and SRCU synchronizations.  Translating the
below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on
CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the
fairness of r/w semaphores).

    CPU0                     CPU1                     CPU2
1   lock(&kvm->slots_lock);
2                                                     lock(&vcpu->mutex);
3                                                     lock(&kvm->srcu);
4                            lock(cpu_hotplug_lock);
5                            lock(kvm_lock);
6                            lock(&kvm->slots_lock);
7                                                     lock(cpu_hotplug_lock);
8   sync(&kvm->srcu);

Note, there are likely more potential deadlocks in KVM x86, e.g. the same
pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with
__kvmclock_cpufreq_notifier():

  cpuhp_cpufreq_online()
  |
  -> cpufreq_online()
     |
     -> cpufreq_gov_performance_limits()
        |
        -> __cpufreq_driver_target()
           |
           -> __target_index()
              |
              -> cpufreq_freq_transition_begin()
                 |
                 -> cpufreq_notify_transition()
                    |
                    -> ... __kvmclock_cpufreq_notifier()

But, actually triggering such deadlocks is beyond rare due to the
combination of dependencies and timings involved.  E.g. the cpufreq
notifier is only used on older CPUs without a constant TSC, mucking with
the NX hugepage mitigation while VMs are running is very uncommon, and
doing so while also onlining/offlining a CPU (necessary to generate
contention on cpu_hotplug_lock) would be even more unusual.

The most robust solution to the general cpu_hotplug_lock issue is likely
to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq
notifier doesn't to take kvm_lock.  For now, settle for fixing the most
blatant deadlock, as switching to an RCU-protected list is a much more
involved change, but add a comment in locking.rst to call out that care
needs to be taken when walking holding kvm_lock and walking vm_list.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S         O
  ------------------------------------------------------
  tee/35048 is trying to acquire lock:
  ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm]

  but task is already holding lock:
  ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm]

  which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

  -> #3 (kvm_lock){+.+.}-{3:3}:
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         kvm_dev_ioctl+0x4fb/0xe50 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #2 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2e/0xb0
         static_key_slow_inc+0x16/0x30
         kvm_lapic_set_base+0x6a/0x1c0 [kvm]
         kvm_set_apic_base+0x8f/0xe0 [kvm]
         kvm_set_msr_common+0x9ae/0xf80 [kvm]
         vmx_set_msr+0xa54/0xbe0 [kvm_intel]
         __kvm_set_msr+0xb6/0x1a0 [kvm]
         kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm]
         kvm_vcpu_ioctl+0x485/0x5b0 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #1 (&kvm->srcu){.+.+}-{0:0}:
         __synchronize_srcu+0x44/0x1a0
         synchronize_srcu_expedited+0x21/0x30
         kvm_swap_active_memslots+0x110/0x1c0 [kvm]
         kvm_set_memslot+0x360/0x620 [kvm]
         __kvm_set_memory_region+0x27b/0x300 [kvm]
         kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm]
         kvm_vm_ioctl+0x295/0x650 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #0 (&kvm->slots_lock){+.+.}-{3:3}:
         __lock_acquire+0x15ef/0x2e30
         lock_acquire+0xe0/0x260
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         set_nx_huge_pages+0x179/0x1e0 [kvm]
         param_attr_store+0x93/0x100
         module_attr_store+0x22/0x40
         sysfs_kf_write+0x81/0xb0
         kernfs_fop_write_iter+0x133/0x1d0
         vfs_write+0x28d/0x380
         ksys_write+0x70/0xe0
         __x64_sys_write+0x1f/0x30
         x64_sys_call+0x281b/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

Cc: Chao Gao <[email protected]>
Fixes: 0bf5049 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
Cc: [email protected]
Reviewed-by: Kai Huang <[email protected]>
Acked-by: Kai Huang <[email protected]>
Tested-by: Farrah Chen <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants