Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cpu kind information available #80

Merged
merged 22 commits into from
Jun 18, 2024
Merged

Make cpu kind information available #80

merged 22 commits into from
Jun 18, 2024

Conversation

carstenbauer
Copy link
Member

@carstenbauer carstenbauer commented Jun 14, 2024

Use case: Distinguishing efficiency and performance cores.

  • num_cpukinds()
  • num_virtual_cores_cpukinds()
  • show kind information in topology()?
  • tests
  • readme

Close #57

@carstenbauer
Copy link
Member Author

carstenbauer commented Jun 14, 2024

On Mac mini M1 (4 efficiency and 4 performance cores)

julia> using Hwloc

julia> num_cpukinds()
2

julia> num_virtual_cores_cpukinds()
2-element Vector{Int64}:
 4
 4

julia> Hwloc.get_cpukind_info()
2-element Vector{Union{Nothing, @NamedTuple{masks::Vector{UInt64}, efficiency_rank::Int32, infos::Vector{Hwloc.HwlocInfo}}}}:
 (masks = UInt64[0x000000000000000f], efficiency_rank = 0, infos = Hwloc.HwlocInfo[Hwloc.HwlocInfo("DarwinCompatible", "apple,icestorm;ARM,v8")])
 (masks = UInt64[0x00000000000000f0], efficiency_rank = 1, infos = Hwloc.HwlocInfo[Hwloc.HwlocInfo("DarwinCompatible", "apple,firestorm;ARM,v8")])

Integration into topology() via the cpukind kwarg:

julia> topology(; cpukind=true)

Machine (3.49 GB)
    Package L#0 P#0 (3.49 GB)
        NUMANode (3.49 GB)
        L2 (4.0 MB)
            L1 (64.0 kB) + Core L#0 P#0
                PU L#0 P#0 (1, DarwinCompatible=apple,icestorm;ARM,v8)
            L1 (64.0 kB) + Core L#1 P#1
                PU L#1 P#1 (1, DarwinCompatible=apple,icestorm;ARM,v8)
            L1 (64.0 kB) + Core L#2 P#2
                PU L#2 P#2 (1, DarwinCompatible=apple,icestorm;ARM,v8)
            L1 (64.0 kB) + Core L#3 P#3
                PU L#3 P#3 (1, DarwinCompatible=apple,icestorm;ARM,v8)
        L2 (12.0 MB)
            L1 (128.0 kB) + Core L#4 P#4
                PU L#4 P#4 (2, DarwinCompatible=apple,firestorm;ARM,v8)
            L1 (128.0 kB) + Core L#5 P#5
                PU L#5 P#5 (2, DarwinCompatible=apple,firestorm;ARM,v8)
            L1 (128.0 kB) + Core L#6 P#6
                PU L#6 P#6 (2, DarwinCompatible=apple,firestorm;ARM,v8)
            L1 (128.0 kB) + Core L#7 P#7
                PU L#7 P#7 (2, DarwinCompatible=apple,firestorm;ARM,v8)
    CoProc(OpenCL) "opencl0d0"

@carstenbauer carstenbauer changed the title Make cpukinds information available Make cpu kind information available Jun 14, 2024
@carstenbauer
Copy link
Member Author

carstenbauer commented Jun 15, 2024

  • debug windows failure

@carstenbauer
Copy link
Member Author

I think this is ready for review/merge. (I apologize for the formatting changes in advance 😄.)

I've tested this on a cluster (single kind of CPU cores) and a M1 mac Mini (two kinds of CPU cores).

@carstenbauer
Copy link
Member Author

@giordano Can you perhaps run topology(; cpukind=true) on Fugaku? It has a particularly strange mapping of cores to "OS indexes" and I'd like to see what it gives here.

@giordano
Copy link
Member

I don't have access to Fugaku anymore 😢

@JBlaschke
Copy link
Contributor

There seems to be a problem on Perlmutter:

julia> topology(; cpukind=true)

Machine (503.14 GB)
    Package L#0 P#0 (251.18 GB)
        Group (62.21 GB)
            NUMANode (62.21 GB)
            L3 (32.0 MB)
                L2 (512.0 kB) + L1 (32.0 kB) + Core L#0 P#0 ERROR: type Nothing has no field masks
Stacktrace:
  [1] getproperty(x::Nothing, f::Symbol)
    @ Base ./Base.jl:37
  [2] _osindex2cpukind(i::Int64)
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:469
  [3]
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:77
  [4]
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:159
  [5]  (repeats 2 times)
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:154
  [6]  (repeats 4 times)
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:159
  [7] print_topology(obj::Hwloc.Object; kwargs::@Kwargs{cpukind::Bool})
    @ Hwloc ~/.julia/dev/Hwloc/src/highlevel_api.jl:175
  [8] print_topology
    @ ~/.julia/dev/Hwloc/src/highlevel_api.jl:175 [inlined]
  [9] topology
    @ ~/.julia/dev/Hwloc/src/highlevel_api.jl:198 [inlined]
 [10] top-level scope
    @ REPL[7]:1
Some type information was truncated. Use `show(err)` to see complete types.

Which seems to stem from the fact that the cpukind info is empty:

julia> num_cpukinds()
1

julia>

julia> num_virtual_cores_cpukinds()

julia> Hwloc.get_cpukind_info()
1-element Vector{Union{Nothing, @NamedTuple{masks::Vector{UInt64}, efficiency_rank::Int32, infos::Vector{Hwloc.HwlocInfo}}}}:
 nothing

Any thoughts?

@carstenbauer
Copy link
Member Author

carstenbauer commented Jun 17, 2024

There seems to be a problem on Perlmutter

@JBlaschke Fixed (tested on a Perlmutter login node). Apparently hwloc can't obtain the CPU kind information on Perlmutter and my handling of this case was flawed.

@carstenbauer
Copy link
Member Author

(The CI failure seems to be a unrelated hickup)

@robertschade
Copy link

On an Intel 12900K (8 performance cores, 8 efficiency cores),

julia> num_cpukinds()
2
julia> num_virtual_cores_cpukinds()
2-element Vector{Int64}:
  8
 16
julia> Hwloc.get_cpukind_info()
2-element Vector{Union{Nothing, @NamedTuple{masks::Vector{UInt64}, efficiency_rank::Int32, infos::Vector{Hwloc.HwlocInfo}}}}:
 (masks = UInt64[0x0000000000ff0000], efficiency_rank = 0, infos = Hwloc.HwlocInfo[Hwloc.HwlocInfo("FrequencyMaxMHz", "3900"), Hwloc.HwlocInfo("FrequencyBaseMHz", "2400"), Hwloc.HwlocInfo("CoreType", "IntelAtom")])
 (masks = UInt64[0x000000000000ffff], efficiency_rank = 1, infos = Hwloc.HwlocInfo[Hwloc.HwlocInfo("FrequencyMaxMHz", "5100"), Hwloc.HwlocInfo("FrequencyBaseMHz", "3200"), Hwloc.HwlocInfo("CoreType", "IntelCore")])
julia> topology(; cpukind=true)

Machine (30.64 GB)
    Package L#0 P#0 (30.64 GB)
        NUMANode (30.64 GB)
        L3 (30.0 MB)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#0 P#0 
                PU L#0 P#0 (2, FrequencyMaxMHz=5100; ...)
                PU L#1 P#1 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#1 P#4 
                PU L#2 P#2 (2, FrequencyMaxMHz=5100; ...)
                PU L#3 P#3 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#2 P#8 
                PU L#4 P#4 (2, FrequencyMaxMHz=5100; ...)
                PU L#5 P#5 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#3 P#12 
                PU L#6 P#6 (2, FrequencyMaxMHz=5100; ...)
                PU L#7 P#7 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#4 P#16 
                PU L#8 P#8 (2, FrequencyMaxMHz=5100; ...)
                PU L#9 P#9 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#5 P#20 
                PU L#10 P#10 (2, FrequencyMaxMHz=5100; ...)
                PU L#11 P#11 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#6 P#24 
                PU L#12 P#12 (2, FrequencyMaxMHz=5100; ...)
                PU L#13 P#13 (2, FrequencyMaxMHz=5100; ...)
            L2 (1.25 MB) + L1 (48.0 kB) + Core L#7 P#28 
                PU L#14 P#14 (2, FrequencyMaxMHz=5100; ...)
                PU L#15 P#15 (2, FrequencyMaxMHz=5100; ...)
            L2 (2.0 MB)
                L1 (32.0 kB) + Core L#8 P#32 
                    PU L#16 P#16 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#9 P#33 
                    PU L#17 P#17 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#10 P#34 
                    PU L#18 P#18 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#11 P#35 
                    PU L#19 P#19 (1, FrequencyMaxMHz=3900; ...)
            L2 (2.0 MB)
                L1 (32.0 kB) + Core L#12 P#36 
                    PU L#20 P#20 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#13 P#37 
                    PU L#21 P#21 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#14 P#38 
                    PU L#22 P#22 (1, FrequencyMaxMHz=3900; ...)
                L1 (32.0 kB) + Core L#15 P#39 
                    PU L#23 P#23 (1, FrequencyMaxMHz=3900; ...)
    HostBridge 
        PCI 00:02.0 (VGA)
            GPU "renderD128"
            GPU "card0"
        PCIBridge 
            PCI 01:00.0 (NVMExp)
                Block(Disk) "nvme0n1"
        PCIBridge 
            PCI 02:00.0 (NVMExp)
                Block(Disk) "nvme1n1"
        PCIBridge 
            PCI 05:00.0 (Ethernet)
                Net "enp5s0"

@carstenbauer
Copy link
Member Author

That looks nice / as expected. Thanks for testing @robertschade!

@JBlaschke
Copy link
Contributor

There seems to be a problem on Perlmutter

@JBlaschke Fixed (tested on a Perlmutter login node). Apparently hwloc can't obtain the CPU kind information on Perlmutter and my handling of this case was flawed.

Thanks! Do you know why this information isn't available on PM? We've got normal AMD Milan CPUs there...

@carstenbauer
Copy link
Member Author

Do you know why this information isn't available on PM? We've got normal AMD Milan CPUs there...

No idea, hwloc is my abstraction and it doesn't provide it (lstopo --cpukinds also prints nothing).

@carstenbauer
Copy link
Member Author

FYI: I plan to merge this tomorrow.

@JBlaschke
Copy link
Contributor

No idea, hwloc is my abstraction and it doesn't provide it (lstopo --cpukinds also prints nothing).

Ok so that's probably our fault then -- I'll follow up internatlly.

FYI: I plan to merge this tomorrow.

Brilliant! Thanks for your work on this :)

@carstenbauer carstenbauer merged commit 2ee0444 into master Jun 18, 2024
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Try to support --cpukinds information
4 participants