Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_online_cpus() utiilty function #39

Merged
merged 6 commits into from
May 14, 2024

Conversation

gmarler
Copy link
Collaborator

@gmarler gmarler commented May 10, 2024

Attaching perf events can only be done for CPUs that are online - merely being present or possible isn't sufficient to allow such an event to be attached.

It's also possible for online CPU ranges to not be contiguous, due to NUMA topology or CPUs being offlined in some fashion.

So adding a utility function that provides a Vec of individual (not ranges of) online CPUs and using that instead of num_possible_cpus(), which could be much larger than the number of online CPUs.

Also updating code that tries to size certain data structures based on the number of online CPUs.

There are still places where the possible number of CPUs is still needed.

This is due to certain eBPF (libbpf) internals that have to allocate buffers based on the possible number of CPUs, rather than the online number - you'll only notice this on machines where they differ, which is pretty often for both physical and virtual machines.

Tested this one line of code on a host where the online number of CPUs was 4, but the possible was 8 to verify it works as expected (and fails when you try to use the online number instead).

@gmarler gmarler requested a review from javierhonduco May 10, 2024 13:48
Copy link
Owner

@javierhonduco javierhonduco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this! Left some comments (might be slow to reply today, not feeling very well)

src/util.rs Outdated
cpu_range
.trim_end()
.parse::<u32>()
.expect("Unable to parse lone CPU number"),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about returning an error here rather than panicking?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intent - fixing that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

src/util.rs Outdated
let end = cpu_range[index + 1..]
.trim_end()
.parse::<u32>()
.expect("ending CPU number");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here and above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

assert_eq!(
cpus,
(0..=1)
.chain(3..=3)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about chain(), very cool!

Copy link
Owner

@javierhonduco javierhonduco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this very needed change! LGTM.

I thought I had left a comment on some small improvement to how we test this. Ideally we would not rely on the filesystem (it might be wiped during test execution and cause the test to fail), which perhaps we could address by passing a memory buffer rather than an actual file (see how we do this for the kernel symbols parser). This is not a high priority of course, so I am happy to merge as-is and change this later on!

@javierhonduco javierhonduco merged commit 8525177 into javierhonduco:main May 14, 2024
4 checks passed
@javierhonduco
Copy link
Owner

BTW I tested it locally on my x86 box and worked like a charm 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants