-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deterministic hugepage size, set without kexec #12
base: main
Are you sure you want to change the base?
Conversation
@@ -0,0 +1 @@ | |||
CMDLINE_LINUX="$CMDLINE_LINUX default_hugepagesz=1G" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not want to change the hugepagesz
, the default (2MB) is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to follow up on the discussion we had recently:
On cascade lake we have:
- L1 TLB:
- 4x32 2MB Entries covering 128MB of memory
- 1x4 1G entries covering 4GB of memory
It pretty much depends on the locality of your workload whether 2M or 1G pages are the better option. My suggestion would be the same as @fwiesel's. Let's stick with 2MB pages for the time being.
When we have everything working, we can still do benchmarks to figure out whether 1G pages are actually helping, or not.
#cmdline="$(</proc/cmdline) hugepages=$hugepages" | ||
#release=$(uname -r) | ||
|
||
#NEWROOT=${NEWROOT:-/sysroot} | ||
|
||
#kexec \ | ||
# -l $NEWROOT/boot/vmlinuz-${release} \ | ||
# --initrd=$NEWROOT/boot/initrd.img-${release} \ | ||
# --command-line="$cmdline" | ||
|
||
#kexec -e --reset-vga |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove it instead?
@@ -12,5 +12,5 @@ depends() { | |||
# Install the required file(s) and directories for the module in the initramfs. | |||
install() { | |||
inst_hook pre-pivot 00 "$moddir/ensure-hugepages.sh" | |||
inst kexec | |||
#inst kexec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove that comment as well.
|
||
NEWROOT=${NEWROOT:-/sysroot} | ||
echo $hugepages > /proc/sys/vm/nr_hugepages | ||
# might get more hugepages and faster with the kernel cmdline, but like this you avoid a kexec reboot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my understanding: What exactly do you mean with this comment? Why do we avoid a kexec reboot when we use it like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gehoern moved applying hugepages to a slightly later point in booting meaning that the Linux kernel might move around some pages in that process (if than only a few at this moment in time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotTheEvilOne this part I understand, I just don't understand why we would need a kexec cycle otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kexec
has been used originally to apply the changes. This has been rewritten in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, then the comment makes sense. Thanks for elaborating.
hugepaes have not been set deterministically, fixed by adding it to the kernel cmdline
added a direct application of hugepages to dracut -> no kexec needed anymore
has a slight delay but in initrd this still should be fast.