Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port of illumos #7446 - zpool create should support efi system partition #11029

Closed
wants to merge 1 commit into from

Conversation

nshalman
Copy link

@nshalman nshalman commented Oct 8, 2020

Motivation and Context

Original illumos issue from @tsoome: https://illumos.org/issues/7446

Since we support whole-disk configuration for boot pool, we also will need whole disk support with UEFI boot and for this, zpool create should create efi-system partition.

We have borrowed the idea from Oracle Solaris, and introduced a zpool create -B switch to provide a way to specify that a boot partition should be created.

The default partition size is 256MB (the minimum size for FAT32 with 4k blocks).
To support custom sizes, the set on creation "bootsize" property is created and so the custom size can be set as e.g.:
zpool create -B -o bootsize=34MB rpool sda

After the pool is created, the "bootsize" property is read only. When the -B switch is not used, the bootsize defaults to 0 and is shown in zpool get output with value '-'. Older zfs/zpool implementations are ignoring this property.

Abandoned previous attempt: #7664

Description

Since we support whole-disk configuration for boot pool, we also will need whole disk support with UEFI boot and for this, zpool create should create efi-system partition.

How Has This Been Tested?

Local compilation and testing on Debian Buster using a 4.19 kernel:

  • Able to create both with and without -B
  • Able to import and export pools created in both fashions.

Help Requested

There's one Linux specific file that presumably needs to be ported to FreeBSD. Pinging @tsoome (thanks for the original work!!!) and @allanjude for visibility.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Authored by: Toomas Soome <[email protected]>
Reviewed by: Andrew Stormont <[email protected]>
Reviewed by: Yuri Pankov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
Ported-by: Nahum Shalman <[email protected]>

OpenZFS-issue: https://illumos.org/issues/7446
OpenZFS-commit: openzfs/openzfs@7855d95

Reverts "Remove unused `zpool_is_bootable`"
This reverts commit 663a070.
@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Oct 8, 2020

@nshalman Awesome, was looking forward to the (draft) PR of this! :)

Some sidenotes :

OpenZFS-issue: https://illumos.org/issues/7446

illumnos != (or =/= ) OpenZFS, OpenZFS issues are not filled @ illumnos anymore...

OpenZFS-commit: openzfs/openzfs@7855d95

OpenZFS issues and PR's are filed here. The OpenZFS/OpenZFS repo has been depricated/archieved about 10 months ago and replaced with this one.

You can safely remove those two references and copy any relevant info to this place instead, to keep a clean PR log. 💯


On-topic/code:
I'm not the one to review this fully, but love the simplicity and care for documentation!

@nshalman
Copy link
Author

nshalman commented Oct 8, 2020

@Ornias1993

You can safely remove those two references and copy any relevant info to this place instead, to keep a clean PR log. 💯

I've cleaned that up a bit. Let me know if you'd like further changes.

On-topic/code:
I'm not the one to review this fully, but love the simplicity and care for documentation!

@tsoome gets all credit for the work and the documentation!

((vtoc->efi_lbasize == 512 &&
boot_size < 33 * 1024 * 1024) ||
(vtoc->efi_lbasize == 4096 &&
boot_size < 256 * 1024 * 1024))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this size may not be enough to create fat32

@nshalman
Copy link
Author

nshalman commented Oct 8, 2020

Some notes from IRC:

<nahamu>        tsoome: I'm trying to port your create -B work again: https://github.com/openzfs/zfs/pull/11029
<tsoome>        nahamu please note it does have 2 bugs and one concern:)
<nahamu>        tsoome: Thanks!
<nahamu>        tsoome: are they fixed in follow up commits somewhere I can look?
<tsoome>        one concern is that ESP size calculation for 4k sector size may result in too small partition
<nahamu>        I know we're also going to need a FreeBSD port.
<tsoome>        no, i just have not had time to address them
<tsoome>        if you have access to 4kn disk, check the min size to create esp, its likely just above 256MB
<nahamu>        what's the failure mode? how would I trigger it?
<tsoome>        fat32 creation will fail if there are too few sectors
<nahamu>        and your concern is only that we want a reasonable default; if someone wants to make it smaller that's their business, right?
<tsoome>        with 512 the min FAT32 is ~33MB
<tsoome>        no, the boot_size < 256 * 1024 * 1024 may need larger constant
<nahamu>        should the constant in the code be a number of sectors and the default codepath should multiply that sector count by the sector size?
<tsoome>        so, the other bug is about hiding ESP size from pool; we do use ZPOOL_PROP_BOOTSIZE property, but some pool layout (was it mirror?) was still showing that space unfder expandsz prop
<nahamu>        tsoome: how tricky do think fixing that part will be?
<tsoome>        it needs experiment to create fat32 with 4kn sector size
<nahamu>        Oh I meant the hiding the space from expandsize
<tsoome>        ah, that one needs to confirm the issue, then the vdev type specific fix.
<tsoome>        raidz is OK, so I think it was mirror
<tsoome>        and the second bug is a bit more ugly -- the current partition creation code is assuming 512B sectors...  it does not break things, but is wasting space:D
<tsoome>        I did start something to address the issue, but it is still wip, but perhaps it can still be helpful https://code.illumos.org/c/illumos-gate/+/207
<nahamu>        thanks
<tsoome>        as you see, those things are linked:)
<tsoome>        anyhow, those are not fatal problems, just wanted to point out we are not perfect yet:)

zpool_label_name(vtoc->efi_parts[0].p_name, EFI_PART_NAME_LEN);
/* first fix the partition start block */
if (start_block == MAXOFFSET_T)
start_block = NEW_START_BLOCK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, there is a small issue - the block numbers around here are in units of 512B.

vtoc->efi_parts[8].p_tag = V_RESERVED;
/*
* EFI System partition is using slice 0.
* ZFS is on slice 1 and slice 8 is reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we finally get rid of the old Sun manufacturing slice 8? it has nothing to do with ZFS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disk expansion code in lib/libefi/rdwr_efi.c which is needed when doing an expansion with zpool online -e <pool> <disk> (which is a feature I care about) seems to currently also depend on the existence of that reserved partition.
Also, unfortunately in direct tension with this PR as written is the fact that the linux specific zfs_append_partition in lib/libzutil/os/linux/zutil_device_path_os.c currently seems to assume that the pool is on always partition 1.

So in principle, "sure." In practice, maybe that should be a separate PR? Also, I'm not sure how much backward compatibility we want to retain for expanding pools both with and without that extra partition. Maintaining the backward compatibility might be harder than keeping the partition around on new pools.

@tsoome
Copy link
Contributor

tsoome commented Oct 18, 2020

Some notes from IRC:

<nahamu>        tsoome: I'm trying to port your create -B work again: https://github.com/openzfs/zfs/pull/11029
<tsoome>        nahamu please note it does have 2 bugs and one concern:)
<nahamu>        tsoome: Thanks!
<nahamu>        tsoome: are they fixed in follow up commits somewhere I can look?
<tsoome>        one concern is that ESP size calculation for 4k sector size may result in too small partition
<nahamu>        I know we're also going to need a FreeBSD port.
<tsoome>        no, i just have not had time to address them
<tsoome>        if you have access to 4kn disk, check the min size to create esp, its likely just above 256MB
<nahamu>        what's the failure mode? how would I trigger it?
<tsoome>        fat32 creation will fail if there are too few sectors
<nahamu>        and your concern is only that we want a reasonable default; if someone wants to make it smaller that's their business, right?
<tsoome>        with 512 the min FAT32 is ~33MB
<tsoome>        no, the boot_size < 256 * 1024 * 1024 may need larger constant
<nahamu>        should the constant in the code be a number of sectors and the default codepath should multiply that sector count by the sector size?
<tsoome>        so, the other bug is about hiding ESP size from pool; we do use ZPOOL_PROP_BOOTSIZE property, but some pool layout (was it mirror?) was still showing that space unfder expandsz prop
<nahamu>        tsoome: how tricky do think fixing that part will be?
<tsoome>        it needs experiment to create fat32 with 4kn sector size
<nahamu>        Oh I meant the hiding the space from expandsize
<tsoome>        ah, that one needs to confirm the issue, then the vdev type specific fix.
<tsoome>        raidz is OK, so I think it was mirror
<tsoome>        and the second bug is a bit more ugly -- the current partition creation code is assuming 512B sectors...  it does not break things, but is wasting space:D
<tsoome>        I did start something to address the issue, but it is still wip, but perhaps it can still be helpful https://code.illumos.org/c/illumos-gate/+/207
<nahamu>        thanks
<tsoome>        as you see, those things are linked:)
<tsoome>        anyhow, those are not fatal problems, just wanted to point out we are not perfect yet:)

Please note, the expandsz issue turned out to be entirely illumos issue and result of mis-merge of another feature. See also https://code.illumos.org/c/illumos-gate/+/963

@PrivatePuffin
Copy link
Contributor

Please note: illumnos bugs/issues are not related to this project (anymore). Illumnos is not part of the list of OS’s this project develops for.

Want to say that extra clearly now, due to all the illumnos references.

@tsoome
Copy link
Contributor

tsoome commented Oct 18, 2020

Please note: illumnos bugs/issues are not related to this project (anymore). Illumnos is not part of the list of OS’s this project develops for.

Want to say that extra clearly now, due to all the illumnos references.

This specific feature was developed on, and is being ported from illumos, so the comments and notes are relevant to this PR. Are you saying which OS, and which notes should or should not be mentioned?

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Oct 18, 2020

The constant illumnos references just get increasingly confusing....

Als there doesnt exist a “openzfs 7446”, see the new contribution guidelines: openzfs ports are dropped, because this project == openzfs ;)
(So the title is also quite confusing)

@IvanVolosyuk
Copy link
Contributor

IvanVolosyuk commented Oct 18, 2020

I'm trying to bisect the minimum fat32 size using:

g() { truncate -s $1 /var/tmp/fat; losetup -b 4096 /dev/loop0 /var/tmp/fat; mkfs.vfat -F32 /dev/loop0; losetup -d /dev/loop0;rm /var/tmp/fat; }

@tsoome
Copy link
Contributor

tsoome commented Oct 18, 2020

FYI: Looks like minimum size of fat32 with 4096b sectors is 3000000000. Found by careful bisecting using linux mkfs.vfat.

g() { truncate -s $1 /var/tmp/fat; losetup -b 4096 /dev/loop0 /var/tmp/fat; mkfs.vfat -F32 /dev/loop0; losetup -d /dev/loop0;rm /var/tmp/fat; }
$ g 2999999999
losetup: /var/tmp/fat: Warning: file does not fit into a 512-byte sector; the end of the file will be ignored.
mkfs.fat 4.1 (2017-01-24)
$ g 3000000000
mkfs.fat 4.1 (2017-01-24)

2.8GB? that is not right for minimum size, you probably need to set sectors per cluster 1 (-s 1). Should end up somewhere ~260MB.

@IvanVolosyuk
Copy link
Contributor

I'm redoing it now. I made a mistake there was another error reported by losetup. Thanks for '-s 1' I missed it. Here the new result.
$ g() { truncate -s $1 /var/tmp/fat; losetup -b 4096 /dev/loop0 /var/tmp/fat; mkfs.vfat -F32 -s 1 /dev/loop0 2>&1 | grep enough; file -s /dev/loop0; losetup -d /dev/loop0;rm /var/tmp/fat; }

$ g 269062143
losetup: /var/tmp/fat: Warning: file does not fit into a 512-byte sector; the end of the file will be ignored.
WARNING: Not enough clusters for a 32 bit FAT!
/dev/loop0: DOS/MBR boot sector, code offset 0x58+2, OEM-ID "mkfs.fat", Bytes/sector 4096, Media descriptor 0xf8, sectors/track 32, heads 64, sectors 65688 (volumes > 32 MB), FAT (32 bit), sectors/FAT 64, serial number 0x74e83467, unlabeled

$ g 269062144
/dev/loop0: DOS/MBR boot sector, code offset 0x58+2, OEM-ID "mkfs.fat", Bytes/sector 4096, Media descriptor 0xf8, sectors/track 32, heads 64, sectors 65689 (volumes > 32 MB), FAT (32 bit), sectors/FAT 64, serial number 0x74d0a2a0, unlabeled

So, the minimum size seems 269062144.

@nshalman nshalman changed the title OpenZFS 7446 - zpool create should support efi system partition Port of illumos #7446 - zpool create should support efi system partition Oct 26, 2020
@nshalman
Copy link
Author

Als there doesnt exist a “openzfs 7446”, see the new contribution guidelines: openzfs ports are dropped, because this project == openzfs ;)
(So the title is also quite confusing)

Title fixed too.

@nshalman
Copy link
Author

So, the minimum size seems 269062144.

Based in the disk format of my work linux laptop, I'm inclined to just bump the default EFI partition size all the way up to 512MB. Anyone who is using this feature by simply adding -B in 2020 or beyond will probably have enough disk space to be able to afford that, and anyone who cares can manually specify a smaller size that works for them.

@nshalman
Copy link
Author

The code as written doesn't currently work properly because the linux specific zfs_append_partition in lib/libzutil/os/linux/zutil_device_path_os.c currently seems to assume that the pool is on always partition 1.
More work is needed. :(

@nshalman
Copy link
Author

nshalman commented Jul 6, 2021

I think I was in over my head. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants