Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

coreos-install creates ESP with FAT16 instead of the recommended FAT32 #2246

Closed
mortenlj opened this issue Nov 11, 2017 · 10 comments · Fixed by coreos/scripts#806
Closed

coreos-install creates ESP with FAT16 instead of the recommended FAT32 #2246

mortenlj opened this issue Nov 11, 2017 · 10 comments · Fixed by coreos/scripts#806

Comments

@mortenlj
Copy link

Issue Report

Bug

Container Linux Version

coreos-install directly from the master branch, don't know which version that would be

Environment

Bare-metal. ASUS laptop (yes, I know this is not the typical target system, but there should be no reason it shouldn't work).

Expected Behavior

The install-script creates an ESP, using FAT32, that will be recognized by the UEFI firmware and used for booting.

Actual Behavior

The FAT16 partition is not recognized as a ESP, and hence the system won't boot.

This is probably the ASUS UEFI implementation that is being more strict in what it accepts than others, but as far as I can tell, the UEFI spec does explicitly say that the ESP should use a variant of FAT32.

Reproduction Steps

  1. Install Container Linux using the coreos-install script

Other Information

https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface

Such a setup is usually referred to as UEFI-GPT, while ESP is recommended to be at least 512 MiB in size and formatted with a FAT32 filesystem for maximum compatibility.

@bgilbert
Copy link
Contributor

Nice catch! If you copy the files out of the ESP, reformat the filesystem, and copy the files back, does your system boot? Be sure to label the filesystem EFI-SYSTEM, i.e.,

mkfs.vfat -F 32 -n EFI-SYSTEM /dev/sda1

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

@mortenlj
Copy link
Author

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

coreos-install is really just downloading and writing a disk image, so the problem is in the formatting tool that runs during image creation. It seems we've always had this bug.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

@bgilbert
Copy link
Contributor

I tried copying the files out and reformatting, but for some reason I get an error about invalid GPT signature when I try booting afterwards. I had to manually add the boot entry, for reasons unknown, so I might have done something wrong there.

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

What was the exact text of the invalid signature message?

I booted using EFI\boot\bootx64.efi, which gave me a GRUB with three options, coreos default, usr-a and usr-b. default and usr-a gave me the invalid GPT signature error, usr-b gave me an error about vmlinux-b not found (or something similar).

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Would it just be a matter of adding -F 32 to the command you linked to, or is that function used for other things too so it needs to be configurable?

I have a two-line patch that adds -F 32 only for partitions whose type GUID corresponds to the EFI system partition. We don't use VFAT for anything else, but the code might as well handle all cases.

@mortenlj
Copy link
Author

Container Linux never installs a boot entry; it always boots on EFI via Default Boot Behavior. Installing a boot entry by hand should work, or else deleting all boot entries.

Did the switch to FAT32 at least allow you to make progress, or is it possible that the missing boot entry was the original problem as well?

Yeah, switching to FAT32 made the partition detected by the firmware, so it was definitive progress. There were no boot entries defined, so it should have used the default boot behavior, but for some reason that didn't happen. Adding the equivalent boot entry worked.

What was the exact text of the invalid signature message?

I will try to get time for another attempt this evening, and grab a picture of it so I can get the exact text.

On a freshly-installed image, USR-B won't work, and USR-A and default are equivalent except for some detection steps.

Ok, I suspected as much. This is to be expected then, so the remaining issue is the invalid GPT signature problem. I will get back to you with details about that when I get a chance to look at it.

@mortenlj
Copy link
Author

What was the exact text of the invalid signature message?

Booting `CoreOS default'

error: invalid GPT signature.

Reading or updating the GPT failed!
Please file a bug with any messages above to CoreOS:
 https://issues.coreos.com

Aborted. Press enter to exit GRUB.

I pressed enter.

error: can't find command `exit'.
error: file `/coreos/vmlinuz-b' not found.

Press any key to continue...

I pressed a key...


   Failed to boot both default and fallback entries.

Press any key to continue...

Pressing any key here, returns to the GRUB menu, where default, usr-a and usr-b are the options to select from.

I don't know if this is a consequence of the reformatting of the partition, or a separate issue.
I'm out travelling for a few days now, but I can continue troubleshooting this when I get back. I might also just try to run the install-script again, after your fix has been merged.

@bgilbert
Copy link
Contributor

error: can't find command 'exit' shouldn't happen. It's not directly related to the problem, but it's odd. The subsequent behavior is a result of that error.

invalid GPT signature is coming from GRUB code. It's not immediately clear why, but I agree that testing a fixed image is the next step. When you get a chance, please try this test image:

wget 'https://users.developer.core-os.net/bgilbert/boards/amd64-usr/1590.0.0%2B2017-11-13-1552/coreos_production_image.bin.bz2'
sudo coreos-install -f coreos_production_image.bin.bz2 [...]

@mortenlj
Copy link
Author

I've tested your image now, and that worked like a charm.

@bgilbert
Copy link
Contributor

Great. The fix should be included in the next alpha (1618.0.0), due in a couple weeks. Thanks again for reporting!

@ajeddeloh
Copy link

Reopening since we are reverting this due to a grub bug

@ajeddeloh ajeddeloh reopened this Dec 18, 2017
@bgilbert
Copy link
Contributor

When this is fixed, the partitioning docs should be updated to say FAT32, since VFAT is not a thing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants