Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u32 overflow in finding fat_offset from cluster #74

Closed
gauteh opened this issue Feb 20, 2023 · 30 comments · Fixed by #110
Closed

u32 overflow in finding fat_offset from cluster #74

gauteh opened this issue Feb 20, 2023 · 30 comments · Fixed by #110

Comments

@gauteh
Copy link
Contributor

gauteh commented Feb 20, 2023

Screenshot_20230216_230927_Chrome

I only have this message from a device deployed in the field. It seems to crash on:

                let fat_offset = cluster.0 * 4;

where everything involved is u32s.

@gauteh
Copy link
Contributor Author

gauteh commented Jun 7, 2023

This happens on a SD-card with 2.2GB used (2249504 bytes) and exactly 1024 files (of approx 2.3M, 2252800 bytes each).

@thejpster
Copy link
Member

What is the cluster number? You can't have anywhere near 2**30 clusters on a FAT32 volume, so that multiplication should never overflow.

@jonathanpallant
Copy link
Collaborator

Maybe we should test here for the magic cluster IDs. I suspect we've walked too far down the chain and set the current cluster to be a magic cluster ID (like Cluster::END_OF_FILE).

@gauteh
Copy link
Contributor Author

gauteh commented Sep 18, 2023

That could be. When this issue happened I tried to re-create the issue using the LinuxBlockDevice target, but couldn't figure it out at that time. Maybe some usize difference between my arm-target ?

@jonathanpallant
Copy link
Collaborator

If you were able to share a copy of the code and the disk image, that would be brilliant.

Otherwise we probably need a more exhaustive test suite. One approach that works well is to use embedded-sdmmc to perform "operations" on a disk image, and also use the OS native APIs to perform the same "operations" to a local folder (perhaps a loop-back mounted FAT disk image). The "operations" would include randomly creating a new file / folder, randomly truncating a random file by a random amount, appending random data to a random file, overwriting a random file with random data and randomly deleting a file/folder. Run that for a million iterations and then check that the files in both disk images are the same.

@gauteh
Copy link
Contributor Author

gauteh commented Sep 18, 2023

The code that uses the lib is here: https://github.com/gauteh/sfy/blob/main/sfy-buoy/src/storage/mod.rs#L170, I'm not sure which of my code-lines trigger the error.

I don't have a disk image, but I do have a folder with the contents of the SD-card which I will share with you. I think the code now fails at trying to find and open a new data file: https://github.com/gauteh/sfy/blob/main/sfy-buoy/src/storage/mod.rs#L400

https://drive.google.com/file/d/1QBETOG7isGc-aRrycQHYl9z-8alxCza6/view?usp=sharing

@thejpster
Copy link
Member

Your write function doesn't close the file correctly. Files cannot close on drop because they don't hold references to the volume manager.

@gauteh
Copy link
Contributor Author

gauteh commented Sep 22, 2023

@jonathanpallant
Copy link
Collaborator

I did not see the drop impl!!

@gauteh
Copy link
Contributor Author

gauteh commented Sep 22, 2023

Ok 😊 then I think I should be good wrt the file close.

@jonathanpallant
Copy link
Collaborator

Do you have CRC checks enabled on the SD Card interface?

@gauteh
Copy link
Contributor Author

gauteh commented Oct 2, 2023

Do you have CRC checks enabled on the SD Card interface?

Yes.

@chen-gz
Copy link

chen-gz commented Dec 10, 2023

I got the same error. The number of sectors I have is 31116288.
image
The value I have is "cluster.0 = 4294967292"

Following is call stacks
image
image
image

@thejpster
Copy link
Member

That's 0xFFFFFFFC which is not a valid cluster number but a magic value used as a marker (I think it means end of chain).

So I guess the fix is to detect if you are asking for the next cluster if you are at the end of the chain, and report None.

Sounds like the file length and number of clusters in the chain are out of step though. It would be interesting to image the disk, run Scandisk on it, and then image it again. If you can provide an image as a reproducer that would be great.

If you can't provide an image because you're working on a proprietary software then first up, please note that no warranty is given, express or implied, and second, there are companies out there who can do commercial support for open source software, and under NDA if required.

@chen-gz
Copy link

chen-gz commented Dec 12, 2023

I would like to provide the files, but I'll need to find a small disk and create an image for it. It may take two or more days since I've been busy recently.

@chen-gz
Copy link

chen-gz commented Dec 13, 2023

That's 0xFFFFFFFC which is not a valid cluster number but a magic value used as a marker (I think it means end of chain).

So I guess the fix is to detect if you are asking for the next cluster if you are at the end of the chain, and report None.

Sounds like the file length and number of clusters in the chain are out of step though. It would be interesting to image the disk, run Scandisk on it, and then image it again. If you can provide an image as a reproducer that would be great.

If you can't provide an image because you're working on a proprietary software then first up, please note that no warranty is given, express or implied, and second, there are companies out there who can do commercial support for open source software, and under NDA if required.

Do you have any tool recommendation to image the disk? I reproduced the error. But the disk is 32GB. I tried use HxD but it will take very long time to export.
Following code reproduced the error.

    let mut volume_mgr = embedded_sdmmc::VolumeManager::new(sd, ts);
    let volume0 = volume_mgr
        .open_volume(embedded_sdmmc::VolumeIdx(0))
        .unwrap();
    defmt::info!("Volume 0: {:?}", volume0);
    let mut root_dir = volume_mgr.open_root_dir(volume0).unwrap();

    let mut file_num = 0;
    loop {
        file_num += 1;
        let mut file_name = String::<32>::new();
        file_name
            .write_fmt(format_args!("{}.da", file_num))
            .unwrap(); 
        // if put .data as end. I got nametooLong error

        let file = match volume_mgr.open_file_in_dir(
            root_dir,
            file_name.as_str(),
            embedded_sdmmc::Mode::ReadWriteCreateOrTruncate,
        ) {
            Ok(f) => f,
            Err(err) => {
                defmt::error!("open file failed {:?}", err);
                break;
            }
        };
        defmt::info!("open file");
        let buf = b"hello world, from rust";
        let _ = volume_mgr.write(file, &buf[..]).unwrap();

        defmt::info!("write file");
        let _ = volume_mgr.close_file(file).unwrap();
        defmt::info!("close file");
        LED_BLUE.toggle();
    }
    

@jonathanpallant
Copy link
Collaborator

jonathanpallant commented Dec 13, 2023

I would do something like dd if=/dev/mmcblk0 bs=1M status=progress | gzip > disk.img.gz

It will take a very long time to run if you have a 32GB partition, even if most of it is blank.

If you are on Windows, you can use Win32 Disk Imager or similar.

@chen-gz
Copy link

chen-gz commented Dec 13, 2023

Here is image of disk. https://minio.ggeta.com/blog-public-data/disk.img.gz This file is stored in my personal server. Github does not allow me to upload large file. The sha256sum is f0b4b6ac07676351ab98191ddfd479a331862fc7e0e91c89344625ed2453706c disk.img.gz

@thejpster
Copy link
Member

My browser says it's a 5 GiB file and I'm downloading at 60 KiB/sec, so it will take around 24 hours...

@thejpster
Copy link
Member

OK, the start of the file is enough to look at the FAT and the root directory.

$ sudo fdisk ./disk.img
Welcome to fdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): p
Disk ./disk.img: 28.63 MiB, 30020096 bytes, 58633 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x1caebd9f

Device      Boot Start      End  Sectors  Size Id Type
./disk.img1       8192 61067263 61059072 29.1G  c W95 FAT32 (LBA)

So we expect one FAT32 partition starting at 8192 x 512 bytes = 4 MiB into the disk image.

$ hexdump -C disk.img | head -n 20
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  9f bd ae 1c 00 00 00 82  |................|
000001c0  03 00 0c fe ff ff 00 20  00 00 00 b0 a3 03 00 00  |....... ........|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00400000  eb 58 90 4d 53 44 4f 53  35 2e 30 00 02 20 98 0b  |.X.MSDOS5.0.. ..|
00400010  02 00 00 00 00 f8 00 00  3f 00 ff 00 00 20 00 00  |........?.... ..|
00400020  00 b0 a3 03 34 3a 00 00  00 00 00 00 02 00 00 00  |....4:..........|
00400030  01 00 06 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00400040  80 01 29 66 c2 9f a0 4e  4f 20 4e 41 4d 45 20 20  |..)f...NO NAME  |
00400050  20 20 46 41 54 33 32 20  20 20 33 c9 8e d1 bc f4  |  FAT32   3.....|
00400060  7b 8e c1 8e d9 bd 00 7c  88 56 40 88 4e 02 8a 56  |{......|[email protected]|
00400070  40 b4 41 bb aa 55 cd 13  72 10 81 fb 55 aa 75 0a  |@.A..U..r...U.u.|
00400080  f6 c1 01 74 05 fe 46 02  eb 2d 8a 56 40 b4 08 cd  |...t..F..-.V@...|
00400090  13 73 05 b9 ff ff 8a f1  66 0f b6 c6 40 66 0f b6  |.s......f...@f..|
004000a0  d1 80 e2 3f f7 e2 86 cd  c0 ed 06 41 66 0f b7 c9  |...?.......Af...|

Yup, just the partition table inside the first 4 MiB. That's the FAT32 BPB.

Let's see what this crates makes of it:

$ cargo run --example shell -- ~/Downloads/disk.img
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/shell /home/jonathan/Downloads/disk.img`
Opening '/home/jonathan/Downloads/disk.img'...
Volume # 0: found
Failed to open volume 1: FormatError("Partition type not supported")
Failed to open volume 2: FormatError("Partition type not supported")
Failed to open volume 3: FormatError("Partition type not supported")
0:/> stat
Status:
VolumeManager {
    block_device: LinuxBlockDevice {
        file: RefCell {
            value: File {
                fd: 3,
                path: "/home/jonathan/Downloads/disk.img",
                read: true,
                write: true,
            },
        },
        print_blocks: false,
    },
    time_source: Clock,
    id_generator: SearchIdGenerator {
        next_id: 102,
    },
    open_volumes: [
        VolumeInfo {
            volume_id: RawVolume(
                SearchId(
                    100,
                ),
            ),
            idx: VolumeIdx(
                0,
            ),
            volume_type: Fat(
                FatVolume {
                    lba_start: BlockIdx(
                        8192,
                    ),
                    num_blocks: BlockCount(
                        61059072,
                    ),
                    name: "NO NAME    ",
                    blocks_per_cluster: 32,
                    first_data_block: BlockCount(
                        32768,
                    ),
                    fat_start: BlockCount(
                        2968,
                    ),
                    free_clusters_count: Some(
                        1906559,
                    ),
                    next_free_cluster: Some(
                        ClusterId(
                            515,
                        ),
                    ),
                    cluster_count: 1907072,
                    fat_specific_info: Fat32(
                        Fat32Info {
                            first_root_dir_cluster: ClusterId(
                                2,
                            ),
                            info_location: BlockIdx(
                                8193,
                            ),
                        },
                    ),
                },
            ),
        },
    ],
    open_dirs: [
        DirectoryInfo {
            directory_id: RawDirectory(
                SearchId(
                    101,
                ),
            ),
            volume_id: RawVolume(
                SearchId(
                    100,
                ),
            ),
            cluster: ClusterId(
                4294967292,
            ),
        },
    ],
    open_files: [],
}
0:/> 

The FAT starts at block 2968, i.e. there are 2968 reserved blocks. This is a weirdly large number and I expect it to be 32. Anyway, that's byte offset 0x57_3000 in the file.

The info block is block 8193. The root directory starts in Cluster 2, and there are 32 blocks per cluster.

A directory listing shows a SYSTEM~1 folder, and 509 files called like 100.DA which are 22 bytes long; this matches your example code above.

Let's look for the FAT, after 2968 reserved blocks.

$ hexdump -v -C ~/Downloads/disk.img | grep 00573000 -A 25 -B 2
00572fe0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00572ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00573000  f8 ff ff 0f ff ff ff ff  ff ff ff 0f ff ff ff 0f  |................|
00573010  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573020  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573030  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573040  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573050  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573060  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573070  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573080  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573090  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730a0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730b0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730c0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730d0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730e0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
005730f0  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573100  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573110  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573120  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573130  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573140  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573150  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573160  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573170  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573180  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|
00573190  ff ff ff 0f ff ff ff 0f  ff ff ff 0f ff ff ff 0f  |................|

Looks like a FAT. Let's look at it as 32-bit values, with the duplicates removed:

$ hexdump -e '/32 "%08_ax: "' -e '8/4 "%08X " "\n"' ~/Downloads/disk.img  | grep 00573000 -B2 -A16
00401a00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
00573000: 0FFFFFF8 FFFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF
00573020: 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF
*
00573800: 0FFFFFFF 0FFFFFFF 0FFFFFFF 00000000 00000000 00000000 00000000 00000000
00573820: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
00cb9800: 0FFFFFF8 FFFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 0FFFFFFF 00000000 00000000
00cb9820: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
01400000: 49002042 66006E00 0F006F00 00727200 0061006D 00690074 0000006F 0000006E
01400020: 79005301 74007300 0F006500 006D7200 00560020 006C006F 00000075 0065006D
01400040: 54535953 317E4D45 16202020 B0263D00 578C578C B0270000 0003578C 00000000
01400060: 20202031 20202020 20204144 00000000 00000021 00000000 00060021 00000016
01400080: 20202032 20202020 20204144 00000000 00000021 00000000 00070021 00000016
014000a0: 20202033 20202020 20204144 00000000 00000021 00000000 00080021 00000016
014000c0: 20202034 20202020 20204144 00000000 00000021 00000000 00090021 00000016
014000e0: 20202035 20202020 20204144 00000000 00000021 00000000 000A0021 00000016

0x0FFFFFF8 and 0x0FFFFFFF are end-of-file markers, so the first 515 entries all occupy one cluster each.

With 1907072 clusters in the file, what we see at 0x00cb9800 is the second copy of the FAT, which this library ignores (hence why it doesn't match the first copy). Then at 0x01400000 I assume that's the root directory.

hexdump -C ~/Downloads/disk.img | grep 01400000 -A16
01400000  42 20 00 49 00 6e 00 66  00 6f 00 0f 00 72 72 00  |B .I.n.f.o...rr.|
01400010  6d 00 61 00 74 00 69 00  6f 00 00 00 6e 00 00 00  |m.a.t.i.o...n...|
01400020  01 53 00 79 00 73 00 74  00 65 00 0f 00 72 6d 00  |.S.y.s.t.e...rm.|
01400030  20 00 56 00 6f 00 6c 00  75 00 00 00 6d 00 65 00  | .V.o.l.u...m.e.|
01400040  53 59 53 54 45 4d 7e 31  20 20 20 16 00 3d 26 b0  |SYSTEM~1   ..=&.|
01400050  8c 57 8c 57 00 00 27 b0  8c 57 03 00 00 00 00 00  |.W.W..'..W......|
01400060  31 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |1       DA  ....|
01400070  21 00 00 00 00 00 00 00  21 00 06 00 16 00 00 00  |!.......!.......|
01400080  32 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |2       DA  ....|
01400090  21 00 00 00 00 00 00 00  21 00 07 00 16 00 00 00  |!.......!.......|
014000a0  33 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |3       DA  ....|
014000b0  21 00 00 00 00 00 00 00  21 00 08 00 16 00 00 00  |!.......!.......|
014000c0  34 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |4       DA  ....|
014000d0  21 00 00 00 00 00 00 00  21 00 09 00 16 00 00 00  |!.......!.......|
014000e0  35 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |5       DA  ....|
014000f0  21 00 00 00 00 00 00 00  21 00 0a 00 16 00 00 00  |!.......!.......|
01400100  36 20 20 20 20 20 20 20  44 41 20 20 00 00 00 00  |6       DA  ....|

Yes, that's a directory.

@thejpster
Copy link
Member

OK, that all seems normal. I'll try your example and see what is going wrong - perhaps it's a bug when the root directory takes up more than one cluster?

big_dir.rs:

extern crate embedded_sdmmc;

mod linux;
use linux::*;

const FILE_TO_APPEND: &str = "README.TXT";

use embedded_sdmmc::{Error, Mode, VolumeIdx, VolumeManager};

fn main() -> Result<(), embedded_sdmmc::Error<std::io::Error>> {
    env_logger::init();
    let mut args = std::env::args().skip(1);
    let filename = args.next().unwrap_or_else(|| "/dev/mmcblk0".into());
    let print_blocks = args.find(|x| x == "-v").map(|_| true).unwrap_or(false);
    let lbd = LinuxBlockDevice::new(filename, print_blocks).map_err(Error::DeviceError)?;
    let mut volume_mgr: VolumeManager<LinuxBlockDevice, Clock, 8, 8, 4> =
        VolumeManager::new_with_limits(lbd, Clock, 0xAA00_0000);
    let mut volume = volume_mgr
        .open_volume(embedded_sdmmc::VolumeIdx(1))
        .unwrap();
    log::info!("Volume: {:?}", volume);
    let mut root_dir = volume.open_root_dir().unwrap();

    let mut file_num = 0;
    loop {
        file_num += 1;
        let file_name = format!("{}.da", file_num);
        let mut file = match root_dir.open_file_in_dir(
            file_name.as_str(),
            embedded_sdmmc::Mode::ReadWriteCreateOrTruncate,
        ) {
            Ok(f) => f,
            Err(err) => {
                log::error!("open file failed {:?}", err);
                break;
            }
        };
        log::info!("open file");
        let buf = b"hello world, from rust";
        let _ = file.write(&buf[..]).unwrap();

        log::info!("write file");
        drop(file);
        log::info!("close file");
    }

    Ok(())
}
$ zcat ./tests/disk.img.gz > disk.img
$ cargo run --example big_dir -- ./disk.img                     
warning: unused imports: `Mode`, `VolumeIdx`
 --> examples/big_dir.rs:8:29
  |
8 | use embedded_sdmmc::{Error, Mode, VolumeIdx, VolumeManager};
  |                             ^^^^  ^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: constant `FILE_TO_APPEND` is never used
 --> examples/big_dir.rs:6:7
  |
6 | const FILE_TO_APPEND: &str = "README.TXT";
  |       ^^^^^^^^^^^^^^
  |
  = note: `#[warn(dead_code)]` on by default

warning: `embedded-sdmmc` (example "big_dir") generated 2 warnings (run `cargo fix --example "big_dir"` to apply 1 suggestion)
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/examples/big_dir ./disk.img`
thread 'main' panicked at /home/jonathan/Documents/github/embedded-sdmmc-rs/src/fat/volume.rs:207:34:
attempt to multiply with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Hey, I love a reproducer. Thanks! Let's add an explicit check and look at a backtrace.

RUST_BACKTRACE=1 ./target/debug/examples/big_dir ./disk.img 
thread 'main' panicked at /home/jonathan/Documents/github/embedded-sdmmc-rs/src/fat/volume.rs:183:13:
next_cluster called on invalid cluster ClusterId(fffffffc)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/std/src/panicking.rs:597:5
   1: core::panicking::panic_fmt
             at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/panicking.rs:72:14
   2: embedded_sdmmc::fat::volume::FatVolume::next_cluster
             at ./src/fat/volume.rs:183:13
   3: embedded_sdmmc::fat::volume::FatVolume::write_new_directory_entry
             at ./src/fat/volume.rs:405:31
   4: embedded_sdmmc::volume_mgr::VolumeManager<D,T,_,_,_>::open_file_in_dir
             at ./src/volume_mgr.rs:518:45
   5: embedded_sdmmc::filesystem::directory::Directory<D,T,_,_,_>::open_file_in_dir
             at ./src/filesystem/directory.rs:150:17
   6: big_dir::main
             at ./examples/big_dir.rs:28:30
   7: core::ops::function::FnOnce::call_once
             at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

OK, so the problem is that FAT32 doesn't live in a fixed place - the location is given by some metadata. So when you open the root directory, the directory holds the value 0xFFFF_FFFC. There was a bug in the write_new_directory_entry function where it would map that value to the real block on disk for the purposes of loading directory blocks, but it didn't update the cluster number. So, when it went to find the next cluster it was trying to find the next cluster after 0xFFFF_FFFC, which obviously isn't a valid cluster number.

The patch is pretty simple:

-                let mut first_dir_block_num = match dir.cluster {
+                let mut current_cluster = match dir.cluster {
-                    ClusterId::ROOT_DIR => self.cluster_to_block(fat32_info.first_root_dir_cluster),
+                    ClusterId::ROOT_DIR => Some(fat32_info.first_root_dir_cluster),
-                    _ => self.cluster_to_block(dir.cluster),
+                    _ => Some(dir.cluster),
                 };
-                let mut current_cluster = Some(dir.cluster);
+                let mut first_dir_block_num = self.cluster_to_block(dir.cluster);

thejpster added a commit that referenced this issue Dec 17, 2023
Map ROOT_DIR (0xFFFF_FFFC) to an actual cluster number, so if you go off the end of the first cluster of the root directory on FAT32, the code no longer crashes trying to convert the ROOT_DIR magic cluster number into a disk offset.
@thejpster thejpster mentioned this issue Dec 17, 2023
thejpster added a commit that referenced this issue Dec 17, 2023
Map ROOT_DIR (0xFFFF_FFFC) to an actual cluster number, so if you go off the end of the first cluster of the root directory on FAT32, the code no longer crashes trying to convert the ROOT_DIR magic cluster number into a disk offset.
@thejpster
Copy link
Member

and exactly 1024 files

I mean, it was right there. 1024 files, at 32 files per block, is 32 blocks. Which is a pretty common cluster size (16 KiB).

@gauteh
Copy link
Contributor Author

gauteh commented Dec 17, 2023

Yeah, that was pretty suspicious. Nice work figuring it out.

@thejpster
Copy link
Member

Also, note that any directory operation involves a linear directory walk, so ideally you shouldn't have too many files in one directory. Perhaps create /0/1/2/0012345.dat or something.

@gauteh
Copy link
Contributor Author

gauteh commented Dec 17, 2023

Thanks, that's very useful.

@thejpster
Copy link
Member

To see it in action. the big_dir example goes slower and slower and slower the more directories it creates.

@chen-gz
Copy link

chen-gz commented Dec 17, 2023

Thanks for your nice work! I notice the the speed was slow, but I wasn't though it may cause by the size of directory. Thanks for point it out. The reason I put all file in root directory because this repository does not support create new directory.

@thejpster
Copy link
Member

Ah, yeah, because I didn't need that feature for my homebrew computer. PRs are welcome! :)

@thejpster
Copy link
Member

OK, fine: #111

@chen-gz
Copy link

chen-gz commented Dec 17, 2023

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants