-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MacOS os.statvfs() has rollover for >4TB disks at each 4TB (32bit counter overflow?) #87804
Comments
MacOS BigSur (and older), python 3.9.2 (and older) For disks >4TB, os.statvfs() shows a wrong value for available space: too low, and always rollover at each 4TB. Example: "df -m" does show the correct available space df -m /Volumes/Frank/ So available space 19902070164 MB, so about 18.5 TB. Good. Now python's os.statvfs(): >>> s = os.statvfs("/Volumes/Frank")
>>> s.f_bavail * s.f_frsize / 1024**2
2658399.39453125 So 2.5TB, and thus wrong The difference is 16777216 MB which is exactly 4 times 4TB. Problem seems to be in MacOS statvfs() itself; reproducable with a few lines of C code. We have implemented a workaround in our python program SABnzbd to directly use MacOS' libc statfs() call (not statvfs() ). A solution / workaround in python itself would be much nicer. No problem with python on Linux with >4TB drives. |
Correction on typo in original post / to be clear: From the "df -m /Volumes/Frank/", Available is 19435615 in unity of 1MB blocks, so 19435615 MB. Which is 18.5 TB. All correctly reported by "df -m". But not by os.statvfs() |
As you note in the title this is a 32-bit overflow in the statvfs system API, the struct it uses contains 32-bit values. |
OK. What would be a solution from/for Python to get the correct available space on a MacOS system? In SABnzbd we implemented a workaround with a direct call to MacOS C lib's statfs(). See https://github.com/sabnzbd/sabnzbd/blob/develop/sabnzbd/filesystem.py#L948-L989 IMHO not a great solution for a python programmer. Could python's os.statvfs() use the correct (64bit) info from statfs()? |
For reference, link to our workaround: |
It seems to me that since I dug into it some more, and found that (from what I can tell, anyway) Linux decided to change the types in the struct used by Anyway, to help incentivize a fix, I have posted a bounty for a fix for this issue. |
I am not an expert on this but I think the
and the |
@ned-deily Oooh excellent find! Unfortunately I'm not much of a C programmer so I won't be taking a crack at it, but it seems like my bounty should be pretty easy to claim for someone who does know C. |
Implementing os.statvfs using statfs(2) isn't as easy as I'd like: the f_namemax field of struct statvfs is not available in struct statfs. It is possibly to use pathconf(2) instead, or keep using statvfs(2) to get this field. |
@ronaldoussoren did you have any luck using @ned-deily's find? It might be possible to just compile with that macro set and then the current implementation using |
The block counts reported by statvfs are always 32-bit, the macro's mentioned by Ned are for statfs (not the lack of a 'v' in the name). I've looked at the statvfs source code (yay opensource) for macOS 13 and even f_namemax is easy to replicate: libc just sets that field to a fixed value and not something that is filesystem specific. In hindsight that was expected, all filesystems users are likely to encounter support 255 characters in file names... Longer term it might be interesting to expose statfs directly, but that's for a different issue and something that needs design work because the statfs structures used by Linux and macOS are different (and I haven't looked at other OS-es). |
On macOS the statvfs interface returns block counts as 32-bit integers, and that results in bad reporting for larger disks. Therefore reimplement statvfs in terms of statfs, which does use 64-bit integers for block counts. Tested using a sparse filesystem image of 100TB.
@ronaldoussoren ah, thank you for explaining! I misunderstood some things! Looks like you have found a path forward though, because I see a PR. That's awesome, thank you for working on this!! |
On macOS the statvfs interface returns block counts as 32-bit integers, and that results in bad reporting for larger disks. Therefore reimplement statvfs in terms of statfs, which does use 64-bit integers for block counts. Tested using a sparse filesystem image of 100TB.
I have (finally!) merged my PR. I won't merge the PR into 3.11 and 3.12, because the change is a bit too large for that to my taste. |
Sorry, I've missed this PR during the review phase :( |
After the error handling fix, the refleaks buildbot started failing. |
Reposting investigation from @Eclips4
|
Working on the fix! Thanks for the report. |
Apparently, the leak only occurs on macOS, on the ARM64 MacOS M1 Refleaks NoGIL 3.x worker: https://buildbot.python.org/all/#/builders/1368/builds/184
|
…statfs` (#115335) It was the macro expansion! Sorry!
Looks like new runs of m1 free-threaded refleaks buildbot are successful: https://buildbot.python.org/all/#/builders/1368/builds/188 We can hopefully close this issue now :) |
On macOS the statvfs interface returns block counts as 32-bit integers, and that results in bad reporting for larger disks. Therefore reimplement statvfs in terms of statfs, which does use 64-bit integers for block counts. Tested using a sparse filesystem image of 100TB.
…structstatfs` (python#115335) It was the macro expansion! Sorry!
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
_pystatvfs_fromstructstatfs
#115236_pystatvfs_fromstructstatfs
#115335The text was updated successfully, but these errors were encountered: