-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected behavior in mmap causing issues with rpm #3939
Comments
Wonderbar. But I've got to ask or it is going to nag at me. The following from Hayden's test case:
... is semantically the same as...
...which is #902. Which was allegedly impossible because:
I'll take the win. But either (a) #902 just got fixed too or (b) I'm missing something stupid. |
When a file is mapped, we create a section with that file as the backing store. In NT, when you have a file-backed section you can't make the file smaller -- the section is essentially locking those file "pages". That is causing the #902 problem and is the rationale behind the cited comment. In the current scenario, where mmap is bigger than the file size, that is supported, and the only interesting bit is when the actual view of the extended part of the file is mapped. Currently this is done on-demand via guard pages, after verifying that the file is appropriately sized. |
Thanks much. Makes sense. In the #902 test case, maybe instead of actually truncating, make a note locally in the WSL VFS that new file size is 16 bytes (not 8192 byes), make a note the current high water mark is 8192, and don't call down to NT at all. In other words, lie to userspace that you've shrunk the file. A If that means the mapped pointer still has valid pages containing |
Fixed in Windows Insider Build 18890 |
@Brian-Perkins : unfortunately no. |
@therealkenc what is the chance of this fix being added to a 18362 servicing build? Or do we need to wait until next year / WSL2 to get a fix in a Production release? |
We'd like to see this fixed in WSL1 because we use VMware and VirtualBox on our workstations. |
I don't have control over that sort of thing, but historically speaking, chances low to nonexistent. It should appear in the Fall 2019 release tho. You won't have to wait until 20H1 or WSL2. |
I need this fix, how can I upgrade to 18890? better to keep my subsystem data. |
^--- pretty sure it it didn't make 19H2 aka 1909 (18363 < 18890) but 2004 should be good. |
Not seeing it in Server 2019 yet, is it being applied there? |
This is the result of several weeks of collaboration between the Pengwin team, WSL community members, outside experts we retained, and partners at Oracle on an bug affecting Red Hat Package Manager (rpm) on WSL, in CentOS, RHEL, Fedora (partial), Oracle Linux, and Scientific Linux. Those efforts are extensively documented here. After working through several theories and workarounds we believe we have narrowed down the issue to unusual mmap handling on WSL affecting the implementation of Berkeley DB inside rpm. The unusual mmap behavior seems to sync up with our workarounds and mitigations, so we have medium to high confidence this is the issue. There have been a handful of occasionally vague mmap issues reported before here, see #902 and #658, that are not quite on point, closest being #2852. Because this issue affects a broad array of distros on WSL we appreciate Microsoft's attention to this issue.
Your Windows build number: 17134 and 17763.
What you're doing and what's happening:
su -
into root.Would expect
rpm --rebuilddb
would rebuild a working rpmdb.Running rpm --rebuilddb breaks rpmdb.
"When the underlying file is extended, the extended part of the mapping is actually mapped back to the beginning of the file. This is why BDB would crash when it extended the size of the file that backed the in-memory cache (one of the __db.### files), and why setting the cache size to a small value works as a work around" - Dr. Lauren Foutz, Oracle
C code which replicates the issue:
mmap_extend.c.txt
See strace, procmon, and etl files here.
See strace, procmon, and etl files here.
This issue does not affect OpenSUSE's implementation of rpm because of a unique rpm implementation in that distro.
The text was updated successfully, but these errors were encountered: