-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DrvFs: compiler reports file missing, but it is present #2712
Comments
After a clean re-start, the compilation has failed again, exactly as above, but with a different file:
But the file is alive and well, where it is supposed to be. |
After a further clean re-start, the same compilation fails in a different place. A file appears to be missing, but it is safely in the right place when found with
but:
|
@Warblefly |
Thank you for replying. I'm afraid this workaround, which I've already tried, is unsuccessful in build 17046. It is a slightly different problem: the file is occasionally simply "not found"; this isn't an EINVAL return. |
#2484 (sic). |
Well spotted, @therealkenc. |
@Warblefly Seems like you got all the compilation issues one can get. Lel Maybe for you since you are beyond 1709, your fallback is the faulty one present in 16299 because its the current stable one. Anyway my issue was this one: #2464 (comment) I just posted in related threads to see if someone knew about it because there was no specific issue for it until now. @therealkenc The cumulative update to 16299.98 fixes this issue. But it seems performance has taken a hit. Kernel compile went from 240~ with ccache to ~350 with ccache. |
Thanks for the info @nutcasev15 — indeed, whether or not the DrvFs filesystem is mounted with, or without, the "fallback" workaround given some time ago, the fault occurs. It makes the job impossible, unfortunately. |
2.Crossref, possibly related; RoliSoft/WSL-Distribution-Switcher#69 |
@AnneTheAgile Thank you for asking these questions. I am happy to offer this information: (1) The build is 17046.rs_prerelease.171118-1403. |
A few members of my team (myself, @rajsesh, @yiyang-msft) are seeing this as well, on builds in the 1705x-1706x range. |
Correction: I mis-typed the build. Here, it's 17046. |
this can be "reproduced" fairly "consistently" in my setup in several completely different use cases, see (duplicate) issue @Warblefly linked above. |
This issue can be reproduced in my environment during GCC crosscompiling (Windows version 17063). Version 17040 works fine. |
still an issue in 17074 |
Please fix this, it is sooo annoying |
Does anyone has a targeted repro for this issue? I can try with @Warblefly's project, but compiling that seems to be a long process. |
unfortunately the errors (in my cases) only happen when there's lot of file actions going on, i.e. during long compile/render processes. |
Doesn't matter from which angle you look at this issue, it looks like a race condition. It will be a difficult one to chase. But, if someone can provide some specific repro here with any good degree of repro, it will be very helpful. I tried @Warblefly's project, but, it has lot of dependencies and the instructions in the repo do not cover all of them. Chasing each one of them down seems time consuming. I am trying other projects to see if I can get a local repro. |
Thank you for checking @sunilmut — just updated the repo from my live, functioning tree. I'm now compiling it under Ubuntu under Hyper-V Manager and it is perfect. But under Ubuntu Bash on Windows, the failures at random points continue. I don't test Ubuntu Bash on Windows any more; not enough time. |
Isssue still exists in 17093. I tried to create simple program, which simulates this behavior. On DrvFS reproduces problem, on VolFS works without errors.
|
in my scenarios, the error also happens on VolFS, albeit not as often as on DrvFS |
I think It is a problem with make -j X ? |
We believe we have found the cause of this issue, and are working on a fix. We're trying to get this fix into the upcoming Spring Update as well (probably as an update after release). |
@therealkenc as said in a few comments before and shown in the above strace, the issue still exists for |
Repetition is not constructive. |
just wanted to be helpful by pointing out that the test case only tests one part of the problem, or one problem in a family of problems, and that closing this issue might have been premature. but before I start "repeating" myself again: where do you want me to continue with my unlink-while-heavy-concurrency-i/o: new issue? #2780? here? |
@heldchen - the ESPIPE error you are seeing looks like it is on stdin (fd=0) so is likely unrelated. If you can get a repro where one of the other file system commands fails it would be great if you could share (we can route depending on what the error is, and if there is already an existing issue covering it). Unlink on /mnt/x (DrvFS) has had a few reported issues but has been improving over time. |
thanks @Brian-Perkins for confirming the ESPIPE is not relevant. turns out @therealkenc was right to close this issue after all. I spent some more hours logging all threaded unlink-actions using strace and comparing them. it looks like the Magento framework using a file-based self-rolled locking mechanism. when one thread acquires a lock. unfortunately they did not account for multiple threads trying to acquire the lock at the same time, so in some rare circumstances more than one thread would think that it owns the lock. in these cases, according to my strace logs the chances are actually quite high that two threads try to release their lock by unlinking the file at virtually the same time. in these cases, PHP's file stats cache seems to momentarily reports outdated info to the losing thread, thus resulting in the "errors" I've been observing. |
I guess this manifests is 17134 (1801 RTM). |
This rest of this bug should be fixed in 17677. Do let us know if this issue is completely resolved. |
@sunilmut thanks for the fix! |
@jowadmax - Once we get some kind of confirmation (from the community) that the issue is fully resolved by the fixes, yes, the fix will be considered for backporting to 1803. |
@sunilmut - Hello and thanks for the fix ! |
@sunilmut Is there a way to install the insider build without permanently enabling preview builds? I might be able to convince my company to let me install one preview build so I can check if the fix works, but I sure don't want to get preview builds all the time. |
@pgroke-dt you might be able to install the preview on a virtual machine and run your compile job for testing. You can use VirtualBox's shared directory feature to access your local files from the VM. Unfortunately there are no ISOs available for the Insider Builds (afaik), so you'll need to enroll the Windows version that's on the virtual machine into the Insiders Program. |
Can confirm this issue is fixed for me on build 17692. Would love to have this backported to 1803 so I don't have to leave my work machine on a Insider build. |
@jowadmax Thanks for the tip, unfortunately I don't have enough space for another Windows installation in a VM on my system. |
@sunilmut |
@sunilmut A backport would be great. My 1803 is frustrating me. |
@therealkenc Seeing this in 17134 (1803), confirming your suspicion. A backport or workaround would be wonderful as I'm unable/unwilling (requirement of Microsoft Account/non-system local account) to install one of the insider builds. Update: Jumped through the hoops to get on insiders and installed build 17741. Confirmed fixed for me. |
Yeah the December OP reported 17046 (before 17134) so there was no real reason to expect it to work on the April Update. My marking it insidertransient for a few days was just "wrong" on my part; apologies. It was reported that 16299 is okay, fueling the incorrect assumption. At the time it was somewhat confusing as to what broke when, what was being fixed precisely, and in what release (#2484 still flaps in the wind uncategorized). Beyond that, four asks for a backport is now three too many. I don't have any specific insight on the interworkings of the backport process (and, I get the impression it is somewhat opaque to the WSL devs as well and not entirely in their control). But, given that it was fixed in late May, this is mid-August, and the Redstone 5 (1809) release is already starting to be locked down for October... if it were me, I wouldn't place a large bet on a QFE. Who knows. Could happen. Either way, rest assured all the devs get all the issue updates whether you ping them or not. |
On build 17046, while compiling the GCC cross-compiler, it bailed out with a "No such file or directory" error when referencing builtins.def — but the file is present. Could this be a variant of #2448 or #2464?
The error:
But the file reported as missing is present.
The text was updated successfully, but these errors were encountered: