-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix possible failure to create file/directory #1253
Conversation
…ent request to create a file to proceed
Hi @pomaroff , This change seems to make sense! Thank you for providing a fix.
May I ask how did you discover this ? |
d75ec7a
to
88eb96a
Compare
Hi @Liryna, Apologies for the scant details, I was hoping to use the build from AppVeyor to test the changes, but I've been unable to install the artifact from there. I'm not sure if it's related to the build failures that are currently happening on master? I may need some help getting an installer to verify the fix, or if you're happy to verify it for me then that'd be great.
I discovered this issue because I'm trying to use MsBuild for some C# solutions inside a Dokan mirror, and this is failing on some solutions. I found that the CreateFile callback in userland code wasn't getting called after looking at procmon, where I found that dotnet.exe can rarely make concurrent requests to create the same directory which both receive The only relevant place in the Dokan driver where this status is returned without calling back to the userland code is the area I've been looking at in this PR. I'm not entirely convinced this code is necessary at all because we could handle the concurrent requests in the userland code, but nevertheless, I think the changes I'm proposing here would be sufficient to get things working? This is relatively easy to reproduce in a simple test. See this C# example below, which for me tends to fail after a few loops. for (var i = 0; i < 100; i++)
{
var path = Path.Combine(mount.Path, $"Directory{i}");
var task1 = Task.Run(() => Directory.CreateDirectory(path));
var task2 = Task.Run(() => Directory.CreateDirectory(path));
Task.WaitAll(task1, task2);
Assert.That(Directory.Exists(path), Is.True, () => $"Failed to create Directory{i}");
} |
I've managed to install this locally, and whilst I think it has improved it it hasn't fixed all scenarios. I will look closer into this area of the code. |
… concurrent request to create a file to proceed" This reverts commit 88eb96a.
I tend to agree with you. It is such checks that need to be there for correctness and could be in userland or in the driver. Having it in userland means having all filesystem implementation take care of it (count active handles etc). Thanks for the update @pomaroff . Please let me know if you find the other reason MsBuild fails. |
…e the correct value of FileCount and use this to permit the first caller to proceed.
Are we sure this check is correct? Lines 701 to 705 in 255e4a5
My understanding is that
Looking at procmon, I suspect it is simply because there are many concurrent requests on the same (non-existent) directory path. Some of the requests have a Create disposition, but many others have Open dispositions. It's possible to have multiple concurrent active requests on the same path where at least 1 of them has a Create disposition, and by returning I've been experimenting on this PR with removing this check, and so far I have not encountered any problems with MsBuild after making this change, but I'll continue testing. I have also been trying to fix the race condition on FileCount because the FileSharing logic does not need to apply to the first request on the file. |
We could introduce a new What I am trying to say is that we should drop this if and let the userland implementation take care of it (which is the contrary to what I said previously I know 😄 ). Feel free to change your PR to remove this if, I can look to add it to the mirror / memfs samples. |
Thanks Liryna. Regarding this:
I think we already cover this in the documentation: |
Correct! 😎 good eye |
What do you think of the other change in this PR, which is to move the VCB Unlock until after the FCB is Locked? In doing so, we ensure that:
|
Holding the Vcb lock does resolve that issue but is usually not safe unless made very carefully. If any of the called function triggers a reentry (it easily happens) that also end up requiring the Vcb lock, we will deadlock. It is safer to just move the condition to userland. |
Thanks @Liryna. I'm not sure if it would have caused a deadlock, but I've reverted it to be on the safe side as it's not strictly relevant to the root problem I'm trying to solve here. I can always raise another PR in the future if needed. Thanks. |
Thanks @pomaroff ! I will take care of the possible sample changes 👍 |
If there are 2 (or more) concurrent requests to access a file, any 1 request with a CREATE disposition will be rejected with
STATUS_OBJECT_NAME_COLLISION
. The driver should not return this error code based on the fact that there other open requests before calling userland code to confirm that the file handle is actually open and that the file or directory exists.We also do not need to run file sharing logic for the first of many concurrent requests on the same file.