-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: undeprecate existsSync; use access instead of stat for performance #7455
Conversation
See also interminable issue #1592 and PR #4217. I hope that we merge this PR #7455, but if the team refuses to merge this PR, then I hope one of #7455 or #4217 can be merged. No new tests were added in this branch, but all existing tests pass, and I'm not exactly sure how to write a fair benchmark for this. Your suggestions are welcome. |
cc @nodejs/fs |
Why un-deprecate it if it's essentially just going to be a wrapper for |
Generating an exception just so you can ignore it hurts performance. In PR #4217 we're considering adding a new method Furthermore, if (fs.existsSync(".git/rebase-apply/rebasing")) {
console.error("You are in the middle of a rebase!");
} To write that with try {
fs.accessSync(".git/rebase-apply/rebasing");
} catch (ignored) {
console.error("You are in the middle of a rebase!");
} Perhaps if annoynce were the only problem, we could all just use a one-liner
|
I'm -1 on this for the reasons stated in the other, related issues. |
Please leave a corresponding vote on PR #4217, preferably a +1 vote. |
We can't change the behavior to use |
-1. To keep it short, what @mscdex and @Fishrock123 said. |
@Fishrock123 I think you'll find that the current implementation behaves exactly the same as the old behavior. I realize that a lot of ink has been spilled on this topic, but I have read the megathread, and can find no specific information about how this would break the behavior of old For example, I tried reproducing the rumored Windows EPERM issue and could not. My implementation behaves the same as the old implementation when it comes to inaccessible files. (Turns out that when you don't have access to a file, you can't stat it, either! Lucky us.) This PR passes all existing tests. If there's a bug in this implementation, please tell me what it is and I will fix it. |
I understand the urge to do operations like: try {
fs.accessSync(".git/rebase-apply/rebasing");
} catch (ignored) {
console.error("You are in the middle of a rebase!");
} But developers need to be aware that that is still a race condition. Instead the operation should be performed as such: fs.open('.git/rebase-apply/rebasing', 'wx', (err, fd) => {
if (err) throw err; // the file already exists
// Great! we have a lock on the file and can now perform our operations.
doStuff(fd);
});
function doStuff(fd) {
// Perform operations
fs.closeSync(fd);
fs.unlinkSync('.git/rebase-apply/rebasing');
} Using the above the ability to have any two programs to race against the lock file should be removed. I may see the usefulness if checking file existence from a processes that isn't creating/removing the lock file, and you don't care about false positives. But this is conditional upon the processes that are creating/removing the lock file from doing so properly. |
TOCTOU bugs are a risk for Thus, no matter how you feel about the TOCTOU risk, it's irrelevant to the two PRs I've filed, PR #4217 or PR #7455. @trevnorris I thought you and I had actually come to agreement on this in the thread for PR #4217 :-( (For the record, I picked on the There is nothing I want to do with |
I understand. Just hashing through key points for the PR. Unless someone can create a test showing how using It may alleviate concerns if more documentation was added about the racy-ness of Also some documentation about the use cases. e.g. not using it to see if the file can be created, and instead using it for a lock file like you have shown above. but only of the application has no intent on creating the same lock file. this way any failures should be kept as false positives. |
Agreed that documentation can completely alleviate the problem (after that, it's "use at your own risk"). 99% of the time, checking just for file existence causes races (although I'll acknowledge that there are valid use cases). +1 if we add appropriate warnings (and an explanation of what the right way is to mess about with files) to exists-documentation. |
The documentation has been warning about that for three years (d97ea06), with seemingly little effect. |
Ah, apologies. I should've done some homework there. Some extra "how to deal with the open error" info (ie: check for ENOENT) would be welcome though. I'm not really invested in this feature tbh, and it seems more than enough people are quite opinionated about this one, so I'll stay out of it :) |
@bnoordhuis my suggestion is that the documentation be expanded with specifics. Like alternative code examples. We say to use |
I opened a separate PR #7832 around providing better examples for how to use |
The documentation PR #7832 with code examples is getting closer to merging, but we're not much closer to having this PR #7455 or its evil twin #4217 either merged or closed. I was really hoping to land one of those PRs in Node 7, which seems to be fast approaching. @trevnorris @bnoordhuis What's the next step on this PR? |
@Fishrock123 you have a qualm w/ this if it lands as semver-major? |
@Fishrock123 said earlier that it would be a large breaking change, but I don't think it is a breaking change; I can't repro any regressions in my implementation. If anyone can reproduce a bug where my implementation differs from the old |
The existing implementation throws ignored exceptions, which is slow. The new implementation returns a simple boolean without throwing.
access is faster than stat, because we don't have to return the actual stat results.
There's no alternative method to use that doesn't throw an exception when the file doesn't exist.
f6c935f
to
d255e0f
Compare
We couldn't repo any problems with realpath either. I'm not comfortable with this... |
I'll stick it on the CTC agenda, I see #7455 (comment) as a good use-case. I doubt we'll be able to switch the impl though. |
@bnoordhuis you had Opinions here, could you take a look at that use-case? |
Ok, my thoughts:
So what I think we should do is keep those soft deprecated until async/await is there, and then hard-deprecate those, telling people to switch to What I perhaps could be fine with is introducing another API endpoint without touching these for a truly sync version, like |
With If you promisify If you want a fast, easy-to-use |
Hm, I missed that, thanks. Another issue with |
I claim that |
Is there actually a case where the user cannot access the file, stat returns true, but that is what the user expects? Or does stat only return true for non-accessible files on Unix? |
There are two questions here:
I claim that the answer is "no" in both cases. http://linux.die.net/man/2/access lists five ways http://linux.die.net/man/2/stat indicates EACCES, ELOOP, ENAMETOOLONG, ENOENT, and ENOTDIR as possible failure modes, in addition to EBADF, ENOMEM, and EOVERFLOW which are the caller's fault. To pick off the obvious cases, it seems pretty clear that if So I think we can boil this down to four sub questions with a matrix of stat/access and ENOENT/EACCES: 1a) Can I claim that we can answer "no" to all of these sub-questions. Specifically, for the ENOENT cases 1a and 2a, I again can't even imagine sane FS behavior where 2b is the only case where I think there could possibly be a difference; it is at least imaginable that a FS could say "this file exists, but you're not allowed to stat it." If I understand correctly, this is what some folks claimed would happen in certain Windows permissions cases. But I tried testing this myself and I couldn't get it to repro. In my tests, I never found a way to access a file without being allowed to stat it or vice versa. If anybody can exhibit a test case, please let me know. |
I'm mostly worried about Windows tbh. |
I think the best way forward here is to separate the un-deprecation from the underlying change. Is that possible @dfabulich? I may need a refresher on what exactly was wrong with |
There's now an open PR #8364 to undeprecate The only reason they were combined in the first place was that in other threads folks told me that a PR like #8364 would not be merged, because there was never ever ever any good reason to use If the usability argument has convinced the team, then we can merge #8364 and then consider the performance implications of using |
Are any @nodejs/collaborators strongly in favor of undeprecating |
+1 to separating the undeprecation from the internal change. I signed off on the other PR here. I think that PR should land before this one, and this one should be modified to just provide the internal change. |
closing in favor of #8364 |
This has been dragged through various long discussions and has been elevated to the CTC multiple times. As noted in #7455 (comment), while this API is still generally considered an anti-pattern, there are still use-cases it is best suited for, such as checking if a git rebase is in progress by looking if ".git/rebase-apply/rebasing" exists. The general consensus is to undeprecate just the sync version, given that the async version still has the "arguments order inconsistency" problem. The consensus at the two last CTC meetings this came up at was also to undeprecate existsSync() but keep exists() deprecated. See: #8242 & #8330 (Description write-up by @Fishrock123) Fixes: #1592 Refs: #4217 Refs: #7455 PR-URL: #8364 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ilkka Myller <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Сковорода Никита Андреевич <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]>
This has been dragged through various long discussions and has been elevated to the CTC multiple times. As noted in #7455 (comment), while this API is still generally considered an anti-pattern, there are still use-cases it is best suited for, such as checking if a git rebase is in progress by looking if ".git/rebase-apply/rebasing" exists. The general consensus is to undeprecate just the sync version, given that the async version still has the "arguments order inconsistency" problem. The consensus at the two last CTC meetings this came up at was also to undeprecate existsSync() but keep exists() deprecated. See: #8242 & #8330 (Description write-up by @Fishrock123) Fixes: #1592 Refs: #4217 Refs: #7455 PR-URL: #8364 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ilkka Myller <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Сковорода Никита Андреевич <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]>
This has been dragged through various long discussions and has been elevated to the CTC multiple times. As noted in #7455 (comment), while this API is still generally considered an anti-pattern, there are still use-cases it is best suited for, such as checking if a git rebase is in progress by looking if ".git/rebase-apply/rebasing" exists. The general consensus is to undeprecate just the sync version, given that the async version still has the "arguments order inconsistency" problem. The consensus at the two last CTC meetings this came up at was also to undeprecate existsSync() but keep exists() deprecated. See: #8242 & #8330 (Description write-up by @Fishrock123) Fixes: #1592 Refs: #4217 Refs: #7455 PR-URL: #8364 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ilkka Myller <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Сковорода Никита Андреевич <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]>
Checklist
make -j4 test
(UNIX), orvcbuild test nosign
(Windows) passesAffected core subsystem(s)
fs
Description of change
existsSync
by usingaccess
without throwing an ignored exception.exists
faster by usingaccess
instead ofstat
(without returning an ignored stat result)existsSync
because there's no alternative method that doesn't throw an exception.