-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible bug with symlink following #106
Comments
Version info:
|
(note to self, that's current master). Hey @BurntSushi , Coincidentally, I'm right in the middle of reworking the directory tree traversal, so this may get changed in the process. It's not really a bug though, but I suppose it depends on how you look at it. E.g. try this: $ ucg --type=cc -w '[A-Z]+_SUSPEND' ~/src/TestCorpus/linux --dirjobs=1 | wc -l
404
$ ucg --type=cc -w '[A-Z]+_SUSPEND' ~/src/TestCorpus/linux --dirjobs=2 | wc -l
392 Now do something similar with $ egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND' ~/src/TestCorpus/linux | wc -l
404
$ egrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND' ~/src/TestCorpus/linux | wc -l
390
That's exactly what's going on, but as always, it's a bit more complicated than that. When running with more than 1 dir tree thread, I have to handle symlink loop detection myself. I do that by not re-visiting directories I've already visited. But with only one thread, the fts library I use is able to handle this for me; however, it appears to be detecting only real cycles, and not just previously-visited dirs. So in that case, any "forward-crosslinks" (gotta be an actual term for that) get visited more than once.
I noticed this when I first went to multithreaded directory traversal, and made the decision that:
So it's not a bug of ommission.
Indeed. It probably would make sense to at least reword the warning in this case. Thanks for the report Andrew. Looks like you've been pretty busy! GRVS |
I admit, it's a little strange for the results returned to vary based on the number of threads you're using, but I can see why skipping previously visited directories might be helpful (if a bit surprising, based on the behavior of others tools).
I just recently added a lock-free parallel recursive directory iterator myself, which also of course has to handle symlinks manually, but it only detects true loops (which matches single threaded behavior). Rust being fairly new, I also had to write the single threaded iterator. :P They both check for symlink loops in exactly the same way. |
This has been resolved since release 0.3.2 in a few ways:
Closing as resolved. |
To reproduce:
In particular, I don't believe there are any recursive directory loops.
Are you perhaps not descending into the symlinked directory because it links to a directory that has already been searched? (Either way, it's at least not a loop.)
The text was updated successfully, but these errors were encountered: