You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(L190) requires files to be completely exhausted before even the first file is matched. If files is a list-like, this is not a problem, but when calling it from the tree_*() methods it means that the whole iterator mechanics is pretty much useless.
It also means that if I have an ignored folder containing a very complex structure, which I want pathspec to ignore, pathspec will search through it although there is no way it will play a role in the results.
As an example, for an automation I'm writing on a real life repository containing a frontend application, the scan of npm generated files took about 10 minutes (before yielding the first result) and then I gave up and stopped it.
I think a possible solution is to remove this dictionary and simply doing:
for file in files:
if util.match_file(self.patterns, util.normalize_file(file)):
yield file
(I bypassed util.match_files() here as it, too, is not a generator and will try to convert files to list first)
The text was updated successfully, but these errors were encountered:
This is fixed and will be in v0.10.0 when it is released. All of the PathSpec.match_*() methods now properly use files as an iterator without exhausting it before hand, and yields results as the iterator is consumed (i.e., basically what your example does). I also improved the util.iter_tree_*() functions (used by PathSpec.match_tree_*()) to use os.scandir() which is supposed to be more efficient than os.listdir() with large directories.
Yeah, util.match_files() doesn't have a useful signature because it returns a set rather than iterator.
Hey @cpburnz , thanks for the great lib!
In
match_files()
(https://github.com/cpburnz/python-path-specification/blob/c00b332b2075548ee0c0673b72d7f2570d12ffe6/pathspec/pathspec.py#L170), the line(L190) requires
files
to be completely exhausted before even the first file is matched. Iffiles
is a list-like, this is not a problem, but when calling it from thetree_*()
methods it means that the whole iterator mechanics is pretty much useless.It also means that if I have an ignored folder containing a very complex structure, which I want
pathspec
to ignore,pathspec
will search through it although there is no way it will play a role in the results.As an example, for an automation I'm writing on a real life repository containing a frontend application, the scan of npm generated files took about 10 minutes (before yielding the first result) and then I gave up and stopped it.
I think a possible solution is to remove this dictionary and simply doing:
(I bypassed
util.match_files()
here as it, too, is not a generator and will try to convertfiles
to list first)The text was updated successfully, but these errors were encountered: