-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create a quadtree implementation adapted for svg paths and segments #188
create a quadtree implementation adapted for svg paths and segments #188
Conversation
d9cc279
to
353ebd4
Compare
I was thinking and I could also extend this in a way that |
Hi @Esser50K, thanks for the suggestion and PR. If there's a significant speedup, yeah, that sounds great. I think there's pretty good test coverage for intersections, so we could even set your method as the default. |
Alright I moved the Quadtree inside the Path. I guess a performance test would make sense. The quadtree only gives marginal improvements for small SVGs with a low amount of paths, however when the number of paths is large (and the paths have many segments) this method is a lot faster. In theory the performance of the other intersection was simply quadratic |
564c235
to
c38dda7
Compare
I don't like the idea of always storing QuadTree inside Path. You can have a lot of paths and now all implementations would have a QuadTree object in them. Not only that they build the quadtree when it's created, for a function that might not be called. That seems like a lot of overhead in the generic case. It would seem that, at worst, you should build a quadtree cache and reset it in the same cases where The actual methodology for doing this faster would be to use the Bentley Ottmann (https://en.wikipedia.org/wiki/Bentley%E2%80%93Ottmann_algorithm) algorithm for finding line intersections. ( https://www.youtube.com/watch?v=qkhUNzCGDt0). Given the bezier curves you might need to do some weird things with creating rational bezier curves (or segmenting them) etc. The time complexity here is actually the same as it ever was. You're still dealing with O(N²) the difference is you're not doing a brute force N² testing everything but you're actually only testing those segments with overlapping bounding boxes. In worst case you built a quadtree and then did all the same intersection checks you did before. In best case you did no intersection checks since nothing overlapped at all. Time complexities for acceleration structures are a bit weird but it speeds up the average case. Here you're doing fast hit testing, if the bounding boxes of the two segments do not overlap they cannot intersect. This means most of the time you skip all most of potential intersections. Bounding box checks on bezier curves are a bit hefty. There's actually a note that There's probably a couple shorter methods too rather than calculating the For example if we're looking for a line-cubicbezier intersection. if min(other_seg.start.real, other_seg.end.real) > max(self.start.real, self.control1.real, control2.real, self.end.real):
return None
if max(other_seg.start.real, other_seg.end.real) < min(self.start.real, self.control1.real, control2.real, self.end.real):
return None
if min(other_seg.start.imag, other_seg.end.imag) > max(self.start.imag, self.control1.imag, control2.imag, self.end.imag):
return None
if max(other_seg.start.imag, other_seg.end.imag) < min(self.start.imag, self.control1.imag, control2.imag, self.end.imag):
return None These checks are going to be quite fast compared to the other checks. So really a check for the performance should be matched against this sort of added fast fail routine which should speed the whole thing up quite considerably anyway. |
if isinstance(other_seg, (Line, QuadraticBezier, CubicBezier)):
ob = [e.real for e in other_seg.bpoints()]
sb = [e.real for e in self.bpoints()]
if min(ob) > max(sb):
return []
if max(ob) < min(sb):
return []
ob = [e.imag for e in other_seg.bpoints()]
sb = [e.imag for e in self.bpoints()]
if min(ob) > max(sb):
return []
if max(ob) < min(sb):
return [] Is probably about right. Might be faster to hardcode it rather than doing the list comprehension but adding that as the first value in |
Here's a somewhat fair test for random intersections: def test_random_intersections(self):
from random import Random
r = Random()
distance = 100
distribution = 10000
count = 500
def random_complex(offset_x=0.0, offset_y=0.0):
return complex(r.random() * distance + offset_x, r.random() * distance + offset_y)
def random_line():
offset_x = r.random() * distribution
offset_y = r.random() * distribution
return Line(random_complex(offset_x, offset_y), random_complex(offset_x, offset_y))
def random_quad():
offset_x = r.random() * distribution
offset_y = r.random() * distribution
return QuadraticBezier(random_complex(offset_x, offset_y), random_complex(offset_x, offset_y), random_complex(offset_x, offset_y))
def random_cubic():
offset_x = r.random() * distribution
offset_y = r.random() * distribution
return CubicBezier(random_complex(offset_x, offset_y), random_complex(offset_x, offset_y), random_complex(offset_x, offset_y), random_complex(offset_x, offset_y))
def random_path():
path = Path()
for i in range(count):
type_segment = random.randint(0, 3)
if type_segment == 0:
path.append(random_line())
if type_segment == 1:
path.append(random_quad())
if type_segment == 2:
path.append(random_cubic())
return path
path1 = random_path()
path2 = random_path()
t = time.time()
path1.intersect(path2)
print(f"\nIntersection calculation took {time.time() - t} seconds.\n") This goes from 47 seconds to 1.5 seconds with a the given distribution, with fast failing turned on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't run this as a test even when I did manage to force it to run the test gave me 50 seconds which is worse than I got for the default. The incompatibilities however are critical.
for segment in path: | ||
self.insert_segment(PathSegment(segment, original_path=path)) | ||
|
||
def _get_segments_in_area(self, area: Rect, out=None) -> list[PathSegment]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For good or ill the setup.py
has svgpathtools compatible down to 2.7 this is python 3.9 type hints.
|
||
out.extend(self._path_segments) | ||
if self._is_split: | ||
out = out.union(self._subtreeNE._get_segments_in_area(area, out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out is a list()
there is no method .union
. for list, or at least not in 3.8 which I run.
@tatarize can you confirm these are your suggestions in summary:
Or do you think using a quadtree for accelerating intersections is incorrect altogether and one should use the |
Do you have any examples to demonstrate the speedup @Esser50K ? I'm tempted to drop 2.7 support just to have type annotations. |
Thanks to some great work by @tatarize , you can now run This also currently produces a |
Nice to see an update here. |
e2f4588
to
e4090e0
Compare
I believe that the fast-fail code from @tatarize is doing something very similar to the quadtree (in the sense that is looking at the bounds and checking if they overlap at all). Test output logs:
@mathandy feel free to close this PR in that case |
The massive increase from the quadtree was actually because it did fast fail. The actual datastructure itself wasn't speeding things up much if at all. Basically if the AABB (axis aligned bounding boxes) of the curves do not overlap then they can't intersect so we can do some very basic subtraction and rule out something like 95% (varies a lot) of possible intersections. Fast fail just does a very tiny amount of subtraction to see if intersection is even possible. It's like 4 subtractions and a compare which shouldn't add much overhead at all. In fact, if there's no overlap across the x-axis it doesn't even check the y-axis at all. I also submitted: #192 for some review. While it's technically imprecise and can fail in some edge cases it's also 50x faster (and much more exact for where the intersection occurs) and works for all curves including arcs. So if finding intersections was mission critical it can be sped up still. It would be possible to change that code perform AABB overlap checks to actually exactly solve the intersections, or at least show that an extremely small region of space still contains both curves (without knowing if they necessarily overlap). |
Regardless, thanks for your work on it. You are appreciated! |
I've been working on a project that heavily relies on the use of the intersection calculation capabilities of this lib.
However I noticed it being too slow when I loaded it with larger svgs, essentially this is because for every path it goes through every segment every single time.
I experimented with loading the segments into a quadtree so they wold be quicker too look up and it works quite well, especially for svgs with a lot of paths and paths with a lot of segments.
I know this doesn't make the cut according to contribution guidelines but I'm happy to make the changes if there is actual interest for this.