Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to improve the performance of line()? #456

Open
ceball opened this issue Sep 8, 2017 · 6 comments
Open

Is it possible to improve the performance of line()? #456

ceball opened this issue Sep 8, 2017 · 6 comments

Comments

@ceball
Copy link
Member

ceball commented Sep 8, 2017

I started with examples/timeseries.ipynb, modifying to have only one column:

screen shot 2017-09-08 at 11 31 52 pm

In a bit of an ugly process, I then time canvas.line() and canvas.points() for increasing repeats of the dataframe:

screen shot 2017-09-08 at 11 33 20 pm

line seems to be a lot slower than points:

screen shot 2017-09-08 at 11 46 38 pm

screen shot 2017-09-08 at 11 47 37 pm

(Sorry for the screenshots, but github doesn't seem to allow notebooks to be attached.)

@ceball
Copy link
Member Author

ceball commented Sep 8, 2017

I just looked at the source code and see line uses agg=any() by default, while points uses agg=count() by default. So just a note to confirm that the findings above don't change much if I use the same reduction for both line and points.

@jbednar
Copy link
Member

jbednar commented Sep 8, 2017 via email

@ceball
Copy link
Member Author

ceball commented Sep 8, 2017

Yes, I guess the title should have been more like, 'Is line the expected amount slower than points?', or maybe, 'Can line be made faster?'.

@ceball ceball changed the title Why is line slower than points? Is it possible to improve the performance of line()? Sep 9, 2017
@jbcrail
Copy link
Contributor

jbcrail commented Sep 11, 2017

My immediate suggestions for performance gains are:

  1. Switch to integer-specific Bresenham line-drawing algorithm

We currently use the more general algorithm that supports both ints and floats. However, by the time we need to draw a line, the points have been mapped to an integer space so float support is redundant.

  1. Switch line-clipping algorithm

We currently use Cohen-Sutherland for clipping a line to a bounding box, but the Liang-Barsky algorithm is considered significantly more efficient.

@jbednar
Copy link
Member

jbednar commented Sep 12, 2017

I'd be very happy to try both suggestions; we're using line a lot in some performance-critical applications, so anything will help...

jbcrail added a commit that referenced this issue Oct 16, 2017
I switched from Cohen-Sutherland to Liang-Barsky. The performance gains
for random lines range from 50-75% improvement for a million lines.

Related to #456
jbednar pushed a commit that referenced this issue Oct 16, 2017
Switched from Cohen-Sutherland to Liang-Barsky. The performance gains
for random lines range from 50-75% improvement for a million lines. Related to #456.
@jbednar
Copy link
Member

jbednar commented Oct 18, 2017

It would be nice to re-run those benchmarks now that #495 has been merged.

jbednar pushed a commit that referenced this issue Oct 30, 2017
Switched from Cohen-Sutherland to Liang-Barsky. The performance gains
for random lines range from 50-75% improvement for a million lines. Related to #456.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants