Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of Query method #40

Open
kevina opened this issue Jun 26, 2016 · 5 comments
Open

Improve performance of Query method #40

kevina opened this issue Jun 26, 2016 · 5 comments
Assignees
Labels
help wanted Seeking public contribution on this issue status/deferred Conscious decision to pause or backlog

Comments

@kevina
Copy link
Contributor

kevina commented Jun 26, 2016

In ipfs/kubo#2760 @whyrusleeping said in a line comment:

Yeah, using a channel as an iterator sucks. If one of you wants to work on improving the perf of query that would be great.

We could change the interface to not use a channel, and have it instead just return the next value directly. Then on top of that we could provide a method for turning the direct query result into a channel buffered one for usecases that need it

@kevina
Copy link
Contributor Author

kevina commented Jun 26, 2016

@whyrusleeping I will be happy to look into this and determine where the bottleneck is. It may be as simple as increasing the buffer size. I will also try a direct iterator approach and see if that helps.

@kevina kevina self-assigned this Jun 28, 2016
@kevina
Copy link
Contributor Author

kevina commented Jun 29, 2016

Here are some performance numbers for doing a key-only query on the leveldb datastore:

plot

The buffer size is the channel buffer size, direct is the results from querying the level-db directly.

And here are some results from the flatfs datastore:

plot

It seams that at least for key-only 128 in the optimal buffer size.

@whyrusleeping
Copy link
Member

@kevina thanks for these graphs, i think youre right, we should buffer the channels at 128 for now. And if we need more perf later, give the option for direct iteration.

@kevina
Copy link
Contributor Author

kevina commented Jun 30, 2016

I updated the graph for flatfs queries. It seams there is enough overhead in the filepath.Walk that once the buffer is large enough the overhead of channels and goroutine is insignificant.

kevina added a commit that referenced this issue Jun 30, 2016
Use "make benchmark" to run.
@kevina
Copy link
Contributor Author

kevina commented Jun 30, 2016

I pushed the (somewhat hackish) code to create the graphs on the kevina/query-benchmarks for lack of a better place.

kevina added a commit that referenced this issue Jun 30, 2016
Use "make benchmark" to run.
@whyrusleeping whyrusleeping added the help wanted Seeking public contribution on this issue label Sep 14, 2016
@flyingzumwalt flyingzumwalt added the status/deferred Conscious decision to pause or backlog label Sep 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue status/deferred Conscious decision to pause or backlog
Projects
None yet
Development

No branches or pull requests

3 participants