Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split PBF Parser into a single Read and a single Parse thread to increase parsing speed. #20

Merged
merged 2 commits into from
Nov 3, 2011

Conversation

sivetic
Copy link
Contributor

@sivetic sivetic commented Oct 14, 2011

I noticed extracting was inefficient in the original threading model, and I wasn't making any significant gains by increasing the number of threads to more than 1 (the default). I split the PBF Parser code into two separate threads instead - one thread reads all data from the PBF, and another thread parses all the data. The Read thread places all data into a thread-safe queue. I added a maximum queue size limit to avoid issues with reading all of the data into RAM (data is read quicker than it is parsed). I noticed an improvement of around 30% over original parsing method. Threading is done with Boost threads. I removed the OMP pragmas as they were no longer necessary.

Note - Read/Parse methods are NOT thread safe any longer. Only one thread may Read and Parse at the same time. I experimented with multiple Parse threads, but the improvement was negligible due to greedy locking of the STXXLContainers object. I'm guessing a better locking mechanism may improve the performance, though I did not have time to test it out.

@@ -32,6 +32,84 @@ or see http://www.gnu.org/licenses/agpl.txt.
#include "ExtractorStructs.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since concurrent_queue is not just a small struct, it is justified to put into its own class file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will split up into its own file.

@DennisOSRM
Copy link
Collaborator

Great addition, I put a few comments into the pull request


bool empty() const
{
boost::mutex::scoped_lock lock(queue_mutex);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a lock on the empty function really necessary. This is a concurrent read and it does not ensure that the queue does not change after the lock goes out of scope.

DennisOSRM pushed a commit that referenced this pull request Nov 3, 2011
Split PBF Parser into a single Read and a single Parse thread to increase parsing speed.
@DennisOSRM DennisOSRM merged commit a5c4d21 into Project-OSRM:master Nov 3, 2011
@DennisOSRM
Copy link
Collaborator

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants