Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datafacade allocator backed by mmap #4881

Merged
merged 2 commits into from
Feb 26, 2018
Merged

Add datafacade allocator backed by mmap #4881

merged 2 commits into from
Feb 26, 2018

Conversation

TheMarex
Copy link
Member

@TheMarex TheMarex commented Feb 13, 2018

Issue

This is a first step towards #1947. We don't mmap the OSRM dataset directly but create a new file that basically stores the memory content as it is now. This is less effective both from a disk storage and startup time perspective, but should give us a good idea how the query performance using mmap without any optimizations.

Tasklist

  • Try on biggest production dataset (bike?) and measure performance
  • Investigate performance degradation under memory pressure.
  • review
  • adjust for comments
  • cherry pick to release branch

Requirements / Relations

#1947

@TheMarex
Copy link
Member Author

As a first prototype I tried to see what the general overhead is for using mmap instead of process internal memory and ran 50000 random queries on Bayern. Blue is internal memory red is mmap.
As you can see there is already a slight slowdown of just using mmap over internal memory even without memory pressure.
I ran two experiments once just running osrm-extract and osrm-contract and one that also ran osrm-partition to renumber the nodes in the graph using the partition information.

No renumbering

image

With renumbering

image

You can see that without memory pressure both perform about the same. The hope here is that renumbering will improve the locality.

@TheMarex
Copy link
Member Author

TheMarex commented Feb 16, 2018

We ran some experiments using the bicycle profile on North-America. To test the behavior under memory pressure, we first determined the minimal memory usage as 5.8G (first time serving queries completed without an OOM). Shows are the results for random queries which are generally the worst-case. The difference between 20% and 25% is quite large since we start to trash pages at a high rate due to the random access. For a 20% memory reduction, we would expect a slowdown of factor two in the worst case. All these results were obtained with 8 concurrent queries to test multi-thread page trashing.

image

@TheMarex
Copy link
Member Author

If we use renumbering (running osrm-partition) we see a much better performance under memory pressure:

image

For completeness I also run tests using MLD to see how this approach fairs:

image

We can see that MLD is in general more resilient towards memory pressure than CH, but is overall slower.

Copy link
Contributor

@oxidase oxidase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I have two remarks to fix appveyor builds.

{
// Create a new file with the given size in bytes
boost::iostreams::mapped_file_params params;
params.path = file.c_str();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file.string() would be better here as params.path is std::string.

// Create a new file with the given size in bytes
boost::iostreams::mapped_file_params params;
params.path = file.c_str();
params.mode = std::ios::in | std::ios::out;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mode is deprecated, better to use params.flags = boost::iostreams::mapped_file::readwrite;

@TheMarex
Copy link
Member Author

@oxidase can you check if this works for you? 🙇‍♂️

Copy link
Contributor

@oxidase oxidase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheMarex works 👍, lgtm

@TheMarex TheMarex merged commit 31d6d74 into master Feb 26, 2018
@TheMarex TheMarex deleted the mmap_facade branch February 26, 2018 22:32
@TheMarex TheMarex mentioned this pull request Mar 12, 2018
6 tasks
@danpat danpat mentioned this pull request Oct 20, 2018
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants