Add datafacade allocator backed by mmap #4881

TheMarex · 2018-02-13T15:43:51Z

Issue

This is a first step towards #1947. We don't mmap the OSRM dataset directly but create a new file that basically stores the memory content as it is now. This is less effective both from a disk storage and startup time perspective, but should give us a good idea how the query performance using mmap without any optimizations.

Tasklist

Try on biggest production dataset (bike?) and measure performance
Investigate performance degradation under memory pressure.
review
adjust for comments
cherry pick to release branch

Requirements / Relations

#1947

TheMarex · 2018-02-14T15:52:53Z

As a first prototype I tried to see what the general overhead is for using mmap instead of process internal memory and ran 50000 random queries on Bayern. Blue is internal memory red is mmap.
As you can see there is already a slight slowdown of just using mmap over internal memory even without memory pressure.
I ran two experiments once just running osrm-extract and osrm-contract and one that also ran osrm-partition to renumber the nodes in the graph using the partition information.

No renumbering

With renumbering

You can see that without memory pressure both perform about the same. The hope here is that renumbering will improve the locality.

TheMarex · 2018-02-16T16:13:54Z

We ran some experiments using the bicycle profile on North-America. To test the behavior under memory pressure, we first determined the minimal memory usage as 5.8G (first time serving queries completed without an OOM). Shows are the results for random queries which are generally the worst-case. The difference between 20% and 25% is quite large since we start to trash pages at a high rate due to the random access. For a 20% memory reduction, we would expect a slowdown of factor two in the worst case. All these results were obtained with 8 concurrent queries to test multi-thread page trashing.

TheMarex · 2018-02-17T16:11:57Z

If we use renumbering (running osrm-partition) we see a much better performance under memory pressure:

For completeness I also run tests using MLD to see how this approach fairs:

We can see that MLD is in general more resilient towards memory pressure than CH, but is overall slower.

oxidase

Looks good! I have two remarks to fix appveyor builds.

oxidase · 2018-02-23T08:22:32Z

include/util/mmap_file.hpp

+    {
+        // Create a new file with the given size in bytes
+        boost::iostreams::mapped_file_params params;
+        params.path = file.c_str();


file.string() would be better here as params.path is std::string.

oxidase · 2018-02-23T10:40:06Z

include/util/mmap_file.hpp

+        // Create a new file with the given size in bytes
+        boost::iostreams::mapped_file_params params;
+        params.path = file.c_str();
+        params.mode = std::ios::in | std::ios::out;


mode is deprecated, better to use params.flags = boost::iostreams::mapped_file::readwrite;

TheMarex · 2018-02-26T14:25:54Z

@oxidase can you check if this works for you? 🙇‍♂️

oxidase

@TheMarex works 👍, lgtm

TheMarex added Work In Progress Experimental - Do not merge labels Feb 13, 2018

TheMarex requested a review from danpat February 13, 2018 15:45

TheMarex self-assigned this Feb 13, 2018

TheMarex force-pushed the mmap_facade branch from 93718b5 to 6dd5143 Compare February 18, 2018 16:46

TheMarex added Review and removed Experimental - Do not merge Work In Progress labels Feb 18, 2018

TheMarex requested a review from oxidase February 21, 2018 14:33

oxidase requested changes Feb 23, 2018

View reviewed changes

TheMarex added 2 commits February 26, 2018 13:36

Add mmap allocator

f6dc19a

Update documentation and changelog

d67e400

TheMarex force-pushed the mmap_facade branch from ab7971a to d67e400 Compare February 26, 2018 13:37

TheMarex added Review - In feedback and removed Review labels Feb 26, 2018

oxidase approved these changes Feb 26, 2018

View reviewed changes

TheMarex merged commit 31d6d74 into master Feb 26, 2018

TheMarex deleted the mmap_facade branch February 26, 2018 22:32

TheMarex mentioned this pull request Mar 12, 2018

Implement mmapDataFacade #1947

Closed

6 tasks

danpat mentioned this pull request Oct 10, 2018

How can I make osrm-datastore to use swap memory? #5182

Closed

danpat mentioned this pull request Oct 20, 2018

Support directly mmap-ing datafiles #5242

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add datafacade allocator backed by mmap #4881

Add datafacade allocator backed by mmap #4881

TheMarex commented Feb 13, 2018 •

edited

Loading

TheMarex commented Feb 14, 2018

TheMarex commented Feb 16, 2018 •

edited

Loading

TheMarex commented Feb 17, 2018

oxidase left a comment

oxidase Feb 23, 2018

oxidase Feb 23, 2018

TheMarex commented Feb 26, 2018

oxidase left a comment

Add datafacade allocator backed by mmap #4881

Add datafacade allocator backed by mmap #4881

Conversation

TheMarex commented Feb 13, 2018 • edited Loading

Issue

Tasklist

Requirements / Relations

TheMarex commented Feb 14, 2018

No renumbering

With renumbering

TheMarex commented Feb 16, 2018 • edited Loading

TheMarex commented Feb 17, 2018

oxidase left a comment

Choose a reason for hiding this comment

oxidase Feb 23, 2018

Choose a reason for hiding this comment

oxidase Feb 23, 2018

Choose a reason for hiding this comment

TheMarex commented Feb 26, 2018

oxidase left a comment

Choose a reason for hiding this comment

TheMarex commented Feb 13, 2018 •

edited

Loading

TheMarex commented Feb 16, 2018 •

edited

Loading