Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph data structure for MLD #3779

Closed
TheMarex opened this issue Mar 6, 2017 · 5 comments
Closed

Graph data structure for MLD #3779

TheMarex opened this issue Mar 6, 2017 · 5 comments
Assignees
Labels
Milestone

Comments

@TheMarex
Copy link
Member

TheMarex commented Mar 6, 2017

We need a data structure that makes the following operations fast:

  1. Scanning all boundary arcs of a node at a certain level
  2. Scanning all arcs to nodes of the same cell on level 1
  3. Identify when a node is a boundary node on a certain level

For this I propose the following data structure that basically just relies on a StaticGraph but enforces a node ordering that is specific to the partition.

  1. Sort all nodes by the following criterion:

    1. The highest level for which it is a boundary node (or never)
    2. The cell in which it is contained on that level
    3. Node ID (even better: BFS ID, but we can do that later)
  2. Node IDs of boundary nodes at a certain level will be consecutive. That way we can easily implement a fast map: Node ID -> Boundary ID (row/column) for each cell using a vector for each level, without too much memory overhead.

  3. We will get a lot of cache efficiency because the search on the level 0 base graph will only be within one part of the adjacency array.

We can create this in the partitioner just after we have determined the MultiLevelPartition and before we will create the CellStorage. CellStorage will then depend on the graph having a certain ID ordering to facilitate fast lookups.

EDIT: We need to do this before the MultiLevelPartition because we store data reference by node ID.

/cc @oxidase

@TheMarex TheMarex added the MLD label Mar 6, 2017
@TheMarex
Copy link
Member Author

TheMarex commented Mar 7, 2017

Will try to sketch this data structure out today.

@TheMarex
Copy link
Member Author

TheMarex commented Mar 7, 2017

Resorting the edges makes us dependent on #3737 since the current format of the lookup file assumes edges are always sorted in the way we emit them in the EdgeBasedGraphFactory.

@danpat danpat added this to the 5.8.0 milestone May 10, 2017
@TheMarex
Copy link
Member Author

TheMarex commented May 10, 2017

Thinking a little bit more on how to do this, I think it can be done bypassing some of the structural problems here:

  1. What we want to re-number are the EdgeBasedNode ids. These are referenced in multiple files after executing osrm-extract:

    • .fileIndex stores a reference to the forward/backward node in the search graph.
    • .ebg stores edges as (source, target, data)
    • .ebg_node stores additional information indexed by node ID
  2. We already read the .ebg file in osrm-partition anyway, we could also renumber every edge and write it out again.

  3. Loading the .fileIndex file in osrm-partition and renumbering it is kind of ugly but works.

  4. There should be no downside from doing something like osrm-extract then run osrm-partition (modified the .ebg file) osrm-contract (reads the .ebg file modied by osrm-partition). Best case this will also improve cache locality for CH.

  5. Multiple runs of osrm-partition are not affected by the renumbered file.

Brain dump on how to do this:

  1. Compute partition, don't create MultiLevelPartition yet, work on the raw vectors.
  2. Scan all nodes and mark every node that is a border node.
  3. Create a permutation array for every node.
  4. Partition nodes first by level on which they are a border node.
  5. Top-down stable sort on each level partition

@TheMarex TheMarex self-assigned this May 10, 2017
@TheMarex
Copy link
Member Author

TheMarex commented May 23, 2017

First progress report: Got the renumbering working up until the customization step (the .ebg_nodes and .fileIndex files still need to be renumbered).
This already enabled me to measure the impact of the new numbering on the customization:

Heat map of the access pattern on the graph during customization:

The X-axis shows the node ID space, the Y-axis is the time of execution (axis is flipped, top is start).

master
renumbered_access_master

renumbered
renumbered_access_new

As can already be guessed from these images, the renumbering reduces the cache-misses by more then 40%.

That said, this actually seems to decrease performance, which doesn't really make sense. That is the output of perf. The number of cycles is higher for this branch. My current theory is that maybe the graph is corrupted, can't really tell before the whole pipeline works.

Never mind this was a measurement error locally, new results confirm this translates to speedup as expected:

master

 Performance counter stats for './osrm-customize ../../osrm-data/bayern-latest.osrm':

      11855.821761      task-clock:u (msec)       #    2.737 CPUs utilized          
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
             9,461      page-faults:u             #    0.798 K/sec                  
    36,612,730,500      cycles:u                  #    3.088 GHz                    
    34,531,501,103      instructions:u            #    0.94  insn per cycle         
     7,020,323,106      branches:u                #  592.141 M/sec                  
       367,569,729      branch-misses:u           #    5.24% of all branches        

       4.331524738 seconds time elapsed

renumbered

 Performance counter stats for './osrm-customize ../../osrm-data/bayern-latest.osrm':

      11382.990306      task-clock:u (msec)       #    2.803 CPUs utilized          
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
             9,445      page-faults:u             #    0.830 K/sec                  
    35,151,807,276      cycles:u                  #    3.088 GHz                    
    33,939,053,523      instructions:u            #    0.97  insn per cycle         
     6,866,671,197      branches:u                #  603.240 M/sec                  
       337,103,764      branch-misses:u           #    4.91% of all branches        

       4.061277007 seconds time elapsed

TheMarex added a commit that referenced this issue Jun 2, 2017
The new numbering uses the partition information
to sort border nodes first to compactify storages
that need access indexed by border node ID.

We also get an optimized cache performance for free
sincr we can also recursively sort the nodes by cell ID.

This implements issue #3779.
TheMarex added a commit that referenced this issue Jun 2, 2017
The new numbering uses the partition information
to sort border nodes first to compactify storages
that need access indexed by border node ID.

We also get an optimized cache performance for free
sincr we can also recursively sort the nodes by cell ID.

This implements issue #3779.
@TheMarex
Copy link
Member Author

TheMarex commented Jun 8, 2017

This was addressed with #4089.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants