-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Inertial Flow Partitioner #3205
Comments
The Inertial Flow Partitioner and the dummy partitioner outlined in #3204 will share the same interface - only the implementation will change. Let's spec out how this interface will look like (and answer open questions). Eventually we need a partition on the edge based graph for the customization to make use of it (the call site is outlined in #3552). The Intertial Flow partitioner needs an embedding, which we already have for the node based graph. This and the fact that @MoKob already had some ideas how to transfer a partition on the node based graph to the edge based graph makes it a viable option to partition the node based graph already.
In case we want to make the partitioner a new tool
For the partitioner's interface, let's spec this out in two steps. First, partitioning the node based graph. And second, transforming the node based graph partition to the edge based graph. We can load the node based graph (which is a NodeBasedDynamicGraph) and partition it with Inertial Flow. The output of this is a vector (keyed by node based node) storing a partition id for each node (the partition id is generated from a 0/1 (left/right) bisection per depth). For the partitioner interface I can imagine a similar interface to what we had in rtd (check vector<uint32_t> InertialFlow(const NodeBasedGraph& graph, const vector<Coordinate>& embedding); leaving out tuning parameters and implementation specifics such as
On the implementation side rtd just destructed the original graph creating two subgraphs (which was somewhat expensive but worked out for North America and even when we did things in parallel). It's not clear to me if we should do this again here. Also in rtd we tried spacial orders again and again for each subgraph to get the best cut. What we could do instead this time: do n spacial orders on the full node based graph and keep them around. Never do spacial orders again but re-use.
At this point we should have a partition id for each edge based graph node. We still need to combine several partition levels into a partition id usable for the customization (rtd called this annotation).
Please fill in the blanks and discuss! cc @MoKob @TheMarex @danpat @oxidase |
To turn a node-based partition into an edge based one, there is essentially one item to consider. In the top left, I've put the node based graph with a very simple partition. If we would simply use that partition, we would end up with more border nodes than we want, as shown in the right image (everything that has the color red would be, in some sense, connected to the border of the partition). What we would have to do is to introduce artificial nodes in between every two nodes that are border nodes in the node-based-partition. We introduce a intermediary segment of length 0 that we don't snap to. For the interface, I'll need some time to think about how we would best integrate this into our existing pipeline. To me it is not clear, yet, how we intend to split the different steps so that we can re-use much of what is happening. To summarise:
|
What about trying to adapt the inertial flow algorithm directly to the edge-based-graph? Instead of sorting by node coordinates directly, we could try sorting by centeroid of the edge-based-node (e.g. centeroid of several segments). Or as already proposed just translate the from the node-based-graph to the edge-based-graph. The main motivation here is that we want a clear separation between algorithm independent pre-processing ( That is why I propose to work directs on the
Try to find an automated strategy here sounds great, as we works with data sets of widely varying size and we need to account for that. Maybe we can base this on a goal for a given number of vertices per level? I think keeping the number of levels constant might be easier to work with down the road.
It would be great if we could make |
I have a few reasons why I would not separate these processing toolchains: From our prototype we had the following measurements: Why using node based:
Added benefits for CH:
Refactor Involved:
In total, this would be a battle that is actually worth picking to me. |
I think we need to step back here a little bit and look at this from a high-level view and set some constraints:
This split is important to retain because our whole processing tool-chain is build around it. Ideally we can ship MLD support without anyone noticing that is is there. Breaking compatibility will set us back for at least three months (thinking back to the last time we did that) as we will need to adapt all downstream users and triage release processes. The goal of the initial implementation is: Change as little as possible to get MLD into OSRM. That is step one, anything beyond that is something we should absolutely discuss when we actually have something running and a system we can improve incrementally. With that constraints it means we need to keep the general structure of the edge-based-graph intact because our data indexing that we compute in |
This, to me, is just another case of you knowing what should be done, but telling no one in advance. Adding a single parameter to |
The basic steps for
Up until 4 should be clear by now, so I'm going to focus on 4-6. Step 4At this point we have a partition of the node based graph, staying in the same example as @MoKob above: What we need is something to translate from the NBG node ids to EBG nodes, because that is what we want to partition eventually. In order to do that we need to write out more auxiliary information while the EBG is generated to get that mapping. The best location to capture this is probably here. You will need to write out the following information:
Step 5You load the edge based graph that is a list of edges and construct a (dynamic) graph for them. The best example on how to do that is in the constructor of GraphContractor. Now we iterate over all nodes in that graph:
In the example above the small dots are the nodes in the EBG overlaid over the NBG for orientation. The nodes in the middle that are colored in red and green are the border vertexes that we need to fix. Step 6In the last step we need to resolve the problem arising from these border nodes we can't assign a partition to. There are at least two methods to fix these. Method 1Just pick one of the partitions. This can be done randomly or by just checking how much border edges we would introduce by picking either partition (== counting the adjacent number of nodes in the EBG that have different colors). This of course has all the problems that @MoKob outline above. However for testing this is trivial to implement and gives you a partition instantly. Method 2Here we need to modify the graph in the way @MoKob suggested above. The important thing here is that we are not allowed to modify the IDs of existing nodes in the EBG (because they are saved in the StaticRTree as entry point) but additive changes are fine:
Culprits for this approach are that we need to keep a marker around that identifies "real" arcs (e.g. non-border arcs). We can mark these edges by assigning them a special data ID (this is what we save for example here in the EBGF). Unpacking can remove these edges easily before annotation. |
The base version of this shipped using a simple heuristic that picks the |
Once we have a dummy partitioner (ticketed in #3204) we unblock further steps in the pipeline. Eventually we want to replace the dummy partitioner with a proper partitioning approach: Inertial Flow.
See: http://sommer.jp/roadseparator.htm
We already successfully prototyped a internal routing engine with that - some hints are over at
http://daniel-j-h.github.io/post/selection-algorithms-for-partitioning/
our multiple candidate cuts and parallelization should be doable in osrm, too.
Tasks:
The text was updated successfully, but these errors were encountered: