Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unconnected boundary nodes from partition #3826

Merged
merged 1 commit into from
Mar 17, 2017

Conversation

TheMarex
Copy link
Member

@TheMarex TheMarex commented Mar 15, 2017

Issue

Addresses #3780 and #3818 and refactors assigning IDs from node-based edges to edge-based edges.

We now chose the BisectionID of the side that would induce the least amount of boundary edges (since we are assigning BisectionIDs and not CellIDs this is only heuristic, since two different BisectionIDs can later be contained in the same cell).

We also remove all unconnected boundary nodes.

Tasklist

  • review
  • adjust for comments

Copy link
Contributor

@oxidase oxidase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be closed without merge? the original issue is already closed.

EDIT: 🤕 sorry, the second issue is still open

@TheMarex TheMarex changed the title Initial steps to force forward and backward node in same cell Remove unconnected boundary nodes from partition Mar 16, 2017
@TheMarex
Copy link
Member Author

@oxidase ah sorry my bad, renamed the PR title. Going to squash the commits before merging, the reverts are just there for transparency.

@oxidase
Copy link
Contributor

oxidase commented Mar 16, 2017

@TheMarex it is really hard to compare heuristics, because non-covered edge cases could give different results. I checked only two cases:

  • ./osrm-partition bayern-latest.osrm on master and the branch: number of unconnected nodes goes from 0,2,3,4 to 18,19 (numbers are different between runs, because partitioning is not stable).
  • ./osrm-partition germany-latest.osrm number of unconnected nodes goes from 43,45 to 89,94.

These numbers are really small (~10^-6 part of all nodes), so the difference is almost insignificant.

I think it is better to use heuristic as simple as possible or even better avoid any kind by partitioning the edge-based graph, but don't know if it must be prioritized because of so minor win.

@TheMarex
Copy link
Member Author

TheMarex commented Mar 16, 2017

@oxidase I will try your test datasets. This method should guarantee exactly zero unconnected boundary nodes, might be a bug.

@TheMarex
Copy link
Member Author

TheMarex commented Mar 16, 2017

@oxidase fixed a bug in the code that breaks the partition, which was causing this weirdness.

Did some investigation around the increased border nodes: Picking the minimum border edge BisectionID does actually decrease quality. I tried a few combinations and surprisingly picking u for the forward and v for the backward node works the best.

My hypothesis is this is because this comes closest to a partitioning on the actual embedding. (forward node would be represented by the start of the edge and the backward node by the target)

Result on Bayern and master are virtually the same.

[info] Level 1 #cells 13847 #nodes 2745649,   source nodes: average 148244 (5.39923%) invalid 23 (0.000837689%),   destination nodes: average 211618 (7.70739%) invalid 9 (0.000327791%)
[info] Level 2 #cells 1013 #nodes 2745649,   source nodes: average 24509 (0.892649%) invalid 7 (0.000254949%),   destination nodes: average 33516 (1.22069%) invalid 0 (0%)
[info] Level 3 #cells 67 #nodes 2745649,   source nodes: average 4417 (0.160873%) invalid 0 (0%),   destination nodes: average 5948 (0.216634%) invalid 0 (0%)
[info] Level 4 #cells 3 #nodes 2745649,   source nodes: average 179 (0.00651941%) invalid 0 (0%),   destination nodes: average 221 (0.0080491%) invalid 0 (0%)

[info] Level 1 #cells 13837 #nodes 2745649,   source nodes: average 148191 (5.3973%) invalid 24 (0.00087411%),   destination nodes: average 211538 (7.70448%) invalid 8 (0.00029137%)
[info] Level 2 #cells 1009 #nodes 2745649,   source nodes: average 24397 (0.88857%) invalid 7 (0.000254949%),   destination nodes: average 33389 (1.21607%) invalid 0 (0%)
[info] Level 3 #cells 66 #nodes 2745649,   source nodes: average 4323 (0.157449%) invalid 0 (0%),   destination nodes: average 5804 (0.211389%) invalid 0 (0%)
[info] Level 4 #cells 3 #nodes 2745649,   source nodes: average 179 (0.00651941%) invalid 0 (0%),   destination nodes: average 221 (0.0080491%) invalid 0 (0%)

[info] Level 1 #cells 13864 #nodes 2745649,   source nodes: average 148367 (5.40371%) invalid 23 (0.000837689%),   destination nodes: average 211783 (7.7134%) invalid 7 (0.000254949%)
[info] Level 2 #cells 1020 #nodes 2745649,   source nodes: average 24579 (0.895198%) invalid 7 (0.000254949%),   destination nodes: average 33648 (1.2255%) invalid 0 (0%)
[info] Level 3 #cells 66 #nodes 2745649,   source nodes: average 4415 (0.1608%) invalid 0 (0%),   destination nodes: average 5929 (0.215942%) invalid 0 (0%)
[info] Level 4 #cells 3 #nodes 2745649,   source nodes: average 179 (0.00651941%) invalid 0 (0%),   destination nodes: average 221 (0.0080491%) invalid 0 (0%)

Compared to master:

[info] Level 1 #cells 13863 #nodes 2745649,   source nodes: average 148513 (5.40903%) invalid 23 (0.000837689%),   destination nodes: average 212019 (7.722%) invalid 10 (0.000364213%)
[info] Level 2 #cells 1009 #nodes 2745649,   source nodes: average 24400 (0.888679%) invalid 7 (0.000254949%),   destination nodes: average 33363 (1.21512%) invalid 0 (0%)
[info] Level 3 #cells 64 #nodes 2745649,   source nodes: average 4224 (0.153843%) invalid 0 (0%),   destination nodes: average 5654 (0.205926%) invalid 0 (0%)
[info] Level 4 #cells 3 #nodes 2745649,   source nodes: average 179 (0.00651941%) invalid 0 (0%),   destination nodes: average 221 (0.0080491%) invalid 0 (0%)

Copy link
Contributor

@oxidase oxidase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just checked on germany-latest - no warnings during CellStorage construction

[info] Loaded edge based graph for mapping partition ids: 51583110 edges, 13943644 nodes
[info] Fixed 1694 unconnected nodes
[info] Edge-based-graph annotation:
[info]   level 1 #cells 70419 bit size 17
[info]   level 2 #cells 5128 bit size 13
[info]   level 3 #cells 315 bit size 9
[info]   level 4 #cells 10 bit size 4
[info] MultiLevelPartition constructed in 12.3158 seconds
[info] CellStorage constructed in 2.4778 seconds

This commit removes all occurences of unconnected boundary nodes
and switches to the simple heuristic of picking U for the forward
and V for the backward node. This performs better than several
fancy heuristics.
@TheMarex TheMarex force-pushed the partition/forwardbackward branch from 7393470 to dfe4ae5 Compare March 16, 2017 21:52
@TheMarex TheMarex merged commit 57c6c6e into master Mar 17, 2017
@TheMarex TheMarex deleted the partition/forwardbackward branch March 17, 2017 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants