Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor nodes file #4036

Merged
merged 9 commits into from
May 17, 2017
Merged

Refactor nodes file #4036

merged 9 commits into from
May 17, 2017

Conversation

oxidase
Copy link
Contributor

@oxidase oxidase commented May 12, 2017

Issue

PR fixes #4028

Tasklist

  • Cleanup fileIndex, nodes and nodes_data files
  • [ ] Use Geometry views to prevent vector copies
  • Check size reduction for germany-latest.osm.pbf
  • update relevant Wiki pages
  • review
  • adjust for comments

Requirements / Relations

PR partially duplicates #3446

@oxidase
Copy link
Contributor Author

oxidase commented May 12, 2017

Changed sizes:

  • germany-latest.osrm.fileIndex 817152000 -> 614072320 ~25%
  • germany-latest.osrm.ramIndex 836852-> 629372 ~25%
  • germany-latest.osrm.names 9595145->9594817 = -328 must be checked

@oxidase oxidase force-pushed the refactor/nodes-file branch from 0d495bd to 2a3f890 Compare May 15, 2017 09:08
@TheMarex TheMarex added this to the 5.8.0 milestone May 15, 2017
@oxidase oxidase force-pushed the refactor/nodes-file branch 2 times, most recently from 3969e88 to e699a4a Compare May 15, 2017 20:37
@oxidase
Copy link
Contributor Author

oxidase commented May 15, 2017

@TheMarex u and v can also be removed, but it will require indirection forward_segment_id -> geometry_id -> geometry -> geometry[fwd_segment_position] -> coordinates. So phantom nodes snapping will require a copy of a geometry vector. The copying of vectors can be avoided by returning boost::iterator_range but this will expose internal iterator types in a datafacade, so i would like to skip it in this PR. Also removing reverse_segment_id will require an additional index and/or changes in snapping phantom nodes.

With the current implementation size of osrm::extractor::EdgeBasedNode changes from 32 to 20 bytes.

@oxidase
Copy link
Contributor Author

oxidase commented May 16, 2017

Size reduction for modified files of germany-latest.osrm is 26%:

  • fileIndex 817152000->511221760 -37%
  • ramIndex 836852->524036 -37%
  • nodes_data 126112613->182162657 +44%

EDIT: single-threaded extraction results for the branch and master

-rw-r--r-- 1 miha miha  9594992 May 16 11:29 germany-latest.osrm.names
-rw-r--r-- 1 miha miha  9594992 May 16 10:36 germany-latest.osrm.names.master

@oxidase
Copy link
Contributor Author

oxidase commented May 16, 2017

Runtime memory consumption for CH+MLD germany-latest.osm.pbf data slightly increased from

[info] Allocating shared memory of 3731803922 bytes

to

[info] Allocating shared memory of 3787552858 bytes

because part of fileIndex data moved into a shared memory block

@TheMarex
Copy link
Member

@oxidase is that a measurement against master or against 5.7.0?

@oxidase
Copy link
Contributor Author

oxidase commented May 16, 2017

@TheMarex no, it is master against the branch

Copy link
Member

@TheMarex TheMarex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some smaller naming related things. 👍

@@ -93,7 +93,8 @@ class EdgeBasedGraphFactory

// The following get access functions destroy the content in the factory
void GetEdgeBasedEdges(util::DeallocatingVector<EdgeBasedEdge> &edges);
void GetEdgeBasedNodes(std::vector<EdgeBasedNode> &nodes);
void GetEdgeBasedNodes(EdgeBasedNodeDataContainer &ebg_node_data_container);
void GetNodeBasedEdges(std::vector<EdgeBasedNode> &nodes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming here is a little bit confusing. This function returns a EdgeBasedNode but is named GetNodeBasedEdge. I think we should rename that class, it does not represent a single EdgeBasedNode but actually a single segment. Maybe ExternalSegmentData?

@@ -65,6 +68,13 @@ template <storage::Ownership Ownership> class EdgeBasedNodeDataContainerImpl
travel_modes[node_id] = travel_mode;
}

// Used by EdgeBasedGraphFactory to fill data structure
template <typename = std::enable_if<Ownership == storage::Ownership::Container>>
void SetData(NodeID node_id, ComponentID component_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a less generic name: SetComponentID.

@oxidase oxidase force-pushed the refactor/nodes-file branch from b76ba89 to 5bfb104 Compare May 16, 2017 17:23
@@ -359,18 +362,6 @@ void Extractor::FindComponents(unsigned max_edge_id,
}
}

// connect forward and backward nodes of each edge
Copy link
Contributor Author

@oxidase oxidase May 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheMarex just forgot to mention about the removed part that was introduced in c80c223. i don't get idea why graph must be extended with potentially non-exisiting edges. So if there is a use case for it, the place must be reverted before merge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is necessary to enforce that the component ID of the forward and backward node is the same (the implicit assumption is that we are always about to do a u-turn on the spot). There are of course cases where this is generally not true, but we can't handle these right now anyway (for that we would need to split the PhantomNode into forward/backward and have two component IDs).

@oxidase
Copy link
Contributor Author

oxidase commented May 16, 2017

@TheMarex as discussed per chat EdgeBasedNode is renamed to EdgeBasedNodeSegment

@oxidase oxidase force-pushed the refactor/nodes-file branch from 5bfb104 to 75e6808 Compare May 16, 2017 18:30
@oxidase oxidase force-pushed the refactor/nodes-file branch from 75e6808 to 7a26333 Compare May 17, 2017 05:57
@oxidase oxidase force-pushed the refactor/nodes-file branch from 7a26333 to 3ee629f Compare May 17, 2017 07:15
@TheMarex TheMarex merged commit 3546abc into master May 17, 2017
@TheMarex TheMarex deleted the refactor/nodes-file branch May 17, 2017 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce data duplication in fileIndex files
2 participants