-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
De-duplicate edge-based-node data in RTree #3446
Conversation
@@ -15,43 +15,48 @@ namespace osrm | |||
namespace extractor | |||
{ | |||
|
|||
struct RoadSegment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the space savings and use a little bit of that to inline the coordinates in the struct. That would be 16 bytes more, but would solve the following ugliness:
- Speed: It will probably be much faster, no double-indirection
- No need for a separate std::vector<std::pair<NodeID, NodeID>> on construction
- No cyclic dependency on
datafacade
- No dependency on the coordinates array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheMarex yeah, good idea. The first round here was to see just how much space saving there was to be had.
In theory we can cram +/- 1m precision lon/lat values into 22-23 bits each - we may be able to pack lon/lat/fwd_segment_position into 64 bits and only cost an additional 4 bytes.
If you get the indexing for the What we would need for that is to modify the path unpacking to not create |
Closing this as it's being replaced by #4036 |
Issue
The StaticRTree contains lots of duplicated data. Every road segment contains a complete copy of the edge-based-node data that it's related to.
While this is good for cache efficiency, it's quite wasteful space wise.
This PR refactors all the common data into a standalone vector. This adds an additional lookup layer for every RTree operation, but reduces the total RTree datasize by >50% (in testing so far, Texas-sized RTree data went from 42MB to 17MB).
This PR is a WIP - I'm not 100% clear on the performance impact of this change yet. I'm hoping that it's relatively minor, and the significant space savings are worth the cost.
Tasklist