Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSRM Normalized File Format - Read Issues #984

Closed
jlaura opened this issue Apr 18, 2014 · 23 comments
Closed

OSRM Normalized File Format - Read Issues #984

jlaura opened this issue Apr 18, 2014 · 23 comments

Comments

@jlaura
Copy link

jlaura commented Apr 18, 2014

I am attempting to parse the .osrm file to extract the nodes and edges for use in some custom analytics, written in Python. Using the mapping of the normalized file format, I am able to skip the header and read in the edges:

[(32261224, -110980190, 2522869424L, 0)
 (32261237, -110978633, 2522869425L, 0)
 (32261262, -110982280, 2522869426L, 0)
 (32261267, -110981328, 2522869427L, 0)
 (32261306, -110978515, 2522869428L, 0)
 (32261427, -110978378, 2522869429L, 0)
 (32261540, -110978326, 2522869430L, 0)
 (32261616, -110978328, 2522869431L, 0)
 (32261625, -110979222, 2522869432L, 0)
 (32261638, -110978327, 2522869433L, 0)]

I do not need the flags and so am not extracting them.

I can not get reasonable looking edges extracted and am seeing:

[(4183983810L, 2522869435L, 0, 18302, -1893072404, -1694, 2522869436L, 0L)
 (32262024L, 4183986981L, -1772097859, 0, 1200160768, 492, 4183987106L, 2522869438L)]

Length 0 and negative length make no sense.

  1. Is the normalized file format in line with the current code or has the structure changed?
  2. In the .osrm file the vertex count appears to be a fully 16 bytes (4 for the count and 12 padding), while the edge count is 4 bytes with no padding. Is this correct?
  3. Does the header contain information describing scaling the lat/lon (32.2, -110.98) in the above example.
@woodbri
Copy link

woodbri commented Apr 19, 2014

Also see #825 and you might find https://github.com/woodbri/osrm-tools although it is a little out of date relative the the current release of Project-OSRM as I have not had time to update it and Dennis has #825 somewhere (hopefully) on his todo list.

@jlaura
Copy link
Author

jlaura commented Apr 19, 2014

@woodbri I saw #825 and a reader or plain text file would be wonderful. The latter does raise performance concerns, but access to the data is better than no access. Can you speak to the accuracy of the wiki - specifically the layout of the edges. Are you are to properly read an OSRM normalized file in your tool (it looks like it geared more towards writing)? Thanks!

@woodbri
Copy link

woodbri commented Apr 19, 2014

@jlaura Yes my tool is more oriented to writing. I updated the wiki page last time I sync'd my tool with project-osrm, but I suspect that it is out of date again. I had to go through the source code to update it last time. As Project_OSRM becomes more popular getting a clean well supported API is more critical, but until Dennis has time to do something on this front, it will be up to the community to deal with this.

@jlaura
Copy link
Author

jlaura commented Apr 19, 2014

@woodbri Thanks for the info...to the source I'll go. I appreciate the wikipage to get rolling - thanks for the hard work getting that done.

@jlaura
Copy link
Author

jlaura commented Apr 22, 2014

Here is a Python script that will parse the .osrm file into a pair of NumPy arrays, one containing nodes (with flags all set to 0) and one containing edges. GIST

Still no information on what is in the header. It looks like a 128 byte UUID followed by a count of the number of edges (padded after 12 bytes), but I'm not 100% sure. Like @woodbri, I just skip it.

@jlaura jlaura closed this as completed Apr 22, 2014
@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

what about an option in osrm-extract and osrm-prepare to export to some stable/standard format? #825

@DennisOSRM
Copy link
Collaborator

I am not sure if we are at this point yet. The .osrm format has been pretty stable, but with all the new and coming features there will be breaking changes.

What we could do, though, is using something like protobuf to have some backward compatibility.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

you mean export nodes/edges in protobuffer format?

@DennisOSRM
Copy link
Collaborator

Right, the advantage would be to not only have some standardized format with (limited) backward compatibility but also bindings for several programming languages. And we are already using protobuffer for parsing the osm.pbf files.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

can protobuffer be used for saving the contracted graph? or do people only need to extracted graph?

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

but yes, that's what i was suggesting, exporting to some standard format

@DennisOSRM
Copy link
Collaborator

Good question on what people need. The protobuffer library lets you define schemas for any kind of data to de-/serialize.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

the current issue #984 mentions .osrm, which would be the contracted graph. on the other hand, #825 talks about osrm-extract. in any case you could image use cases for both.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

do you have experience with http://kentonv.github.io/capnproto?

@DennisOSRM
Copy link
Collaborator

Some limited experience. capnproto is basically the same as protobuf from a functional point of view. That being said, it has a good reputation in terms of efficiency. The only downside is that capnproto would introduce another dependency where protobuf is already used.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

yeah, absolute performance would not be the main goal with an export function. i would stay with pbf

@jlaura
Copy link
Author

jlaura commented Apr 23, 2014

@emiltin So I can confirm for my use case - the contracted graph is the full graph, but with coincident nodes and edges removed or has some other processing occurred? For my use case - I need access to the nodes, edges (from which I can extract the graph topology), and edge lengths.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

the contracted data is a lot differnet from the basic topology. it's an edge expanded graph that has been contracted using the contraction hierachies algorithm. if you just want the topology, you probably want the output from osrm-extract, which is more like a filtered version of the osm data.

@woodbri
Copy link

woodbri commented Apr 23, 2014

@emiltin It seems that this discussion should be part of #825 from the point of view of making a stable way to import or export the OSRM files/data. While I'm not opposed to using something like protobuf, there are a lot of advantages to supporting something like simple CSV text files, but that could be achieved with a pbf2txt utility of some kind.

Regarding text files, while these are not efficient, efficiency is not the primary goal here but rather accessibility is the goal. Also these only need to be read/written once on transfer so performance is not the issue and they can be written as gzip compressed stream to deal with space efficiency.

While my use case does not currently need the contracted graph, I could see this as being potentially useful in the future, but clearly not a high priority in my mind.

@emiltin
Copy link
Contributor

emiltin commented Apr 23, 2014

yes i agree it's essentially the same discussion as #825.
sorry if i'm a bit slow here, but what's the point in exporting osrm-extract data - is it to to get access to OSRM's lua based filtering and speed assignments?

@woodbri
Copy link

woodbri commented Apr 23, 2014

I can speak to @jlaura specific needs but there are a lot of applications that can not be solved today with OSRM that could be solved in pgRouting or potentially other routing engines, for example: adjusting costs or filtering the edges based on traffic or accident data, time based cost adjustments, dynamically supporting different profiles without having multiple instances of OSRM. There are obviously other ways to get OSM data into pgRouting, but in @jlaura case it is a Python app and not pgRouting.

@DennisOSRM
Copy link
Collaborator

@jlaura from what I get from your post above, you would need the .osrm file which contains all the segments of the road network.

@jlaura
Copy link
Author

jlaura commented Apr 23, 2014

@DennisOSRM Sounds like I'm on the right path. My workflow (and the linked gist above) is aimed at extracting the .osrm file. This should be the output from osrm-extract.

I believe I could also do the extraction myself from the OSM data, but happened upon this project and saw that you were performing that extraction as your first processing phase - which generates the .osrm file. It looks like the second phase is what OSRM was original designed for - the generation of high performance routing. (The .edges and .nodes files contain a reduced graph if I understand correctly).

My use case if performing network constrained spatial analysis on nonplanar graphs. The first step is the extraction of the network graph to be used in the generation of an adjacency matrix (contiguity, kernel, or distance based). From the source, it looks like OSRM fulfills the extraction need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants