-
Notifications
You must be signed in to change notification settings - Fork 33
Data mapping
OpenStreetMap data is organized in a relational model composed of data primitives: node, way and relation. These objects are linked to each other by their unique osmid
. As relational, this model fits well in a RDBMS (commonly PostgreSQL + Postgis) and is exportable.
Even though XML is the official representation, OpenStreetMap also supports other compressed formats such as PBF (Protocol Buffers) or BZ2 (compressed XML). These files can be easily found on the Internet (see the Quick-start's Get OSM data section).
Osmosis is able to read both XML and PBF formats: it deserializes data into Java objects that can be processed through plugins. In our case, the elasticsearch-osmosis-plugin will convert these Java objects into their JSON equivalent prior to be inserted into elasticsearch using their osmid
as document key.
Following examples and explanations are based on the following sample.osm
file:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="OpenStreetMap server" copyright="OpenStreetMap and contributors"
attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
<node id="497017646" version="2" changeset="12638179" visible="true" timestamp="2012-08-06T19:59:42Z"
lat="48.6757054" lon="2.3794174" user="inoskyh" uid="785001"/>
<node id="497017647" version="2" changeset="12638179" visible="true" timestamp="2012-08-06T19:59:42Z"
lat="48.6755698" lon="2.3795879" user="inoskyh" uid="785001"/>
<node id="343866517" version="10" changeset="12638179" visible="true" timestamp="2012-08-06T19:59:42Z"
lat="48.6752788" lon="2.3799338" user="inoskyh" uid="785001"/>
<way id="40849832" visible="true" timestamp="2012-08-06T19:59:43Z" version="3" changeset="12638179"
user="inoskyh" uid="785001">
<nd ref="497017646"/>
<nd ref="497017647"/>
<nd ref="343866517"/>
<tag k="highway" v="residential"/>
<tag k="name" v="Avenue Marc Sangnier"/>
</way>
</osm>
The elasticsearch-osmosis-plugin converts and inserts OSM entities into an elasticsearch index, in different indices depending on the OSM entity type.
Produced JSON documents share common fields to allow geo-querying on multiple indices:
- The
centroid
field represents the centroid of the entity (equals to coordinates if the entity is a point). It is mapped as geo_point type, allowing the use of distance filter, distance range filter, polygon filter, bounding box filter and distance facets - The
shape
field represents the shape of the entity. It is mapped as geo_shape type, allowing the use of shape query and shape filter
In addition, each way
document contains the following fields:
- The
lenghtKm
field represents the lenght (or perimeter if the way is closed) in kilometers - The
areaKm2
field represents the area in square kilometers (equals 0 if the way is not closed)
Provided our precedent extract, you can expect the following:
- All nodes will be stored into the
node
indice, with theirosmid
as elasticsearchid
{"centroid":[2.3794174,48.6757054],"shape":{"type":"point","coordinates":[2.3794174,48.6757054]},"tags":{}}
{"centroid":[2.3795879,48.6755698],"shape":{"type":"point","coordinates":[2.3795879,48.6755698]},"tags":{}}
{"centroid":[2.3799338,48.6752788],"shape":{"type":"point","coordinates":[2.3799338,48.6752788]},"tags":{}}
- All ways will be store into the
way
indice, with theirosmid
as elasticsearchid
{
"centroid": [2.379676881568899,48.67549366663964],
"lengthKm": 0.07448669438396566,
"areaKm2": 0,
"shape": {
"type": "linestring",
"coordinates": [[2.3794174,48.6757054],[2.3795879,48.6755698],[2.3799338,48.6752788]]
},
"tags": {
"highway": "residential",
"name": "Avenue Marc Sangnier"
}
}
- All relations and bounds (not present in this exmaple) are ignored because not yet implemented.
The following mapping is applied by default on all indices of the index. You can override it if needed (see the Usage section).
{
"_all": {"enabled": false},
"dynamic_templates": [
{
"tags_exceptions": {
"path_match": "tags.*",
"match": "(name.*)",
"match_pattern": "regex",
"mapping": {
"store": "no",
"type": "multi_field",
"fields": {
"{name}": {"type": "string", "index": "not_analyzed"},
"analyzed": {"type": "string", "index": "analyzed"}
}
}
}
},
{
"tags_default": {
"path_match": "tags.*",
"mapping": {"index": "not_analyzed", "store": "no"}
}
}
],
"properties": {
"centroid": {"type": "geo_point"},
"shape": {"type": "geo_shape"}
}
}