Skip to content

Commit

Permalink
Work-in-progress implementation of new flex backend
Browse files Browse the repository at this point in the history
  • Loading branch information
joto committed Jan 11, 2020
1 parent 1268f10 commit 8b14c67
Show file tree
Hide file tree
Showing 30 changed files with 4,546 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ addons:
# Here we install only packages that are the same for all builds on ubuntu.
apt:
packages: ['python3-psycopg2', 'libexpat1-dev', 'libpq-dev', 'libbz2-dev',
'libproj-dev', 'libluajit-5.1-dev',
'libproj-dev', 'libluajit-5.1-dev', 'lua-messagepack',
'libboost-dev', 'libboost-system-dev', 'libboost-filesystem-dev']

# env: T="...." // please set an unique test id (T="..")
Expand Down
150 changes: 150 additions & 0 deletions flex-config/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Flex Backend Configuration

The "Flex" backend is configured through a Lua file which defines the structure
of the output tables and is used to map OSM data to the data format to be used
in the database. This way you have a lot of control over how the data should
look like in the database.

## Lua config file

All configuration is done through the `osm2pgsql` object in Lua. It has the
following fields:

* `osm2pgsql.version`: The version of osm2pgsql as string.
* `osm2pgsql.srid`: The SRID set on the command line (with `-l`, `-m`, or `-E`).
* `osm2pgsql.mode`: Either `create` or `append` depending on the command line options.
* `osm2pgsql.stage`: Either 1 or 2 (1st/2nd stage processing the data). See below.
* `osm2pgsql.userdata`: To store your user data. See below.

The following functions are defined:

* `osm2pgsql.define_node_table(name, columns)`: Define a node table with the
specified name and columns.
* `osm2pgsql.define_way_table(name, columns)`: Define a way table with the
specified name and columns.
* `osm2pgsql.define_relation_table(name, columns)`: Define a relation table
with the specified name and columns.
* `osm2pgsql.define_area_table(name, columns)`: Define an area table
with the specified name and columns.
* `osm2pgsql.define_table(data)`: Define a table.
* `osm2pgsql.mark(type, id)`: Mark the OSM object of the specified type ('w'
or 'r') with the specified id. The OSM object will trigger a call to the
processing function again in the second stage.
* `osm2pgsql.get_bbox()`: Get the bounding box of the current node or way. Only
works inside the `osm2pgsql.process_node()` and `osm2pgsql.process_way()`
functions.

You are expected to define one or more of the following functions:

* `osm2pgsql.process_node(data)`: Called for each node.
* `osm2pgsql.process_way(data)`: Called for each way.
* `osm2pgsql.process_relation(data)`: Called for each relation.

Any fields starting with an underscore (`_`) are reserved for internal use
of osm2pgsql and must not be accessed in any way.

### Defining a table

You have to define one or more tables where your data should end up. This
is done with the `osm2pgsql.define_table()` function or one of the slightly
more convenient functions `osm2pgsql.define_(node|way|relation|area_table()`.

Each table is either a *node table*, *way table*, *relation table*, or *area
table*. This means that the data for that table comes primarily from a node,
way, relation, or area respectively. Osm2pgsql makes sure that the OSM object
id will be stored in the table so that later updates to those OSM objects (or
deletions) will be properly reflected in the tables. Area tables are special,
they can contain data derived from ways or from relations. Way ids will be
stored as is, relation ids will be stored as negative numbers. (You can define
tables that don't have any ids, but those tables will never be updated by
osm2pgsql.)

If you are using the `osm2pgsql.define_(node|way|relation|area_table()`
convenience functions, osm2pgsql will automatically create an id column named
`(node|way|relation|area)_id`, respectively. If you want more control over
the id column(s), use the `osm2pgsql.define_table()` function.

Most tables will have a geometry column. (Currently only zero or one geometry
columns are supported.) The types of the geometry column possible depend on
the type of the input data. For node tables you are pretty much restricted
to point geometries, but there is a variety of options for relation tables
for instance.

Supported geometry types:
* `geometry`: Any kind of geometry. Also used for area tables that should hold
both polygon and multipolygon geometries.
* `point`: Point geometry, usually created from nodes.
* `linestring`: Linestring geometry, usually created from ways.
* `polygon`: Polygon geometry for area tables, created from ways or relations.
* `multipoint`: Currently not used.
* `multilinestring`: Created from (possibly split up) ways.
* `multipolygon`: For area tables, created from ways or relations.

The only thing you have to do here is to define the geometry type you want and
osm2pgsql will create the right geometry for you from the OSM data and fill it
in.

In addition to id and geometry columns, each table can have any number of
"normal" columns using any type supported be PostgreSQL. Some types are
specially recognized by osm2pgsql and it adds some support for them. But
you can use any SQL type you want, in which case you have to make sure are
creating the right text format for these columns.

Available column types:
* `text`: Text string
* `boolean`: Interprets values `"true"`, `"yes"` as `true` and everything else
as `"false"`
* `int2`, `smallint`: 16bit signed integer
* `int4`, `int`, `integer`: 32bit signed integer
* `int8`, `bigint`: 64bit signed integer
* `real`: A real number
* `hstore`: Can be created automatically from a Lua table
* `json` and `jsonb`: Not supported yet
* `direction`: Interprets values `"true"`, `"yes"`, and `"1"` as 1, `"-1"` as
`-1`, and everything else as `0`. Useful for `oneway` tags etc.
* `area`: The area of the (polygon) geometry.


## Command line options

Use the command line option `-O flex` or `--output=flex` to enable the flex
backend and the `-S|--style` option to set the Lua config file.

The following command line options have a somewhat different meaning when
using the flex backend:

* `-p|--prefix`: The table names you are setting in your Lua config files
will *not* get this prefix.
* `-S|--style`: Use this to specify the Lua config file. Without it, osm2pgsql
will not work, because it will try to read the default style file which
the flex backend doesn't understand.
* `-G|--multi-geometry` is not used. Set the column type of the output table
to the type you want instead, for instance `polygon` vs. `multipolygon`.

The following command line options are ignored by `osm2pgsl` when using the
flex backend, because they don't make sense in that context:

* `-k|--hstore`
* `-j|--hstore-all`
* `-z|--hstore-column`
* `--hstore-match-only`
* `--hstore-add-index`
* `-K|--keep-coastlines` (Coastline tags are not handled specially in the
flex backend.)
* `--tag-transform-script` (Set the Lua config file with the `-S|--style`
option.)

## Example config files

This directory contains example config files for the flex backend. All config
files contain comments as documentation.

If you are learning about the flex backend, read the config files in the
following order (from easiest to understand to the more complex ones):

1. [simple.lua](simple.lua)
2. [multipolygons.lua](multipolygons.lua)
3. [advanced.lua](advanced.lua)
4. [highway-shields.lua](highway-shields.lua)
5. [unitable.lua](unitable.lua)

166 changes: 166 additions & 0 deletions flex-config/advanced.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@

-- Read and understand simple.lua and multipolygons.lua first, before you try
-- to understand this file.

inspect = require('inspect')

print("osm2pgsql version: " .. osm2pgsql.version)

-- Are we running in "create" or "append" mode?
print("osm2pgsql mode: " .. osm2pgsql.mode)

-- Which stage in the data processing is this?
print("osm2pgsql stage: " .. osm2pgsql.stage)

-- Uncomment the following line to see the userdata (but, careful, it might be
-- a lot of data)
-- print("osm2pgsql userdata: " .. inspect(osm2pgsql.userdata))

tables = {}

tables.pois = osm2pgsql.define_node_table("pois", {
{ column = 'tags', type = 'hstore' },
{ column = 'geom', type = 'point' },
})

tables.ways = osm2pgsql.define_way_table("ways", {
{ column = 'tags', type = 'hstore' },
{ column = 'geom', type = 'linestring' },
})

-- Using the define_table function allows some more control over the id columns
-- than the more convenient define_(node|way|relation|area)_table functions.
-- In this case we are setting the name of the id column to "osm_id".
tables.polygons = osm2pgsql.define_table{
name = "polygons",
ids = { type = 'area', id_column = 'osm_id' },
columns = {
{ column = 'tags', type = 'hstore' },
{ column = 'geom', type = 'geometry' },
}
}

-- A table for all route relations
tables.routes = osm2pgsql.define_relation_table("routes", {
{ column = 'tags', type = 'hstore' },
{ column = 'geom', type = 'multilinestring' },
})

-- A table for all individual members of route relations
-- (Note that this script doesn't handle ways in multiple relations correctly.)
tables.route_members = osm2pgsql.define_table{
name = "route_members",
ids = { type = 'way', id_column = 'way_id' },
columns = {
{ column = 'rel_id', type = 'int8' }, -- not a specially handled id column
{ column = 'tags', type = 'hstore' }, -- tags from member way
{ column = 'role', type = 'text' }, -- role in the relation
{ column = 'rtags', type = 'hstore' }, -- tags from relation
{ column = 'geom', type = 'linestring' },
}
}

function is_empty(some_table)
return next(some_table) == nil
end

function clean_tags(tags)
tags.odbl = nil
tags.created_by = nil
tags.source = nil
tags["source:ref"] = nil
tags["source:name"] = nil
end

function osm2pgsql.process_node(data)
clean_tags(data.tags)
if is_empty(data.tags) then
return
end

tables.pois:add_row({
tags = data.tags
})
end

function osm2pgsql.process_way(data)
-- print(inspect(data))

clean_tags(data.tags)
if is_empty(data.tags) then
return
end

-- osm2pgsql.stage: either 1 or 2 for first or second pass through the data
if osm2pgsql.stage == 2 then
local row = {
rel_id = 0,
tags = data.tags,
role = '',
rtags = {},
}
member_data = osm2pgsql.userdata.w2r[data.id]
if member_data then
row.rel_id = member_data.rel_id
row.role = member_data.role
row.rtags = osm2pgsql.userdata.route_tags[row.rel_id]
end
-- print(inspect(row))
tables.route_members:add_row(row)
return
end

if data.is_closed then
tables.polygons:add_row({
tags = data.tags
})
else
tables.ways:add_row({
tags = data.tags
})
end
end

function osm2pgsql.process_relation(data)
-- print(inspect(data))

clean_tags(data.tags)
if is_empty(data.tags) then
return
end

if data.tags.type == 'multipolygon' or data.tags.type == 'boundary' then
tables.polygons:add_row({
tags = data.tags
})
elseif data.tags.type == 'route' and data.tags.route == 'hiking' then
tables.routes:add_row({
tags = data.tags
})

if not osm2pgsql.userdata.route_tags then
osm2pgsql.userdata.route_tags = {}
end

if not osm2pgsql.userdata.w2r then
osm2pgsql.userdata.w2r = {}
end

osm2pgsql.userdata.route_tags[data.id] = data.tags

-- Go through all the members...
for i, member in ipairs(data.members) do
if member.type == 'w' then
-- Mark the member way as "interesting", the "process_way"
-- callback will be triggered again in the second stage
osm2pgsql.mark('w', member.ref)
-- print("mark way id " .. member.ref)
osm2pgsql.userdata.w2r[member.ref] = {
rel_id = data.id,
role = member.role,
}
end
end
end
end

Loading

0 comments on commit 8b14c67

Please sign in to comment.