Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad label placement at high latitudes due to coordinate calculations in mercator projection #4241

Closed
swedneck opened this issue Nov 1, 2020 · 17 comments · Fixed by #4616
Closed

Comments

@swedneck
Copy link

swedneck commented Nov 1, 2020

Expected behavior

The label for norway (Norge) renders in the middle of mainland norway, next to sweden.

Actual behavior

The label for norway is rendered in the middle of Svalbard, but not mainland norway.

Links and screenshots illustrating the problem

2020-11-01_20-57-11

@jeisenbe
Copy link
Collaborator

jeisenbe commented Nov 1, 2020

This behavior happens because the location of the label is based on ST_PointOnSurface which tries to find the largest polygon in a multipolygon composed of more than one separate area. The area of Svalbard in the Mercator projection (including the nearby ocean areas which are part of the territory) appears to be larger than the area of the mainland part of Norway, and I believe this means that the algorithm selects the center of that area instead.

We could solve the particular problem in this issue by altering the rules for the labels of countries at low zoom level to use a more complicated algorithm, but this might be confusing mappers and to people who want to adapt this map style.

@pnorman
Copy link
Collaborator

pnorman commented Nov 2, 2020

I've had some success doing label calculations for countries in WGS84, finding better results for northern countries.

@imagico
Copy link
Collaborator

imagico commented Nov 2, 2020

Another approach would be to to distinctly transform into a polar equal area projection for geometries at higher latitudes, kind of like _ST_BestSRID() - which would not work because it would fall back to Mercator for larger geometries. Something like

ST_Transform(
  ST_PointOnSurface(
    ST_Transform(
      way, 
      CASE 
        WHEN (ST_YMax(way) > 9000000) AND (ST_YMin(way) > 0) THEN 3408 
        WHEN (ST_YMin(way) < -9000000) AND (ST_YMax(way) < 0) THEN 3409 
        WHEN (ST_YMax(way) <= 9000000) AND (ST_YMin(way) >= -9000000) THEN 900913
        ELSE 4326
      END
    )
  ), 900913)

@imagico
Copy link
Collaborator

imagico commented Nov 2, 2020

For Norway that would of course fall back to EPSG:4326 as well because of Bouvetøya. 😏

@StyXman
Copy link
Contributor

StyXman commented Jun 14, 2021

Notice that CyclOSM and Humanitarian styles use https://www.openstreetmap.org/node/424297198 instead (but they both use different places for Sweden's name!).

@imagico
Copy link
Collaborator

imagico commented Jun 14, 2021

We don't want to support mappers hand placing labels for Mercator maps - even if that is a convenient solution for other map providers. So a good, scalable and sustainable solution for this problem would need to look different. For simple point labels the approach to do label positioning in a projection with locally low shape and scale distortion is generally a sound one.

@imagico imagico changed the title Norge label rendered on svalbard instead of mainland Bad label placement at high latitudes due to coordinate calculations in mercator projection Nov 28, 2021
@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

I looked at some options

Point selection Coordinates Location
ST_PointOnSurface(way) POINT(16.26443388617199 78.88549837213904) Svalbard
ST_PointOnSurface(ST_Transform(way,4326)) POINT(16.60557407083153 78.51240264999998) Svalbard
ST_PointOnSurface(ST_Transform(way,_ST_BestSRID(ST_Transform(way,4326)))) POINT(16.264433169749058 78.88549837224912) Svalbard
ST_PointOnSurface(ST_Transform(way, 3408)) POINT(17.436842360057945 78.72907520088494) Svalbard

If I manually pick EPSG:3049, I get mainland Norway, but PostGIS isn't picking a _ST_BestSRID that has Svalbard having a smaller area than mainland Norway. This is because Bouvetøya is in the southern hemisphere and it has to use mercator as a fallback.

I tried slicing up the multipolygon st_dump and looking at the areas of each after transforming into _st_bestsrid, but Svalbard was still the largest.

It should be possible to ST_Dump a MP, transform each part into an appropriate projection, calculate the area, pick the largest, calculate the label point for that part, and transform that back into Mercator.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

I had a go at doing a transformation into 6933, pointonsurface, then back to web mercator. It improved labels for Canada, Australia, Greenland, Brasil, Chile, China, Japan, and perhaps Russia. It needed a new partial index for performance.

@imagico
Copy link
Collaborator

imagico commented Jul 10, 2022

I tried slicing up the multipolygon st_dump and looking at the areas of each after transforming into _st_bestsrid, but Svalbard was still the largest.

ST_Area() on geography does not use _ST_BestSRID(), it directly calculates the spherical area. I think that is more elegant than using an equal-area projection.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

I tried slicing up the multipolygon st_dump and looking at the areas of each after transforming into _st_bestsrid, but Svalbard was still the largest.

ST_Area() on geography does not use _ST_BestSRID(), it directly calculates the spherical area. I think that is more elegant than using an equal-area projection.

Ya, but I really don't want to get into the complexity of dumping multipolygons, picking the largest, and operating on it, if there's any other way. In particular since it needs to work with a partial index for performance reasons.

Svaldbard isn't the only bad label placement - Canada, Greenland, and Russia all place the labels pretty far north. They're picking the right polygon in the MP, the problem is the centroid calculation in 3857 gives a poor result.

@imagico
Copy link
Collaborator

imagico commented Jul 10, 2022

By the way: EPSG:6933 is Cylindrical equal-area with standard parallels at 30°. My understanding is that when you determine ST_PointOnSurface() in that you will get a distortion in the opposite direction of Mercator (i.e. towards lower latitudes), least around 30 degrees latitude, higher the further you move away from that.

My understanding of how ST_PointOnSurface() works is also that you will get no such systematic distortion (but of course still the inevitable awkwardness in how ST_PointOnSurface() works in general) if you calculate it in EPSG:4326 because that is equidistant in latitude.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

If the centroid is within the polygon, ST_PointOnSurface is the same as ST_Centroid. All projections distort. In general, the centroid in one projection is going to be different from the centroid in another projection, and the same is true with EPSG:4326 and EPSG:3857.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

6933
image

4326
image

I think I see how I can get picking the largest polygon in a MP into a suitable expression, so I'm going to look at that.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

SELECT 
(SELECT ST_AsText(ST_Transform(ST_PointOnSurface(geom),4326)) FROM ST_Dump(way) ORDER by ST_Area(ST_Transform(geom,4326)::geography) DESC LIMIT 1)
FROM planet_osm_polygon WHERE osm_id = -2978650;

This returns 12.519591021864175 65.43191663124402
image

If I go and transform the part into _ST_BestSRID with

SELECT ST_AsText(ST_Transform(way,4326)) FROM (
SELECT 
(SELECT ST_PointOnSurface(ST_Transform(geom, _ST_BestSRID(ST_Transform(geom,4326))))
FROM ST_Dump(way)
    ORDER by ST_Area(ST_Transform(geom,4326)::geography) DESC LIMIT 1) AS way
FROM planet_osm_polygon WHERE osm_id = -2978650 ) AS q;

I get a better point

image, which is approaching where I would manually place a label1.

Unfortunately, this doesn't help with Canada, where it still places the label in the same spot.

Footnotes

  1. Actually, if manually placing labels, I'd probably write it on an angled line

@imagico
Copy link
Collaborator

imagico commented Jul 10, 2022

All projections distort.

That is too simplistic i think. Just because any projection on a plane is subject to some form of distortion does does not mean you cannot quantitatively distinguish between lower and higher amounts of distortion or identify systematic biases.

Anyway, your samples seem to confirm that cylindrical equal-area will give you a bias in the opposite direction of Mercator. The bias of course also depends on the latitude extent of the polygon. The Japan label is weird, it seems to jump between Honshu and Hokkaido.

If the centroid is within the polygon, ST_PointOnSurface is the same as ST_Centroid.

That is not the case i think, AFAICS PostGIS uses GEOS for ST_PointOnSurface. I tested a few simple convex shapes and the results of ST_PointOnSurface and ST_Centroid differ.

PostGIS seems to support ST_Centroid on the sphere without projection so if you want to use that you can.

@pnorman
Copy link
Collaborator

pnorman commented Jul 10, 2022

That is not the case i think, AFAICS PostGIS uses GEOS for ST_PointOnSurface. I tested a few simple convex shapes and the results of ST_PointOnSurface and ST_Centroid differ.

Looking into it, yes, ST_PointOnSurface differs. They still both use GEOS.

ST_Centroid on the sphere doesn't really help since the centroid can be outside the polygon.

Taking the above work with picking the largest true-area polygon, I looked at both pointonsurface and PIA (MaximumInscribedCircle center) in 4326, 3857, and _st_bestsrid.

As far as I can see, except for Japan and Norway, PointOnSurface in 3857 is what we have now.

PointOnSurface in bestsrid is slightly different for most countries. Importantly, it's unchanged for Canada, Russia, and China. It is improved for Greenland, Norway, Ukraine, and Laos. For other countries with just minor changes, they seem to be slightly better but most of them had good label placement to start with.

PointOnSurface in 4326 should differ at least slightly for every country. In most countries with a uniform north-south distribution of land, it is basically unchanged. Canada, Russia, Greenland, and Laos are substantially improved.

PIA is interesting, but requires PostGIS 3.1+ and GEOS 3.9+, so I would consider it for 6.x.

In all three projection items, the points have significantly shifted.

For 3857, I found Russia, Canada, Norway, the UK, and Laos significantly better. Greenland is worse - that Mercator distortion is awful with it. New Zealand is perhaps worse - the dumbbell shape is hard to label.

4326 PIA isn't much different for most countries. It's better for Greenland, perhaps worse for Canada.

bestsrid PIA is close to PIA for 3857 in most countries. Greenland is hugely improved, no surprise with the mercator distortions gone. Canada and Russia are basically unchanged between the two.

I've attached the geojson files, shell scripts, and made a umap map with layers you can toggle on and off at https://umap.openstreetmap.fr/en/map/label-placement-strategies_786297

point_3857.zip


For 6.x, I would go with PIA without question. PIA in 3857 is likely to give the best placement for keeping the text within the boundary, but I would go with bestsrid. They're basically the same for most countries except Greenland, where 3857 is horrible.

For 5.x, I find it a tossup between bestsrid and 4326.


If I were doing automatic points and complexity weren't an issue and I wasn't considering other labels like city labels, I would probably go with something like

  • Take largest true-area polygon
  • Cut off all water
  • Take convex hull (or perhaps polygon hull or concave hull)
  • Subtract areas that are in any other countries
  • Find PIA

This is beyond what I would want to do in this style.

@imagico
Copy link
Collaborator

imagico commented Jul 11, 2022

What is important to keep in mind is that _st_bestsrid is designed for small geometries, by default using utm projection if a geometry does not have too large an extent in longitude. This does not function very well here with larger geometries like country boundaries.

The big advantage of ST_MaximumInscribedCircle compared to ST_PointOnSurface is not necessarily that it is better for label placement (it is not, look for example at Chile, Pakistan, Vietnam or, as you mentioned, New Zealand) but that it is invariant against rotation of the coordinate system. If you try - instead of _st_bestsrid - to work in polar coordinate system for higher latitudes (like i sketched in #4241 (comment)) you will get very similar results with ST_PointOnSurface for equal area and conformal projections (e.g. EPSG:3408 vs. EPSG:3995) but you will get huge differences for different rotations (e.g. EPSG:3995 vs. EPSG:3413). That would make sense if ST_PointOnSurface was specifically optimized for placing horizontal labels - but it is not, so this is just annoying.

Practically i would probably go for the simplest solution for the specific problem of this issue (that would be ST_PointOnSurface on EPSG:4326 i think) and afterwards see if we can clearly identify and formulate the most significant remaining problems with use of ST_PointOnSurface for labeling and then develop a solution for that under the constraints we have, preferably one that is also usable for the other labels we do based on ST_PointOnSurface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants