-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/refactor area calculation #1699
Conversation
I've been testing this PR against a polygon drawn about this rectangular building: https://www.openstreetmap.org/#map=17/60.14876/24.92242&layers=TN
|
This looks ok so far -- we should test in isolation on staging (migration, trigger changes, calculation). I'll leave it to @oliverroick to decide whether we revert the original area commit, test the whole feature in staging, then do a patch release next week, or push the sprint release date by a day or two to accommodate the additional testing. |
Let's push the release by a few days and test it next week. I also added another commit, that fixes the failing tests. Since we're using geographies now, some spatial lookups are not supported anymore. You can cast the geography back to a geometry and run the spatial lookups; which is what I did. |
|
||
location = duplicate.spatial_units.annotate( | ||
geom=Cast('geometry', GeometryField()) | ||
).get(geom=geom, **attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks for the fixup! I was struggling to understand where the ~=
operator was being used. It makes sense that it was this .get
call.
Btw, here's a little helper to run from the command line to verify the triggers were installed on your db: sudo su postgres -c "
psql -c \"
SELECT event_object_table
,trigger_name
,event_manipulation
,action_statement
,action_timing
FROM information_schema.triggers
WHERE event_object_table = 'spatial_spatialunit'
ORDER BY event_object_table
,event_manipulation;
\" cadasta " (env)vagrant@vagrant-ubuntu-trusty-64:/vagrant/cadasta$ sudo su postgres -c "
> psql -c \"
> SELECT event_object_table
> ,trigger_name
> ,event_manipulation
> ,action_statement
> ,action_timing
> FROM information_schema.triggers
> WHERE event_object_table = 'spatial_spatialunit'
> ORDER BY event_object_table
> ,event_manipulation;
> \" cadasta "
event_object_table | trigger_name | event_manipulation | action_statement | action_timing
---------------------+------------------------+--------------------+------------------------------------+---------------
spatial_spatialunit | calculate_area_trigger | INSERT | EXECUTE PROCEDURE calculate_area() | BEFORE
spatial_spatialunit | calculate_area_trigger | UPDATE | EXECUTE PROCEDURE calculate_area() | BEFORE
(2 rows) |
@oliverroick @amplifi any opposition if I simplify that signal a bit? |
I've done some area testing in Helsinki and Singapore and the areas are comparable to the areas computed using QGIS. I get a difference in area of −1.3% in Singapore (~1°N) to +0.6% in Helsinki (~60°N). The difference is probably caused by using the spherical version of I guess this is OK as it is. We just need to be clear to our partners that the area values should be considered only approximate and not to be taken as gospel truth. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this on staging and it should be good to go from my end.
Let's do that after the code is shipped. |
@seav Thanks for the testing. Reading through the link you've added:
We are currently doing calculations on a geography and not turning off the spheroidal calculation, so we are currently returning the most accurate calculation via spheroid (not sphere) that we can do via PostGIS tooling at this given time, right? Would you agree with this statement? The only way we could improve it would rely on the next bit of docs:
We should make sure that we're using the latest/greatest version of PostGIS (and thus Proj) to ensure that the calculation is as accurate as possible. |
And regarding the slight discrepancies, I would note that I never got a unified result for area calculation throughout my experimentation/development of this feature. As shown in the first comment, PostGIS and Python had slightly different results. QGIS also had slightly different results. Definitely not as bad as ~1%, but they all varied slightly. My guess is that each tool employs slightly different algorithms and the results show this. I am not sure which one is more correct than others, however if I were to place a bet, I'd put it on the PostGIS/Proj solution. |
Based on the documentation, I guess PostGIS should be using spheroidal calculation. Anyway, I did some more testing and I took your branch, added |
2f299e7
to
511c4dc
Compare
Okay, #1709 is available if we want to simplify the signal slightly. |
Proposed changes in this pull request
This PR resolves the area calculation issues brought up in #1689.
As mentioned by @oliverroick, there are two changes that need to be made:
area
field for all currently existing location instances in our DB.area
field will be calculated for any location instances saved in the future.To address the first question, a new migration has been added to instruct the DB correctly calculate the area value for all rows with geometry type of POLYGON or MULTIPOLYGON.
This will overwrite the currently existing
area
values. It should be decently performant, faster at least than calculating this data in Python.To address the second question (calculating
area
for newSpatialUnit
rows), I deleted the currently existingpre_save
signal and moved the logic to the DB as a DB trigger. This was done because of the fact that calculating the area via Python produced slightly different values than via PostGIS:This would mean that the values calculated via DB migration would shift slightly the next time that the
SpatialUnit
instance was saved. The advantage of using a trigger is that we ensure that thearea
will be calculated even when using thebulk_create
andupdate
methods. The downside is that the DB-calculated value is not set on the object when the object is saved. To get around this, I wired up apost_save
signal to retrieve this value and add it to the instancesarea
field if it should have been generated.Other changes: For some reason, MultiPolygons didn't have their area calculated. I've updated the code to support that geometry type.
The
SpatialRelationshipManager
was checking if oneSpatialUnit
contained another. This was done via a DB query. Usinggeography=True
on theSpatialUnit.geometry
field removed the ability to query viacontains
. I did a quick refactor to do the containment check in Python via GEOS.When should this PR be merged
No dependencies.
Risks
None to mention.
Follow-up actions
[List any possible follow-up actions here; for instance, testing data
migrations, software that we need to install on staging and production
environments.]
Checklist (for reviewing)
General
migration
label if a new migration is added.Functionality
Code
Tests
Security
Documentation