Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update habitatmap_stdized and habitatmap_terr #65

Merged
merged 57 commits into from
Sep 27, 2024

Conversation

ToonHub
Copy link
Contributor

@ToonHub ToonHub commented Mar 4, 2024

I updated habitatmap_stdized and habitatmap_terr based on habitatmap_2023.
I could run most of the original code without any problems.
You find the rendered html-files below.

generating_habitatmap_terr.zip
generating_habitatmap_stdized.zip

Once accepted, I will upload the files to Zenodo.

This controls the behaviour of the pipe shortcut key in current RStudio versions.

This project was already using the magrittr pipe.
Copy link
Member

@florisvdh florisvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ToonHub for taking up this work! 👏

The resulting files are perfectly usable and closely align with the setup of the 2020_v1 versions. Some of the below comments will still cause updates of the files though.

Also, I suggest to await the upcoming work by @cecileherr in this repo + in {n2khab}, before uploading the results to Zenodo (hence leave this PR unmerged), just in case she comes across more things that need to be done here. And she may want to review this PR too, as she did in PR #50.

Some of the below comments apply to parts that already existed before, but which I suggest to alter.

Project generate_habitatmap_stdized

  • I've sent PR #67 that goes into this one, essentially with some checks added.

  • I think in the following sentence the commas must be underscores (which have a different meaning here), am I right?

    An exception to this rule are following codes: 3130,rbbmr, 3140,rbbmr, 3150,rbbmr and 3160,rbbmr.

Project generate_habitatmap_terr

  • suggestion to apply the same updates as in PR #67.
  • see issue #66 : if code will be added to create an update of habitatmap_terr_2020_v1 first, then that code must be merged into this branch too, even though it won't have an effect on 2023_v1. This is to make sure such update (adding a summarize() step) would indeed have no effect, and because it would remain in place for future versions too.

Both projects

  • Going forward, we should fix invalid and corrupt geometries in processed data sources.
    • See issue #60, and the way this has been anticipated in read_watersurfaces() (see the fix_geom argument).
    • While read_habitatmap() has not yet been updated to provide this on-the-fly (and it won't be the default), in the two projects you could use some code from read_watersurfaces() to update the polygons coming from habitatmap_2023. EDIT: geometry fixing will only be needed in generate_habitatmap_stdized. Project generate_habitatmap_terr uses the geometries from habitatmap_stdized, so no need for more geometry fixing.
  • Setup chunk: please update the ISO8601 timestamp for the GPKG driver to a 2024 date. The setting stores a reproducible (hence fixed) timestamp in the GeoPackage, but it's still at a 2021 date.
  • Regarding storage of different versions, there's inbo/n2khab#113 which needs further handling.
    • Whatever strategy will eventually be applied in {n2khab}, I wouldn't add custom subdirectories inside a data source directory as the latter already contains the currently used data source: this approach does not clearly separate multiple data sources. IMO a (default) data source directory should just contain one version of the data source; nothing more since extras would (by convention) also be part of that data source.
    • Keeping other versions outside (e.g. next to) the default directory does not need your hack to filter out those subdirectories.


The ID of each polygon in the habitatmap contains the year in which the polygon
was last updated.
In the table below we show the number of records per type and per update year.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ToonHub
Copy link
Contributor Author

ToonHub commented Mar 12, 2024

I accepted the suggestions by @florisvdh in #67 for habitatmap_stdized and also applied them to habitatmap_terr.

I also updated the ISO8601 timestamp for both gpkg files.

@florisvdh
Copy link
Member

florisvdh commented Mar 14, 2024

Thanks @ToonHub !

@cecileherr when you start your work in this repo, could you pick up these remaining steps too? See above for explanation.

  • remove the hack to filter out subdirectories of older versions (for version handling, favour an approach that is compatible with current default instead)
  • fix geometries in project generate_habitatmap_stdized
  • create a version habitatmap_terr_2020_v2 and merge the summarize step in here
  • include the solution for #68, regarding rbbvos+

@cecileherr
Copy link
Collaborator

About fixing geometries in habitatmap (the raw data) with a new fix_geom:

I would have liked to add a check of the geometries (detect invalid polygons, visualize them on a map, try to correct them - and see how long it takes - and check the corrected layer) somewhere, so we could see what the impact is of fixing the geometries with read_habitatmap(fix_geom = TRUE) for each new version of habitatmap.

Unfortunately (but logically) there is no generate_habitatmap project for this layer. Shall I just use habitatmap.Rmd under n2khab-preprocessing\src\miscellaneous ? @florisvdh: or do you have an alternative suggestion?

@florisvdh
Copy link
Member

Shall I just use habitatmap.Rmd under n2khab-preprocessing\src\miscellaneous ?

Yes, that would be the most logical place indeed (that directory is mostly about exploration & discussion in preparation of further steps).

The file's current contents seem to be checks of habitatmap_stdized: this may be superfluous due to overlap with the generate_habitatmap_stdized project (and it also contradicts the filename), hence that part could be dropped.

Good idea to do the checks that you propose BTW.

ToonHub and others added 5 commits September 3, 2024 09:12
apply suggestions by florisvdh

Co-authored-by: Floris Vanderhaeghe <[email protected]>
- calculate st_is_valid only once
- remove a reference tot fix_gem in the sectiona bout the object
Generate habitatmap_stdized, habitatmap_terr, watersurfaces_hab 2023: a few tweaks
@cecileherr cecileherr merged commit e3ccb73 into main Sep 27, 2024
@cecileherr cecileherr deleted the update_habitatmap_stdized_habitatmap_terr branch September 27, 2024 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants