-
Notifications
You must be signed in to change notification settings - Fork 1
/
aoh.Rmd
250 lines (185 loc) · 19.7 KB
/
aoh.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
---
title: "Getting started"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
fig_caption: true
self_contained: yes
fontsize: 11pt
documentclass: article
bibliography: references.bib
csl: reference-style.csl
vignette: >
%\VignetteIndexEntry{Getting started}
%\VignetteEngine{knitr::rmarkdown_notangle}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
# figure sizes
h <- 3.5
w <- 3.5
# detect if vignette being built under package check
is_check <- ("CheckExEnv" %in% search()) || any(c("_R_CHECK_TIMINGS_",
"_R_CHECK_LICENSE_") %in% names(Sys.getenv()))
# set default chunk settings
knitr::opts_chunk$set(fig.align = "center", eval = !is_check)
```
```{r, include = FALSE}
devtools::load_all()
```
## Introduction
Area of Habitat (AOH) maps aim to delineate the spatial distribution of suitable habitat for a species [@r1]. They are used to assess performance of protected area systems, measure impacts of threats to biodiversity, and identify priorities for conservation actions [@r2; @r3; @r7]. These maps are generally produced by obtaining geographic range data for a species, and then removing areas that do not contain suitable habitat or occur outside the known elevational limits for the species [@r1]. To help make these maps accessible, the _aoh R_ package provides routines for automatically creating Area of Habitat data based on the [International Union for Conservation of Nature (IUCN) Red List of Threatened Species](https://www.iucnredlist.org/). After manually downloading species range data from the [IUCN Red List](https://www.iucnredlist.org/resources/spatial-data-download), users can import them (using `read_spp_range_data()`), prepare them and collate additional information for subsequent processing (using `create_spp_info_data()`), and then create Area of Habitat data (using `create_spp_aoh_data()`). Global elevation and habitat classification data are automatically downloaded [@r4; @r5; @r9], and data on species' habitat preferences and elevational limits are obtained automatically using the [IUCN Red List API](https://apiv3.iucnredlist.org/). Since accessing the IUCN Red List requires a token, users may need to [obtain a token](https://apiv3.iucnredlist.org/) and update their _R_ configuration to recognize the token (see README for details).
## Tutorial
Here we provide a tutorial for using the _aoh R_ package. In this tutorial, we will generate Area of Habitat data for the following Iberian species: Pyrenean brook salamander (_Calotriton asper_), Iberian frog (_Rana iberica_), western spadefoot toad (_Pelobates cultripes_), and golden striped salamnader (_Chioglossa lusitanica_). To start off, we will load the package. We will also load the [_rappdirs_](https://CRAN.R-project.org/package=rappdirs) _R_ package to cache data, and the [_terra_](https://CRAN.R-project.org/package=terra) and [_ggplot2_](https://CRAN.R-project.org/package=ggplot2) _R_ packages to visualize results.
```{r, message = FALSE, warning = FALSE}
# load packages
library(aoh)
library(terra)
library(rappdirs)
library(ggplot2)
```
Now we will import range data for the species. Although users would typically obtain range data from the [International Union for Conservation of Nature (IUCN) Red List of Threatened Species](https://www.iucnredlist.org/), here we will use built-in species range data that distributed with the package for convenience. **Please note that these data were not obtained from the IUCN Red List, and were manually generated using occurrence records from the [Global Biodiversity Information Facility](https://www.gbif.org/).**
```{r}
# find file path for data
path <- system.file("extdata", "EXAMPLE_SPECIES.zip", package = "aoh")
# import data
spp_range_data <- read_spp_range_data(path)
# preview data
print(spp_range_data)
```
Next, we will prepare all the range data for generating Area of Habitat data. This procedure -- in addition to repairing any geometry issues in the spatial data -- will obtain information on the species' habitat preferences and elevational limits (via the IUCN Red List of Threatened Species). We also specify a folder to cache the downloaded data so that we won't need to re-download it again during subsequent runs.
```{r, message = FALSE, results = "hide"}
# specify cache directory
cache_dir <- user_data_dir("aoh")
# create cache_dir if needed
if (!file.exists(cache_dir)) {
dir.create(cache_dir, showWarnings = FALSE, recursive = TRUE)
}
# prepare information
spp_info_data <- create_spp_info_data(spp_range_data, cache_dir = cache_dir)
```
We can now generate Area of Habitat data for the species. By default, these data will be generated using elevation data derived from @r4 and habitat data derived from @r9. Similar to before, we also specify a folder to cache the downloaded datasets so that we won't need to re-downloaded again during subsequent runs.
```{r, message = FALSE, results = "hide"}
# specify cache directory
cache_dir <- user_data_dir("aoh")
# specify folder to save Area of Habitat data
## although we use a temporary directory here to avoid polluting your computer
## with examples files, you would normally specify the folder
## on your computer where you want to save data
output_dir <- tempdir()
# generate Area of Habitat data
## note that this function might take a complete because it will need to
## download the global habitat and elevation data that first time you run it.
spp_aoh_data <- create_spp_aoh_data(
spp_info_data, output_dir = output_dir, cache_dir = cache_dir
)
```
While running the code, we see that it displayed a message telling us that certain habitat classes were not available (i.e., `"7.1"`, `"7.2"`). **This is fine. It is not an error.** The reason we see this message is because although the global habitat dataset contains the majority of [IUCN habitat classes](https://www.iucnredlist.org/resources/habitat-classification-scheme) for terrestrial environments, it does not contain every single IUCN habitat class [see @r9 for details]. Upon checking the [IUCN habitat classes](https://www.iucnredlist.org/resources/habitat-classification-scheme), we can see that these classes correspond to artificial aquatic areas and also caves and other subterranean environments. Although failing to account for such habitats could potentially be an issue, here we will assume that accounting for the species' non-subterranean habitats is sufficient to describe their spatial distribution [@r6].
```{r}
# preview results
## resulting dataset is a simple features (sf) object containing
## spatial geometries for cleaned versions of the range data
## (in the geometry column) and the following additional columns:
##
## - id_no : IUCN Red List taxon identifier
## - seasonal : integer identifier for seasonal distributions
## - category : character IUCN Red List threat category
## - full_habitat_code: All IUCN Red List codes for suitable habitat classes
## (multiple codes are delimited using "|" symbols)
## - habitat_code : IUCN Red List codes for suitable habitat classes
## used to create AOH maps
## - elevation_lower : lower limit for the species on IUCN Red List
## - elevation_upper : upper limit for the species on IUCN Red List
## - xmin : minimum x-coordinate for Area of Habitat data
## - xmax : maximum x-coordinate for Area of Habitat data
## - ymin : minimum y-coordinate for Area of Habitat data
## - ymax : maximum y-coordinate for Area of Habitat data
## - path : file path for Area of Habitat data (GeoTIFF format)
##
## since data obtained from the IUCN Red List cannot be redistributed,
## we will only show some of the columns in this object
##
## N.B., you can view all columns on your computer with:
##> print(spp_aoh_data, width = Inf)
print(spp_aoh_data[, c("id_no", "binomial", "seasonal", "path")])
```
After generating the Area of Habitat data, we can import them.
```{r}
# import the Area of Habitat data
## since the data for each species have a different spatial extent
## (to reduce file sizes), we will import each dataset separately in a list
spp_aoh_rasters <- lapply(spp_aoh_data$path, rast)
# preview raster data
print(spp_aoh_rasters)
```
We can see that the Area of Habitat data for each species are stored in separate spatial (raster) datasets with different extents. Although this is useful because it drastically reduces the total size of the data for each species, it can make it difficult to work with data for multiple species. To address this, we can use the `terra_combine()` function to automatically align and combine the spatial data for all species' distributions into a single spatial dataset.
```{r}
# combine raster data
spp_aoh_rasters <- terra_combine(spp_aoh_rasters)
# assign identifiers to layer names
names(spp_aoh_rasters) <- paste0(
"AOH_", spp_aoh_data$id_no, "_", spp_aoh_data$seasonal
)
# preview raster data
print(spp_aoh_rasters)
```
Finally, let's create some maps to compare the range data with the Area of habitat data. Although we could create these maps manually (e.g., using the [_ggplot2_](https://CRAN.R-project.org/package=ggplot2) _R_ package), we will use a plotting function distributed with the _aoh R_ package for convenience.
```{r, message = FALSE, warning = FALSE, dpi = 200, fig.width = 5.5, fig.height = 4, out.width = "90%"}
# create maps
## N.B. you might need to install the ggmap package to create the maps
map <-
plot_spp_aoh_data(
spp_aoh_data,
zoom = 6,
maptype = "stamen_toner_background"
) +
scale_fill_viridis_d() +
scale_color_manual(values = c("range" = "red")) +
scale_size_manual(values = c("range" = 0.5)) +
theme(
axis.title = element_blank(),
axis.text = element_text(size = 6),
strip.text = element_text(color = "white"),
strip.background = element_rect(fill = "black", color = "black")
)
# display maps
print(map)
```
## Frequently asked questions
Here we provide answers to some of the frequently asked questions encountered when using the package.
* **I see the following error message `Error: need an API key for Red List data`, how do I resolve this?**
This error message indicates that you need to obtain a token to access the [IUCN Red List API](https://apiv3.iucnredlist.org/), or that you need to complete the setup process so that R can use the token. For further details on resolving this issue, please see below for details on obtaining access to the IUCN Red List API. If you have previously completed the set up procedures and still receive this error message, please try completing them again.
* **How do I obtain access to the IUCN Red List API?**
You will need to obtain a token to access the [IUCN Red List API](https://apiv3.iucnredlist.org/) (if you do not have one already). To achieve this, please visit the IUCN API website (<https://apiv3.iucnredlist.org/>), click the "Generate a token" link at the top of the web page, and fill out the form to apply for a token. You should then receive a token shortly after completing the form (but not immediately). After receiving a token, you will need to complete some additional steps so that R can use this token to access the IUCN Red List API.
Please open the `.Renviron` file on your computer (e.g., using `usethis::edit_r_environ()`). Next, please add the following text to the file (replacing the string with the token) and save the file:
```
IUCN_REDLIST_KEY="your_actual_token_not_this_string"
```
Please restart your R session. You should now be able to access the IUCN Red List API. To verify this, please try running the following _R_ code and -- assuming everything works correctly -- you should see the current version of the IUCN Red List:
```{r, eval = FALSE}
# verify access to IUCN Red List API
rredlist::rl_version()
```
If these instructions did not work, please consult the documentation for the [_rredlist_](https://CRAN.R-project.org/package=rredlist) _R_ package for further details.
* **Where can I find species range data for generating Area of Habitat data?**
Species range data can be obtained from the [IUCN Red List](https://www.iucnredlist.org/) (see [Spatial Data Download resources](https://www.iucnredlist.org/resources/spatial-data-download)). They can also be obtained from other data sources (see the question below for details).
* **I keep seeing this message `Error in x$.self$finalize() : attempt to apply non-function`, what does it mean?**
This message is commonly encountered when using the _terra_ package with large datasets. Although there is currently no known solution to prevent this message from appearing, the message can be safely ignored ([see here for details](https://github.com/rspatial/terra/issues/30)). This is because the message does not stop R from completing spatial data processing -- meaning that R will continue processing data even when this message is displayed -- and the underlying cause of the message is not thought to result in incorrect calculations.
* **Can I produce Area of Habitat data for thousands of species globally?**
Yes, the package can generate Area of Habitat data for all terrestrial amphibians, mammals, birds, and reptiles. To accomplish this, you will need a system with at least 16 Gb RAM and 65 Gb disk space. The processing could take a couple of days (e.g., if processing all amphibian species) or weeks (e.g., if processing all bird species) to complete. An example script for processing global data is available on the online code repository ([see here](https://github.com/prioritizr/aoh/tree/master/inst/examples)). Additionally, since a lot of memory is required to process data for all bird species globally, it is recommended to split the full dataset containing all bird species into multiple chunks (e.g., six chunks) and process each of these chunks separately.
* **How can I speed up the processing for Area of Habitat data?**
The `create_spp_aoh_data()` function can use different software engines for data processing (specified via the `engine` parameter). Although each engine produces the same results, some engines are more computationally efficient than others. The default `"terra"` engine uses the _terra_ package for processing. Although this engine is easy to install and fast for small datasets, it does not scale well for larger datasets. It is generally recommended to use the `"gdal"` engine in all cases where possible. Although the `"gdal"` engine requires installation of additional software (see the package README for instructions), it is much faster than the other engines. Additionally, the `"grass"` engine is also available. This engine can be faster than the `"terra"` engine when processing many species across large spatial extents. However, benchmarks indicate that it is slower than the `"gdal"` engine.
* **Can I use species range data from other data sources (instead of the IUCN Red List)?**
Yes, you can use species range data from a variety of sources. For example, species range data could be obtained from governmental (e.g., data for federally listed species in Canada are available through the [Government of Canada data portal](http://open.canada.ca/en/using-open-data)) and non-governmental organizations (e.g., [Botanical Information and Ecology Network](https://biendata.org/) and [Map of Life](https://mol.org/)). These data can also be produced using observation records [e.g., following @r8] from data repositories (e.g., [Global Biodiversity Information Facility](https://www.gbif.org/) and [Atlas of Living Australia](https://www.ala.org.au/)). After obtaining species range data, they will need to be formatted so that they follow the data format conventions used by the IUCN Red List. This means that the species range data must contain the following columns: `id_no`, `presence`, `origin`, `seasonal`, `terrestrial` (or `terrestial`), `freshwater`, and `marine`. For further details on what values these columns should contain, please see the **Species range data format** section in the documentation for `create_spp_info_data()` and the [IUCN Red List documentation](https://www.iucnredlist.org/resources/mappingstandards). Additionally, note that if you wish to use the IUCN Red List for specifying habitat preference data, please ensure that the `id_no` specified for each species follows the taxon identifiers used by the IUCN Red List. For example, the tutorial used manually generated species range data for the Pyrenean brook salamander (_Calotriton asper_). To ensure that correct habitat preference data were obtained for this species from the the IUCN Red List, the `id_no` value specified for this species was specified as `59448`.
* **Can I use species elevational limit data from other sources?**
Yes, you can use elevational limit data from other sources. For example, for birds it is important to use the "Occasional minimum altitude" and "Occasional maximum altitude" estimates for those species for which these are coded in the IUCN Red List. These data are available on request from BirdLife International (please contact <[email protected]>). To use such data, you will need to manually create a table containing the species’ summary information. This table will need to contain information on the species' elevational limits, as well as their habitat preferences and threat status (see `get_spp_summary_data()` for the correct format). After preparing these data, they can be used to collate the information needed for processing Area of Habitat data (via `create_spp_info_data()`) and, in turn, create Area of Habitat data (via `create_spp_aoh_data())`.
* **Can I use elevation data from other data sources?**
Yes, you can use elevation data from a variety of sources. For example, elevation data could be derived from [NASA's Shuttle Radar Topography Mission (SRTM)](https://csidotinfo.wordpress.com/data/srtm-90m-digital-elevation-database-v4-1/). After preparing the elevation data, they can be used to create Area of Habitat data (via `create_spp_aoh_data()`). For more information, see the [Customization vignette](customization.html).
* **Can I use habitat classification data from other data sources?**
Yes, you can use habitat classification data from a variety of sources. For example, the habitat classification data could be derived from [Copernicus Corine Land Cover](https://land.copernicus.eu/en/products/corine-land-cover), and [MODIS Land Cover data (MCD12Q1)](https://lpdaac.usgs.gov/products/mcd12q1v061/)). To use such data, you will also need to develop a crosswalk table to specify which land cover (or habitat) classes correspond to which habitat classes as defined by the [IUCN Red List Habitat Classification Scheme](https://www.iucnredlist.org/resources/habitat-classification-scheme) [e.g., see @r2; @r9]. After preparing the habitat classification data and the crosswalk table, they can be used to create Area of Habitat data (via `create_spp_aoh_data()`). For more information, see the [Customization vignette](customization.html).
* **The output Area of Habitat data have different spatial extents, how can I combine them together?**
The `terra_combine()` function can be used to align and combine a `list` of raster `terra::rast()` objects into a single object. Note that this procedure is only recommended when all species occur in the same geographic region. Please see the tutorial above for an example of using this function to combine Area of Habitat data for multiple species into a single object.
## Conclusion
Hopefully, this vignette has provided a useful introduction to the package. If you encounter any issues when running the code in this tutorial -- or when adapting the code for your own work -- please see the following section. Additionally, if you have any questions about using the package or suggestions for improving it, please [file an issue at the package's online code repository](https://github.com/prioritizr/aoh/issues).
## References