Catalog Comparisons
+[1]:
+
import pandas as pd
+import rioxarray as rxr
+
+from gval.catalogs.catalogs import catalog_compare
+
Initializing Catalogs
+The cataloging functionality was designed to easily facilitate batch comparisons of maps residing locally, in a service, or in the cloud. The format of such catalogs are as follows:
+[2]:
+
TEST_DATA_DIR = './'
+
+candidate_continuous_catalog = pd.read_csv(f'{TEST_DATA_DIR}candidate_catalog_0.csv')
+benchmark_continuous_catalog = pd.read_csv(f'{TEST_DATA_DIR}benchmark_catalog_0.csv')
+candidate_categorical_catalog = pd.read_csv(f'{TEST_DATA_DIR}candidate_catalog_1.csv')
+benchmark_categorical_catalog = pd.read_csv(f'{TEST_DATA_DIR}benchmark_catalog_1.csv')
+
Candidate Catalog
+[3]:
+
candidate_categorical_catalog['catalog_attribute_1'] = [1, 2]
+candidate_categorical_catalog
+
[3]:
+
+ | map_id | +compare_id | +agreement_maps | +catalog_attribute_1 | +
---|---|---|---|---|
0 | +./candidate_categorical_0.tif | +compare1 | +agreement_categorical_0.tif | +1 | +
1 | +./candidate_categorical_1.tif | +compare2 | +agreement_categorical_1.tif | +2 | +
The catalog should have columns representing: 1. An identifier of a candidate map, (in this case compare_id
) 2. The location of the candidate map, (in this case map_id
) 3. The name of the agreement map to be created named agreement_maps
Benchmark Catalog
+[4]:
+
benchmark_categorical_catalog['catalog_attribute_2'] = [3, 4]
+benchmark_categorical_catalog
+
[4]:
+
+ | map_id | +compare_id | +catalog_attribute_2 | +
---|---|---|---|
0 | +./benchmark_categorical_0.tif | +compare1 | +3 | +
1 | +./benchmark_categorical_1.tif | +compare2 | +4 | +
Similar to the previous catalog, the benchmark catalog should have columns representing: 1. An identifier of a candidate map, (in this case compare_id
) 2. The location of the candidate map, (in this case map_id
)
Categorical Catalog Comparison
+When compare_type
is set to ‘categorical’ the catalog will be run as categorical comparisons. See arguments and output below for the comparison metrics:
[5]:
+
arguments = {
+ "candidate_catalog": candidate_categorical_catalog,
+ "benchmark_catalog": benchmark_categorical_catalog,
+ "on": "compare_id",
+ "map_ids": "map_id",
+ "how": "inner",
+ "compare_type": "categorical",
+ "compare_kwargs": {
+ "metrics": (
+ "critical_success_index",
+ "true_positive_rate",
+ "positive_predictive_value",
+ ),
+ "encode_nodata": True,
+ "nodata": -9999,
+ "positive_categories": 2,
+ "negative_categories": 1
+ },
+ "open_kwargs": {
+ "mask_and_scale": True,
+ "masked": True
+ }
+}
+
+agreement_categorical_catalog = catalog_compare(**arguments)
+agreement_categorical_catalog.transpose()
+
[5]:
+
+ | 0 | +1 | +2 | +
---|---|---|---|
map_id_candidate | +./candidate_categorical_0.tif | +./candidate_categorical_1.tif | +./candidate_categorical_1.tif | +
compare_id | +compare1 | +compare2 | +compare2 | +
agreement_maps | +agreement_categorical_0.tif | +agreement_categorical_1.tif | +agreement_categorical_1.tif | +
catalog_attribute_1 | +1 | +2 | +2 | +
map_id_benchmark | +./benchmark_categorical_0.tif | +./benchmark_categorical_1.tif | +./benchmark_categorical_1.tif | +
catalog_attribute_2 | +3 | +4 | +4 | +
band | +1 | +1 | +2 | +
fn | +844.0 | +844.0 | +844.0 | +
fp | +844.0 | +844.0 | +844.0 | +
tn | +5939.0 | +5939.0 | +5939.0 | +
tp | +1977.0 | +1977.0 | +1977.0 | +
critical_success_index | +0.539427 | +0.539427 | +0.539427 | +
true_positive_rate | +0.700815 | +0.700815 | +0.700815 | +
positive_predictive_value | +0.700815 | +0.700815 | +0.700815 | +
We can see the agreement maps below (and why the metrics are similar as the datasets were essentially equivalent):
+[6]:
+
for ag_map in agreement_categorical_catalog['agreement_maps'].unique():
+ rxr.open_rasterio(ag_map, mask_and_scale=True).gval.cat_plot(
+ title=f'Agreement Map {int(ag_map.split("_")[-1][0]) + 1}'
+ )
+
Continuous Catalog Compare
+The continuous catalogs are as follows:
+[7]:
+
candidate_continuous_catalog['catalog_attribute_1'] = [1, 2]
+candidate_continuous_catalog
+
[7]:
+
+ | map_id | +compare_id | +agreement_maps | +catalog_attribute_1 | +
---|---|---|---|---|
0 | +./candidate_continuous_0.tif | +compare1 | +./agreement_continuous_0.tif | +1 | +
1 | +./candidate_continuous_1.tif | +compare2 | +./agreement_continuous_1.tif | +2 | +
[8]:
+
benchmark_continuous_catalog['catalog_attribute_2'] = [3, 4]
+benchmark_continuous_catalog
+
[8]:
+
+ | map_id | +compare_id | +catalog_attribute_2 | +
---|---|---|---|
0 | +./benchmark_continuous_0.tif | +compare1 | +3 | +
1 | +./benchmark_continuous_1.tif | +compare2 | +4 | +
Just like before, compare_type
is set to ‘continuous’ and the catalog will be run as continuous comparisons:
[9]:
+
arguments = {
+ "candidate_catalog": candidate_continuous_catalog,
+ "benchmark_catalog": benchmark_continuous_catalog,
+ "on": "compare_id",
+ "map_ids": "map_id",
+ "how": "inner",
+ "compare_type": "continuous",
+ "compare_kwargs": {
+ "metrics": (
+ "coefficient_of_determination",
+ "mean_absolute_error",
+ "mean_absolute_percentage_error",
+ ),
+ "encode_nodata": True,
+ "nodata": -9999,
+ },
+ "open_kwargs": {
+ "mask_and_scale": True,
+ "masked": True
+ }
+}
+
+agreement_continuous_catalog = catalog_compare(**arguments)
+agreement_continuous_catalog.transpose()
+
[9]:
+
+ | 0 | +1 | +2 | +
---|---|---|---|
map_id_candidate | +./candidate_continuous_0.tif | +./candidate_continuous_1.tif | +./candidate_continuous_1.tif | +
compare_id | +compare1 | +compare2 | +compare2 | +
agreement_maps | +./agreement_continuous_0.tif | +./agreement_continuous_1.tif | +./agreement_continuous_1.tif | +
catalog_attribute_1 | +1 | +2 | +2 | +
map_id_benchmark | +./benchmark_continuous_0.tif | +./benchmark_continuous_1.tif | +./benchmark_continuous_1.tif | +
catalog_attribute_2 | +3 | +4 | +4 | +
band | +1 | +1 | +2 | +
coefficient_of_determination | +-0.06616 | +-2.829421 | +0.10903 | +
mean_absolute_error | +0.317389 | +0.485031 | +0.485031 | +
mean_absolute_percentage_error | +0.159568 | +0.202235 | +0.153235 | +
We can see the continuous agreement maps below:
+[10]:
+
for ag_map in agreement_continuous_catalog['agreement_maps'].unique():
+ rxr.open_rasterio(ag_map, mask_and_scale=True).gval.cont_plot(
+ title=f'Agreement Map {int(ag_map.split("_")[-1][0]) + 1}'
+ )
+