Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing simulations in IPCC_annual_global_tas.csv #268

Closed
aulemahal opened this issue Oct 5, 2023 · 9 comments · Fixed by #270
Closed

Missing simulations in IPCC_annual_global_tas.csv #268

aulemahal opened this issue Oct 5, 2023 · 9 comments · Fixed by #270

Comments

@aulemahal
Copy link
Collaborator

Generic Issue

  • xscen version: master
  • Python version: any
  • Operating System: any

Description

Some members of ESPO-G6-R2 are missing from the global tas timeseries dataset.

CMIP6_CMCC-ESM2_ssp245_r1i1p1f1
CMIP6_CMCC-ESM2_ssp370_r1i1p1f1
CMIP6_CMCC-ESM2_ssp585_r1i1p1f1
CMIP6_EC-Earth3-CC_ssp245_r1i1p1f1
CMIP6_EC-Earth3-CC_ssp585_r1i1p1f1
CMIP6_KACE-1-0-G_ssp245_r1i1p1f1
CMIP6_KACE-1-0-G_ssp370_r1i1p1f1
CMIP6_KACE-1-0-G_ssp585_r1i1p1f1
CMIP6_TaiESM1_ssp245_r1i1p1f1
CMIP6_TaiESM1_ssp370_r1i1p1f1

I'm not familiar with the official source for this data, but I can do the addition if someone points me to it!

@juliettelavoie
Copy link
Contributor

juliettelavoie commented Oct 5, 2023

Ça vient d'ici: https://github.com/IPCC-WG1/Atlas/tree/main/datasets-aggregated-regionally/data/CMIP6/CMIP6_tas_landsea (la colonne world)
en plus de calculs maison de @mccrayc .
Il faudra peut-être faire le calcul nous-même pour les manquants..

@juliettelavoie
Copy link
Contributor

Quand j'ai commencé à regarder les niveaux de réchauffement, j'avais mis plusieurs ressource ici: Ouranosinc/xclim#1014

Il y avait d'autres sources qui donnaient des niveaux de réchauffement, mais je ne suis pas certaine qu'ils utilisaient la même méthode que le giec.

@aulemahal
Copy link
Collaborator Author

aulemahal commented Oct 5, 2023

import xscen as xs
import pandas as pd

scats = xs.search_data_catalogs('/tank/scenario/catalogues/simulation.json', {'tas': 'MS'}, other_search_criteria={'source': 'TaiESM1', 'experiment':'ssp585', 'processing_level':'raw'}, match_hist_and_fut=True)
ds = xs.extract_dataset(scats['CMIP6_ScenarioMIP_AS-RCEC_TaiESM1_ssp585_r1i1p1f1_global'])['MS']

dsm = xs.spatial_mean(
    ds.resample(time='YS').mean(),
    region='global',
    method='cos-lat',
    spatial_subset=False
)

tas_glob = xc.units.convert_units_to(dsm.tas, '°C')
tas_glob = tas_glob.rename(time='year').assign_coords(year=dsm.time.dt.year.values)

tas_ref = pd.read_csv('data/IPCC_annual_global_tas.csv').set_index('year')['CMIP6_TaiESM1_ssp585_r1i1p1f1']

abs(tas_glob - tas_ref).mean()

donne 0.0015 °C. J'ai une erreur assez uniforme de 0.0013-0.0018. L'erreur augmente à 0.82 ± 0.07 si j'utilise xESMF pour la moyenne.

Avec les données quotidiennes (tasmin et tasmax), j'ai plutôt une erreur 0.04 ± 0.03... (cos-lat)

Est-ce que quelqu'un à une idée ?

Sinon, ça me semble quand même raisonnable pour des warmings levels.

@juliettelavoie
Copy link
Contributor

juliettelavoie commented Oct 5, 2023

Pour reproduire exactement, il faut regriller je pense.
Chris m'avait envoyé ça avec "J’ai pu reproduire exactement les mêmes valeurs en utilisant tas (les fichiers tas_Amon_*.nc dans nos répertoires CMIP5" :

'''
Calculate yearly global mean temperatures for the other CanESM2 members as done for the IPCC  
'''
#for ens_memb in ['r2i1p1','r3i1p1','r4i1p1','r5i1p1']:
for ens_memb in ['r1i1p1']:
    with xr.open_dataset('REDACTED/CMIP5/CCCMA/CanESM2/historical/mon/atmos/'+ens_memb+'/tas/tas_Amon_CanESM2_historical_'+ens_memb+'_185001-200512.nc') as ds_gcm_past:
        with xr.open_dataset('/REDACTED/CMIP5/CCCMA/CanESM2/rcp85/mon/atmos/'+ens_memb+'/tas/tas_Amon_CanESM2_rcp85_'+ens_memb+'_200601-210012.nc') as ds_gcm_fut:
            ds_gcm = xr.concat([ds_gcm_past, ds_gcm_fut], dim='time')
            ds_gcm = ds_gcm.groupby('time.year').mean()
            #Regridding
            lon_bnds = np.append(ds_gcm.lon_bnds.sel(bnds=0)[0].values, ds_gcm.lon_bnds.sel(bnds=1)[1].values[-1])
            lat_bnds = np.append(ds_gcm.lat_bnds.sel(bnds=0)[0].values, ds_gcm.lat_bnds.sel(bnds=1)[1].values[-1])
            grid_gcm = {'lon': ds_gcm['lon'].values,
                       'lat': ds_gcm['lat'].values,
                       'lon_b': lon_bnds,
                       'lat_b': lat_bnds, # fix half-polar cells
                      }
            #Create output grid 2x2°
            grid_out = xe.util.grid_global(2, 2)
            regridder = xe.Regridder(grid_gcm, grid_out, 'conservative')
            weights = np.cos(np.deg2rad(grid_out.lat))
            weights.name = "weights"
            #Regrid the tas field
            ds_gcm_regridded = regridder(ds_gcm['tas'])
            #Apply weights to the grid        
            yearly_mean_ts = ds_gcm_regridded.weighted(weights).mean(("x",'y'))
            #Output csv
            yearly_mean_ts.to_pandas().to_csv('../data/cmip5_tas/CMIP5_yearly_mean_CanESM2_'+ens_memb+'.csv')
            #pre_industrial = ds_canesm2.sel(time=slice('1850-01-01','1900-12-31')).mean()['tas'].weighted(weights)
            #yearly_mean_ts = ds_canesm2.groupby('time.year').mean().mean(['lat','lon'])#-pre_industrial

@aulemahal
Copy link
Collaborator Author

Oh je vais essayer ça merci!

@aulemahal
Copy link
Collaborator Author

En faisant le regridding vers 2°, l'erreur augmente à 0.02 °C.

Je vais donc faire l'hypothèse que mon 0.0015 °C provient du pré-traitement des données qu'on a sur disque. Genre passages entre float32 et float64, compression et décompression, etc. Ça reste une erreur maximale de 0.014%, on va vivre avec! Je calculerai les séries manquantes sans faire le changement de grille vers 2°.

@juliettelavoie
Copy link
Contributor

Ça me va

@aulemahal
Copy link
Collaborator Author

Funny enough, I tried to use the new resample, with weights for the annual mean from monthly data, and it was a worst error again. 🤷‍♂️

@mccrayc
Copy link
Collaborator

mccrayc commented Oct 10, 2023

I just went back and tested my code and am getting similar differences as you @aulemahal, when I said 'exact' I'm guessing what I meant with 'exact enough' ;)

In the IPCC Atlas files, the header specifies that CMIP5 data is regridded to 2deg but that CMIP6 is regridded to 1deg, which may explain why you get worse results when regridding CMIP6 to 2deg. It also specifies that it uses the CDO conservative remapping algorithm (remapcon), which I assume is the equivalent to 'conservative' with xESMF but may be another source of slight differences.

Otherwise, I agree that this level of error is acceptable!

aulemahal added a commit that referenced this issue Dec 11, 2023
<!-- Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #268
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features).
- [x] (If applicable) Tests have been added.
- [x] This PR does not seem to break the templates.
- [x] HISTORY.rst has been updated (with summary of main changes).
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added.

### What kind of change does this PR introduce?

Addition of:
```
CMIP6_CMCC-ESM2_ssp245_r1i1p1f1
CMIP6_CMCC-ESM2_ssp370_r1i1p1f1
CMIP6_CMCC-ESM2_ssp585_r1i1p1f1
CMIP6_EC-Earth3-CC_ssp245_r1i1p1f1
CMIP6_EC-Earth3-CC_ssp585_r1i1p1f1
CMIP6_KACE-1-0-G_ssp245_r1i1p1f1
CMIP6_KACE-1-0-G_ssp370_r1i1p1f1
CMIP6_KACE-1-0-G_ssp585_r1i1p1f1
CMIP6_TaiESM1_ssp245_r1i1p1f1
CMIP6_TaiESM1_ssp370_r1i1p1f1
```
to `xscen/data/IPCC_annual_global_tas.csv`.


* Change from CSV to netCDF
* Accept DataArrays, DataFrames in `get_warming_level`, output as those
or list.
* Multiple models and multiple warming levels in `subset_warming_level`.

### Does this PR introduce a breaking change?
Yes, if `get_warming_level` is called with a sequence of models, the
output is now a list of the same length. The dictionary can be retrieved
by passing `output='selected'`. I found it way more useful this way!

### Other information:
The new global tas were computed with the following code:
```python3
def get_global_tas(source, exp, member):
    scats = xs.search_data_catalogs(
        'catalogues/simulation.json',
        {'tas': 'MS'},
        other_search_criteria={'source': source, 'experiment': exp, 'member': member, 'processing_level': 'raw'},
        match_hist_and_fut=True
    )
    ds = xs.extract_dataset(scats.popitem()[1])['MS']
    day = ds.tas.resample(time='YS').mean()
    dam = xs.spatial_mean(
        day,
        region='global',
        method='cos-lat',
        spatial_subset=False
    )
    tas = xc.units.convert_units_to(dam, '°C')
    return tas.rename(time='year').assign_coords(year=tas.time.dt.year.values)
```

See issue #268 for discussion of small differences between this and the
official IPCC-provided data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants