\n",
" \n",
" 6. Pandas\n",
@@ -277,24 +278,24 @@
"\n",
"For data, *Lagrangian* refers to oceanic and atmosphere information acquired by observing platforms drifting with the flow they are embedded within, but also refers more broadly to the data originating from uncrewed platforms, vehicles, and animals that gather data along their unrestricted and often complex paths. Because such paths traverse both spatial and temporal dimensions, Lagrangian data often convolve spatial and temporal information that cannot always readily be organized in common data structures and stored in standard file formats with the help of common libraries and standards. As such, for both originators and users, Lagrangian data present challenges that the [EarthCube CloudDrift](https://github.com/Cloud-Drift) project aims to overcome.\n",
"\n",
- "This notebook consists of systematic evaluations and comparisons of workflows for Lagrangian data, using as a basis the velocity and sea surface temperature datasets emanating from the drifting buoys of the [Global Drifter Program](https://www.aoml.noaa.gov/phod/gdp/) (GDP). Specifically, we consider the interplay between diverse storage file formats ([NetCDF](https://www.unidata.ucar.edu/software/netcdf/), [Parquet](https://github.com/apache/parquet-format)) and the data structure associated with common existing libraries in Python ([xarray](https://docs.xarray.dev/en/stable/), [pandas](https://pandas.pydata.org), and [Awkward Array](https://awkward-array.org/quickstart.html)) in order to test their adequacies for performing three common Lagrangian tasks:\n",
+ "This notebook consists of systematic evaluations and comparisons of workflows for Lagrangian data, using as a basis the velocity and sea surface temperature datasets emanating from the drifting buoys of the [Global Drifter Program](https://www.aoml.noaa.gov/phod/gdp/) (GDP). Specifically, we consider the interplay between diverse storage file formats ([NetCDF](https://www.unidata.ucar.edu/software/netcdf/), [Parquet](https://github.com/apache/parquet-format)) and the data structure associated with common existing libraries in *Python* ([xarray](https://docs.xarray.dev/en/stable/), [pandas](https://pandas.pydata.org), and [Awkward Array](https://awkward-array.org/quickstart.html)) in order to test their adequacies for performing three common Lagrangian tasks:\n",
"\n",
- "1. binning of a variable on an Eulerian grid (e.g. mean temperature map),\n",
+ "1. binning of a variable on an spatially-fixed grid (e.g. mean temperature map),\n",
"2. extracting data within given geographical and/or temporal windows (e.g. Gulf of Mexico),\n",
"3. analyses per trajectory (e.g. single statistics, spectral estimation by Fast Fourier Transform).\n",
"\n",
- "Since the *CloudDrift* project aims at accelerating the use of Lagrangian data for atmospheric, oceanic, and climate sciences, we hope that the users of this notebook will provide us with feedback on its ease of use and the intuitiveness of the proposed methods in order to guide the on-going development of the *clouddrift* python package.\n",
+ "Since the *CloudDrift* project aims at accelerating the use of Lagrangian data for atmospheric, oceanic, and climate sciences, we hope that the users of this notebook will provide us with feedback on its ease of use and the intuitiveness of the proposed methods in order to guide the on-going development of the *clouddrift* *Python* package.\n",
"\n",
"## Technical contributions\n",
"\n",
"- Description of some challenges arising from the analysis of large, heterogeneous Lagrangian datasets.\n",
- "- Description of some data formats for Lagrangian analysis with python.\n",
- "- Comparison of performances of established and developing python packages and libraries.\n",
+ "- Description of some data formats for Lagrangian analysis with *Python*.\n",
+ "- Comparison of performances of established and developing *Python* packages and libraries.\n",
"\n",
"## Methodology\n",
"\n",
"The notebook proceeds in three steps:\n",
- "1. First, we download a subset of the hourly dataset of the GDP. Specifically, we access the version 2.00 (beta) of the dataset that consists of a collection of 17,324 NetCDF files available from a FTP server of the GDP. Alternative methods to download these data are described on the website of the [GDP DAC at NOAA AOML](https://www.aoml.noaa.gov/phod/gdp/hourly_data.php) and includes a newly-formed collection from the NOAA National Centers for Environmental Information with [doi:10.25921/x46c-3620](https://doi.org/10.25921/x46c-3620). We download a subset (which size can be scaled up or down) then proceed to aggregate the data from the individual files in one single file using a suggested format (the contiguous ragged array).\n",
+ "1. First, we download a subset of the hourly dataset of the GDP. Specifically, we access version 2.00 (beta) of the dataset that consists of a collection of 17,324 NetCDF files, one for each drifting buoy, available from a HHTPS (or FTP) [server](https://www.aoml.noaa.gov/ftp/pub/phod/lumpkin/hourly/v2.00/netcdf/) of the GDP. Alternative methods to download these data are described on the website of the [GDP DAC at NOAA AOML](https://www.aoml.noaa.gov/phod/gdp/hourly_data.php) and includes a newly-formed collection from the NOAA National Centers for Environmental Information with [doi:10.25921/x46c-3620](https://doi.org/10.25921/x46c-3620). We download a subset (which size can be scaled up or down) then proceed to aggregate the data from the individual files in one single file using a suggested format (the contiguous ragged array).\n",
"\n",
"2. Second, we benchmark three libraries—*xarray*, *Pandas*, and *Awkward Array*—with typical Lagrangian workflow tasks such as the geographical binning of a variable, the extraction of the data for a given region, and operations performed per drifter trajectory.\n",
"\n",
@@ -304,7 +305,7 @@
"\n",
"In terms of data file format, we tested both NetCDF and Parquet file formats but did not find significant performance gain from using one or the other. Because NetCDF is a well-known and established file format in Earth sciences, we save the contiguous ragged array as a single NetCDF archive. \n",
"\n",
- "In terms of python packages, we find that *Pandas* is intuitive with a simple syntax but does not perform efficiently with large dataset. The complete GDP hourly dataset is currently *only* ~15 GB, but as part of *CloudDrift* we also want to support larger Lagrangian datasets (>100 GB). On the other hand, *xarray* can interface with *Dask* to efficiently *lazy-load* large dataset but it requires custom adaptation to operate on a ragged array. In contrast, *Awkward Array* provides a novel approach by storing alongside the data an offset index in a manner that is transparent to the user, simplifying the analysis of non-uniform Lagrangian datasets. We find that it is also *fast* and can easily interface with *Numba* to further improve performances.\n",
+ "In terms of python packages, we find that *Pandas* is intuitive with a simple syntax but does not perform efficiently with a large dataset. The complete GDP hourly dataset is currently *only* ~15 GB, but as part of *CloudDrift* we also want to support larger Lagrangian datasets (>100 GB). On the other hand, *xarray* can interface with *Dask* to efficiently *lazy-load* large dataset but it requires custom adaptation to operate on a ragged array. In contrast, *Awkward Array* provides a novel approach by storing alongside the data an offset index in a manner that is transparent to the user, simplifying the analysis of non-uniform Lagrangian datasets. We find that it is also *fast* and can easily interface with *Numba* to further improve performances.\n",
"\n",
"In terms of benchmark speed, each package show similar results for the geographical binning (test 1) and the operation per trajectory (test 3) benchmarks. For the extraction of a given region (test 2), *xarray* was found to be slower than both *Pandas* and *Awkward Array*. We note that speed performance may not the deciding factor for all users and we believe that ease of use and simple intuitive syntax are also important."
]
@@ -443,7 +444,7 @@
"\n",
"In the first step of this notebook, we present the current format of the [Global Drifter Program (GDP)](https://www.aoml.noaa.gov/phod/gdp/) dataset, and show how to transform it into a single archival file in which each variable is stored in an ragged fashion.\n",
"\n",
- "The GDP produces two interpolated datasets of drifter position, velocity and sea surface temperature from more than 20,000 drifters that have been released since 1979. One dataset is at 6-hour resolution ([Hansen and Poulain 1996](http://dx.doi.org/10.1175/1520-0426(1996)013<0900:QCAIOW>2.0.CO;2)) and the other one is at hourly resolution ([Elipot et al. 2016](http://dx.doi.org/10.1002/2016JC011716)). The files, one per drifter identified by its unique identification number (ID), are updated on a quarterly basis and are available via the FTP server of the GDP Data Assembly Center (DAC).\n",
+ "The GDP produces two interpolated datasets of drifter position, velocity and sea surface temperature from more than 20,000 drifters that have been released since 1979. One dataset is at 6-hour resolution ([Hansen and Poulain 1996](http://dx.doi.org/10.1175/1520-0426(1996)013<0900:QCAIOW>2.0.CO;2)) and the other one is at hourly resolution ([Elipot et al. 2016](http://dx.doi.org/10.1002/2016JC011716)). The files, one per drifter identified by its unique identification number (ID), are updated on a quarterly basis and are available via the [HTTPS server](https://www.aoml.noaa.gov/ftp/pub/phod/lumpkin/hourly/v2.00/netcdf/) of the GDP Data Assembly Center (DAC).\n",
"\n",
"Here we use a subset of the hourly drifter dataset of the GDP by setting the variable `subset_nb_drifters = 500`. The suggested number is large enough to create an interesting dataset, yet without making the downloading cumbersome and the data processing too expensive. Feel free to scale down or up this value (from 1 to 17324), but beware that if you are running this notebook in a binder there is some memory limitation (500 should work). "
]
@@ -469,8 +470,8 @@
"output_type": "stream",
"text": [
"Fetching the 500 requested netCDF files (as a reference ~2min for 500 files).\n",
- "CPU times: user 88.3 ms, sys: 24.4 ms, total: 113 ms\n",
- "Wall time: 3.03 s\n"
+ "CPU times: user 53 ms, sys: 15.4 ms, total: 68.4 ms\n",
+ "Wall time: 2.39 s\n"
]
}
],
@@ -929,16 +930,16 @@
" acknowledgement: Elipot et al. (2016), Elipot et al. (2021) to...\n",
" history: Version 2.00. Metadata from dirall.dat and d...\n",
" interpolation_method: \n",
- " imei:
xarray.Dataset
traj: 1
obs: 1095
ID
(traj)
|S10
...
long_name :
Global Drifter Program Buoy ID
cf_role :
trajectory_id
array([b'2592'], dtype='|S10')
rowsize
(traj)
int32
...
long_name :
number of obs for this trajectory
sample_dimension :
obs
array([1095], dtype=int32)
WMO
(traj)
float64
...
long_name :
World Meteorological Organization buoy identification number
history :
From dirall.dat
array([4400509.])
expno
(traj)
float64
...
long_name :
Experiment number
history :
From dirall.dat
units :
count
array([9046.])
deploy_date
(traj)
float32
...
long_name :
Deployment date and time
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From deplog.dat
array([9.886752e+08], dtype=float32)
deploy_lat
(traj)
float64
...
long_name :
Deployment latitude
units :
degrees_north
history :
From deplog.dat
array([47.433])
deploy_lon
(traj)
float64
...
long_name :
Deployment longitude
units :
degrees_east
history :
From deplog.dat
array([-52.167])
end_date
(traj)
float32
...
long_name :
End date and time
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From dirall.dat
array([9.926496e+08], dtype=float32)
end_lat
(traj)
float64
...
long_name :
End latitude
units :
degrees_north
history :
From dirall.dat
array([47.22])
end_lon
(traj)
float64
...
long_name :
End longitude
units :
degrees_east
history :
From dirall.dat
array([-54.04])
drogue_lost_date
(traj)
float32
...
long_name :
Date of drogue loss (missing value=drogue still attached; 0=drogue status uncertain from beginning)
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From dirall.dat
array([0.], dtype=float32)
typedeath
(traj)
float64
...
long_name :
Type of death (0=buoy still alive, 1=buoy ran aground, 2=picked up by vessel, 3=stop transmitting, 4=sporadic transmissions, 5=bad batteries, 6=inactive status)
history :
From dirall.dat
array([1.])
typebuoy
(traj)
|S10
...
long_name :
Buoy type (see https://www.aoml.noaa.gov/phod/dac/dirall.html)
Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.
NOAA Atlantic Oceanographic and Meteorological Laboratory
creator_name :
GDP drifter DAC
creator_url :
https://www.aoml.noaa.gov/phod/gdp
creator_email :
aoml.dftr@noaa.gov
doi :
10.1002/2016JC011716TBA
summary :
Global Drifter Program hourly data
comment :
Global Drifter Program hourly data
standard_name_vocabulary :
CF Standard Name Table v30
instrument_vocabulary :
n/a
keywords_vocabulary :
n/a
keywords :
drifter, surface current, interpolated
DeployingShip :
MAERSK PLACENTIA
DeploymentStatus :
good
BuoyTypeManufacturer :
Technocean
BuoyTypeSensorArray :
SVP
CurrentProgram :
9046
PurchaserFunding :
United States
SensorUpgrade :
Transmissions :
United States
DeployingCountry :
United States
DeploymentComments :
ManufactureYear :
NaN
ManufactureMonth :
NaN
ManufactureSensorType :
SVP
ManufactureVoltage :
56 volts
FloatDiameter :
38 cm
SubsfcFloatPresence :
0
DrogueType :
HOLY098
DrogueLength :
5.5 m
DrogueBallast :
2 kg
DragAreaAboveDrogue :
16.1 m^2
DragAreaOfDrogue :
712.8 m^2
DragAreaRatio :
44.2733
DrogueCenterDepth :
15 m
DrogueDetectSensor :
submergence
acknowledgement :
Elipot et al. (2016), Elipot et al. (2021) to be submitted. Global Drifter Program quality-controlled hourly interpolated data from ocean surface drifting buoys, version 2.00. NOAA National Centers for Environmental Information. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2016JC011716TBA.
history :
Version 2.00. Metadata from dirall.dat and deplog.dat
interpolation_method :
imei :
"
+ " imei:
xarray.Dataset
traj: 1
obs: 1095
ID
(traj)
|S10
...
long_name :
Global Drifter Program Buoy ID
cf_role :
trajectory_id
array([b'2592'], dtype='|S10')
rowsize
(traj)
int32
...
long_name :
number of obs for this trajectory
sample_dimension :
obs
array([1095], dtype=int32)
WMO
(traj)
float64
...
long_name :
World Meteorological Organization buoy identification number
history :
From dirall.dat
array([4400509.])
expno
(traj)
float64
...
long_name :
Experiment number
history :
From dirall.dat
units :
count
array([9046.])
deploy_date
(traj)
float32
...
long_name :
Deployment date and time
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From deplog.dat
array([9.886752e+08], dtype=float32)
deploy_lat
(traj)
float64
...
long_name :
Deployment latitude
units :
degrees_north
history :
From deplog.dat
array([47.433])
deploy_lon
(traj)
float64
...
long_name :
Deployment longitude
units :
degrees_east
history :
From deplog.dat
array([-52.167])
end_date
(traj)
float32
...
long_name :
End date and time
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From dirall.dat
array([9.926496e+08], dtype=float32)
end_lat
(traj)
float64
...
long_name :
End latitude
units :
degrees_north
history :
From dirall.dat
array([47.22])
end_lon
(traj)
float64
...
long_name :
End longitude
units :
degrees_east
history :
From dirall.dat
array([-54.04])
drogue_lost_date
(traj)
float32
...
long_name :
Date of drogue loss (missing value=drogue still attached; 0=drogue status uncertain from beginning)
units :
seconds since 1970-01-01 00:00:00 UTC
history :
From dirall.dat
array([0.], dtype=float32)
typedeath
(traj)
float64
...
long_name :
Type of death (0=buoy still alive, 1=buoy ran aground, 2=picked up by vessel, 3=stop transmitting, 4=sporadic transmissions, 5=bad batteries, 6=inactive status)
history :
From dirall.dat
array([1.])
typebuoy
(traj)
|S10
...
long_name :
Buoy type (see https://www.aoml.noaa.gov/phod/dac/dirall.html)
Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.
NOAA Atlantic Oceanographic and Meteorological Laboratory
creator_name :
GDP drifter DAC
creator_url :
https://www.aoml.noaa.gov/phod/gdp
creator_email :
aoml.dftr@noaa.gov
doi :
10.1002/2016JC011716TBA
summary :
Global Drifter Program hourly data
comment :
Global Drifter Program hourly data
standard_name_vocabulary :
CF Standard Name Table v30
instrument_vocabulary :
n/a
keywords_vocabulary :
n/a
keywords :
drifter, surface current, interpolated
DeployingShip :
MAERSK PLACENTIA
DeploymentStatus :
good
BuoyTypeManufacturer :
Technocean
BuoyTypeSensorArray :
SVP
CurrentProgram :
9046
PurchaserFunding :
United States
SensorUpgrade :
Transmissions :
United States
DeployingCountry :
United States
DeploymentComments :
ManufactureYear :
NaN
ManufactureMonth :
NaN
ManufactureSensorType :
SVP
ManufactureVoltage :
56 volts
FloatDiameter :
38 cm
SubsfcFloatPresence :
0
DrogueType :
HOLY098
DrogueLength :
5.5 m
DrogueBallast :
2 kg
DragAreaAboveDrogue :
16.1 m^2
DragAreaOfDrogue :
712.8 m^2
DragAreaRatio :
44.2733
DrogueCenterDepth :
15 m
DrogueDetectSensor :
submergence
acknowledgement :
Elipot et al. (2016), Elipot et al. (2021) to be submitted. Global Drifter Program quality-controlled hourly interpolated data from ocean surface drifting buoys, version 2.00. NOAA National Centers for Environmental Information. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2016JC011716TBA.
history :
Version 2.00. Metadata from dirall.dat and deplog.dat
interpolation_method :
imei :
"
],
"text/plain": [
"\n",
@@ -992,7 +993,7 @@
"source": [
"## Contiguous Ragged Array\n",
"\n",
- "In the GDP dataset, the number of observations varies from `len(['obs'])=13` to `len(['obs'])=66417`. As such, it seems inefficient to create bidimensional datastructure `['traj', 'obs']`, commonly used by Lagrangian numerical simulation tools such as [Ocean Parcels](https://oceanenDrift](https://opendrift.github.io/) and [OpenDrift](https://opendrift.github.io/) that tend to generate trajectories of equal or similar lengths.\n",
+ "In the GDP dataset, the number of observations varies from `len(['obs'])=13` to `len(['obs'])=66417`. As such, it seems inefficient to create bidimensional datastructure `['traj', 'obs']`, commonly used by Lagrangian numerical simulation tools such as [Ocean Parcels](https://oceanparcels.org/) and [OpenDrift](https://opendrift.github.io/) that tend to generate trajectories of equal or similar lengths.\n",
"\n",
"Here, we propose to combine the data from the individual netCDFs files into a [*contiguous ragged array*](https://cfconventions.org/cf-conventions/cf-conventions.html#_contiguous_ragged_array_representation) eventually written in a single NetCDF file in order to simplify data distribution, decrease metadata redundancies, and efficiently store a Lagrangian data collection of uneven lengths. The aggregation process (conducted with the `create_ragged_array` function found in the module `preprocess.py`) also converts to variables some of the metadata originally stored as attributes in the individual NetCDFs. The final structure contains 21 variables with dimension `['obs']` and 38 variables with dimension `['traj']`."
]
@@ -1387,7 +1388,7 @@
" location_type (traj) bool False False False ... True True True\n",
" WMO (traj) int32 4400509 1600536 ... 4601712 4601740\n",
" expno (traj) int32 9046 9435 7325 ... 21312 21312 21312\n",
- " deploy_date (traj) datetime64[ns] 2001-05-01 2001-01-11 ... NaT\n",
+ " deploy_date (traj) datetime64[ns] 2001-05-01 ... 1970-01-01\n",
" deploy_lon (traj) float32 -52.17 71.24 -97.16 ... -151.0 -143.4\n",
" ... ...\n",
" err_sst (obs) float32 ...\n",
@@ -1400,7 +1401,7 @@
" title: Global Drifter Program hourly drifting buoy collection\n",
" history: Version 2.00. Metadata from dirall.dat and deplog.dat\n",
" Conventions: CF-1.6\n",
- " date_created: 2022-04-14T23:14:58.694974\n",
+ " date_created: 2022-04-15T15:08:31.898904\n",
" publisher_name: GDP Drifter DAC\n",
" publisher_email: aoml.dftr@noaa.gov\n",
" ... ...\n",
@@ -1409,31 +1410,31 @@
" contributor_role: Data Acquisition Center\n",
" institution: NOAA Atlantic Oceanographic and Meteorological Laboratory\n",
" acknowledgement: Elipot et al. (2022) to be submitted. Elipot et al. (2...\n",
- " summary: Global Drifter Program hourly data
Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.
[4786301 values with dtype=float32]
sst1
(obs)
float32
...
long_name :
Fitted non-diurnal sea water temperature
units :
Kelvin
comments :
Estimated near-surface non-diurnal sea water temperature from drifting buoy measurements
[4786301 values with dtype=float32]
sst2
(obs)
float32
...
long_name :
Fitted diurnal sea water temperature anomaly
units :
Kelvin
comments :
Estimated near-surface diurnal sea water temperature anomaly from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst
(obs)
float32
...
long_name :
Standard uncertainty of fitted sea water temperature
units :
Kelvin
comments :
Estimated one standard error of near-surface sea water temperature estimate from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst1
(obs)
float32
...
long_name :
Standard uncertainty of fitted non-diurnal sea water temperature
units :
Kelvin
comments :
Estimated one standard error of near-surface non-diurnal sea water temperature estimate from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst2
(obs)
float32
...
long_name :
Standard uncertainty of fitted diurnal sea water temperature anomaly
units :
Kelvin
comments :
Estimated one standard error of near-surface diurnal sea water temperature anomaly estimate from drifting buoy measurements
Global Drifter Program hourly drifting buoy collection
history :
Version 2.00. Metadata from dirall.dat and deplog.dat
Conventions :
CF-1.6
date_created :
2022-04-14T23:14:58.694974
publisher_name :
GDP Drifter DAC
publisher_email :
aoml.dftr@noaa.gov
publisher_url :
https://www.aoml.noaa.gov/phod/gdp
licence :
MIT License
processing_level :
Level 2 QC by GDP drifter DAC
metadata_link :
https://www.aoml.noaa.gov/phod/dac/dirall.html
contributor_name :
NOAA Global Drifter Program
contributor_role :
Data Acquisition Center
institution :
NOAA Atlantic Oceanographic and Meteorological Laboratory
acknowledgement :
Elipot et al. (2022) to be submitted. Elipot et al. (2016). Global Drifter Program quality-controlled hourly interpolated data from ocean surface drifting buoys, version 2.00. NOAA National Centers for Environmental Information. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2016JC011716TBA. Accessed [date].
Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.
[4786301 values with dtype=float32]
sst1
(obs)
float32
...
long_name :
Fitted non-diurnal sea water temperature
units :
Kelvin
comments :
Estimated near-surface non-diurnal sea water temperature from drifting buoy measurements
[4786301 values with dtype=float32]
sst2
(obs)
float32
...
long_name :
Fitted diurnal sea water temperature anomaly
units :
Kelvin
comments :
Estimated near-surface diurnal sea water temperature anomaly from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst
(obs)
float32
...
long_name :
Standard uncertainty of fitted sea water temperature
units :
Kelvin
comments :
Estimated one standard error of near-surface sea water temperature estimate from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst1
(obs)
float32
...
long_name :
Standard uncertainty of fitted non-diurnal sea water temperature
units :
Kelvin
comments :
Estimated one standard error of near-surface non-diurnal sea water temperature estimate from drifting buoy measurements
[4786301 values with dtype=float32]
err_sst2
(obs)
float32
...
long_name :
Standard uncertainty of fitted diurnal sea water temperature anomaly
units :
Kelvin
comments :
Estimated one standard error of near-surface diurnal sea water temperature anomaly estimate from drifting buoy measurements
Global Drifter Program hourly drifting buoy collection
history :
Version 2.00. Metadata from dirall.dat and deplog.dat
Conventions :
CF-1.6
date_created :
2022-04-15T15:08:31.898904
publisher_name :
GDP Drifter DAC
publisher_email :
aoml.dftr@noaa.gov
publisher_url :
https://www.aoml.noaa.gov/phod/gdp
licence :
MIT License
processing_level :
Level 2 QC by GDP drifter DAC
metadata_link :
https://www.aoml.noaa.gov/phod/dac/dirall.html
contributor_name :
NOAA Global Drifter Program
contributor_role :
Data Acquisition Center
institution :
NOAA Atlantic Oceanographic and Meteorological Laboratory
acknowledgement :
Elipot et al. (2022) to be submitted. Elipot et al. (2016). Global Drifter Program quality-controlled hourly interpolated data from ocean surface drifting buoys, version 2.00. NOAA National Centers for Environmental Information. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2016JC011716TBA. Accessed [date].
summary :
Global Drifter Program hourly data
"
],
"text/plain": [
"\n",
@@ -1463,7 +1464,7 @@
" title: Global Drifter Program hourly drifting buoy collection\n",
" history: Version 2.00. Metadata from dirall.dat and deplog.dat\n",
" Conventions: CF-1.6\n",
- " date_created: 2022-04-14T23:14:58.694974\n",
+ " date_created: 2022-04-15T15:08:31.898904\n",
" publisher_name: GDP Drifter DAC\n",
" publisher_email: aoml.dftr@noaa.gov\n",
" ... ...\n",
@@ -1859,7 +1860,7 @@
"Dimensions without coordinates: traj\n",
"Attributes:\n",
" long_name: Number of observations per trajectory\n",
- " units: -
"
],
"text/plain": [
"\n",
@@ -1904,7 +1905,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "In the second and next step of the notebook, we benchmark different data science python libraries. For this, the following sections present three typical Lagrangian tasks which are conducted successively using *xarray*, *Pandas*, and finally *Awkward Arrays*."
+ "In the second and next step of the notebook, we benchmark different data science *Python* libraries. For this, the following sections present three typical Lagrangian tasks which are conducted successively using *xarray*, *Pandas*, and finally *Awkward Arrays*."
]
},
{
@@ -2319,7 +2320,7 @@
" title: Global Drifter Program hourly drifting buoy collection\n",
" history: Version 2.00. Metadata from dirall.dat and deplog.dat\n",
" Conventions: CF-1.6\n",
- " date_created: 2022-04-14T23:14:58.694974\n",
+ " date_created: 2022-04-15T15:08:31.898904\n",
" publisher_name: GDP Drifter DAC\n",
" publisher_email: aoml.dftr@noaa.gov\n",
" ... ...\n",
@@ -2328,7 +2329,7 @@
" contributor_role: Data Acquisition Center\n",
" institution: NOAA Atlantic Oceanographic and Meteorological Laboratory\n",
" acknowledgement: Elipot et al. (2022) to be submitted. Elipot et al. (2...\n",
- " summary: Global Drifter Program hourly data
xarray.Dataset
traj: 500
obs: 4786301
ID
(traj)
int64
dask.array<chunksize=(500,), meta=np.ndarray>
long_name :
Global Drifter Program Buoy ID
units :
-
\n",
+ " summary: Global Drifter Program hourly data