code refactor #31

matt-long · 2020-03-07T03:12:54Z

I am thinking about my OMWG talk...so this list is coming out of me right now. I think my refactor of the nutrient plotting tries to move towards some of these principles.

Principles

Rely on APIs for data access

Data access based on hard-coded paths is fragile
Sometimes, relationships between data and actual desired product involves computation
- For instance, pop_tools.get_grid(...) reads CESM INPUTDATA files via web protocol and
  does the same computations as the model to derive the grid variables. Super portable. Efficient because of Numba!
Access details are messy
- Simulations discretized over arbitrary number of files: time levels, variables, ensemble
  members: does not conform to meaningful conceptual discretization
- Many file formats, temporal frequencies, spatial resolutions: APIs enable standardizations steps to be applied en route
An API can be parameterized
- Flexibility enables more reuseable components
- Build codes to perform operation over key dimension in query returns

Consume and produce `xarray` datasets

xarray.Dataset objects encapsulate data and metadata
i/o and Cloud storage formats are well supported
Operators enable rapid dimension reduction, interpolation, resampling, &

Plotting code should not do "computation"

Computation should be clearly separate from visualization
The data behind plots should be cached
If the data to make plots is available, we can build web-visualization maps around it

Isolate dependencies on glade

Data behind a authentication layers precludes reproducibility

To be continued....

The text was updated successfully, but these errors were encountered:

matt-long · 2020-03-07T03:20:10Z

How many local modules do we need?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code refactor #31

code refactor #31

matt-long commented Mar 7, 2020

matt-long commented Mar 7, 2020

code refactor #31

code refactor #31

Comments

matt-long commented Mar 7, 2020

Principles

Rely on APIs for data access

Consume and produce xarray datasets

Plotting code should not do "computation"

Isolate dependencies on glade

matt-long commented Mar 7, 2020

Consume and produce `xarray` datasets