More Mature Datamodels #88

DropD · 2025-01-14T15:42:23Z

Problem

The typing in pydantic models created from user config obviously allows for configs to exist which can not successfully be turned into a workflow. In addition, None is often used as a default value, when there seems to be a deeper meaning to "information not provided in the yaml file". Some of the downsides of this are:

Looking at the models it is not immediately clear which information is required, has to come from the user, can be read from elsewhere
Looking at the models it is not immediately clear what it means if certain values are (apparently) legitimately None. For example, does period=None, start_date != end_date != None mean "run at start date" or "never run" or "run at start and end date"?
Any code that operates on instances of models has to check for a whole load of None values if not more possible types for static analysis to be able to confirm it's not doing anything wrong, while many of those checks are not necessary, because validators have already taken care of it.
It is very hard to infer from the code what a meaningful unit test should be.

All of these things block us from making use of unit tests and static type analysis, which are major factors to keep up velocity in the mid-to-long term.

Proposed solution:

I propose to solve the problem incrementally by doing the following for each of the data models:

write a doctest example which demonstrates how to obtain the class from yaml
write more extensive unit tests for the desired state after all validators have run
add unit tests for how they are used in workflow.Workflow creation
ask the hard questions about whether None or an empty container is meaningful in attributes and if yes, what the meaning is.
insert more meaningful sentinel values (enums, types) and additional validators as required
insert additional "finalized" data models and canonicalization functionality if validators are not enough
test everything that was added
switch workflow.Workflow building to the new canonicalized types and adapt the unit tests

If at any point any of this requires changes to the existing integration tests, everyone should agree that it is necessary.

The text was updated successfully, but these errors were encountered:

This was referenced Jan 14, 2025

Tests and canonicalization for ConfigWorkflow #89

Merged

Example canonicalization for ConfigWorkflow #90

Merged

agoscinski added this to the Testing components (parser, core, workgraph) milestone Jan 20, 2025

leclairm added enhancement New feature or request refactor labels Jan 23, 2025

This was referenced Jan 24, 2025

Separate namelist validation inConfigIconTask to dedicated namelist class #84

Open

Icon namelist non optional #109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More Mature Datamodels #88

More Mature Datamodels #88

DropD commented Jan 14, 2025

More Mature Datamodels #88

More Mature Datamodels #88

Comments

DropD commented Jan 14, 2025

Problem

Proposed solution: