Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pack vector components into higher-rank vector fields. #218

Merged
merged 20 commits into from
Sep 16, 2024

Conversation

odlomax
Copy link
Contributor

@odlomax odlomax commented Aug 9, 2024

This is a utility that packs vector fields components, stored in individual fields, into single vector fields.

For example, given rank-2 fields "eastward wind" and "northward wind", it can be configured to pack them into a rank-3 field "wind", where original fields populate the "variables" dimension. There is also an unpacking operation to return them to their original arrangement.

The code would look this:

\\ vector field components from somewhere
const auto componentFieldSet = ...;

\\ configuration for packing class (see below)
const auto fieldPackingConf = ...;
const auto fieldPacker = util::PackVectorFields(fieldPackingConf);

\\ components are packed into vector field
const auto packedFields = fieldPacker.pack(componentFieldSet);

\\ unpack to component fields.
const auto unpackedFields = fieldPacker.unpack(packedFields);

\\ OR if we want to unpack them into the original fieldset to preserve the order.
unpackFields(packedFields, componentFieldSet);

An example of fieldPackingConf could be something like this (in yaml format):

   vector fields:
   - name: wind
     components:
     - eastward wind
     - northward wind~~

Edit: After a bit of thinking, I've massively simplified how this works. In this example, the fields "eastward wind" and "northward wind" would need to both have the metadata {"vector field name", "wind"} added. After that, the pack_vector_fields and unpack_vector_fields methods can figure the rest out. The code would look like this:

\\ vector field components and scalar fields from somewhere
\\ vector fields have "vector field name" metadata set
auto componentFieldSet = ...;

\\ components are packed into vector field
const auto packedFields = util::pack_vector_fields(componentFieldSet);

\\ unpack to component fields
const auto unpackedFields = util::unpack_vector_fields(packedFields);

\\ OR if we want to unpack them into the original fieldset to preserve the order
util::unpack_vector_fields(packedFields, componentFieldSet);

I believe the method I've implemented is reasonably generic. The main limitation is that it require fields to have the standard [horizontal, levels, variables] shape, where the levels dimension is optional.

This PR addresses part of issue #217

@odlomax
Copy link
Contributor Author

odlomax commented Aug 12, 2024

Apologies if you have already started looking at this, @wdeconinck. I realised that the configuration element of the code was more complicated than it needed to be. I've updated the branch to simplify things.

Copy link
Contributor

@twsearle twsearle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change looks good. I understand it is mostly just used internal to atlas at the moment, but it looks like it would also be possible to use this in a project that is calling the atlas library?

One of the more exotic bits of the ocean quality control and DA makes use of standard errors on two different timescales. These are used separately for some checks, and then combined to produce a total error for other checks. Combining errors is done in quadrature, so I am imagining it would be simple enough to mark these fields as a vector field so that they can be a single object.

We are still trying to find a nice way to represent this situation within the JEDI framework /ufo. at the moment we use a sort of hack because we can't associate two or more error fields with a single model variable.

I have tested this change on Met Office infrastructure and the new tests pass on our systems, including the EXZ.

@odlomax
Copy link
Contributor Author

odlomax commented Aug 12, 2024

I think this change looks good. I understand it is mostly just used internal to atlas at the moment, but it looks like it would also be possible to use this in a project that is calling the atlas library?

One of the more exotic bits of the ocean quality control and DA makes use of standard errors on two different timescales. These are used separately for some checks, and then combined to produce a total error for other checks. Combining errors is done in quadrature, so I am imagining it would be simple enough to mark these fields as a vector field so that they can be a single object.

We are still trying to find a nice way to represent this situation within the JEDI framework /ufo. at the moment we use a sort of hack because we can't associate two or more error fields with a single model variable.

I have tested this change on Met Office infrastructure and the new tests pass on our systems, including the EXZ.

Thanks for having a look, @twsearle!

As you can see, after an overly complex false start, this boiled to into two straightforward non-class functions. I was thinking of using them to help with interpolation, but I'm thrilled you've found another use-case!

@twsearle
Copy link
Contributor

Thanks for having a look, @twsearle!

As you can see, after an overly complex false start, this boiled to into two straightforward non-class functions. I was thinking of using them to help with interpolation, but I'm thrilled you've found another use-case!

No worries I had some time this afternoon to review and I was subscribed to the change! I think I am a ways away from making use of something like this (I think it would involve fields being in their packed state to begin with so that only one oops "model variable name" is associated with the error on that model). It is something to think about though! I think it has the potential to make things significantly less ugly in our configurations.

Copy link
Member

@wdeconinck wdeconinck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All really good!
The only change I would request is to use underscores instead of spaces in the configuration options.
This would potentially allow e.g. python API to use option names as keyword arguments (kwargs).

@odlomax
Copy link
Contributor Author

odlomax commented Sep 16, 2024

All really good! The only change I would request is to use underscores instead of spaces in the configuration options. This would potentially allow e.g. python API to use option names as keyword arguments (kwargs).

Thanks, @wdeconinck. Makes perfect sense. I've done as you've suggested.

@wdeconinck wdeconinck merged commit e2974ad into ecmwf:develop Sep 16, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants