Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[cartesian]: read-only data dims direct access & Fields #1451

Merged

Conversation

FlorianDeconinck
Copy link
Contributor

@FlorianDeconinck FlorianDeconinck commented Feb 12, 2024

Description

We introduce a ROField which is a data dimensions only Field. Access is provided via .at[] which also becomes a way to access data dimensions on regular Field.

Technically, we re-use the data dims concept, empty out the cartesian axes and make sure we have protection in place against write to the ROField. Definition follows the already established data dimensions format for now: [(type, (axes...))]

Requirements

  • All fixes and/or new features come with corresponding tests.
  • Important design decisions have been documented in the approriate ADR inside the docs/development/ADRs/ folder.

`.at` operator to index globally
utest
Better error message.
@havogt
Copy link
Contributor

havogt commented Feb 12, 2024

Just a quick feedback (didn't look close): couldn't that be implemented with just a a normal field with no I,J or K dimension? Since our iteration space is always 3D, writing to a pure I field would also result in multiple writes from different threads to the same memory location, therefore the limit of no dimension is not really different.

@FlorianDeconinck
Copy link
Contributor Author

Technically the ROField is exactly that: a no spatial dim, only data dim Field. I made a different FieldDescriptor to not have spaghetti code and be more verbose user side with those kind of field which are not on the grid

@FlorianDeconinck
Copy link
Contributor Author

E.g.

class _ReadOnlyFieldDescriptorMaker(_FieldDescriptorMaker):
    def __getitem__(self, field_spec):
        if not isinstance(field_spec, collections.abc.Collection) and not len(field_spec) == 2:
            raise ValueError("ROField is defined by a tuple (type, [axes_size..])")

        dtype, data_dims = field_spec

        return _FieldDescriptor(dtype, [], data_dims)

@FlorianDeconinck
Copy link
Contributor Author

NB: One side effect of the way I coded up .at is that you can always access data dimension that way for read operation, including on a normal Field with spatial dimensions (you get centered spatial dim)

@havogt
Copy link
Contributor

havogt commented Feb 12, 2024

I meant read-only fields and fields with no spatial dimensions are probably both useful features, but they are orthogonal. I’ll have a closer look tomorrow.

@FlorianDeconinck
Copy link
Contributor Author

Yes, I did think about that. And technically the read-only Field is enforced by the fact you can only do .at for access which is then blocked from write at the gtscript level. So the Field itself is not RO, which is confusing for devs.

At first I wrote it as a "DataField".

Let's exchange tomorrow at community meeting

Make utest more readable
@FlorianDeconinck
Copy link
Contributor Author

I renamed ROField to OffgridField.

I looked into using Field as-is, but Field by default have a IJK spatial axes which is used to be able to write Field[np.float64] and have it default to a 3d field.

I didn't come up with an alternate way to write the definition that was clean enough. So out of abundance of caution around the feature, I opted again for the OffgridField which forces no spatial dimensions

Copy link
Contributor

@havogt havogt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'll discuss with @egparedes when he is back from vacation.

# GTScript builtins: variable annotations
Field = _FieldDescriptorMaker()
"""Field descriptor."""

OffgridField = _ReadOnlyFieldDescriptorMaker()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like ConstantField (constant with respect to the grid), but I think not everyone likes it for confusion with non-mutable. Here it both meanings coincide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConstantField would be confusing for users I think. Though, indeed, the system make it constant in stencils, the Field is still modifiable outside of gtscript

src/gt4py/cartesian/frontend/gtscript_frontend.py Outdated Show resolved Hide resolved
@FlorianDeconinck FlorianDeconinck changed the title feat[cartesian]: read-only absolute indexed field feat[cartesian]: read-only data dims direct access & Fields Feb 14, 2024
@egparedes
Copy link
Contributor

egparedes commented Feb 23, 2024

I fully agree that the two features added in this PR make sense: the shortcut to access data dimensions in the current point instead of the two pairs of brackets, and a way to pass ndarrays of data which are not linked to the grid, however I disagree with the names.

Since the content of the new OffgridField concept is not linked to any position in the grid, it's not at all a physical field (https://en.wikipedia.org/wiki/Field_(physics)) and in my opinion should be just called Array, NdArray, Tensor or something along this lines. Additionally, an Array object uses absolute indexing (as any other collection instance in Python) while fields interpret indices as offsets relative to the current iteration point in the grid (numba's stencil decorator offers a similar feature by tagging a parameter as standard_indexing: https://numba.readthedocs.io/en/stable/user/stencil.html#standard-indexing). The difference with fields with data dimensions is that they are still fields, since there is a tensor quantity associated to every point in the grid.

Regarding the new .at data dimensions accessor, I would use something else since in other frameworks like jax, .at has a different meaning (https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.ndarray.at.html) and we are considering to add something similar in gt4py.next which would make the two different uses of .at in gt4py.cartesian very confusing. Other options I see could be .data , .local, .current, .this, .point, ... or single letter shortcuts in the spirit of numpy's .T like .A (array) or .D (data).

@FlorianDeconinck
Copy link
Contributor Author

Two very good points.

The generic Array or NdArray could be confusing for users as a generic Array that transfers to any other framework (same for ndarray). I reckon we'd rather limit that concept in scope. What about GlobalMemory ? Not the greatest but hopefully you see my point.

As for the re-use of at if you are planning on shadowing jax at then of course we need something else. A single letter shortcut seems like a way to go. A for Array or Access is fine.

@egparedes
Copy link
Contributor

The generic Array or NdArray could be confusing for users as a generic Array that transfers to any other framework (same for ndarray). I reckon we'd rather limit that concept in scope. What about GlobalMemory ?

Yes, I see your point and it makes sense. What about GlobalParameter or GlobalTable?

@FlorianDeconinck
Copy link
Contributor Author

GlobalTable, I like that. It conveys both the array-ness and a form of read-only.

@egparedes
Copy link
Contributor

From my side at least, you can go ahead and change the names of the two features, and I think the PR would already be ready to be merged.

@FlorianDeconinck
Copy link
Contributor Author

Done. GlobalTable and .A for access

@FlorianDeconinck
Copy link
Contributor Author

Ready to be merged.

Copy link
Contributor

@egparedes egparedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.

Copy link
Contributor

@egparedes egparedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR to the new formatting tools (it should be very easy)

@FlorianDeconinck
Copy link
Contributor Author

cscs-ci run

@havogt
Copy link
Contributor

havogt commented Feb 29, 2024

cscs-ci run

Copy link
Contributor

@egparedes egparedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ILGTM

@FlorianDeconinck FlorianDeconinck merged commit 0151b05 into GridTools:main Mar 4, 2024
31 checks passed
@FlorianDeconinck FlorianDeconinck deleted the cartesian/global_field_index branch March 4, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants