Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grid Dimension Ordering #46

Closed
bet20ICL opened this issue May 31, 2024 · 6 comments · Fixed by #52
Closed

Grid Dimension Ordering #46

bet20ICL opened this issue May 31, 2024 · 6 comments · Fixed by #52
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@bet20ICL
Copy link
Collaborator

Hi, I ran into some issues using create_mesh.py for my dataset. I'm wondering if there might be a bug in the documentation for the code?

The code comments consistently say that a sample has dimensions (N_t', dim_x, dim_y, d_features').

full_sample = torch.tensor(
np.load(sample_path), dtype=torch.float32
) # (N_t', dim_x, dim_y, d_features')

Similarly, nwp_xy.npy, the coordinates of each grid point also has dimensions in the same order (2, N_x, N_y).

# -- Static grid node features --
grid_xy = torch.tensor(
np.load(os.path.join(static_dir_path, "nwp_xy.npy"))
) # (2, N_x, N_y)

Now, these lines in create_mesh.py imply that the first slice of nwp_xy.npy hold the x coordinates, and vice versa.

neural-lam/create_mesh.py

Lines 110 to 112 in 9d558d1

def mk_2d_graph(xy, nx, ny):
xm, xM = np.amin(xy[0][0, :]), np.amax(xy[0][0, :])
ym, yM = np.amin(xy[1][:, 0]), np.amax(xy[1][:, 0])

I now refer to nwp_xy.npy as grid_xy.
Therefore, given the dimension ordering (2, N_x, N_y), I would reason that grid_xy[0][0] is a length N_y vector holding the x coordinates of vertical slice of the dataset. I would expect grid_xy[0][0] to have the same x coordinate. However, I see that isn't the case.

grid_xy = np.load("data/meps_example/static/nwp_xy.npy")
print(grid_xy.shape)
print(grid_xy[0][0].shape)
print(grid_xy[0][0])

>>>
(2, 268, 238)
(238,)
array([-1.0601221e+06, -1.0501221e+06, -1.0401222e+06, -1.0301222e+06,
...

Trying to create a mesh with my own dataset with the ordering (2, N_x, N_y) gives me a buggy graph. However, when changing the ordering to (2, N_y, N_x), the graph seems fine.

Is this an issue with the code comments? Should the correct dimension ordering be N_y, N_x in all cases instead of N_x, N_y?

@joeloskarsson
Copy link
Collaborator

Hi, thanks for pointing this out and pointing to the different places in the code where it is used. This is indeed inconsistent and unclear. I think that when I wrote WeatherDataset and create_grid_features.py I did not put much thought into what dimension to name x or y. The result of this is that if we follow:

dim_x = 268
dim_y = 238
N_grid = 268x238 = 63784

then xy = np.load(nwp_xy.npy) is a (2, dim_x, dim_y) array with xy[0] containing y-coordinates and xy[0] containing x-coordinates. That does not make much sense.

I think that this also goes against the actual coordinate system used in MEPS. Will check with Tomas that this

dim_x = 268
dim_y = 238
N_grid = 268x238 = 63784

should actually be

dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784

I also think @sadamov had the same issue at some point? I remember the x/y ordering causing issues when you made your first graph?

This documentation issue should be easy to fix by changing all the comments in the files. In the long run, I hope that these kinds of issues will be avoided as we are moving to xarray-based input data handling.

@joeloskarsson joeloskarsson added the documentation Improvements or additions to documentation label Jun 1, 2024
@TomasLandelius
Copy link

Yes,

dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784

is the correct description.

@joeloskarsson
Copy link
Collaborator

Great! I'll take on the task to update these comments to the correct ordering. I'll try to go through and check if there are additional places (maybe in the plotting?) where this is also inconsistent.

@bet20ICL would you be up for reviewing that pull request? Since you noted this and have an understanding for the problem already. I can add you to the organization so you are able to.

@joeloskarsson joeloskarsson self-assigned this Jun 3, 2024
@bet20ICL
Copy link
Collaborator Author

bet20ICL commented Jun 3, 2024

Yes I am happy to!

@leifdenby
Copy link
Member

@bet20ICL I just want to say welcome too! And thanks for your contribution :)

I don't know if you are aware of it, but there is a little community that has formed around the neural-lam codebase and we have monthly calls discussing our progress. You would be very welcome to join us if you would like to. It would be really cool to hear what you've been doing with neural-lam. You can find more details in our "development plan" google doc: http://bit.ly/mllam-plan, you can find the details of the next meeting there too.

@bet20ICL
Copy link
Collaborator Author

bet20ICL commented Jun 4, 2024

Thanks so much for the invite @leifdenby! I will definitely stop by to have a chat.

joeloskarsson added a commit that referenced this issue Jun 4, 2024
…S data (#52)

The x- and y-dimensions for the MEPS data are swapped in comments
describing tensor shapes, and also in some variable names. This change
swaps from (x, y) ordering to the correct (y, x) ordering.

This fixes #46. See the issue for a more clear description.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants