Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

row/column major considerations #1553

Closed
germasch opened this issue Jun 27, 2019 · 10 comments
Closed

row/column major considerations #1553

germasch opened this issue Jun 27, 2019 · 10 comments

Comments

@germasch
Copy link
Contributor

As @pnorbert seems to be working on BP4s on-disk representation (and dumping it), here's a wishlist item that I think would be worth considering:

Currently, ADIOS2 makes the decisions based on row vs col-major layout based on which API is invoked (C++/C: row-major, Fortran: col-major). There are some considerations with that, though. E.g., some (many?) projects switching parts of their code/kernels to C/C++/CUDA, and as that transition continues, they may want to be able to write data in a single app from both languages. e.g., they might want to write Fortran-layout data using the C++ API.

Also, C++'s data layout isn't natively row-major, I'd argue, it's mostly natively non-existant (except for compile-time fixed dimension arrays). Many of the libraries that do multi-d arrays (boost, xtensor, Kokkos) support both layouts, as does Python. So even in a native C++ application, one might use and want to write col-major data. Some apps may be using a mix of row-major and col-major layouts for performance reasons.

So I think there's some work that could be done on the API side, that lets the application specify the data layout they want to use for a given variable. But that's something that can be added over time relatively easily. If supporting mixed layouts is an eventual goal, though, the output format needs to be able to handle it, too, though. I believe BP3 keeps a global layout flag (well, maybe per PG or something), so I think it may not be able to handle mixed layouts well without some kind of compatibility issues, and that's fine. But for BP4, it might be useful to consider this case now, while the binary format is still in flux. It'd probably be easy to add a row-major / col-major flag per Variable.

[One could also consider a more flexible approach that allows for a general strided approach, e.g., for the case that some array dimensions used in an application has been padded for alignment.]

@williamfgc
Copy link
Contributor

williamfgc commented Jun 28, 2019

@germasch thanks for bringing this up. In adios there is no real concept of row-major/column-major as it's all indexed contiguous memory. We reuse these familiar concepts to add another degree of freedom to the Shape, Start, Count per Variable dimension. Basically, adios needs to understand which one is the fastest (and the slowest) dimension index.

ADIOS has today two requirements:

  • We assume the ordering of the Variable dimensions is natural to the language in use and the user is responsible for swapping those accordingly (which somehow they have to do it way before they use ADIOS2, if let's say going from Fortran to CUDA, see bottom of this message).

  • bpls being a C++ reader always presents the data in "row-major" order, basically the last dimension is the fastest/contiguous one, regardless.

The mentioned libraries allocate data using multidimensional array abstractions because they have to produce the memory. ADIOS is a consumer of this allocated memory at both write and read, hence we are not stopping apps to describe their fast(f) and slow(s) dimensions:

In C++:
Variable 1: (s,...,f) is Row Major -> to ADIOS C++ bindings as (s,...,f)
If variable 2 is created by a library that supports column-major indexing:
Variable 2: (f,...,s) is Column Major -> to ADIOS C++ bindings as (s,...,f)

In Fortran:
Variable 1: (f, ..., s) is Column Major -> CUDA and ADIOS C++ bindings as (s,...,f)

@berkgeveci
Copy link

I would like to add my 2 cents from the visualization perspective (as a VTK / ParaView developer). It would be nice to have some kind of API in ADIOS to identify if ADIOS swapped the order of Shape, Start and Count (when reading from C++ API data coming from Fortran API). This is because the positions in the arrays often have spatial importance (x, y & z for example). Without this kind of meta-data, the visualization tool using a generic reader (without specific knowledge of the data source) would end up loading in a transformed way (z, y, & x) confusing the end user. Please note that all would be necessary is access to the appropriate meta-data nothing more.

@williamfgc
Copy link
Contributor

@berkgeveci thanks for the input, point taken. You're correct as it's a simple accessor to a flag in the current deserializer. I went through the same in my own VTK development for image data and unstructured grid support, see merge request for vti and vtu branch I remember discussing this with @chuckatkins very briefly a while ago, it's running on our CI right now. Thanks.

@germasch
Copy link
Contributor Author

@williamfgc, I think @berkgeveci understood what I was getting it, which I may not have managed ot make entirely clear. I understand how the current model works (you may remember that I added support for SetMemorySelection on read.)

We assume the ordering of the Variable dimensions is natural to the language in use and the user is responsible for swapping those accordingly (which somehow they have to do it way before they use ADIOS2, if let's say going from Fortran to CUDA, see bottom of this message).

What ADIOS2 does today is already better(IMO) than what HDF5 does. HDF5 always assumes row-major layout, so if one uses col major, one has to reverse dims (e.g. in Fortran) oneself. In HDF5, one has no way of knowing that a 2x3x4 dataset is actually a 4x3x2 dataset which was written in Fortran. ADIOS2, does save the host language used, so if writing a 4x3x2 dataset in Fortran, it will also show as a 2x3x4 dataset in C++, but the info is there to know that it was actually written in Fortran so that the dims are reversed, and, e.g., a viz app can use that to get xyz right. (Or the python interface could expose directly it as a ndarray with the original coordinate order, though I don't think that currently happens). Your PR #1573 exposes that info to the user, and that's good (I agree with that approach of providing additional info to have the user handle it if they want to -- the alternative would be to transpose the data, which is slow). It allows the viz tool to display the data as the original 4x3x2.

However, if I have an array in C++ that's 4x3x2 in column major order (say because it was passed from Fortran, for performance reasons, whatever) you consider that to be not to C++'s natural order. So when I write it with ADIOS2, I have to reverse dimensions myself (that's what I actually do right now). The issue then is, that ADIOS2 has no way to know that I reversed the dims, so it thinks I wrote a 2x3x4 row major array. The visualization tool will thus display the data as 2x3x4, ie, with the coordinate axes reversed.

In order to fix this problem, a per-Variable flag is needed to indicate what the original layout was. I don't think that's easy to add to BP3 without breaking compatibility (and I don't think compatibility should be broken for it). But I did see @pnorbert changing the binary layout of BP4 files, so I wanted to point out that this would be a good time to add such a flag to be prepared to support this feature in the feature without breaking BP4's binary compatibility. I don't know what the plans are about BP4, but if the upcoming release will freeze BP4's binary on-disk layout, it'd be particularly important to do this before the layout is fixed. The API side of it isn't that urgent, and I'd be happy to suggest/implement an API for it.

To be specific, what I'm talking about is (BP3Serializer.cpp):

    const char columnMajor =
        (helper::IsRowMajor(hostLanguage) == false) ? 'y' : 'n';
    helper::InsertToBuffer(metadataBuffer, &columnMajor);
    helper::CopyToBuffer(dataBuffer, dataPosition, &columnMajor);

This is exactly what's needed, but it's currently (in BP3 and BP4) per PG, not per Variable. As the layout of BP4 is in flux, it shouldn't be hard to add this flag to the Variable record as well, in order to allow this per-Variable in the future.

@pnorbert
Copy link
Contributor

I will see if I can add a flag for each variable in the BP4 format for the release but no promises because how slowly I am getting anything into the release (me and CI combined).

BP4 exists for two reasons: fast append and opportunity to stream properly through files. Everyone in the team has one or two issues with the BP format, and therefore we will need to start early to design a new format and to take our time designing it.

@williamfgc
Copy link
Contributor

@germasch yeah, I misunderstood the original request. A current workaround is to add a variable attribute, until we get a function to identify each Variable Put block. Still this is not on a per-block basis.

While related, I see @berkgeveci request as yet another degree of freedom (to add to the confusion) since x, y, z have a physical meaning independently of which one is fastest or slowest. I can see this might be coming from vtkDataArray being a column-major C++ object (to add even more to the confusion), which was a very reasonable decision at the time it was designed.

But I agree with you, BP3 was not designed to handle such scenarios on a per-Variable block basis. We'll have to review this for future formats, while it sounds trivial I'd like to work with an app that has a use-case (WDM?) of some blocks being processes by the GPU thus data coming from CUDA C APIs, while other blocks might come directly from Fortran. The latter is a reasonable scenario to exploit current hardware on Summit, since we are having more GPUs and RDMA with NVLink, but I am not aware of a current use-case.

@chuckatkins
Copy link
Contributor

I will see if I can add a flag for each variable in the BP4 format for the release

I'd really rather not put it in the release. This feature needs some thought and some back and forth with the people that need it. Given that we need to cut the release tomorrow at the latest there's not time for that. I'd rather put it in master for the next release and do it right rather than throw something over the fence for the release that may not really satisfy the needs of the applications that need it.

@williamfgc
Copy link
Contributor

williamfgc commented Jun 29, 2019

@chuckatkins since this request is coming from @berkgeveci and yourself I'll let you guys sort out where my PR #1573 should go. That PR only extends a flag in the Variable Info struct, I think @pnorbert addition is more complicated.

@pnorbert
Copy link
Contributor

@chuckatkins Okay, let's think this through first.

For notes, in the BP4 Data format, there is a byte for isDimensionVariable flag, which is not used anymore (but maybe someone wants them back?) and also a Path variable (2 bytes for length), so we can repurpose some of them for this new functionality (per variable row/colum major flag). If we want a per-block flag, that will become a new characteristic, so the format does not change (only is extended).

In the BP4 Metadata format I don't see the dimension variable flag, but we have unused 2 + 2 bytes for Path and Group Name, both of which are not used.

@germasch
Copy link
Contributor Author

germasch commented Aug 5, 2019

superseded by #1661

@germasch germasch closed this as completed Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants