Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data layout / BP-X #1661

Open
germasch opened this issue Aug 5, 2019 · 22 comments
Open

data layout / BP-X #1661

germasch opened this issue Aug 5, 2019 · 22 comments

Comments

@germasch
Copy link
Contributor

germasch commented Aug 5, 2019

This is somewhat BP-X related, but also a more generic API issue, so I figured I'll put the details here.

Language-independent handling of row-major vs col-major data layout

Here's a proposal to increase flexibility in handling data layout while keeping the existing behavior / APIs unchanged.

Rationale

Currently, the ADIOS2 API is built around the assumption that the language of the API bindings determines the data layout that the application is using, in particular Fortran/Matlab/R: column-major; otherwise: row-major (see adiosSystem.cpp). That covers most common applications. In particular reading and writing data from the same language works nicely. However, mixing languages or data layouts is more complicated.

When Fortran writes, say, a 400x200 (column-major) array, it will be presented in the C/C++ API as a 200x400 row-major array. In order to present the data as a row-major array, there are only two options: reverse the shape, or transpose the data. Since transposition is potentially expensive, ADIOS2 make the perfectly reasonable choice to instead reverse the dimensions.

However, the assumption that the language of the API determines the data layout used is not always valid. For example:

  • The main part of a legacy code remains written in Fortran, hence it's using column-major data layout for its arrays. However, some parts of the code are rewritten in C++, e.g., in order to enable use of CUDA / Kokkos (this is currently happening for XGC). At this point, if one writes data using the adios2 C++ API, it makes the assumption that the data layout is row-major, even though it actually is not. This can be worked around on the application side by pretending that the column-data is row-major if the app reverses dims itself. However, one loses the information whether a dataset is actually row-major, or just pretending to be.

  • Essentially the same issue happens in a pure C++ code that uses data structures that aren't (necessarily) row-major, like, e.g., xtensor or Kokkos Views. Again, it can be worked around by reversing dims on the application side, but the info what the real layout is gets lost.

  • A general visualization app, say written in C++, sees a data set of size (200x400). It doesn't know whether that means there are really 200 points in x direction and 400 points in y direction, or whether it is looking at an actual 400x200 data set that had been written using the Fortran API. This is somewhat solved for the unique language->layout mapping case by Exposes IsReverseDims in BlockInfo #1573, though (a) the app has to go down to the block level to check the IsReverseDims flag and (b) the IsReverseDims flag doesn't directly tell the original data layout, but requires additional logic from the application. The viz app then has to reverse dims itself again to read data into a data structure using the original layout.

    The visualization app still has no way of knowing whether the writing app had reversed the dims itself because it falls into either of the cases above.

How relevant is this?

  • XGC is going to a mixed Fortran / C++ model, where C++ is using col-major data structures. For now, writing the data is still handled in Fortran, so there is no actual issue yet.

  • My particle-in-cell code (PSC) falls into the category of a C++ code using column-major data layout (for historic reasons), so I currently I'm passing fake (reversed) dims to ADIOS2 to make things work. (Actually, the same has to be done when using HDF5, but I think this is really a weakness of HDF5 which shouldn't be perpetuated.)

  • @berkgeveci supported the usefulness of knowing row-major vs col-major in row/column major considerations #1553.

Proposed data format changes

BP3 / BP4 have global flags that indicate the host language, ie., assumed data layout. That's helpful, but having per-dataset granularity would be better. Per-dataset granualarity would be better, so one could write both row-major and col-major dataset into the same file. The API changes I'm proposing don't necessarily require changes to the data format, which might well be not possible at this point for compatibility reasons, but I think it'd be good to consider this for future evolutions of the data formats. One could use attributes to store the information as attributes given that existing mechanism, but I think it's a bit iffy since there is no distinction between user-defined and system-defined attributes, so there could be
conflicts if one were to do that.)
The existing BP3 / BP4 would support the additional API described below, but they would not support mixing row-major and col-major datasets in a single file.

Proposed API

(I'm just making this up as I'm writing this. But I hope it'll give a useful basis for thinking about it.)

Each dataset (Variable) would have a Layout property associated with it. When creating/inquiring variable for writing/reading, the Layout property would be set to the current host language.

Add a Variable::setLayout() function that would allow the application to override the default layout. E.g., in C++ I could go say myVar.setLayout(Layout::ColumnMajor) to tell adios2 that what I'm passing to it will be laid out in ColumMajor. For now, adios2 would just revere the dims for me, writing the data while pretending it's the row-major layout that it's expecting. Ideally, adios2 would also record the original data layout, but that's data format dependent
(see above).

When reading, the same mechanism would apply. If I know the data I'll be reading is actually column-major, and my data structure expects the blob of data to be in column-major, I'll do a
myVar.SetLayout(Layout::ColumnMajor) first. If the actual data is not already columnn-major, adios2 would reverse dims for me (basically just like what currently happens when file language and API host language don't match, but it'd be based on the layout I'm telling, not the language I'm using, though in the default case nothing would change.)

Finally, add a Variable::GetOriginalLayout() function, which would for a given variable tell me what the original layout was when it was written. So a dataset written by Fortran would return "ColumnMajor", while a dataset written in C++ would return "RowMajor". That way, an application that uses a data structure that supports both row-major and col-major could load the data back into the appropriate data structure. without having to deal with either reversing dims or transposing data. Something like this:

auto myVar = InquireVariable(...);
auto layout = myVar.GetOriginalLayout();
myVar.SetLayout(layout);
auto shape = myVar.Shape();
xt::xarray<double, xt::layout_type::dynamic> var;
if (layout == Layout::RowMajor) {
  var = {shape, xt::layout_type::row_major};
} else {
  var = {shape, xt::layout_type::column_major};
}
myVar.Get(var.data());
// no need to worry about data layout or reversed dims in the 
// var we read from here on out, it'll just work :)
@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

@germasch do you have a timeline of when XGC would be ready for this? Also, I am very wary of adding API functionality when there is a solution in place. APIs can get very bloated pretty easily in any library and over time it becomes very hard to not only maintain, but to attract new projects.

Back to the issue, exposing IsReverseDimension does allow the reader to discover the original layout of the data in terms of fastest/slowest index, not a hardcoded description. The x,y,z issue is different and it's very much problem dependent. VTK assumes "x" is always the fastest index, so it's even more restrictive as it assumes Fortran arrays are always written as A(x,y,z) and C arrays as A(z,y,x). I have worked with VTK visualizing Fortran code in which x is the slowest varying index A(z,y,x), or that doesn't assume x,y,z, but any other orthogonal basis (n1,n2,n3).
So index names and ordering don't correlate and any changes in adios2 won't leverage the VTK assumption.

My point is: adios2 already provides a solution following standardized convention for each language, but gives you info if the dimensions were reversed at read so you know the original data layout in terms of fast/slow indices.

I am curious to see real code comparing the current solution and another one adding APIs. I personally feel more confused when handling new functions I need to learn than just going with something standard. Having code will also allow asking the practitioners' opinion based on their experience and which approach is easier to understand. I will write in another box to compare side-by-side.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

Variable is written as A(10,20) in Column Major, 10 is fastest

shape = var.Shape(); //shape = {20,10}

auto col = var.GetOriginalLayout();
var.SetLayout(col);
shape = var.Shape(); // shape = {10,20}
engine.Get(var, *v);  // is v {10,20} (does it mean data is transposed?) or {20,10}?

Also, with the proposed functions we still must go down to the block level
Write:

engine.Put(var, *v1);
var.SetLayout(columnMajor);
engine.Put(var, *v2);
engine.Close();

Read: (one block is row-major and the other column-major)

auto layout = var.GetLayout();
//layout = Column-Major or Row-Major?

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

@germasch do you have a timeline of when XGC would be ready for this? Also, I am very wary of adding API functionality when there is a solution in place. APIs can get very bloated pretty easily in any library and over time it becomes very hard to not only maintain, but to attract new projects.

I can't really predict what's going to happen with XGC, as I said there's currently work underway of rewriting part of the code into C++, but I don't think I/O is part of that current effort.

I do have code to show, and that's from my own code (PSC, a particle-in-cell code):

inline kg::io::Dims makeDims(int m, const Int3& dims)
{
  return kg::io::Dims{size_t(m), size_t(dims[2]), size_t(dims[1]),
                      size_t(dims[0])};
}

There is some wrapping layer involved, but you get the idea: While the data is indexed as (i,j,k,m), where m is a component number, I have to pretend that the dimensions are reversed when writing to adios2, because it thinks the data is in row-major layout, but it really isn't.

adios2 doesn't offer a solution to this issue currently, that's why I reverse the dims myself. Obviously, since I'm lying to adios2, it has no idea what's really going on, so there is no way that it could tell a visualization tool that the dims really should be the other way around. Reading the data back in the code itself is basically not an issue, because I'm just reversing the dims there as well. But when, e.g., accessing it via python, the indices are reversed, and that's unpleasant.

Most applications deal with the row-major / column-major the same way, that is, if your data doesn't match the layout you're using, you just reverse the dims, because the alternative of transposing the data is expensive. That's what you got to do when using HDF5 from Fortran, or, apparently, when using vtk with row-major arrays. I'm not arguing with that, that's the reasonable thing to do, but when one has to reverse dims for this reason, it'd be nice to have a way to tell that's happened. I take it from what @berkgevici wrote in the other thread, that while vtk may always use col-major arrays, it stores the info that the original layout is row-major so that the data can be displayed correctly in terms of x,y,z, order. That's the main thing I'm asking for, and it's not supported for my use case (above) in the current API.

How one wants to handle the mismatching data layout case (if at all) is up to the application. In fortran, you don't really have a choice. In C++, you can choose to always do row-major -- but you can also choose to use an multi-dimensional array class that can support both layouts natively. The same thing goes for Python -- ndarrays do support both data layouts natively. If you choose to use an multi-dimensional array that supports both layouts, it obviously makes sense to use that functionality, because then you can just access things in the natural index order and not worry about what the underlying layout is.

Back to the issue, exposing IsReverseDimension does allow the reader to discover the original layout of the data in terms of fastest/slowest index, not a hardcoded description. The x,y,z issue is different and it's very much problem dependent. VTK assumes "x" is always the fastest index, so it's even more restrictive as it assumes Fortran arrays are always written as A(x,y,z) and C arrays as A(z,y,x). I have worked with VTK visualizing Fortran code in which x is the slowest varying index A(z,y,x), or that doesn't assume x,y,z, but any other orthogonal basis (n1,n2,n3).

Sure, some people in C++ use var[k][j][i] to access their arrays, and if people want to do that, that's fine with me. Since C++ does support operator overloading, I personally prefer to use var(i,j,k) to do the same thing. Same thing in python, having to do var(k,j,i) when loading Fortran data with h5py keeps confusing me still after all these years...

Just to be clear, what I've proposed is an addition to the API which is likely going to be used only by a select group of users, e.g., people dealing with generic visualization tools, or people like me, who use column major arrays in C++ and like them to be represented correctly in their output. Most people will be just fine doing Fortran and column major, and won't see any difference.

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

Variable is written as A(10,20) in Column Major, 10 is fastest

shape = var.Shape(); //shape = {20,10}

auto col = var.GetOriginalLayout();
var.SetLayout(col);
shape = var.Shape(); // shape = {10,20}
engine.Get(var, *v);  // is v {10,20} (does it mean data is transposed?) or {20,10}?

You got the idea. The way it's written, you get 10x20 col-major data. If one didn't do the SetLayout, one would get 20x10 row major data, that's the current behavior. Note that the data is actually exactly the same in both cases, so no transposing of the data ever happens. If you use the original data layout when reading, no reversing of dims happens, either.

It's really just an extension of current behavior: If the data layout you're reading into is the same that the data was written with, you get that data layout with the original dims. If you're reading the data into the opposite layout, the dims are reversed, the data is the same. The difference is only that instead of assuming "C++/Python -> layout must be row-major", you get a chance to tell adios2 what it actually is.

Also, with the proposed functions we still must go down to the block level
Write:

engine.Put(var, *v1);
var.SetLayout(columnMajor);
engine.Put(var, *v2);
engine.Close();

I think allowing for different layouts at the block level is overengineering at best, and trying to implement that correctly would make my head spin. But I don't see any possible application where this could be beneficial, either. So the data layout should be a property of a given variable, not of its block. I think I said that, by the way, when you added an API that provides IsReverseDims on the block level, I can't come up with any use case where it wouldn't be the same for all blocks of a given variable.

So changing the layout in the middle of a write would be an error, ideally one which gets caught, but at least undefined behavior. Note that this is no different then changing other properties of a variable in the middle of a write (say, you wrote one block to a variable, and then you called SetShape to change its global shape. Doesn't make any sense, and no good will come from it).

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

Obviously, since I'm lying to adios2,

Let's clarify, adios2 needs to know which is the fastest and the slowest index adopting language convention, that's all adios2 asks. Ultimately, it becomes either calling something like std::reverse (which is standard C++) outside adios2 or adding API calls to adios2 so adios2 can reverse this internally to get the slowest/fastest info. I don't see the "lying" aspect.

current:

std::reverse(shape.begin(), shape.end())
variable.SetShape( shape );

or:

variable.SetLayout( ColumnMajor );
variable.SetShape( shape );

Just to be clear, what I've proposed is an addition to the API which is likely going to be used only by a select group of users

if that's the case, then it's easier (and less costly) to place this functionality in the workflow of the selected group of users instead of adding more APIs in adios2. The end result is the same, we can help providing the required support.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

But I don't see any possible application where this could be beneficial, either. So the data layout should be a property of a given variable, not of its block.

I think it's important to check with different communities and their execution plans for the future and current architectures. This is why I was curious about XGC plans. Are they planning to use NVLink on Summit? Can some blocks be written in Fortran and some from the host C++ code running with CUDA/hip device code? (these are questions, I am not assuming much at this stage). There are new things that would come out from the learning process of working with apps.

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

Obviously, since I'm lying to adios2,

Let's clarify, adios2 needs to know which is the fastest and the slowest index adopting language convention, that's all adios2 asks. Ultimately, it becomes either calling something like std::reverse (which is standard C++) outside adios2 or adding API calls to adios2 so adios2 can reverse this internally to get the slowest/fastest info. I don't see the "lying" aspect.

"Lying" was a bit tongue-in-cheek. But the point is, I'm giving a shape to adios2 which isn't the actual true shape, and adios2 has no way of knowing that I did that.

current:

std::reverse(shape.begin(), shape.end())
variable.SetShape( shape );

or:

variable.SetLayout( ColumnMajor );
variable.SetShape( shape );

Sure, in terms of the length of the code, the two are about the same. In terms of looking at the code and understanding what it does, the second one is definitely much clearer (there's a coding principle saying that code gets written once, but read many times, so one should keep that in mind). But those are subjective.

The main point is, if I reversed the dims, adios2 doesn't have that info. So if someone else reads the data / visualizes the data / loads it into python, they end up with the reversed dims, not the original ones.

Just to be clear, what I've proposed is an addition to the API which is likely going to be used only by a select group of users

if that's the case, then it's easier (and less costly) to place this functionality in the workflow of the selected group of users instead of adding more APIs in adios2. The end result is the same, we can help providing the required support.

See above: If you don't give the information to adios2, there is no way that it can be stored and retrieved back.

In terms of maintenance cost, I think there's a pretty good chance that implementing this feature would actually lead to a reduction in lines of code, since reversing dims or not would be decided and done at the core level, so it wouldn't need to be done by each engine, removing duplicated logic. (However, getting there wouldn't be entirely trivial.)

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

I think it's important to check with different communities and their execution plans for the future and current architectures. This is why I was curious about XGC plans. Are they planning to use NVLink on Summit? Can some blocks be written in Fortran and some from the host C++ code running with CUDA/hip device code? (these are questions, I am not assuming much at this stage). There are new things that would come out from the learning process of working with apps.

I'm not sure what you mean? I don't think there are any plans beyond some GPU-aware MPI (GPUdirect) in terms of communication. If you mean GPUdirect Storage, no, I don't think anyone is thinking about that, though I guess it's something that you guys could be thinking about, because that's hopefully something that applications won't have to deal with themselves.

In terms of writing some blocks from Fortran and some from C++, I could see that happening theoretically in some apps, but I wouldn't anticipate it for XGC. Besides, concurrent use of the C++ and Fortran API isn't supported by adios2, anyway. If one wanted to go really fancy, I could see some case for having different data layouts in Fortran vs CUDA code, though not so much about row-major vs col-major, but rather about AOS vs SOA. Which I think might have some interesting opportunities, but that's beyond this topic.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

The main point is, if I reversed the dims, adios2 doesn't have that info. So if someone else reads the data / visualizes the data / loads it into python, they end up with the reversed dims, not the original ones.

I recently put VTK reader for adios2 image data and unstructured grids, and this info was never needed (HDF5 readers would never work), 'cause I am not assuming that A is A(x,y,z), this is purely application dependent (in fact, it could be misleading as explained in previous messages). The reader only needs to know the fastest and slowest index info and map this to VTK's "x" (fast), "y", and "z" (slow).

In terms of maintenance cost, I think there's a pretty good chance that implementing this feature would actually lead to a reduction in lines of code, since reversing dims or not would be decided and done at the core level

Check the number of HDF5 API calls. By maintenance costs I mean: documentation, tutorials, bug fixes, testing, CI, apps integration, issue tickets....lines of code is hardly an indicator of cost. The current APIs already provide a solution we can ship.

Besides, concurrent use of the C++ and Fortran API isn't supported by adios2, anyway.

  1. It doesn't have to be concurrent. 2) Concurrency doesn't provide any speedup at the Engine level if blocks go to a single buffer as serialization reduces any parallel portion. 3) Concurrency can easily happen at the IO and ADIOS component levels (they are pretty independent). PIConGPU was doing I/O and computation concurrently with adios1.

I'm not sure what you mean?

Going from Titan to Summit (and Frontier) would require new strategies in the computation and data placement. We still need to study these systems and their policies based on what apps need and we can/can't do before making decisions. I am collecting this info from other communities as well so we can add real value to apps, that's all.

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

The main point is, if I reversed the dims, adios2 doesn't have that info. So if someone else reads the data / visualizes the data / loads it into python, they end up with the reversed dims, not the original ones.

I recently put VTK reader for adios2 image data and unstructured grids, and this info was never needed (HDF5 readers would never work), 'cause I am not assuming that A is A(x,y,z), this is purely application dependent (in fact, it could be misleading as explained in previous messages). The reader only needs to know the fastest and slowest index info and map this to VTK's "x" (fast), "y", and "z" (slow).

So I guess I don't know what adios2 image data is (maybe I missed it, or is just 2d arrays?). Unstructured grids are usually 1-d lists of nodes and associated data, aren't they? So the index order doesn't really apply. Generic HDF5 readers work most of the time, but not always, because they actually tend to assume that while HDF5 says the data is row-major, the actual data is column major so they un-reverse the dims, assuming they'd be reversed when writing. That works in the vast majority cases, because almost all scientific data tends to be in column-major order, and so the viz app's guess that what's really in the file is column-major even though it says otherwise tends to be a good guess. But you'll find plenty of people that are unhappy that they write their HDF5 file according to the standard but Paraview transposes the data for them because it thinks it knows better. The point is that Paraview has to make a guess, because HDF5 has no way to recover the information. Adios2 is doing better in that it stores whether the data has been written by Fortran or C, but it gets things wrong if one wants to write column-major from C/C++. Which, again, is a really common case, of Fortran's importance in HPC, so people moving to other languages tend to keep the data layout the same so that the data can be shared between the languages. Which is exactly the reason why you may see a[k][j][i], it's not because people like to be different, it's because that's the easiest way to access col-major data from C. In fact, I'm sure those people who use a[k][j][i] would be happy if there were a SetLayout(ColumnMajor), so that their data would be displayed correctly in analysis tools.

You don't have to take it from me, I'll quote this part from @berkgevici from #1553:

This is because the positions in the arrays often have spatial importance (x, y & z for example). Without this kind of meta-data, the visualization tool using a generic reader (without specific knowledge of the data source) would end up loading in a transformed way (z, y, & x) confusing the end user.

Of course, not everything is x,y,z. Say, people may use 3d arrays indexed by (longitude,latitude,height). Even in this case, though, it would be nice if the data after reading it in a generic tool would not be indexed by (height,latitude,longitude), though visualizing it as (x,y,z) would be not optimal either way. So maybe one shouldn't be focused on getting "xyz" vs "zyx" correct, what this is about is to get the indices to be in the same order as the original data.

Check the number of HDF5 API calls. By maintenance costs I mean: documentation, tutorials, bug fixes, testing, CI, apps integration, issue tickets....lines of code is hardly an indicator of cost. The current APIs already provide a solution we can ship.

I don't think I suggested to follow an HDF5 API (which in all those calls still missed describing the data layout). Actually, LOC is a pretty good metric for maintenance cost, and many of the things you mention (bug fixes, testing, CI, bugs) tend to scale with LOC. I don't think one should freely add all kinds of stuff to a API for no good reason and without proper consideration. I don't think my proposed way is the only possible way to do it, but it does fill a gap which is not handled by the current API.

Also, after me having the point made so many times, I think it's misleading to say "The current APIs already provide a solution we can ship." It does not provide a solution to reading data back with its original indexing preserved if one uses C/C++ to write column-major data, which, again, is a common case. Of course you can ship it anyway, and it's still usable. HDF5 has lived with this limitation forever, but why not use the opportunity and learn from others' mistakes?

Having said that, there are some real issues / costs:

  • First of all the new API has to be implemented and the engines have to be adapted, that's definitely a non-trivial amount of work. So that's a definite cost, and it would be fair to say that this is not a priority you want to spend resources on. (I might be able / willing to do it at some point, but right now it's not a high priority item for me, either.)
  • Even if the API exists, to make it work well, it needs support in the engines. With the current row-major / column-major flags being per PG in BP3 (and probably BP4), support for mixing layouts would be limited.

Besides, concurrent use of the C++ and Fortran API isn't supported by adios2, anyway.

  1. It doesn't have to be concurrent. 2) Concurrency doesn't provide any speedup at the Engine level if blocks go to a single buffer as serialization reduces any parallel portion. 3) Concurrency can easily happen at the IO and ADIOS component levels (they are pretty independent). PIConGPU was doing I/O and computation concurrently with adios1.

I didn't mean concurrent in the sense of multi-threaded or anything, I mean the Fortran bindings aren't designed in a way where you can open a file in C++ and write some blocks for a Variable, and then write some more blocks in Fortran, even in a serial program.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

It does not provide a solution to reading data back with its original indexing preserved if one uses C/C++ to write column-major data, which, again, is a common case.

This goes back to supporting the selected group of users by adding an extra attribute if that information is important to preserve. My experience is that the latest hasn't been required for visualization (knowing which index is fast/slow is sufficient and if the dims were reversed). What Berk requested was addressed by adding the info at the block level, check the last sentence from his request (after the quote you C&P) which predates that PR.

But you'll find plenty of people that are unhappy that they write their HDF5 file according to the standard but Paraview transposes the data for them because it thinks it knows better.

This is independent of I/O library, I can save file in VTK format and still be presented in Paraview as VTK's "x"->fastest, "y", "z"-> slowest. I don't think there is a workaround if "x" is not your fastest index (or an index at all) in your simulation as it's a design decision in VTK (just like it's a design decision in HDF5, readers just need to accommodate to VTK). I put more info in a previous response.

Anyways, I'd like to hear more from other users and bring awareness of what adios2 does on the topic. You showed that your app still can use adios2 in its current state.

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

It does not provide a solution to reading data back with its original indexing preserved if one uses C/C++ to write column-major data, which, again, is a common case.

This goes back to supporting the selected group of users by adding an extra attribute if that information is important to preserve. My experience is that the latest hasn't been required for visualization (knowing which index is fast/slow is sufficient and if the dims were reversed). What Berk requested was addressed by adding the info at the block level, check the last sentence from his request (after the quote you C&P) which predates that PR.

You're exactly right that knowing the provided data layout and whether dims is sufficient (and necessary). However, you also are telling me that I need to reverse the dims myself, which I do, but there is no way to tell adios2 that I did, so that information is not stored and hence not available when reading the data back in generic tools. So that information you seem to agree is needed is not available.

I've said that the API would primarily be used by a small group of developers, those that work on generic analysis / viz tools. However, lots of people use those tools, and they're affected when the dims end up being reversed. That includes people using the included adios2 python interface. I'll file that as a separate issue.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

Either way we need to track layout outside of adios2 (it's either a call to std::reverse or an adios2 API function).
Let me summarize with code (I just feel that having code would help understand pros/cons, etc.).
Today we can do this:
Write:

// in C++ so adios2 expects row-major representation
std::reverse(shape.begin(), shape.end() );
var.SetShape( shape );

Read:

if( var.BlockInfo[0].IsReverseDims )
{ // if true we know original source ...this is sufficient to map to external tools (if needed). 
// In VTK we map VTK's "x" to the fastest index, not a big deal, 
// we just accommodate to their design decision.
}

The proposed changes would be:
Write

var.SetLayout(column); // hides the fact that we reverse dims internally
var.SetShape(shape);

Read

auto col = var.GetOriginalLayout();
var.SetLayout(col); // do we have to set the layout at read?, what's the default layout?
shape = var.Shape();
if(col){
}

The one thing it's still an open question is if SetLayout is different from the original layout then adios2 would transpose the data. Which is expensive, but that would be my first impression when looking at the code below:

Read

auto col = var.GetOriginalLayout();
if(col){
  var.SetLayout(row);
  var.Get(var, *v); // are we transposing the data?
}

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

Either way we need to track layout outside of adios2 (it's either a call to std::reverse or an adios2 API function).

Well, if I use an API function, I'm telling adios2, so that it can keep track internally instead, that's really the major goal here.

Let me summarize with code (I just feel that having code would help understand pros/cons, etc.).

I like code ;)

Today we can do this:
Write:

// in C++ so adios2 expects row-major representation
std::reverse(shape.begin(), shape.end() );
var.SetShape( shape );

Read:

if( var.BlockInfo[0].IsReverseDims )
{ // if true we know original source ...this is sufficient to map to external tools (if needed). 
// In VTK we map VTK's "x" to the fastest index, not a big deal, 
// we just accommodate to their design decision.
}

But this wouldn't work, because I did std::reverse myself, so adios2 cannot know about it, and isReverseDims would be false.

Side note: I guess you better not run this on an MPI process that happened to not write a block for this variable, because var.BlockInfo[0] is invalid then.

The proposed changes would be:
Write

var.SetLayout(column); // hides the fact that we reverse dims internally
var.SetShape(shape);

Read

auto col = var.GetOriginalLayout();
var.SetLayout(col); // do we have to set the layout at read?, what's the default layout?
shape = var.Shape();
if(col){
}

You don't have to set the layout at read time, it'll be row-major by default (this is all a C++ API, and row-major is what's currently always assumed, so behavior is unchanged).

If you do, you get the shape that matches the layout you just selected. If you select the original layout, you get the original shape, which is good if you can support it (python/ndarray can, in C++ it depends on what you use for your multi-d arrays. If you only use a multi-d array that's fixed to be row-major, you don't need write the code above, you just read it, and deal with the fact that maybe it's reversed. That's what vtk would do, it'd just read as row-major. The dims may be reversed, and that info is available from GetOriginalLayout(), but you wouldn't do SetLayout() in this case.

The new API, as used above, is admittedly somewhat awkward. It would be more natural to just always return the data in its original shape and layout, as should (and would) be done by the python bindings. But in the C++-API, there isn't any data model beyond using simple pointers, because there is no standardized multi-d array data structure in C++ in the first place. So this only provides the pieces that can be used to build such behavior on top of it. And it's done in a way that preserves compatibility with current behavior.

The one thing it's still an open question is if SetLayout is different from the original layout then adios2 would transpose the data. Which is expensive, but that would be my first impression when looking at the code below:

Read

auto col = var.GetOriginalLayout();
if(col){
  var.SetLayout(row);
  var.Get(var, *v); // are we transposing the data?
}

No, the data would never be transposed, only the dims get reversed, just like is currently the case (that's my proposal, anyway, one could transpose the data, but as you say, that's expensive, so it's usually avoided).

@pnorbert
Copy link
Contributor

pnorbert commented Aug 9, 2019

Kai, let me try to recap what you are proposing here:

  • allow user (code) to define for an output variable if it is row major or column major
    • this flag is currently automatic based on language, set for entire output, and is hidden
  • allow user (code) to query this flag and form its read selection (the order of dimensions) accordingly
    • so paraview could opt for keeping dimensions in column-major ordering but
    • by default a C++ reader would still see a column-major variable as row-major with dimensions reversed.

What happens if a reader sets the layout to the opposite layout in the same language? Do we need to transpose the data, or assume that the user reverses the dimensions in the read selection for the proper reading, or throw an exception? Probably you said so already, to go for the second option, right?

@pnorbert
Copy link
Contributor

pnorbert commented Aug 9, 2019

How about an extension to the API with new DefineVariableRowMajor() and DefineVariableColumnMajor() and a read side InquireVariableKeepLayout()? I don't think an extra optional flag fits in the original ones, hence the new names. And then a var.GetLayout() query function.

This way we could have an unmutable property of the Variable object on both sides. Majority of the codes can use the original functions, while Fortran/C++ mixed codes and others could declare the order. Same effect as your SetLayout() function but it does not allow for messing with the blocks and across processes (assuming a code uses the same define function everywhere).

On the read size, InquireVariable() would still do the auto-dim-reversal, and read selections should be in the language's default order, as is right now.
On the other hand, InquireVariableKeepLayout() would return the Variable with dimensions as is in the input. Then it's the user's responsibility to form the read selection accordingly.

I don't think I added or removed anything from your original proposal, so correct me if I misunderstood what you have been proposing.

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

Basically, it is about keeping the current approach of:

  • if the data layout of the in-memory variable and what's in the file is the same, do nothing special
  • if it's different, reverse the dimensions

The primary difference is in allowing C++ (and hence python) to state what data layout they have/want, rather than assuming it must be row-major.

BP3/BP4 already have a flag that describes whether data is row-major/col-major. Eventually, in future formats, it'd be helpful to support having that flag be per-variable, rather than (somewhat) global (it's per PG right now, I believe).

The invariant is that the data is never reordered.
Fortran:

  • write a dataset -> recorded as col-major
  • read a dataset: if row-major on disk, present as col major by reversing dims

CXX

  • write a dataset: by default, written and recorded as row-major.
  • write a dataset: if requested by SetLayout(ColumnMajor), write and record as col-major (if supported by format)
  • read a dataset: if row-major on disk, by default present and read as row-major
  • read a dataset: if col-major on disk, by default present as col-major by reversing dims
  • read a dataset: if row-major on disk, but requested to be read as col-major, present as col-major by reversing dims
  • read a dataset: if col-major on disk, and requested to be read as col-major, present and read as col-major

The most interesting case: Data has been written (probably by fortran) as col-major. It's originally 5x10. On disk it's recorded as "col-major 5x10". Currently, in C++, we see
var.Shape() == {10,5}. After reading, the data needs to be accessed as data[j][i]. This continues to work the same.

But if you're using a C++ multi-d array class that supports flexible data layouts, or python, you can say

// var.Shape() == {10,5} as of now
layout = var.GetOriginalLayout(); // returns ColumnMajor
if (layout == ColumnMajor) {
  var.SetLayout(ColumnMajor);
  // var.Shape returns {5, 10} now
  auto field = Array<ColMajor>(var.Shape());
  var.Get(field.data()); // read it
  // can use field(i, j) to access things in original order
} else {
  auto field = Array<RowMajor>(var.Shape());
  var.Get(field.data()); // read it
  // can use field(i, j) to access things in original order
}

@germasch
Copy link
Contributor Author

germasch commented Aug 9, 2019

Sorry for the cross-over, I just wrote the reply above before reading your subsequent comment.

You're right, DefineVariable{Row,Column}Major, InquireVariableKeepType and a GetLayout would be an alternative to achieve the same thing, so that would do. It's nicer in that the code after InquireVariableKeepType would be more symmetric. On the other hand, I don't really like to have separate functions for what kinda would better be overloads. I think at least DefineVariable could be overloaded instead, and actually, I think there's a way to do it for InquireVariable, too:

auto var = io.InquireVariable<T>("abc", Layout::Original);

which would also allow the flexibility to specify, e.g., Layout::ColumnMajor if one was sure that's what one wants; Layout::RowMajor in that context would be the same behavior that's currently there.

I like an enum class because of future extensibility. For example, when using the hdf5 engine, it could return Layout::Unknown from GetLayout(), because with hdf5 there really is no way to know. And I have some separate ideas about supporting AoS vs SoA in adios2 which could fit in there.

While I'm not too concerned about someone calling SetLayout() in between writing two blocks (you can call SetShape() then, too, if you really want to shoot yourself in the foot), I think I like the your suggestion of using DefineVariable and InquireVariable, but using overloads rather than adding suffixes. It's nicer than having things change under you after a SetLayout() call.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

In C++ I'd suggest to follow boost, Eigen in case this ever becomes standard in C++ (makes life easier to explain to C++ users) as an extra template parameter:
boost: matrix<T,column_major, std::vector<T> > m (size1, size2)
eigen: Matrix<int, 3, 4, ColMajor> Acolmajor;

so something along the lines of what @pnorbert suggested:

auto var = io.DefineVariable<T, order> (name, shape, start, count, adios2::isConstantDims);

auto var = io.InquireVariable<T, order> (name);

order is a extendable type, so far the same as numpy: ColumnMajor, RowMajor, Unchanged
I still have trouble figuring out A in numpy:

order no copy copy=True
‘K’ unchanged F & C order preserved, otherwise most similar order
‘A’ unchanged F order if input is F and not C, otherwise C order
‘C’ C order C order
‘F’ F order F order

@pnorbert @germasch what do you think?

@pnorbert
Copy link
Contributor

pnorbert commented Aug 9, 2019

@williamfgc I am fine with that. I was just trying to move the additional functionality into the Define and Inquire steps to have an immutable flag but did not know how to do it right in C++.

But can we keep the original DefineVariable format as well?

@williamfgc
Copy link
Contributor

williamfgc commented Aug 9, 2019

@pnorbert yes, we'll have to assume defaults not to break current APIs. I'd have to experiment a bit before confirming this 100% applying partial template specialization.

DefineVariable<T, ordering = RowMajor >

@germasch
Copy link
Contributor Author

As far as 'A' is concerned, what we're trying to do here is the "no copy" approach, so K and A behave the same. (Even though we really are copying, from the BP3 buffer to the provide variable, but we want to do a straight contiguous copy, ie, not transpose or anything.) In numpy, the object you're creating a new array from can be many things, not just row-major vs col-major, it supports at least arbitrarily strided objects, and I think more beyond that. But for adios2, for now there are only exactly two cases for what's on disk / in the serialized buffer, row-major and col-major. So "K", "C", "F" exactly correspond to Original, RowMajor, ColumnMajor. One could add "A" as an alias for "K", but I'm not sure that wouldn't make things more confusing.

chuckatkins pushed a commit to chuckatkins/ADIOS2 that referenced this issue Feb 27, 2020
Code extracted from:

    https://github.com/pybind/pybind11.git

at commit 80d452484c5409444b0ec19383faa84bb7a4d351 (v2.4.3).

Upstream Shortlog
-----------------

Ahuva Kroizer (1):
      8f5b7fce FAQ addition (ornladios#1606)

Alexander Gagarin (2):
      0071a3fe Fix async Python functors invoking from multiple C++ threads (ornladios#1587) (ornladios#1595)
      b3bf248e Fix casting of time points with non-system-clock duration with VS (ornladios#1748)

Allan Leal (1):
      e76dff77 Fix for Issue ornladios#1258 (ornladios#1298)

Andre Schmeißer (1):
      19189b4c Make `overload_cast_impl` available in C++11 mode. (ornladios#1581)

Ansgar Burchardt (1):
      a22dd2d1 correct stride in matrix example and test

Antony Lee (6):
      a303c6fc Remove spurious quote in error message. (ornladios#1202)
      0826b3c1 Add spaces around "=" in signature repr.
      8fbb5594 Clarify error_already_set documentation.
      55dc1319 Clarify docs for functions taking bytes and not str.
      58e551cc Properly report exceptions thrown during module initialization.
      baf6b990 Silence GCC8's -Wcast-function-type. (ornladios#1396)

Axel Huebl (9):
      4b84bad7 Fix Travis GCC 7 Python 3.6.6 (ornladios#1436)
      97b20e53 CMake: Remember Python Version (ornladios#1434)
      3a94561c Debug Builds: -DPy_DEBUG (ornladios#1438)
      435dbdd1 add_module: allow include as SYSTEM (ornladios#1416)
      9424d5d2 type_record: Uninit Member (ornladios#1658)
      1c627c9e pybind11_getbuffer: useless safe nullptr check (ornladios#1664)
      a2cdd0b9 dict_readonly: member init (ornladios#1661)
      000aabb2 Test: Numpy Scalar Creation (ornladios#1530)
      38f408fc value_and_holder: uninit members (ornladios#1660)

Baljak (1):
      81da9888 Fix Intel C++ compiler warning on Windows (ornladios#1608)

Blake Thompson (1):
      30c03523 Added __contains__ to stl bindings for maps (ornladios#1767)

Boris Dalstein (2):
      b30734ee Fix typo in doc: build-in -> built-in
      96be2c15 Fix version mismatch typos in .travis.yml (ornladios#1948)

Boris Staletic (4):
      289e5d9c Implement an enum_ property "name"
      f4b4e2e9 Use new Doxygen archive URL - fixes Travis
      0ca6867e Avoid Visual Studio 2017 15.9.4 ICE
      b3f0b4de new sphinx (ornladios#1786)

Borja Zarco (2):
      e2b884c3 Use `PyGILState_GetThisThreadState` when using gil_scoped_acquire. (ornladios#1211)
      b2fdfd12 Avoid use of lambda to work around a clang bug. (ornladios#1883)

Bruce Merry (2):
      1e6172d4 Fix some minor mistakes in comments on struct instance
      3b265787 Document using atexit for module destructors on PyPy (ornladios#1169)

Chris Rusby (1):
      22859bb8 Support more natural syntax for vector extend

Christoph Kahl (1):
      640b8fe6 fix ornladios#1406 add mingw compatibility (ornladios#1851)

Dan (10):
      a175b21e Avoid decoding already-decoded strings from cindex.
      4612db54 Try to autodetect the location of the clang standard libraries.
      b46bb64d Allow user to override default values of -x and -std=.
      a33212df Wrap the main functionality of mkdoc in a function.
      ede328a7 Allow writing output to file instead of stdout.
      a163f881 Delete partially-written file in the event of an error.
      590e7ace Avoid storing global state.
      2c8c5c4e Split into seperate functions for easier invocation from python.
      e0b8bbbc Use a file-local constant for non-prefixing nodes.
      41f29ccd Parse command-line args in a separate function.

Darius Arnold (1):
      09330b94 Fix typos in documentation (ornladios#1635)

David Caron (1):
      307ea6b7 Typo

Davis E. King (1):
      9343e68b Fix cmake scripts so projects using CUDA .cu files build correctly. (ornladios#1441)

Dean Moldovan (3):
      c10ac6cf Make it possible to generate constexpr signatures in C++11 mode
      56613945 Use semi-constexpr signatures on MSVC
      0aef6422 Simplify function signature annotation and parsing

Dennis Luxen (1):
      221fb1e1 Untangle cast logic to not implicitly require castability (ornladios#1442)

Dmitry (1):
      8f5a8ab4 Don't strip debug symbols in RelWithDebInfo mode (ornladios#1892)

Elliott Sales de Andrade (1):
      5e7591c6 Update PyPI URLs.

Eric Cousineau (2):
      e9ca89f4 numpy: Add test for explicit dtype checks. At present, int64 + uint64 do not exactly match dtype(...).num
      4a3464fd numpy: Provide concrete size aliases

Francesco Biscani (1):
      ba33b2fc Add -Wdeprecated to test suite and fix associated warnings (ornladios#1191)

François Becker (1):
      ce9d6e2c Fixed typo in classes.rst (ornladios#1388)

Guilhem Saurel (2):
      e7ef34f2 compatibility with pytest 4.0, fix ornladios#1670
      43a39bc7 ignore numpy.ufunc size warnings

Henry Schreiner (10):
      04b41f03 Upgrading to Xcode 9 & fix OSX/Py3 build failure
      cf0d0f9d Matching Python 2 int behavior on Python 2 (ornladios#1186)
      6c62d279 Fix for conda failures on Windows
      ffd56ebe Fix pip issues on AppVeyor CI (ornladios#1369)
      3789b4f9 Update C++ macros for C++17 and MSVC Z mode (ornladios#1347)
      0f404a5d Allow recursive checkout without https (ornladios#1563)
      ae951ca0 CI fixes (ornladios#1744)
      73b840dc Fix for issue in latest conda (ornladios#1757)
      9bb33131 Fixing warnings about conversions in GCC 7+ (ornladios#1753)
      047ce8c4 Fix iostream when used with nogil (ornladios#1368)

Ian Bell (1):
      502ffe50 Add docs and tests for unary op on class (ornladios#1814)

Igor Socec (1):
      a301c5ad Dtype field ordering for NumPy 1.14 (ornladios#1837)

Ivan Smirnov (1):
      d1db2ccf Make register_dtype() accept any field containers (ornladios#1225)

Ivor Wanders (1):
      2b045757 Improve documentation related to inheritance. (ornladios#1676)

Jamie Snape (1):
      a0b8f70d Ensure PythonLibsNew_FOUND is set in FindPythonLibsNew module (ornladios#1373)

Jason Rhinelander (31):
      c6a57c10 Fix dtype string leak
      d2757d04 Remove superfluous "requires_numpy"
      64a99b92 Specify minimum needed cmake version in test suite
      1b08df58 Fix `char &` arguments being non-bindable
      7672292e Add informative compilation failure for method_adaptor failures
      6a81dbbb Fix 2D Nx1/1xN inputs to eigen dense vector args
      a582d6c7 Build /permissive- under VS2017
      835fa9bc Miscellaneous travis-ci updates/fixes
      32ef69ac Qualify `cast_op_type` to help ICC
      5c7a290d Fix new flake8 E741 error from using `l` variable
      71178922 __qualname__ and nested class naming fixes (ornladios#1171)
      086d53e8 Clean up eigen download code (and bump to 3.3.4)
      3be401f2 Silence new MSVC C++17 deprecation warnings
      48e1f9aa Fix premature destruction of args/kwargs arguments
      367d723a Simplify arg copying
      b48d4a01 Added py::args ref counting tests
      88efb251 Fixes for numpy 1.14.0 compatibility
      507da418 Use a named rather than anon struct in instance
      326deef2 Fix segfault when reloading interpreter with external modules (ornladios#1092)
      adbc8111 Use stricter brace initialization
      657a51e8 Remove unnecessary `detail::`
      add56ccd MSVC workaround for broken `using detail::_` warning
      431fc0e1 Fix numpy dtypes test on big-endian architectures
      1ddfacba Fix for Python3 via brew
      e88656ab Improve macro type handling for types with commas
      9f41c8ea Fix class name in overload failure message
      6d0b4708 Reimplement version check and combine init macros
      6862cb9b Add workaround for clang 3.3/3.4
      e763f046 Base class destructor should be virtual
      f7bc18f5 Fix compatibility with catch v2
      177713fa Fix gcc-8 compilation warning

Jeff VanOss (3):
      05d379a9 fix return from std::map bindings to __delitem__ (ornladios#1229)
      01839dce remove duplicate feature from list (ornladios#1476)
      77ef03d5 compile time check that  properties have no py:arg values (ornladios#1524)

Jeffrey Quesnelle (1):
      f93cd0aa PYBIND11_TLS_REPLACE_VALUE should use macro argument value in Python 3.7+ (ornladios#1683)

Jeremy Maitin-Shepard (1):
      a3f4a0e8 Add support for __await__, __aiter__, and __anext__ protocols (ornladios#1842)

Josh Kelley (1):
      741576dd Update documentation for initialize_interpreter (ornladios#1584)

Justin Bassett (1):
      2cbafb05 fix detail::pythonbuf::overflow()'s return value to return not_eof(c) (ornladios#1479)

Jörg Kreuzberger (1):
      69dc380c ornladios#1208 Handle forced unwind exception (e.g. during pthread termination)

Karl Haubenwallner (1):
      e9d6e879 Added a debug flag to the PYBIND11_INTERNALS_VERSION (ornladios#1549)

Khachajantc Michael (1):
      e3cb2a67 Use std::addressof to obtain holder address instead of operator&

Krzysztof Fornalczyk (1):
      5c8746ff check for already existing enum value added; added test (ornladios#1453)

Lori A. Burns (3):
      bdbe8d0b Enforces intel icpc >= 2017, fixes ornladios#1121 (ornladios#1363)
      868d94fc Apply c++ standard flag only to files of CXX language. (ornladios#1678)
      f6c4c104 restores __invert__ to arithmetic-enabled enum, fixes ornladios#1907 (ornladios#1909)

Maciek Starzyk (1):
      9b028562 Update PyPI URLs

Manuel Schneider (1):
      492da592 another typo (ornladios#1675)

Marc Schlaich (1):
      ab003dbd Correct VS version in FAQ

Matthias Geier (1):
      7bb1da96 fix copy-paste error: non-const -> const

Michael Goulding (1):
      77374a7e VS 15.8.0 Preview 4.0 has a bug with alias templates (ornladios#1462)

Michał Wawrzyniec Urbańczyk (1):
      978d439e Add PYBIND11_ prefix to the THROW macro to prevent name collisions. (ornladios#1578)

Naotoshi Seo (1):
      5ef1af13 Fix SEGV to create empty shaped numpy array (ornladios#1371)

Nathan (1):
      9b3fb053 Allow Windows.h min/max to coexist with pybind11 (ornladios#1847)

Omar Awile (2):
      ac6cb91a Fixed small typo (ornladios#1633)
      95f750a8 Add optional buffer size to pythonbuf::d_buffer constructor (ornladios#1687)

Patrik Huber (1):
      41a4fd8a Fix missing word typo

Pauli Virtanen (1):
      c9d32a81 numpy: fix refcount leak to dtype singleton (ornladios#1860)

Roland Dreier (2):
      7a24bcf1 Fix malformed reST (ornladios#1802)
      1aa8dd17 Fix assertion failure for unions (ornladios#1685) (ornladios#1709)

Rune Paamand (2):
      73634b6d Update iostream.h: Changed a local varname 'self' to 'self_' (ornladios#1535)
      06d021b6 Issue ornladios#1532: Incompatible config options, /MP vs /Gm for MSVC in DEBUG (ornladios#1533)

Ryota Suzuki (1):
      1377fbf7 Fix unintentional escaping of character on Windows (ornladios#1574) (ornladios#1575)

Samuel Debionne (2):
      87fa6a43 Detect whether we are running in a Conda environment and adjust get_include() (ornladios#1877)
      6ca312b3 Avoid infinite recursion in is_copy_constructible (ornladios#1910)

Saran Tunyasuvunakool (2):
      b60fd233 Make sure `detail::get_internals` acquires the GIL before making Python calls. (ornladios#1836)
      bdf1a2cc In internals.h, only look at _DEBUG when compiling with MSVC. (ornladios#1855)

Semen Yesylevskyy (1):
      ef13fb2e Info about inconsistent detection of Python version between pybind11 … (ornladios#1093)

Sergei Izmailov (3):
      979d75de doc: Add note about casting from `None` to `T*` (ornladios#1760)
      09f08294  Avoid conversion to `int_` rhs argument of enum eq/ne (ornladios#1912)
      6cb584e9 Adapt to python3.8 C API change (ornladios#1950)

Sergei Lebedev (2):
      08b0bda4 Added set::contains and generalized dict::contains (ornladios#1884)
      046267c6 Added .empty() to all collection types (ornladios#1887)

Stephen Larew (1):
      5b4751af Add const to buffer:request() (ornladios#1890)

Steven Johnson (1):
      4ddf7c40 Add missing includes for better Bazel compatibility (ornladios#1255)

Tarcísio Fischer (1):
      54eb8193 Fix scoped enums comparison for equal/not equal cases (ornladios#1339) (ornladios#1571)

Ted Drain (1):
      0a0758ce Added write only property functions for issue ornladios#1142 (ornladios#1144)

Thomas Hrabe (1):
      534b756c Minor documentation clarification in numpy.rst (ornladios#1356)

Thomas Peters (1):
      dffe869d quiet clang warning by adding default move ctor (ornladios#1821)

Tom de Geus (1):
      a7ff616d Simplified example allowing more robust usage, fixed minor spelling issues

Tomas Babej (1):
      01fada76 Minor typo

Toru Niina (1):
      74d335a5 Replace a usage of C++14 language features with C++11 code (ornladios#1833)

Trevor Laughlin (1):
      63c2a972 Enable unique_ptr holder with mixed Deleters between base and derived types (ornladios#1353)

Unknown (1):
      0b3f44eb Trivial typos

Vladimír Vondruš (2):
      5b0ea77c Fix -Wmissing-prototypes warning on Clang. (ornladios#1863)
      04c8f4b5 Expose BufferError among other pybind11 exceptions. (ornladios#1852)

Wenzel Jakob (51):
      f94d7598 updated changelog for v2.2.1 release
      6d19036c support docstrings in enum::value() (ornladios#1160)
      e7d304fb added citation reference (fixes ornladios#767) (ornladios#1189)
      15e0e445 Moved section on licensing of contributions (fixes ornladios#1109) (ornladios#1188)
      ff6bd092 Fix pybind11 interoperability with Clang trunk (ornladios#1269)
      2d0507db added v2.2.2 changelog
      060936fe Detect pybind11 header path without depending on pip internals (fixes ornladios#1174) (ornladios#1190)
      ed670055 Minor fix for MSVC warning CS4459 (ornladios#1374)
      f5f66189 updated changelog for v2.2.3
      cbd16a82 stl.h: propagate return value policies to type-specific casters (ornladios#1455)
      d4b37a28 added py::ellipsis() method for slicing of multidimensional NumPy arrays
      885b5b90 Eigen test suite: don't create a np.matrix
      e0f3a766 Fixed flake8 error in test_iostream.py
      44e39e0d fix regression reported by @cyfdecyf in ornladios#1454 (ornladios#1517)
      35c82c72 changelog for version 2.2.4 & features targeted for 2.3.0
      06710020 object_api: support the number protocol
      b4b22924 relax operator[] for tuples, lists, and sequences
      f4245181 enum_: move most functionality to a non-template implementation
      c8e9f3cc quench __setstate__ warnings (fixes ornladios#1522)
      c9b8933e flake8 fix
      9f73060c std::array<> caster: support arbitrary sequences (ornladios#1602)
      adc2cdd5 fixed regression in STL type caster RVPs (fixes ornladios#1561) (ornladios#1603)
      e2eca4f8 Support C++17 aligned new statement (ornladios#1582)
      cea42467 fix py::cast<void *> (ornladios#1605)
      64205140 added std::deque to overview.rst
      d1f64fa9 AppVeyor: quench pip deprecation warnings for v2.7
      ccbe68b0 added binding delattr() -> PyObject_DelAttr analogous to hasattr()
      25abf7ef flake8 fixes
      e11e71d8 Make compiler flags for -Werror specific to GNU, Clang, or Intel
      51ca6b08 Update docs on std::out_of_range exception mapping (ornladios#1254)
      cf36e3d9 updated changelog
      64f2a5f8 begin work on v2.3.1
      ed39c504 README.md: added several folks who've made repeated contributions
      a1b71df1 fix issue ornladios#1804 (warning about redefined macros)
      8b90b1da error_already_set: acquire GIL one line earlier (fixes ornladios#1779)
      9fd47121 fix test suite (pytest changes in ExceptionInfo class)
      b2c4ff60 renamed local gil_scoped_acquire to gil_scoped_acquire_local to avoid ambiguity
      c9f5a464 pybind11 internals: separate different compilers
      00a0aa99 v2.4.0 release
      e825205a begin working on v2.4.1
      21d0eb46 Fix Python 3.8 test regression
      5fd187eb minor changelog cleanup
      31680e6f Implicit conversion from enum to int for Python 3.8 (fix by @sizmailov)
      e44fcc3c v2.4.1 release
      82cf7935 begin working on next version
      f3109d84 future-proof Python version check from commit 31680e6 (@lgritz)
      7f5dad7d Remove usage of C++14 constructs (fixes ornladios#1929)
      7ec2ddfc v2.4.2 release
      2abd7e1e updated release.rst to remove parts that are now automated
      34c2281e begin working on next version
      80d45248 v2.4.3 release

Yannick Jadoul (4):
      b4719a60 Switching deprecated Thread Local Storage (TLS) usage in Python 3.7 to Thread Specific Storage (TSS) (ornladios#1454)
      085a2943 Increasing timeout in test_gil_scoped.py to get AppVeyor to succeed
      97784dad [BUGFIX] Fixing pybind11::error_already_set.matches to also work with exception subclasses (ornladios#1715)
      d23c821b Make static member functions, added with `def_static`, `staticmethod` descriptor instances (ornladios#1732)

Zach DeVito (1):
      03874e37 Fix leak in var arg handling

ali-beep (1):
      5ef13eb6 Add negative indexing support to stl_bind. (ornladios#1882)

cdyson37 (1):
      111b25b2 Mention flake8 and check-style.sh in CONTRIBUTING (ornladios#1567)

kingofpayne (1):
      12e8774b Added support for list insertion. (ornladios#1888)

luz.paz (2):
      28cb6764 misc. typos
      13c08072 Typo

luzpaz (2):
      4b874616 Misc. typos (ornladios#1384)
      21bf16f5 misc. comment typo (ornladios#1629)

martinRenou (1):
      35045eee Add getters for exception type, value and traceback (ornladios#1641)

nstelzen (1):
      c2514340 Added note in documentation regarding make install (ornladios#1801)

oremanj (2):
      fd9bc8f5 Add basic support for tag-based static polymorphism (ornladios#1326)
      e7761e33 Fix potential crash when calling an overloaded function (ornladios#1327)

phil-zxx (1):
      c6b699d9 Added ability to convert from datetime.date to system_clock::time_point (ornladios#1848)

sizmailov (1):
      21c3911b add signed overload for `py::slice::compute`

voxmea (1):
      17983e74 Adds type_caster support for std::deque. (ornladios#1609)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants