Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a region calculus #995

Open
wants to merge 75 commits into
base: dev
Choose a base branch
from

Conversation

eschnett
Copy link
Contributor

Closes #989 .

@eschnett
Copy link
Contributor Author

This is a draft only.

There are three classes:

  • point<T,D>, which describes a point in D-dimensional space
  • box<T,D>, which describes a rectangular region in D-dimensional space. A box is described by two points, its lower and upper bounds.
  • region<T,D>, which describes an arbitrary region. A region is implemented as a set of boxes. Obviously, regions that look "blocky" are most efficient.

Each class has three variants:

  • point<T,D>, efficient to use in C++ if you know the dimension
  • vpoint<T>, a virtual base class for points with any dimension, convenient to use from Python
  • a helper class wpoint<T>, a wrapper about point<T,D> that is a subtype of vpoint<T>, and which deals in pointers to points (instead of points).
    Similarly for box and region.

Please advise:

  • I am currently using Google Test for self-tests. What testing framework do you use? Since Google Test currently isn't available, the code doesn't build.
  • I did not modify the coding style at all. I'll do that later.
  • The current code is in one large file. It should probably be split into three, since there are three major classes in there.
  • The dimension-agnostic wrapper classes use unique_ptr for efficiency. Maybe they should use shared_ptr for convenience instead?
  • There are functions for serializing/deserializing to YAML. These require a YAML library. Are these functions of interest, or should I remove them?

@ax3l ax3l self-requested a review May 28, 2021 01:22
@ax3l ax3l added api: new additions to the API frontend: C++17 labels May 28, 2021
@ax3l ax3l requested a review from franzpoeschel May 28, 2021 01:23
@ax3l
Copy link
Member

ax3l commented May 28, 2021

Thanks for the details!

Is wpoint a view or non-owning point?

I am currently using Google Test for self-tests. What testing framework do you use? Since Google Test currently isn't available, the code doesn't build.

We are using Catch2: https://github.com/catchorg/Catch2
As far as I can see, both have a very similar syntax :)
https://github.com/catchorg/Catch2/blob/devel/docs/tutorial.md

Example tests are located in test/ and you could add to the AuxiliaryTest.cpp file or add a new one :)

I did not modify the coding style at all. I'll do that later.
The current code is in one large file. It should probably be split into three, since there are three major classes in there.

Sounds great, please also consider adding Doxygen strings in the headers.

The dimension-agnostic wrapper classes use unique_ptr for efficiency. Maybe they should use shared_ptr for convenience instead?

If you can use a unique_ptr that is certainly the cleanest way. But I haven't gotten into the detail yet enough to understand where you use it.

There are functions for serializing/deserializing to YAML. These require a YAML library. Are these functions of interest, or should I remove them?

We vendor a JSON library (https://github.com/nlohmann/json) that we can use for serialization, but don't yet use a yaml lib.

If we can separate them out well, we make them opt-in. I would try to avoid another mandatory dependency for now.

@eschnett
Copy link
Contributor Author

Hiding the dimension is somewhat difficult in C++.

  • point<T,D> is a regular C++ class, not using inheritance, straightforward to use if one knows D at compile time
  • vpoint<T> is an abstract base class
  • wpoint<T,D> is a subtype of vpoint<D>. It contains (owns) a point<T,D>. However, to make its functions virtual, they need to accept pointers as arguments, and return pointers, which is weird by modern standards. To ensure safety, I use unique_ptr<wpoint<T,D>> here.
  • Thinking about this a bit more, to make these types easily usable from C++, I should introduce another class ndpoint<T> that wraps a unique_ptr<vpoint<D>>, which would hide the unique_ptr business that is necessary for virtual functions.

I'll implement a proof of concept for discussion.

Thanks for the other comments.

@eschnett
Copy link
Contributor Author

I added new (temporary) files RegionCalculus2.hpp as well as respective catch-based tests. These introduce two user-visible classes:

  • point<T,D>, for C++ use when the number of dimensions is known at compile time
  • ndpoint, owning a point<T,D> for some D, for use when the number of dimensions is only known at run time.

The former is necessary for efficiency. The latter is convenient when creating high-level language bindings.

The two other classes vpoint and wpoint are now hidden; they are not supposed to be externally visible.

If you agree with this generic setup, then I'll clean up my implementation.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented May 28, 2021

This pull request introduces 1 alert when merging 84796f8 into 9170612 - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

@eschnett
Copy link
Contributor Author

eschnett commented Jun 2, 2021

@ax3l Can you comment? Do you want to discuss further, or would my API choices have a good chance of being accepted?

@ax3l
Copy link
Member

ax3l commented Jun 2, 2021

Thanks for the ping!
I think the latest design concept (RegionCalculus2) looks great.

It's a bit challenging for compile-time that we have to put so much in header-only files due to templates. But since we make the headers opt-in for users, this should not cause too much trouble.

Can you please at the end add Doxygen strings to classes and methods?
I would also wrap the Region calculus helper namespace once more in the usual openPMD:: namespace.
Let's also add a new example and go through it on a page in the manual. That page would also explain the concept mentioned above.

With respect to naming conventions, we could try to make the class spellings similar to existing classes in the library (uppercase first letter), unless that feels very unnatural for you.

@eschnett
Copy link
Contributor Author

eschnett commented Jun 5, 2021

I find that I'm using C++17, because that's quite convenient, especially if constexpr statements. I see that openPMD-api only uses C++14 elsewhere. If you're planning on upgrading soon, I'll leave the code, otherwise I'll have to downgrade my code.

@eschnett
Copy link
Contributor Author

eschnett commented Jun 6, 2021

Are you using a particular benchmarking infrastructure?

@ax3l
Copy link
Member

ax3l commented Jun 7, 2021 via email

@ax3l
Copy link
Member

ax3l commented Jun 7, 2021 via email

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jun 27, 2021

This pull request introduces 1 alert when merging 15399b1 into ed8f16f - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

# Conflicts:
#	.github/workflows/unix.yml
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jul 8, 2021

This pull request introduces 1 alert when merging c436db6 into 239cf68 - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

CMakeLists.txt Show resolved Hide resolved
include/openPMD/regions/Box.hpp Outdated Show resolved Hide resolved
include/openPMD/regions/Box.hpp Outdated Show resolved Hide resolved
@franzpoeschel
Copy link
Contributor

franzpoeschel commented Jul 12, 2021

Fun fact: This seems to uncover a compiler bug on g++ 7.5.0:

[ 93%] Building CXX object CMakeFiles/13_regions.dir/examples/13_regions.cpp.o
In file included from /home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Box.hpp:5:0,
                 from /home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Regions.hpp:4,
                 from /home/franzpoeschel/git-repos/openPMD-api/examples/13_regions.cpp:1:
/home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Point.hpp: In instantiation of 'openPMD::Regions::fmap(const F&, const openPMD::Regions::Point<T, D>&, const openPMD::Regions::Point<Args, D>& ...)::<lambda(openPMD::Regions::Point<T, D>::size_type)> [with F = openPMD::Regions::operator*(const int&, const openPMD::Regions::Point<int, 2>&)::<lambda(const int&)>; Args = {}; R = int; T = int; long unsigned int D = 2; openPMD::Regions::Point<T, D>::size_type = long int]':
/home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Point.hpp:103:48:   required from 'struct openPMD::Regions::fmap(const F&, const openPMD::Regions::Point<T, D>&, const openPMD::Regions::Point<Args, D>& ...) [with F = openPMD::Regions::operator*(const int&, const openPMD::Regions::Point<int, 2>&)::<lambda(const int&)>; Args = {}; R = int; T = int; long unsigned int D = 2]::<lambda(openPMD::Regions::Point<int, 2>::size_type)>'
/home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Point.hpp:102:29:   required from 'constexpr openPMD::Regions::Point<R, D> openPMD::Regions::fmap(const F&, const openPMD::Regions::Point<T, D>&, const openPMD::Regions::Point<Args, D>& ...) [with F = openPMD::Regions::operator*(const int&, const openPMD::Regions::Point<int, 2>&)::<lambda(const int&)>; Args = {}; R = int; T = int; long unsigned int D = 2]'
/home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Point.hpp:313:16:   required from here
/home/franzpoeschel/git-repos/openPMD-api/examples/13_regions.cpp:29:20:   in constexpr expansion of 'openPMD::Regions::operator*(2, y)'
/home/franzpoeschel/git-repos/openPMD-api/include/openPMD/regions/Point.hpp:103:36: internal compiler error: in tsubst_pack_expansion, at cp/pt.c:11554
         [&](size_type d) { return f(x.elts[d], args.elts[d]...); });
                                   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.

Building with clang++ 11.0.1 did work, however I get a lot of warnings for comparing unsigned (std::size_t D) integers with signed integers (typedef std::ptrdiff_t size_type).

Also conceptually, I understand it correctly, that this code is currently not supposed to interact with other parts of openPMD apart from often being useful to have?
It looks like some of the functionality introduced here could be interesting to use in openPMD once upon switching to C++17, especially Boxes for dealing with chunks. This includes normal loading/storing, but also things like chunk distribution algorithms may become more legible if using these high-level classes.

@eschnett
Copy link
Contributor Author

@franzpoeschel I will have a look at the signed/unsigned warnings. I don't recall seeing these in the code I added.

Conceptually, Regions would be added without making any other changes to openPMD-api at first. Later, it would be unobtrusive to allow NDPoint and NDBox types where currently Offset and/or Extent are used. In a second step, it would be straightforward to replace ChunkInfo with NDBox.

It might also make sense to offer certain helper routines that are sometimes needed to read or write chunks. Examples would be extracting or inserting a sub-box into a box: A given array in memory is defined by a box, and the chunk that is read or written is only a part of this (another box, contained in the first), and there could be efficient functions for reading/writing these. Alternatively, the chunk reading/writing functions could be extended to support these operations. HDF5 calls this "hyperslab selection".

The core contribution is the NDRegion class, which is an efficient implementation of arbitrarily shaped regions. I assume that these would mostly be used by consumers of openPMD-api, e.g. by post-processing or visualization tools.

@franzpoeschel
Copy link
Contributor

Thank you for the clarification!

Introducing the region calculus as an independent library segment first and later using it where applicable is a good approach in my opinion. I have not really had a very thorough look into the code itself, rather than skimming it for a first understanding, but this looks like a nice addition to openPMD overall, thank you for contributing it :)

@eschnett
Copy link
Contributor Author

I see no warnings when building with Apple clang version 12.0.0 (clang-1200.0.32.29). Are you using special options to enable compiler warnings?

@franzpoeschel
Copy link
Contributor

franzpoeschel commented Jul 12, 2021

After some playing around, I only see those warnings when building with CMAKE_BUILD_TYPE=Debug, Release mode results in a quiet build.
@ax3l should we maybe do the clangtidy build in Debug mode?

@eschnett
Copy link
Contributor Author

I still see no warnings when building in Debug mode. Using -DopenPMD_USE_INVASIVE_TESTS=ON doesn't show them either. It must be the architecture – I'm using macOS and Apple Clang.

@eschnett
Copy link
Contributor Author

I disabled Regions by default.

@franzpoeschel
Copy link
Contributor

I still see no warnings when building in Debug mode. Using -DopenPMD_USE_INVASIVE_TESTS=ON doesn't show them either. It must be the architecture – I'm using macOS and Apple Clang.

Hm, weird. I used no special flags otherwise for compiling.

@franzpoeschel
Copy link
Contributor

I have tried to fix the warnings on this branch. Is there a particular reason that you would use std::ptr_diff for the size_types? I did not find an immediate reason for it. Is it because of a size computation by subtracting end - start? I did not find something like this and something like this should probably be treated as an implementation detail anyway.
After removing every instance of std::ptr_diff, no warnings show for me any longer.

@eschnett
Copy link
Contributor Author

@franzpoeschel I looked at the code, and the type size_type is currently indeed used only in very few places, and could indeed be set to std::size_t. I'm sure there was a good reason in an earlier version of the code...

Can you open a pull request against my branch? This will allow me to see a diff for the changes you made.

@franzpoeschel
Copy link
Contributor

franzpoeschel commented Jul 15, 2021

I have opened a pull request: eschnett#1 @eschnett

@eschnett
Copy link
Contributor Author

@franzpoeschel I merged you pull request.

# Conflicts:
#	.github/workflows/unix.yml
@ax3l ax3l added this to the 0.16.0 milestone Aug 1, 2023
@BenWibking
Copy link

Just wanted to check - is this PR still active?

@ax3l ax3l modified the milestones: 0.16.0, 0.17.0 Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API frontend: C++17
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a bounding box calculus
4 participants