Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Observable edges #3608

Merged
merged 10 commits into from
Apr 17, 2020
Merged

Add Observable edges #3608

merged 10 commits into from
Apr 17, 2020

Conversation

jngrad
Copy link
Member

@jngrad jngrad commented Mar 30, 2020

Fixes partially #3599

Description of changes:

  • convert CylindricalProfile to CylindricalProfileObservable to make the observables framework homogeneous
  • for all four *ProfileObservable classes, add a method edges() to calculate the coordinates of the bins, producing the same output as numpy.histogramdd()
  • fix broken LB sample

jngrad added 4 commits March 30, 2020 21:50
Make CylinderProfile a subclass of Observable, cleanup include
statements, add a Base member in the script interface classes.
std::vector<size_t> shape() const override {
return {n_x_bins, n_y_bins, n_z_bins};
}

/** Calculate the bin edges for each dimension */
std::vector<std::vector<double>> edges() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole class seems to be identical CylindricalProfileObservable except for the variable names, maybe this repetition can be avoided?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent 10 days refactoring these observables classes (#3599), but the AutoParameters framework keeps blocking me no matter what I try. I gave up using class templates, otherwise I won't get anything done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is the core code, no? Also I don't think even a template is needed, they can just be the sample class, the only difference is the name of the data members for the size and the limits?

class ProfileObservable : virtual public Observable {
 std::array<std::pair<double, double>, 3> m_limits;
std::array<size_t, 3> m_bins;
public:
   ProfileObservable = ...

  std::vector<size_t> shape() const override {
    return {m_bins.begin(), m_bins.end()};
  }

  std::array<std::vector<double>, 3> edges() {
   std::array<std::vector<double>, 3> ret;

  for(i in {0, 1, 2}) {
   boost::copy(Utils::make_lin_space(m_limits[i].first, m_limits[i].second, bins[i]),     std::back_inserter(profile_edges[i]));
   } 
    return ret;
  }
};

class CartesianProfileObservable : public ProfileObservable {
  public:
  size_t n_bins_x() const { return m_bins[0]; }
 ... 
}

class CylindricalProfileObservable : public ProfileObservable {
  public:
  size_t n_bins_phi() const { return m_bins[0]; }
 ... 
}

I'm not even sure if the derived classes are really needed. If not
you can use the same interface. Otherwise you don't have to use
the fact that they have a common base in the script interface, so
it should not matter there. Disclaimer: Just for illustration, but I
think something like this shoudl work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another solution in mind, with a template class that takes a coordinate system struct as template parameter, and where the n_bins_*... are aliases (e.g. size_t &n_bins_x = n_bins_0) to avoid writing getters. I would prefer to defer this change to another PR to keep this one simple, and because such a change would have to be carried out on the LBProfileObservable classes as well, which will then collide with #3607.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to change this now, you cold also put the implementation of edges() into a free function an reuse it without messing with the class hierarchy.

Copy link
Contributor

@fweik fweik Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add reference data members, the class is no longer default constructible, because references can not be unbound. Just use functions, your construction looks fishy anyway.

Copy link
Member Author

@jngrad jngrad Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've converted them to getters in jngrad/espresso:observable-edges-5 but I still get the same compiler errors, plus new ones that are so long they don't even fit on my fullscreen terminal window. Probably caused by a typo, but I don't have the patience to dissect these 18136-character long compiler warnings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. From what I can tell the problem is already with the core classes, e.g. Observables::DensityProfile can not be constructed. This seems to be related to virtual
inheritance, or the lack thereof to be more precise. The following very simple unit test
shows the issue:

#define BOOST_TEST_MODULE Observables::DensityProfile
#define BOOST_TEST_DYN_LINK
#include <boost/test/unit_test.hpp>

#include "observables/DensityProfile.hpp"

BOOST_AUTO_TEST_CASE(ctor) {
  Observables::DensityProfile densityProfile({}, {}, {}, {}, {}, {}, {}, {}, {}, {});
}

Leading to

[...]
/ssd/fweik/espresso/src/core/observables/DensityProfile.hpp:33:31: error: no matching function for call to ‘Observables::ProfileObservableBase::ProfileObservableBase()’

which lets me suspect that there is a second base ProfileObservableBase, in addition to the one whose ctor is explicitly called from ProfileObservableBase, which is then tried to default
construct, which fails, because there is no default ctor.

I don't understand how this happens, nor why there is virtual inheritance in this hierarchy in the first place?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for DensityProfile, the issue can be fixed by not deriving ProfileBase from Observable, and implementing shape in the derived classes (most of them by either just calling ProfileBase::shape, or adding something to the result).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue can be fixed by not deriving ProfileBase from Observable

This is also what I ended up doing in c3c9afc.

jngrad and others added 2 commits March 31, 2020 00:16
Co-authored-by: Florian Weik <[email protected]>
Optional parameter `density=True` is not allowed in np.histogramdd()
in numpy 1.11.
@codecov
Copy link

codecov bot commented Mar 30, 2020

Codecov Report

Merging #3608 into python will increase coverage by 0%.
The diff coverage is 85%.

Impacted file tree graph

@@          Coverage Diff           @@
##           python   #3608   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files         521     521           
  Lines       22430   22463   +33     
======================================
+ Hits        19836   19867   +31     
- Misses       2594    2596    +2     
Impacted Files Coverage Δ
...ace/observables/CylindricalLBProfileObservable.hpp 9% <0%> (ø)
.../core/observables/CylindricalProfileObservable.hpp 87% <84%> (ø)
...ore/observables/CylindricalLBProfileObservable.hpp 100% <100%> (ø)
...re/observables/CylindricalPidProfileObservable.hpp 100% <100%> (ø)
src/core/observables/ProfileObservable.hpp 100% <100%> (ø)
...ce/observables/CylindricalPidProfileObservable.hpp 21% <100%> (+9%) ⬆️
...ript_interface/observables/LBProfileObservable.hpp 17% <100%> (+2%) ⬆️
...ipt_interface/observables/PidProfileObservable.hpp 22% <100%> (+9%) ⬆️
src/core/polymer.cpp 92% <0%> (-6%) ⬇️
src/core/electrostatics_magnetostatics/p3m.cpp 85% <0%> (-1%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 06e8d3f...3a6ccc9. Read the comment docs.

@jngrad jngrad requested a review from RudolfWeeber March 30, 2020 22:45
@fweik fweik self-assigned this Mar 31, 2020
@fweik
Copy link
Contributor

fweik commented Apr 1, 2020

Maybe we should start by first doing the core changes, with corresponding tests, and then doing the interface. I understand that this is a frustrating experience for you, but going forward with the messed
up class hierarchy in the core would probably come back to us. Maybe we can also recruit @KaiSzuttor to help with this, as he is responsible for the original code, iirc.

@KaiSzuttor
Copy link
Member

I'm not so sure if I'm able to invest too much time here. I can have a look though.

@jngrad
Copy link
Member Author

jngrad commented Apr 2, 2020

@KaiSzuttor We can wait for this PR to be merged first. The WIP in question is in c3c9afc, where the Cartesian/Cylindrical/Spherical Profile classes don't depend on Observable anymore.

@fweik
Copy link
Contributor

fweik commented Apr 14, 2020

@jngrad I'm no clear what your plan is here? First merge this and then fix the core stuff separately?

@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

I've already done some refactoring here, but it's minimal. I would like to keep this PR self-contained and not delay it any further. I'll do the ObservableProfile refactor in a subsequent PR, most likely in a WIP since I'm not totally sure which level of abstraction to use in the class design.

@fweik
Copy link
Contributor

fweik commented Apr 14, 2020

Alright, then I think we can just move forward with this. Discussing the design for the rest in an successive PR is a good idea.

fweik
fweik previously approved these changes Apr 14, 2020
@fweik
Copy link
Contributor

fweik commented Apr 14, 2020

@RudolfWeeber I've already reviewed this.

@KaiSzuttor KaiSzuttor added the automerge Merge with kodiak label Apr 14, 2020
@KaiSzuttor KaiSzuttor removed the request for review from RudolfWeeber April 14, 2020 09:44
@RudolfWeeber RudolfWeeber removed the automerge Merge with kodiak label Apr 14, 2020
@RudolfWeeber
Copy link
Contributor

Please wait. I'd like to take alook.

@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

I vaguely remember we briefly discussed the output of the edges() function, e.g. whether to return the N+1 values between the bins or the N bin centers. Should I add a parameter to the method to decide which tick marks to return?

@fweik
Copy link
Contributor

fweik commented Apr 14, 2020

@jngrad the centers can be easily calculated from the edges, so I wouldn't bother

@KaiSzuttor
Copy link
Member

I think we agreed that the default behavior should be compatible to numpy which it is now

@RudolfWeeber
Copy link
Contributor

I didn't look in detail yet, but a few first impresssions:

From reading the code, it is not immediately apparent to me if I can do

value = obs.result[i_x,i_y,i_z]
lower_edge = obs.edges()[i_x,i_y_iz]

Independently of what Numpy does, this is, what should work, IMO. Does it?

Is ther doc on this, somewhere?

IMO, rather than edges(), it should be bin_edges() for clarity's sake. I'd also find the bin_centers() method useful, but it can also be done in a subsequent PR.

Do we do something about the LB profile observables. I assume, the behavior of the bin edge calculation should be independent of the sampling_offset (as it is now)).

@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

@RudolfWeeber are you sure obs.edges()[i_x,i_y,i_z] is what we should be able to use? This requires returning a 4D matrix with the dimensionality of the actual histogram for the first 3 dimensions and a 3-tuple for the fourth dimension. I think this is quite wasteful. At the moment, edges() returns a 2D array. To get the lower corner of a bin, you would do the following:

obs_edges = np.array(density_profile.edges())
obs_edges[(0,1,2),(i_x, i_y, i_z)]

To plot it, you would either discard the last value to get the same dimensions of the histogram, or calculate the bin centers with a one-liner. To get the bin edges as a 4D matrix, there is probably a numpy function that can do the work for us.

If this is ok with you, I'll rename the method as bin_edges(), create bin_centers(), return them as numpy arrays directly, and add documentation.

@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

ok I just realized that we cannot do the indexing in obs_edges[(0,1,2),(i_x, i_y, i_z)] for a numpy array where the row sizes are different... Which is the most common case: rectangular boxes and cylindrical histograms almost always have different sizes per dimension. The only valid syntax to get the lower corner is then:

obs_edges = density_profile.edges()
[obs_edges[i][j] for i,j in zip((0,1,2),(i_x, i_y, i_z))]

@KaiSzuttor
Copy link
Member

Maybe we should write down a use-case sample...

@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

My understanding was that bin edges were a needed feature to facilitate plotting of histogram slices. @RudolfWeeber do you have another application in mind?

@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Apr 14, 2020 via email

@fweik
Copy link
Contributor

fweik commented Apr 14, 2020

@jngrad you don't have to return the matrix, just a python object that behaves like it.

@fweik fweik closed this Apr 14, 2020
@fweik fweik reopened this Apr 14, 2020
@jngrad
Copy link
Member Author

jngrad commented Apr 14, 2020

@RudolfWeeber do you have a MWE that could help me understand how you intend to use this 4D object? I thought you would pass the bin centers to matplotlib.pyplot.axis to label your axes, but what you have in mind is rather different.

@RudolfWeeber
Copy link
Contributor

RudolfWeeber commented Apr 14, 2020 via email

jngrad added 2 commits April 15, 2020 18:54
Reproduce the core ProfileObservable class in the Python interface.
Calculate the bin edges and centers in Python.
Discuss ongoing efforts to convert functionality from the Analysis
module to the Observable framework. Rewrite section explaining how
to instantiate and use observables. Document the new bin_edges()
and bin_centers() methods and provide a matplotlib script to
illustrate how they differ.
@jngrad
Copy link
Member Author

jngrad commented Apr 15, 2020

@RudolfWeeber Profile observables now provide methods bin_centers() and bin_edges() for convenience. Turns out, most matplotlib functions require bin centers, but some require bin edges, in particular pcolormesh() will silently trim the last coordinate of the dataset if you give it bin centers! The Observables documentation now has a matplotlib script to showcase these 2 methods.

@fweik fweik added the automerge Merge with kodiak label Apr 17, 2020
@kodiakhq kodiakhq bot merged commit 18ffb6f into espressomd:python Apr 17, 2020
@jngrad jngrad deleted the observable-edges-3 branch January 18, 2022 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants