Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring the PidObservable and ProfileObservable framework #3599

Open
jngrad opened this issue Mar 26, 2020 · 6 comments
Open

Refactoring the PidObservable and ProfileObservable framework #3599

jngrad opened this issue Mar 26, 2020 · 6 comments

Comments

@jngrad
Copy link
Member

jngrad commented Mar 26, 2020

We currently have the following observable framework:

class Observable
class PidObservable: Observable
class ProfileObservable: Observable
class CylindricalProfile
class PidProfileObservable: PidObservable, ProfileObservable
class CylindricalPidProfileObservable: PidObservable, CylindricalProfile

The cartesian and cylindrical observables have a completely different inheritance diagram despite doing the same thing with different names (attributes x, y, z become r, phi, z). There is also un-necessary code duplication in the script interface, where each core class must be re-declared with AutoParameters. Finally, adding another coordinate system (e.g. SphericalProfile for RDF) or another particle trait (e.g. particle type instead of id for RDF) leads to a combinatorial growth of the number of core classes and script interface classes.

We could refactor these classes to take a template argument that selects the coordinate system (Cartesian, Cylindrical, Spherical) and add a PtypeObservable:

class Observable
class PidObservable: Observable
class PtypeObservable: Observable
class ProfileObservable<CoordSystem>: Observable
class PidProfileObservable<CoordSystem>: PidObservable, ProfileObservable<CoordSystem>
class PtypeProfileObservable<CoordSystem>: PtypeObservable, ProfileObservable<CoordSystem>

This was attempted in branch jngrad/espresso:refactor-analysis-rdf4. This new observable framework is now more easily extensible, however the introduction of PtypeObservable lead to a lot of code duplication in the script interface (PtypeObservable is essentially a copy-paste of PidObservable). I don't think it's the right solution. This un-necessary code duplication could probably be avoided by making coordinate system observable classes completely independent from the Observable class, but the constraints of the AutoParameters framework makes script interface class design a little bit unpredictable. For example, the cylindrical coordinate system requires an extra axis argument compared to spherical and cartesian coordinates, and I couldn't figure out a working solution for a variadic template ctor in an AutoParameters class. It also doesn't seem possible to have AutoParameters classes forward an incomplete list of arguments to a ctor containing only optional parameters.

Perhaps the solution here is to provide a unified PtraitObservable that replaces PidObservable and PtypeObservable, with ctor PtraitObservable(std::vector<int> ids={}, std::vector<int> types={}, ...) such that we can instantiate it with either particle ids or particle types (the ctor would convert the types to ids). In that case, any future plan for a particle selection feature based on other criteria would need to be implemented at the Python level (except for filtering out virtual particles, see #3154).

Another issue that is blocking the conversion of RDF to an observable is that RDF takes two lists of particle ids as argument. I don't see an easy way to extend our current framework to take two lists instead of one. One could write a Python ctor that automatically flattens the two arrays into a single one with a sentinel value as separator, and have the core observable un-flatten the array. This would probably work, despite being inelegant. We cannot write an RDF class directly and have it work in the script interface, because we can only register classes that inherit from a base class where all attributes are already bound via AutoParameters. This means that to write an RDF class, one has to write a dummy parent class (e.g. TypesObservable in jngrad/espresso:refactor-rdf-4) in both the core and script interface, which causes too much code duplication in the script interface.

Almost orthogonal to that is the question of returning the range of values on the axes of a ProfileObservable. At the moment it is the user's responsibility to come up with the correct Python code for each observable. This should be done at the core level.

@jngrad
Copy link
Member Author

jngrad commented Mar 27, 2020

Concerning the last point, I tried returning the histogram edges from the core observables, but I can't get the AutoParameters interface to work out. The WIP is in jngrad/espresso:observable-edges. The compiler error in a118cc3 is not human-readable. Converting the read-only AutoParameter (the lambda is a getter) by a call_method() in ab4b84a produces a compiler error that is also not human readable. To fix it, I added a new type in the Variant with 5550126, because the std::vector<boost::recursive_variant_> type is not actually a recursive type? In any case, I now have a runtime error AttributeError: Class DensityProfile does not have an attribute edges.

@fweik how do I:

  • get an edges() method in the Python interface starting from 5550126?
  • pass a std::vector<std::vector<T>> between the core and python starting from 688bcc7?

@fweik
Copy link
Contributor

fweik commented Mar 29, 2020

pass a std::vector<std::vector> between the core and python starting from 688bcc7

You can not get a std::vector<std::vector<int>> from a Variant, because this is not a type
that it can hold. You can get a std::vector<Variant>, and then a std::vector<int> from every entry.
From the python side it should just work, e.g. a list of lists of int should be converted to a std::vector<Variant>. Does this answer your question?

@fweik
Copy link
Contributor

fweik commented Mar 29, 2020

pass a std::vector<std::vector> between the core and python starting from 688bcc7?

The change in 688bcc7 is not required, the variant can already contain nested data, because
its a recursive variant.

@KaiSzuttor
Copy link
Member

So is the discussion about the overall observables design finished? Otherwise I would avoid thinking about implementation details.
E.g. it's not totally obvious to me why we need a PtypeObservable (btw. should be called PartTypeObservable or PTypeObservable). Filtering the particles can be done on the interface level very easily or do you want to dynamically include particles in the observable, i.e. particles that change type during the simulation?

@fweik
Copy link
Contributor

fweik commented Mar 30, 2020

@jngrad, if the AutoParameters stuff makes you too much trouble, you don't have to use it. You can also just directly implement ScriptInterfaceBase and do by your own rules. I know that all of this
business is fairly horrible. I have an considerable improvement in the pipeline, but I can't promise when this will be ready. Frankly, I'm already overloaded, and as per usual there is no plan for anything.

@KaiSzuttor lets' have that discussion separately, but I agree we should have it now. I think the earlier designs are discussed, the better. Ideally before implementation starts.

@jngrad
Copy link
Member Author

jngrad commented Mar 30, 2020

@fweik I finally managed to get the core to send a vector of vectors back to Python using the recursive Variant. Let's see if I can figure out how to pass a list of lists from Python to the core...

The frustration with the AutoParameters infrastructure stems from the limited documentation (I have to read the AutoParameter implementation to understand the feature) and from the unreadable template compiler errors generated by gcc and Clang.

The np.reshape callback we introduced in the Cython Observables recently could probably be replaced by a ScriptInterface method that convert the flattened array into an array of arrays.

@KaiSzuttor the particle type-based observable is causing more trouble than it is worth, let's drop it.

kodiakhq bot added a commit that referenced this issue Apr 17, 2020
Fixes partially #3599

Description of changes:
- convert `CylindricalProfile` to `CylindricalProfileObservable` to make the observables framework homogeneous
- for all four `*ProfileObservable` classes, add a method `edges()` to calculate the coordinates of the  bins, producing the same output as `numpy.histogramdd()`
- fix broken LB sample
kodiakhq bot added a commit that referenced this issue May 15, 2020
Folllow-up to #3599

Description of changes:
- replaced the RDF analysis tool by an RDF observable
- removed the particle position storage mechanism from `statistics.cpp`
- documented observable `TimeSeries`
kodiakhq bot added a commit that referenced this issue Nov 18, 2020
Description of changes:
- remove code duplication in profile observables
- unify `ProfileObservable` and `CylindricalProfileObservable` classes (partial fix for #3599)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants