Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add introspection to eckit::Configuration #101

Merged
merged 2 commits into from
May 30, 2024

Conversation

wdeconinck
Copy link
Member

This proposed addition allows to make sure that a configured value is of an expected type without resorting to try / catch.

@wdeconinck wdeconinck force-pushed the feature/config_introspection branch from 50c50ce to a9f8ea9 Compare January 10, 2024 10:02
@codecov-commenter
Copy link

codecov-commenter commented Jan 12, 2024

Codecov Report

Attention: Patch coverage is 73.10924% with 32 lines in your changes are missing coverage. Please review.

Project coverage is 61.97%. Comparing base (4fdc02f) to head (44e7ddd).

Files Patch % Lines
src/eckit/config/Configuration.cc 56.16% 32 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #101      +/-   ##
===========================================
+ Coverage    61.94%   61.97%   +0.03%     
===========================================
  Files          899      900       +1     
  Lines        49652    49799     +147     
  Branches      3730     3745      +15     
===========================================
+ Hits         30756    30865     +109     
- Misses       18896    18934      +38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pmaciel
Copy link
Member

pmaciel commented Jan 17, 2024

Sorry to have taken some time to return to this.

So, eckit::Parametrisation is a eckit::Configuration's base class, with just the barebones interface full of pure virtual methods (like a Java interface). Configuration holds a "root" eckit::Value supporting all the nested structrure of values we care to contain.

Adding introspection to the Configuration is certainly useful -- but, with the reflection time I had to think about the class structure, I think the current version has a few inconsistencies (obviously, a matter of opinion):

  1. the list of new "isType..." is not complete in respect to the internal eckit::Value possible related types (via eckit::Content); I understand it is following the <type_traits> conterparts, but it isn't comprehensive in that respect either. Hence I would suggest to convert this family of methods to a template "is()", or overloading "has()" maybe, where one can pass the type_traits definition, or otherwise map to the eckit::Value possible derived types?
  2. If the class contains a "root" eckit::Value, why not expose it like in const eckit::Value& getValue(const std::string& name), which we could expose the introspection of eckit::Value instead?
  3. The same logic would be applicable to the isConvertible -- because it would better semantically apply (directly) to eckit::Value than (indirectly) to eckit::Configuration.

But in any case I'm even a proponent to change/revamp eckit::Value completely, as modern (and not so modern) C++ has both introspection (<type_traits>) and conversion (std::to_string, std::istream, leveraging std::is_arithmetic and the like) native support. There's a whole variety of unrelated eckit::Value related types via eckit::Content like "Date", "DateTime", and "Time" that wouldn maybe better be moved downstream so this hierarchy could be closer to native C++; Some of these related types also offer limited operations like compare..., add..., sub..., mul..., div..., mod... which would likely be better suited downstream. This PR isn't touching this (neither are immediate dependant projects like Atlas or MIR, but of course configuration is suitable for many other packages), and a revision would allow for a significant simplification and adhering to the standards more closely.

So I don't know if this should follow a cleanup. As a compromise, notwithstanding a complete review of eckit::Configuration and eckit::Value, I suggest adding a Configuration::getValue accessor and moving the introspection/conversion to the eckit::Value class?

So consider my tacit aproval, as of course I know this PR adds significant convenience. I'm just not sure it is in the right place, considering upcoming use of these classes.

@pmaciel
Copy link
Member

pmaciel commented Jan 17, 2024

(A possible Configuration::getValue could be just a redirect of the Configuration::lookUp method, which is currently protected).

Copy link
Member

@pmaciel pmaciel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I approve because the PR contains good development, all here is helpful and my comments are very minor in general.

I'm just not sure this is the time to fully review the eckit::Configuration and eckit::Value classes? (What do you think?)

@@ -130,6 +131,70 @@ class Configuration : public Parametrisation {

virtual void hash(eckit::Hash&) const;

// -- Introspection methods

bool isSubConfiguration(const std::string& name) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List of methods that ideally should follow either eckit::Value variants, or <type_traits> (and maybe eckit::Value should follow that too). I only find the extensive list of isType... and the template isConvertible to be inconsistent (both should maybe be template, and leverage <type_traits>? not sure)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please show an example API of what you have in mind?

Copy link
Member

@pmaciel pmaciel Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a choice between

  1. comprehensively to the class accessors introspect the eckit::Configuration types, as
    instead of (isSubConfiguration is special, so ommited):
    `
    bool isIntegral(const std::string&) const;

    bool isBoolean(const std::string&) const;

    bool isFloatingPoint(const std::string&) const;

    bool isString(const std::string&) const;

    bool isList(const std::string&) const;

    bool isSubConfigurationList(const std::string&) const;

    bool isIntegralList(const std::string&) const;

    bool isBooleanList(const std::string&) const;

    bool isFloatingPointList(const std::string&) const;

    bool isStringList(const std::string&) const;`

    do
    `
    bool isBool(const std::string&) const;

    bool isInt(const std::string&) const;

    bool isLong(const std::string&) const;

    bool isUnsigned(const std::string&) const;

    bool isInt32(const std::string&) const;

    bool isInt64(const std::string&) const;

    bool isFloat(const std::string&) const;

    bool isDouble(const std::string&) const;

    bool isString(const std::string&) const;

    bool isIntVector(const std::string&) const;

    bool isLongVector(const std::string&) const;

    bool isUnsignedVector(const std::string&) const;

    bool isInt32Vector(const std::string&) const;

    bool isInt64Vector(const std::string&) const;

    bool isFloatVector(const std::string&) const;

    bool isDoubleVector(const std::string&) const;

    bool isStringVector(const std::string&) const;`

  2. or, comprehensively (to eckit::Value, because this is what implements conversion) introspect the eckit::Configuration types, as instead of (same as above), do
    ` bool isNumber(const std::string&) const;

    bool isBool(const std::string&) const;

    bool isDouble(const std::string&) const;

    bool isString(const std::string&) const;

    bool isNil(const std::string&) const;

    bool isList(const std::string&) const;

    bool isMap(const std::string&) const;

    bool isDate(const std::string&) const;

    bool isTime(const std::string&) const;

    bool isDateTime(const std::string&) const;

    bool isOrderedMap(const std::string&) const;`

In the end I'm only mentioning consistency so logical patterns emerge. I would think this is a good time to re-think some of the classes purposes as some inconsistency is of course already present -- otherwise we wouldn't need to discuss! :-)

Copy link
Member Author

@wdeconinck wdeconinck Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean.

eckit::Value has isNumber and isDouble whereas I opted for isIntegral and isFloatingPoint which resembles the STL type_traits naming more closely. I'm OK to use eckit::Value nomenclature instead if that is a more expected naming, but thought that would not matter too much with eckit::Value being an implementation detail.

I am also OK to use the current Configuration getTYPE naming for isTYPE or hasTYPE.
Fact is that getFloat and getDouble both work. Same with getInt and getLong.
Would you then say that isFloat and isDouble should both return "true" for an eckit::Value that encodes a double?
That is why I had opted for the isConvertible<T> in the end as an extra API on top of isFloatingPoint.

Please let me know what you prefer, also @tlmquintino and @simondsmart and/or others , please provide input here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree! I'm only trying to limit the amount of variations on these "families of methods" across the classes (I count 3 variants before the PR?). An idea for the future would be to add a isConvertible to eckit::Value instead (avoiding some if's, the necessary hardcode would fall in the mythical "right place" (tm)) and Configuration delegating to that.

I think we're not really discussing code but architecture, as the PR is obviously excellent (approved!), we only need to find if it fits the classes intention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small comment. We explicitly do not want a method to expose the internal eckit::Value root.

Copy link
Contributor

@geier1993 geier1993 Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already had discussions with @simondsmart half a year ago.

One the one side we do not want to expose eckit::Value, on the otherside we have no mechanisms to efficiently copy a value from one map to another.
Inspecting a type is very cumbersome. In multio we have some code that runs a lot of if (isList()) else if (isMap()) else if(isNumber()) .... Which is super inefficient, many unnecessary function calls and checks.

Given that we the list of types is already fixed, it would be more useful to

  • either create a enum ValueType listing all the types and a virtual function ValueType getType() const to retrieve the type and properly switch on it
  • or to implement a visitor with a double-disptach, on which a properly typed overload is called with a reference to the value.. This would allow doing every typed specialization we want behind the interface boundary with 2 function calls.

Then we can use std::is_convertible_v (https://en.cppreference.com/w/cpp/types/is_convertible) or eckit::IsTranslatable in combination with eckit::translate.

The isConvertible call is likely to be followed by further inspection calls or eventually by a get call... which will quickly sum up in a lot of more calls... Instead with a visitor we are already describing what we are going to do.

With C++17 and lambdas creating a Visitor is also not that cumbersome anymore.
We can always create an overload set and generate most of the visitor.

E.g.

value.visit(Overloaded{
[&](std::string& str) {},
...
[&](auto& v){
// Anything else
}])

or

value.visit([&](auto& v){
  using T = std::decay_t<decltype(v)>;
  if constexpr (std::is_integral_v<T>) {
  } else if constexpr (...){
  }
})

We can even pretend to have arbitrary return values behind the interface by using a mutable Visitor that assigns to an std::optional<RetType> etc.

I don't think it is feasible to add more and more functions on top of the interface to get the "type problem" solved.
With the way modern C++ is taking, I would even avoid hiding everything behind a dynamic interface unless there's a real need to and otherwise use all the template & type mechanisms C++ offers.

All these compile time features c++ offers are clearly things that allows C++ to compile & optimize more and distinguishes it from most other languages.

With C++20 concepts & templated lambdas this will eventually also be readable, with the same logic behind.

Copy link
Member Author

@wdeconinck wdeconinck Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading this comment I was wondering how multio managed to do this introspection without this PR changes.
It seems that the "eckit::Value as implementation detail" contract is broken and the eckit::Value is extracted from the LocalConfiguration. This should not have been allowed. e.g. here:

eckit::Value getInputGrid(const eckit::LocalConfiguration& cfg, message::Metadata& md) {
    if (md.has("atlas-grid-kind")) {  // metadata has always precedence
        // TODO: name is bad on purpose (no software support this at the moment)
        return eckit::Value{md.getSubConfiguration("atlas-grid-kind").get()};
    }

    if (cfg.has("input")) {  // configuration file is second option
        return eckit::Value{cfg.getSubConfiguration("input").get()};
    }

    throw eckit::SeriousBug("action-interpolate :: Unable to identify input grid", Here());
}

And then down the line eckit::Value is used to do the introspection (perhaps for another occurrence of this pattern).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was done already somewhere else. Think even in atlas? And without that there is no way to copy an entry with arbitrary type from one configuration to another...

But the points I want to make:

  • We already deal with a fixed set of types. So I think we should make that more explicit to use the other features & power of C++ and avoid the definitions of a lot of virtual functions....

  • eckit::Configuration etc. are actually a map. If we do not expose an interfacing value type. All access and operations have to be made through a lookup first and then on the type. With all the introspecive operations we will repeatedly lookup the same key just to operate on the value.
    The pure C++ way would probably be to perform a lookup and return an iterator - that's not possible for an interface. However, that's more reason to have a wrapper type like eckit::Value.

@wdeconinck
Copy link
Member Author

The design of Configuration was that eckit::Value is an implementation detail. see e.g. comment

class Configuration : public Parametrisation {
    /// @note Do NOT expose eckit::Value in the interface of configuration
    ///       eckit::Value should remain an internal detail of configuration objects
    ///       Clients should use typed configuration parameters

I don't mind but others may have issue with "const Value& Configuration::getValue( name );"

@tlmquintino
Copy link
Member

The design of Configuration was that eckit::Value is an implementation detail. see e.g. comment

class Configuration : public Parametrisation {
    /// @note Do NOT expose eckit::Value in the interface of configuration
    ///       eckit::Value should remain an internal detail of configuration objects
    ///       Clients should use typed configuration parameters

I don't mind but others may have issue with "const Value& Configuration::getValue( name );"

that's what I meant with my comment on the other thread

@wdeconinck
Copy link
Member Author

@tlmquintino , @simondsmart could you please advise on the desired API as discussed above?

@geier1993
Copy link
Contributor

The design of Configuration was that eckit::Value is an implementation detail. see e.g. comment

class Configuration : public Parametrisation {
    /// @note Do NOT expose eckit::Value in the interface of configuration
    ///       eckit::Value should remain an internal detail of configuration objects
    ///       Clients should use typed configuration parameters

I don't mind but others may have issue with "const Value& Configuration::getValue( name );"

But in multio we have exactly the problem that the assumption of having typed configuration parameters is too strong.
We have actions that may contain a subconfigurations with configurations and settings for other libraries like eccodes & MIR to be passed down.

Without that kind of feature, we would have to add a lot of more that is just explicitly performing a lot of explicity queries to the configuration. This list might grow very large and potentially change with updates to these libraries.

@wdeconinck
Copy link
Member Author

@tlmquintino , @simondsmart could you please advise on the desired API as discussed above?

ping @tlmquintino @simondsmart

@wdeconinck wdeconinck force-pushed the feature/config_introspection branch from 5eddd33 to 04f18bd Compare May 15, 2024 14:15
@wdeconinck
Copy link
Member Author

I think 4 months should be more than enough for a PR waiting for review. Unless someone objects before Friday 24/05/2024, I will assume everything is OK.

@geier1993
Copy link
Contributor

We will need this to in multio to resolve all of the deprecation warnings.

Moreover I will need some of it in my mars2grib work on metkit to distinguish between int, string and float.

@wdeconinck wdeconinck force-pushed the feature/config_introspection branch from 04f18bd to 44e7ddd Compare May 30, 2024 07:31
@wdeconinck wdeconinck merged commit 3a95f91 into develop May 30, 2024
193 of 194 checks passed
@wdeconinck wdeconinck deleted the feature/config_introspection branch May 30, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants