Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ponded water/alternative solutions - object-oriented #192

Closed

Conversation

billsacks
Copy link
Member

NOTE: This PR was originally created in the temporary CTSM repository by @billsacks, Sep 8, 2017, where it was NCAR/clm-ctsm#14. I am moving over most of the comments from there: both my own and those of @martynpclark .

Original comment from @billsacks Sep 8, 2017 (with references to PRs and commits changed for this new repository)

NOTE: This PR has not been tested (though it does compile). It is for discussion only.

This is a counterpart to #191 , but with an object-oriented implementation.

In order to facilitate comparison with @martynpclark 's implementation, I started with his changes, and then made changes on top of that. To see just what is different in the object-oriented version, see 98da3bf . You can see from this that the two versions are very similar. The biggest addition in the OO version is the declaration of a type, which holds input arguments and a type-bound procedure declaration:

  type, extends(flux_type) :: drainPond_type
     real(r8) :: smoothScale
     real(r8) :: drainMax
   contains
     procedure :: getFlux => drainPondFlux
  end type drainPond_type

I prefer this version over the internal subroutine for the reasons I laid out here: #191 (comment)

But I'm open to discussion.

Compared with @bandre-ucar 's suggestion: I think the two would be pretty similar. I'm not sure I see any real downsides of this one... Ben's just wouldn't have drainPondFlux be a type-bound procedure. But making drainPondFlux a type-bound procedure opens some additional doors that I hope to show in some upcoming commits.

@mvertens I think you were also interested to see this.

Martyn Clark and others added 11 commits December 28, 2017 12:09
Also fix some indentation
This simplifies the use of implicitEuler. Martyn Clark supported doing
this, since it will be needed in general:
https://github.com/NCAR/clm-ctsm/pull/12#discussion_r137650687

Also, move the endrun clal into implicitEuler, rather than requiring the
caller to check the error code. This simplifies the interface, and works
more smoothly now that the final flux calculation is also in this
routine.
Main point is to decrease the clutter in the main QflxH2osfcDrain
routine - trying to keep the focus more on the science than on the
alternative numerical solutions.
…hods"

This reverts commit c5abd576a98204374efdd1912bddb04896aa9771.

At today's CTSM meeting, people felt that this had more negatives than
positives.
@billsacks
Copy link
Member Author

Comment from @martynpclark Sep 8, 2017

@billsacks -- very nice.

@@ -5,24 +5,21 @@ module computeFluxMod
use shr_kind_mod , only : r8 => shr_kind_r8
implicit none

type, abstract :: flux_type
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @martynpclark Sep 8, 2017

This is pedantic, I know, though can we call this computeFlux_type (or something similar)? The reason is that it is easy to confuse flux_type with WaterFlux_type and EnergyFlux_type, and folks could assume that the type only includes fluxes (whereas, in fact, it will include all variables required to compute fluxes).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @billsacks Sep 11, 2017

Fixed - thanks. Not pedantic at all - this is a very good point.

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 11, 2017, edited to point to PRs and commits in this new repo

One thing that was bothering me a bit with this prototype (and the same issue exists in the prototype in #191 ) is the extra complexity introduced into the science routines by having the alternative numerical solutions inline along with the science. With these long select case statements embedded in the science routines, it gets hard to see what's really going on at a higher level.

ea15490 shows one possible partial solution to this problem. Downsides are: (1) There's a bit more boiler-plate needed to define the different subroutines; (2) the extra subroutine call could lead to a loss of efficiency, though I hope that the compiler is smart enough to inline this. (And (2) should be less of an issue – at least for global runs – once we rewrite this so the flux calculation routines operate on arrays rather than single points.) That said, I still like this solution because it makes it easier to see the high-level structure of the main routine without extra clutter. I'd be curious what others think about this.

billsacks added a commit that referenced this pull request Dec 28, 2017
(Note that the setMetadata method is never called here: a full
implementation would need to call that in initialization for each
flux that builds on computeFlux_type.)

The purpose of this change was to remove what would presumably become
duplication of this select case statement throughout the code
base. However, I'm not very happy with how this change turned out. The
particular challenge here was handling the fact that different fluxes
would provide a different set of solution methods. This led me to put
stub methods in the base class that die at runtime if they are not
overridden, but I feel this makes the use of the base class less obvious
and more error-prone. (But I couldn't see a better way to do this - at
least, not with introducing a lot of extra complexity.)

Whether this extra step is worthwhile depends on how many instances of
this select case statement are sprinkled throughout the code, and how
much this is cluttering up the code. But overall, given that we
potentially have a different set of allowed methods for each flux, I'd
prefer not to go with the changes here - instead having some partial
duplication of the select case statements. Or we could go with the
solution suggested here
ea15490#r26526545
of having function pointers, which would allow moving these select case
statements to initialization (even though that would still involve many
partially-duplicated select case statements throughout the code).

Another problem with the changes here is that there will be some
performance overhead with the additional subroutine calls -
specifically, one extra call to 'solveForFlux' rather than having that
inlined. This is likely very bad with the current implementation -
point-by-point. With an array implementation, this cost is amortized
over all array elements, but may still be non-trivial - and will still
be a high cost for single-point runs.

--

In some ways, this could be clearer & simpler if we kept
computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and
instead introduced a separate class to handle the selection between
methods. That separate class would have the stub implementations of the
various computeFlux methods. Then a flux would be required to provide
(some) implementations of the computeFlux* methods, but it would only
need to provide an implementation of computeFlux_type (i.e., the getFlux
method) if it wanted to call a solution method that required it (like
implicitEuler). The instance of the new class would then need to contain
an instance of the computeFlux child type, so that it can be referenced
by things like comuteFluxImplicitEuler.

The main advantage of this would be to provide a more clear signal about
whether a given flux can be plugged into things like implicitEuler,
which expects an implementation of a particular interface.

However, downsides are:
- Requires more boilerplate to set up the (now) two classes for a given
  flux
- I think there would be some awkwardness / duplication in terms of the
  internal variables that are stored in the class: I think these would
  now be needed in both the computeFlux child class and the new child
  class. (Though there may be some way to get the computeFlux class to
  reference things in the new class, e.g., by having a pointer to an
  object of the new class.)

One possible compromise is: keep the computeFlux_type as it was in
pondedWater/alternativeSolutions_oo, and make that a base class for the
new type. This would provide a more clear signal regarding what's needed
in implicitEuler, and a cleaner mechanism to plug a flux into
implicitEuler even if it doesn't want to extend the new class (because
you could extend computeFlux_type without extending the new class). The
new class would provide a default implementation of getFlux, which just
calls endrun. Then specific flux implementations would extend the new
class, and could then provide implementations for whatever computeFlux*
methods they want, as well as possibly providing an implementation for
getFlux (if needed).
- i.e., the inheritance chain would be: computeFlux_type (abstract
  interface of getFlux) -> some new class (provides stub implementations
  for getFlux and various computeFlux* methods) -> specific flux class
  (provides actual implementations as needed)
- But I'm not sure that buys us much relative to what we had in
  pondedWater/alternativeSolutions_oo: It does make it easier to use
  implicitEuler (for fluxes that don't want the functionalities of the
  new class – i.e., the selection between calculation mechanisms), but
  at the cost of introducing a new class into the design. This could be
  worth doing if this benefit would be useful in practice.

--

Another possibility would be to replace the 'select case' for the
solution method with polymorphism. However, this would require having a
separate class for each solution method for each flux, which feels like
a lot of coding overhead.

In this case, stuff shared between the different solution methods – such
as data that need to be set initially and the getFlux method – could be
in a base class for that flux, and then each solution method would
extend that base class, providing a computeFlux method.

But I think this is too much overhead. I think a better solution than
that would be to ditch the whole select case and solution_method stuff,
and go back to my earlier branch (pondedWater/alternativeSolutions_oo as
referenced in #192), and go with the
idea of using function pointers, as suggested by
ea15490 .
billsacks added a commit that referenced this pull request Dec 28, 2017
(Note that the setMetadata method is never called here: a full
implementation would need to call that in initialization for each
flux that builds on computeFlux_type.)

The purpose of this change was to remove what would presumably become
duplication of this select case statement throughout the code
base. However, I'm not very happy with how this change turned out. The
particular challenge here was handling the fact that different fluxes
would provide a different set of solution methods. This led me to put
stub methods in the base class that die at runtime if they are not
overridden, but I feel this makes the use of the base class less obvious
and more error-prone. (But I couldn't see a better way to do this - at
least, not with introducing a lot of extra complexity.)

Whether this extra step is worthwhile depends on how many instances of
this select case statement are sprinkled throughout the code, and how
much this is cluttering up the code. But overall, given that we
potentially have a different set of allowed methods for each flux, I'd
prefer not to go with the changes here - instead having some partial
duplication of the select case statements. Or we could go with the
solution suggested here
ea15490#r26526545
of having function pointers, which would allow moving these select case
statements to initialization (even though that would still involve many
partially-duplicated select case statements throughout the code).

Another problem with the changes here is that there will be some
performance overhead with the additional subroutine calls -
specifically, one extra call to 'solveForFlux' rather than having that
inlined. This is likely very bad with the current implementation -
point-by-point. With an array implementation, this cost is amortized
over all array elements, but may still be non-trivial - and will still
be a high cost for single-point runs.

--

In some ways, this could be clearer & simpler if we kept
computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and
instead introduced a separate class to handle the selection between
methods. That separate class would have the stub implementations of the
various computeFlux methods. Then a flux would be required to provide
(some) implementations of the computeFlux* methods, but it would only
need to provide an implementation of computeFlux_type (i.e., the getFlux
method) if it wanted to call a solution method that required it (like
implicitEuler). The instance of the new class would then need to contain
an instance of the computeFlux child type, so that it can be referenced
by things like comuteFluxImplicitEuler.

The main advantage of this would be to provide a more clear signal about
whether a given flux can be plugged into things like implicitEuler,
which expects an implementation of a particular interface.

However, downsides are:
- Requires more boilerplate to set up the (now) two classes for a given
  flux
- I think there would be some awkwardness / duplication in terms of the
  internal variables that are stored in the class: I think these would
  now be needed in both the computeFlux child class and the new child
  class. (Though there may be some way to get the computeFlux class to
  reference things in the new class, e.g., by having a pointer to an
  object of the new class.)

One possible compromise is: keep the computeFlux_type as it was in
pondedWater/alternativeSolutions_oo, and make that a base class for the
new type. This would provide a more clear signal regarding what's needed
in implicitEuler, and a cleaner mechanism to plug a flux into
implicitEuler even if it doesn't want to extend the new class (because
you could extend computeFlux_type without extending the new class). The
new class would provide a default implementation of getFlux, which just
calls endrun. Then specific flux implementations would extend the new
class, and could then provide implementations for whatever computeFlux*
methods they want, as well as possibly providing an implementation for
getFlux (if needed).
- i.e., the inheritance chain would be: computeFlux_type (abstract
  interface of getFlux) -> some new class (provides stub implementations
  for getFlux and various computeFlux* methods) -> specific flux class
  (provides actual implementations as needed)
- But I'm not sure that buys us much relative to what we had in
  pondedWater/alternativeSolutions_oo: It does make it easier to use
  implicitEuler (for fluxes that don't want the functionalities of the
  new class – i.e., the selection between calculation mechanisms), but
  at the cost of introducing a new class into the design. This could be
  worth doing if this benefit would be useful in practice.

--

Another possibility would be to replace the 'select case' for the
solution method with polymorphism. However, this would require having a
separate class for each solution method for each flux, which feels like
a lot of coding overhead.

In this case, stuff shared between the different solution methods – such
as data that need to be set initially and the getFlux method – could be
in a base class for that flux, and then each solution method would
extend that base class, providing a computeFlux method.

But I think this is too much overhead. I think a better solution than
that would be to ditch the whole select case and solution_method stuff,
and go back to my earlier branch (pondedWater/alternativeSolutions_oo as
referenced in #192), and go with the
idea of using function pointers, as suggested by
ea15490#r26526545
@billsacks
Copy link
Member Author

billsacks commented Dec 29, 2017

Comment from @billsacks Sep 12, 2017, edited to refer to the PR in this new repo

I was trying for some improvements in #193 , but I don't like how that turned out, so I'm closing that PR. (I just wanted to use the PR mechanism to record my ideas in case I want to come back to it later.)

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 13, 2017

For this and similar situations: Probably it would make sense to move the QflxH2osfcDrain subroutine into the drainPond_type class. This would pull together all related routines into this class, and would allow following the convention (which we're trying to enforce in CLM) that instance variables in a class (even public ones) should only be set by methods of that class.

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 13, 2017, edited to refer to the PR in this new repo

I was thinking about where the drainPond_inst instance should be stored, once we move to having this work on arrays rather than one element at a time. We want to avoid doing dynamic memory allocation in every time step.

So that leaves us with two options:

  1. Allocate space for it and store it somewhere in initialization

  2. Keep it declared as a local variable in the calling subroutine. Make it a parameterized derived type for the array lengths (i.e., parameterized on bounds%endc, etc.)

(1) is required if we want to save any history variables here, or if we want to do the namelist read in initialization in this class. This is more in line with what we do elsewhere in CLM.

(2) is more efficient in terms of memory (because the necessary memory doesn't need to stick around for the whole run) and may be more efficient in time (performance of stack vs. heap memory). But this will only work right now (until we move to always using array lower bounds of 1) if we can specify array lower bounds in parameterized derived types (I'm not sure if this is allowed).

(Note that the notion of what the calling subroutine is would change if we followed the suggestion of #192 (comment) : In this case, the calling subroutine would become UpdateH2osfc, so the instance would become a local variable to that subroutine.)

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 13, 2017

I was thinking more about #192 (comment) . I do like the function pointer idea, the start of whose implementation could look like this:

diff --git a/components/clm/src/biogeophys/SoilHydrologyMod.F90 b/components/clm/src/biogeophys/SoilHydrologyMod.F90
index 88b4572..ff9d9a4 100644
--- a/components/clm/src/biogeophys/SoilHydrologyMod.F90
+++ b/components/clm/src/biogeophys/SoilHydrologyMod.F90
@@ -65,11 +65,18 @@ module SoilHydrologyMod
      real(r8) :: smoothScale = 0.05_r8  ! smoothing scale
    contains
      procedure :: getFlux => drainPondFlux  ! required method to get the current flux estimate
-     procedure :: drainPondExplicitEuler    ! compute the drainPond flux using an explicit euler method
-     procedure :: drainPondImplicitEuler    ! compute the drainPond flux using an implicit euler method
-     procedure :: drainPondAnalytical       ! compute the drainPond flux using an analytical solution
+     procedure(computeDrainPond_interface), pointer :: computeDrainPond
   end type drainPond_type

+  abstract interface
+     subroutine computeDrainPond_interface(this, h2osfc, qflx_h2osfc_drain)
+       import :: drainPond_type
+       class(drainPond_type), intent(in) :: this
+       real(r8) , intent(in)  :: h2osfc            ! drainPond storage
+       real(r8) , intent(out) :: qflx_h2osfc_drain ! drainage flux
+     end subroutine computeDrainPond_interface
+  end interface
+
   !-----------------------------------------------------------------------
   real(r8), private :: baseflow_scalar = 1.e-2_r8

However, this only makes sense if we have a place to store the drainPond_inst instance persistently, so that it's set up in initialization (rather than having it be a local variable to the calling subroutine - in which case we'd need to recreate the function pointer every time step). We may want to solve that problem anyway, in order to (a) allocate necessary arrays, and (b) do a namelist read in that class. See also the related comment #192 (comment)

integer, parameter :: ixImplicitEuler=1002 ! implicit Euler solution with operator splitting
integer, parameter :: ixAnalytical=1003 ! analytical solution with operator splitting
integer :: ixSolution=ixExplicitEuler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @martynpclark Sep 13, 2017

We should talk about how this is implemented. In principle it is possible to have multiple different numerical solutions for every single flux. This would provide scope to mix-and-match different numerical solutions in different parts of the code (sort of the way that CLM is developed now, although CLM does not have multiple numerical options for each flux). This would be incredibly cumbersome and likely confusing. At the other end of the spectrum we use the same numerical options everywhere in the code. This could be too restrictive.

I initially included these alternative options as (1) an example of how we can increase the accuracy of the operator splitting constrained explicit Euler solution without any increase in cost, expecting that there may be cases where people desire to retain operator splitting; and (2) begin the process of separating the physics from the numerical solution.

In my opinion the alternative numerical options for a single flux (as implemented here) may not be a representative use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @billsacks Sep 14, 2017

I'd also like to discuss this point further. And I'd be interested to see a more representative use case at some point.

One other concern I have is: If each flux has multiple possible parameterization options, each with multiple numerical solution options, the code is going to get very complex. (Much of the work I was doing earlier this week was to try to play around with ways to reduce that complexity, but there's no way to remove it entirely.)

Originally I had imagined that, for the most part, the choice of numerical solution method would just appear at high level in the code, not down at the level of individual fluxes. That would reduce this problem of a combinatorial explosion of complexity. I think that's what you're saying, too.

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 14, 2017, edited to point to commits in this new repo

The general feeling from today's CTSM meeting was that people were happy with the main changes to make this object oriented - i.e., the change in 98da3bf . However, at least some people (e.g., @mvertens ) felt that the changes in ea15490 had more negatives than positives.

(Note that the changes in the commits between 98da3bf and ea15490 involved minor cleanup that @martynpclark and I discussed already. This means that people are happy to go with something like b872d21, which is the last commit before ea15490.)

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 14, 2017

Okay, I have reverted the commit that people didn't like.

@billsacks
Copy link
Member Author

Comment from @billsacks Sep 14, 2017

@martynpclark I have updated this branch to include the latest changes on the main 'ctsm' branch, which include my reworked truncation code for h2osfc. So you can now base your next changes off of this branch. Feel free to either work directly on this branch or to create your own branch off of this one (at which point you could do a PR back into this branch to highlight your changes).

@billsacks
Copy link
Member Author

I'm closing this PR that was mainly opened for discussion purposes. I'm deleting the pondedWater/alternativeSolutions_oo branch from the escomp/ctsm repository, but I'm pushing it to billsacks/ctsm (and it should also be available from the escomp repo for a while by checking out the last commit: 9133ffd).

@billsacks billsacks closed this May 18, 2018
@billsacks billsacks deleted the pondedWater/alternativeSolutions_oo branch May 30, 2018 02:39
samsrabin pushed a commit to samsrabin/CTSM that referenced this pull request May 3, 2024
handle edge case model no_leap, data gregorian
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant