Ponded water/alternative solutions - object-oriented #192

billsacks · 2017-12-28T22:02:42Z

NOTE: This PR was originally created in the temporary CTSM repository by @billsacks, Sep 8, 2017, where it was NCAR/clm-ctsm#14. I am moving over most of the comments from there: both my own and those of @martynpclark .

Original comment from @billsacks Sep 8, 2017 (with references to PRs and commits changed for this new repository)

NOTE: This PR has not been tested (though it does compile). It is for discussion only.

This is a counterpart to #191 , but with an object-oriented implementation.

In order to facilitate comparison with @martynpclark 's implementation, I started with his changes, and then made changes on top of that. To see just what is different in the object-oriented version, see 98da3bf . You can see from this that the two versions are very similar. The biggest addition in the OO version is the declaration of a type, which holds input arguments and a type-bound procedure declaration:

  type, extends(flux_type) :: drainPond_type
     real(r8) :: smoothScale
     real(r8) :: drainMax
   contains
     procedure :: getFlux => drainPondFlux
  end type drainPond_type

I prefer this version over the internal subroutine for the reasons I laid out here: #191 (comment)

But I'm open to discussion.

Compared with @bandre-ucar 's suggestion: I think the two would be pretty similar. I'm not sure I see any real downsides of this one... Ben's just wouldn't have drainPondFlux be a type-bound procedure. But making drainPondFlux a type-bound procedure opens some additional doors that I hope to show in some upcoming commits.

@mvertens I think you were also interested to see this.

Also fix some indentation

This simplifies the use of implicitEuler. Martyn Clark supported doing this, since it will be needed in general: https://github.com/NCAR/clm-ctsm/pull/12#discussion_r137650687 Also, move the endrun clal into implicitEuler, rather than requiring the caller to check the error code. This simplifies the interface, and works more smoothly now that the final flux calculation is also in this routine.

Main point is to decrease the clutter in the main QflxH2osfcDrain routine - trying to keep the focus more on the science than on the alternative numerical solutions.

…hods" This reverts commit c5abd576a98204374efdd1912bddb04896aa9771. At today's CTSM meeting, people felt that this had more negatives than positives.

…ns_oo

billsacks · 2017-12-28T22:04:27Z

Comment from @martynpclark Sep 8, 2017

@billsacks -- very nice.

billsacks · 2017-12-28T22:06:08Z

src/utils/computeFluxMod.F90

@@ -5,24 +5,21 @@ module computeFluxMod
  use shr_kind_mod      , only : r8 => shr_kind_r8
  implicit none

+  type, abstract :: flux_type


Comment from @martynpclark Sep 8, 2017

This is pedantic, I know, though can we call this computeFlux_type (or something similar)? The reason is that it is easy to confuse flux_type with WaterFlux_type and EnergyFlux_type, and folks could assume that the type only includes fluxes (whereas, in fact, it will include all variables required to compute fluxes).

Comment from @billsacks Sep 11, 2017

Fixed - thanks. Not pedantic at all - this is a very good point.

billsacks · 2017-12-28T22:10:53Z

Comment from @billsacks Sep 11, 2017, edited to point to PRs and commits in this new repo

One thing that was bothering me a bit with this prototype (and the same issue exists in the prototype in #191 ) is the extra complexity introduced into the science routines by having the alternative numerical solutions inline along with the science. With these long select case statements embedded in the science routines, it gets hard to see what's really going on at a higher level.

ea15490 shows one possible partial solution to this problem. Downsides are: (1) There's a bit more boiler-plate needed to define the different subroutines; (2) the extra subroutine call could lead to a loss of efficiency, though I hope that the compiler is smart enough to inline this. (And (2) should be less of an issue – at least for global runs – once we rewrite this so the flux calculation routines operate on arrays rather than single points.) That said, I still like this solution because it makes it easier to see the high-level structure of the main routine without extra clutter. I'd be curious what others think about this.

(Note that the setMetadata method is never called here: a full implementation would need to call that in initialization for each flux that builds on computeFlux_type.) The purpose of this change was to remove what would presumably become duplication of this select case statement throughout the code base. However, I'm not very happy with how this change turned out. The particular challenge here was handling the fact that different fluxes would provide a different set of solution methods. This led me to put stub methods in the base class that die at runtime if they are not overridden, but I feel this makes the use of the base class less obvious and more error-prone. (But I couldn't see a better way to do this - at least, not with introducing a lot of extra complexity.) Whether this extra step is worthwhile depends on how many instances of this select case statement are sprinkled throughout the code, and how much this is cluttering up the code. But overall, given that we potentially have a different set of allowed methods for each flux, I'd prefer not to go with the changes here - instead having some partial duplication of the select case statements. Or we could go with the solution suggested here ea15490#r26526545 of having function pointers, which would allow moving these select case statements to initialization (even though that would still involve many partially-duplicated select case statements throughout the code). Another problem with the changes here is that there will be some performance overhead with the additional subroutine calls - specifically, one extra call to 'solveForFlux' rather than having that inlined. This is likely very bad with the current implementation - point-by-point. With an array implementation, this cost is amortized over all array elements, but may still be non-trivial - and will still be a high cost for single-point runs. -- In some ways, this could be clearer & simpler if we kept computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and instead introduced a separate class to handle the selection between methods. That separate class would have the stub implementations of the various computeFlux methods. Then a flux would be required to provide (some) implementations of the computeFlux* methods, but it would only need to provide an implementation of computeFlux_type (i.e., the getFlux method) if it wanted to call a solution method that required it (like implicitEuler). The instance of the new class would then need to contain an instance of the computeFlux child type, so that it can be referenced by things like comuteFluxImplicitEuler. The main advantage of this would be to provide a more clear signal about whether a given flux can be plugged into things like implicitEuler, which expects an implementation of a particular interface. However, downsides are: - Requires more boilerplate to set up the (now) two classes for a given flux - I think there would be some awkwardness / duplication in terms of the internal variables that are stored in the class: I think these would now be needed in both the computeFlux child class and the new child class. (Though there may be some way to get the computeFlux class to reference things in the new class, e.g., by having a pointer to an object of the new class.) One possible compromise is: keep the computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and make that a base class for the new type. This would provide a more clear signal regarding what's needed in implicitEuler, and a cleaner mechanism to plug a flux into implicitEuler even if it doesn't want to extend the new class (because you could extend computeFlux_type without extending the new class). The new class would provide a default implementation of getFlux, which just calls endrun. Then specific flux implementations would extend the new class, and could then provide implementations for whatever computeFlux* methods they want, as well as possibly providing an implementation for getFlux (if needed). - i.e., the inheritance chain would be: computeFlux_type (abstract interface of getFlux) -> some new class (provides stub implementations for getFlux and various computeFlux* methods) -> specific flux class (provides actual implementations as needed) - But I'm not sure that buys us much relative to what we had in pondedWater/alternativeSolutions_oo: It does make it easier to use implicitEuler (for fluxes that don't want the functionalities of the new class – i.e., the selection between calculation mechanisms), but at the cost of introducing a new class into the design. This could be worth doing if this benefit would be useful in practice. -- Another possibility would be to replace the 'select case' for the solution method with polymorphism. However, this would require having a separate class for each solution method for each flux, which feels like a lot of coding overhead. In this case, stuff shared between the different solution methods – such as data that need to be set initially and the getFlux method – could be in a base class for that flux, and then each solution method would extend that base class, providing a computeFlux method. But I think this is too much overhead. I think a better solution than that would be to ditch the whole select case and solution_method stuff, and go back to my earlier branch (pondedWater/alternativeSolutions_oo as referenced in #192), and go with the idea of using function pointers, as suggested by ea15490 .

(Note that the setMetadata method is never called here: a full implementation would need to call that in initialization for each flux that builds on computeFlux_type.) The purpose of this change was to remove what would presumably become duplication of this select case statement throughout the code base. However, I'm not very happy with how this change turned out. The particular challenge here was handling the fact that different fluxes would provide a different set of solution methods. This led me to put stub methods in the base class that die at runtime if they are not overridden, but I feel this makes the use of the base class less obvious and more error-prone. (But I couldn't see a better way to do this - at least, not with introducing a lot of extra complexity.) Whether this extra step is worthwhile depends on how many instances of this select case statement are sprinkled throughout the code, and how much this is cluttering up the code. But overall, given that we potentially have a different set of allowed methods for each flux, I'd prefer not to go with the changes here - instead having some partial duplication of the select case statements. Or we could go with the solution suggested here ea15490#r26526545 of having function pointers, which would allow moving these select case statements to initialization (even though that would still involve many partially-duplicated select case statements throughout the code). Another problem with the changes here is that there will be some performance overhead with the additional subroutine calls - specifically, one extra call to 'solveForFlux' rather than having that inlined. This is likely very bad with the current implementation - point-by-point. With an array implementation, this cost is amortized over all array elements, but may still be non-trivial - and will still be a high cost for single-point runs. -- In some ways, this could be clearer & simpler if we kept computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and instead introduced a separate class to handle the selection between methods. That separate class would have the stub implementations of the various computeFlux methods. Then a flux would be required to provide (some) implementations of the computeFlux* methods, but it would only need to provide an implementation of computeFlux_type (i.e., the getFlux method) if it wanted to call a solution method that required it (like implicitEuler). The instance of the new class would then need to contain an instance of the computeFlux child type, so that it can be referenced by things like comuteFluxImplicitEuler. The main advantage of this would be to provide a more clear signal about whether a given flux can be plugged into things like implicitEuler, which expects an implementation of a particular interface. However, downsides are: - Requires more boilerplate to set up the (now) two classes for a given flux - I think there would be some awkwardness / duplication in terms of the internal variables that are stored in the class: I think these would now be needed in both the computeFlux child class and the new child class. (Though there may be some way to get the computeFlux class to reference things in the new class, e.g., by having a pointer to an object of the new class.) One possible compromise is: keep the computeFlux_type as it was in pondedWater/alternativeSolutions_oo, and make that a base class for the new type. This would provide a more clear signal regarding what's needed in implicitEuler, and a cleaner mechanism to plug a flux into implicitEuler even if it doesn't want to extend the new class (because you could extend computeFlux_type without extending the new class). The new class would provide a default implementation of getFlux, which just calls endrun. Then specific flux implementations would extend the new class, and could then provide implementations for whatever computeFlux* methods they want, as well as possibly providing an implementation for getFlux (if needed). - i.e., the inheritance chain would be: computeFlux_type (abstract interface of getFlux) -> some new class (provides stub implementations for getFlux and various computeFlux* methods) -> specific flux class (provides actual implementations as needed) - But I'm not sure that buys us much relative to what we had in pondedWater/alternativeSolutions_oo: It does make it easier to use implicitEuler (for fluxes that don't want the functionalities of the new class – i.e., the selection between calculation mechanisms), but at the cost of introducing a new class into the design. This could be worth doing if this benefit would be useful in practice. -- Another possibility would be to replace the 'select case' for the solution method with polymorphism. However, this would require having a separate class for each solution method for each flux, which feels like a lot of coding overhead. In this case, stuff shared between the different solution methods – such as data that need to be set initially and the getFlux method – could be in a base class for that flux, and then each solution method would extend that base class, providing a computeFlux method. But I think this is too much overhead. I think a better solution than that would be to ditch the whole select case and solution_method stuff, and go back to my earlier branch (pondedWater/alternativeSolutions_oo as referenced in #192), and go with the idea of using function pointers, as suggested by ea15490#r26526545

billsacks · 2017-12-29T02:22:51Z

Comment from @billsacks Sep 12, 2017, edited to refer to the PR in this new repo

I was trying for some improvements in #193 , but I don't like how that turned out, so I'm closing that PR. (I just wanted to use the PR mechanism to record my ideas in case I want to come back to it later.)

billsacks · 2017-12-29T02:23:32Z

Comment from @billsacks Sep 13, 2017

For this and similar situations: Probably it would make sense to move the QflxH2osfcDrain subroutine into the drainPond_type class. This would pull together all related routines into this class, and would allow following the convention (which we're trying to enforce in CLM) that instance variables in a class (even public ones) should only be set by methods of that class.

billsacks · 2017-12-29T02:25:02Z

Comment from @billsacks Sep 13, 2017, edited to refer to the PR in this new repo

I was thinking about where the drainPond_inst instance should be stored, once we move to having this work on arrays rather than one element at a time. We want to avoid doing dynamic memory allocation in every time step.

So that leaves us with two options:

Allocate space for it and store it somewhere in initialization
Keep it declared as a local variable in the calling subroutine. Make it a parameterized derived type for the array lengths (i.e., parameterized on bounds%endc, etc.)

(1) is required if we want to save any history variables here, or if we want to do the namelist read in initialization in this class. This is more in line with what we do elsewhere in CLM.

(2) is more efficient in terms of memory (because the necessary memory doesn't need to stick around for the whole run) and may be more efficient in time (performance of stack vs. heap memory). But this will only work right now (until we move to always using array lower bounds of 1) if we can specify array lower bounds in parameterized derived types (I'm not sure if this is allowed).

(Note that the notion of what the calling subroutine is would change if we followed the suggestion of #192 (comment) : In this case, the calling subroutine would become UpdateH2osfc, so the instance would become a local variable to that subroutine.)

billsacks · 2017-12-29T02:28:16Z

Comment from @billsacks Sep 13, 2017

I was thinking more about #192 (comment) . I do like the function pointer idea, the start of whose implementation could look like this:

diff --git a/components/clm/src/biogeophys/SoilHydrologyMod.F90 b/components/clm/src/biogeophys/SoilHydrologyMod.F90
index 88b4572..ff9d9a4 100644
--- a/components/clm/src/biogeophys/SoilHydrologyMod.F90
+++ b/components/clm/src/biogeophys/SoilHydrologyMod.F90
@@ -65,11 +65,18 @@ module SoilHydrologyMod
      real(r8) :: smoothScale = 0.05_r8  ! smoothing scale
    contains
      procedure :: getFlux => drainPondFlux  ! required method to get the current flux estimate
-     procedure :: drainPondExplicitEuler    ! compute the drainPond flux using an explicit euler method
-     procedure :: drainPondImplicitEuler    ! compute the drainPond flux using an implicit euler method
-     procedure :: drainPondAnalytical       ! compute the drainPond flux using an analytical solution
+     procedure(computeDrainPond_interface), pointer :: computeDrainPond
   end type drainPond_type

+  abstract interface
+     subroutine computeDrainPond_interface(this, h2osfc, qflx_h2osfc_drain)
+       import :: drainPond_type
+       class(drainPond_type), intent(in) :: this
+       real(r8) , intent(in)  :: h2osfc            ! drainPond storage
+       real(r8) , intent(out) :: qflx_h2osfc_drain ! drainage flux
+     end subroutine computeDrainPond_interface
+  end interface
+
   !-----------------------------------------------------------------------
   real(r8), private :: baseflow_scalar = 1.e-2_r8

However, this only makes sense if we have a place to store the drainPond_inst instance persistently, so that it's set up in initialization (rather than having it be a local variable to the calling subroutine - in which case we'd need to recreate the function pointer every time step). We may want to solve that problem anyway, in order to (a) allocate necessary arrays, and (b) do a namelist read in that class. See also the related comment #192 (comment)

billsacks · 2017-12-29T02:30:58Z

src/biogeophys/SoilHydrologyMod.F90

+  integer, parameter :: ixImplicitEuler=1002  ! implicit Euler solution with operator splitting
+  integer, parameter :: ixAnalytical=1003     ! analytical solution with operator splitting
+  integer  :: ixSolution=ixExplicitEuler
+


Comment from @martynpclark Sep 13, 2017

We should talk about how this is implemented. In principle it is possible to have multiple different numerical solutions for every single flux. This would provide scope to mix-and-match different numerical solutions in different parts of the code (sort of the way that CLM is developed now, although CLM does not have multiple numerical options for each flux). This would be incredibly cumbersome and likely confusing. At the other end of the spectrum we use the same numerical options everywhere in the code. This could be too restrictive.

I initially included these alternative options as (1) an example of how we can increase the accuracy of the operator splitting constrained explicit Euler solution without any increase in cost, expecting that there may be cases where people desire to retain operator splitting; and (2) begin the process of separating the physics from the numerical solution.

In my opinion the alternative numerical options for a single flux (as implemented here) may not be a representative use case.

Comment from @billsacks Sep 14, 2017

I'd also like to discuss this point further. And I'd be interested to see a more representative use case at some point.

One other concern I have is: If each flux has multiple possible parameterization options, each with multiple numerical solution options, the code is going to get very complex. (Much of the work I was doing earlier this week was to try to play around with ways to reduce that complexity, but there's no way to remove it entirely.)

Originally I had imagined that, for the most part, the choice of numerical solution method would just appear at high level in the code, not down at the level of individual fluxes. That would reduce this problem of a combinatorial explosion of complexity. I think that's what you're saying, too.

billsacks · 2017-12-29T02:34:33Z

Comment from @billsacks Sep 14, 2017, edited to point to commits in this new repo

The general feeling from today's CTSM meeting was that people were happy with the main changes to make this object oriented - i.e., the change in 98da3bf . However, at least some people (e.g., @mvertens ) felt that the changes in ea15490 had more negatives than positives.

(Note that the changes in the commits between 98da3bf and ea15490 involved minor cleanup that @martynpclark and I discussed already. This means that people are happy to go with something like b872d21, which is the last commit before ea15490.)

billsacks · 2017-12-29T02:35:45Z

Comment from @billsacks Sep 14, 2017

Okay, I have reverted the commit that people didn't like.

billsacks · 2017-12-29T02:36:16Z

Comment from @billsacks Sep 14, 2017

@martynpclark I have updated this branch to include the latest changes on the main 'ctsm' branch, which include my reworked truncation code for h2osfc. So you can now base your next changes off of this branch. Feel free to either work directly on this branch or to create your own branch off of this one (at which point you could do a PR back into this branch to highlight your changes).

billsacks · 2018-05-18T01:51:18Z

I'm closing this PR that was mainly opened for discussion purposes. I'm deleting the pondedWater/alternativeSolutions_oo branch from the escomp/ctsm repository, but I'm pushing it to billsacks/ctsm (and it should also be available from the escomp repo for a while by checking out the last commit: 9133ffd).

handle edge case model no_leap, data gregorian

Martyn Clark and others added 11 commits December 28, 2017 12:09

alternative numerical solutions for surface runoff of ponded water

3932256

change scope of numerical options

3ce6202

alternative numerical solutions for drainage of ponded water

0041a92

Fixes to get Martyn's code to compile

dd3ec25

Also fix some indentation

Reimplement Martyn's idea in an object-oriented way

98da3bf

Rename flux_type to computeFlux_type

cba171d

Rename flux_inst to computeFlux_inst for consistency

87406e2

Move different drainPond calculation options to their own methods

ea15490

Main point is to decrease the clutter in the main QflxH2osfcDrain routine - trying to keep the focus more on the science than on the alternative numerical solutions.

Revert "Move different drainPond calculation options to their own met…

e4d56cd

…hods" This reverts commit c5abd576a98204374efdd1912bddb04896aa9771. At today's CTSM meeting, people felt that this had more negatives than positives.

Merge branch 'unified_land_model' into pondedWater/alternativeSolutio…

9133ffd

…ns_oo

billsacks commented Dec 28, 2017

View reviewed changes

billsacks mentioned this pull request Dec 28, 2017

Ponded water/alternative solutions #191

Closed

billsacks mentioned this pull request Dec 29, 2017

Move the solution method 'select case' block to the base class #193

Closed

billsacks commented Dec 29, 2017

View reviewed changes

billsacks closed this May 18, 2018

billsacks deleted the pondedWater/alternativeSolutions_oo branch May 30, 2018 02:39

samsrabin pushed a commit to samsrabin/CTSM that referenced this pull request May 3, 2024

Merge pull request ESCOMP#192 from jedwards4b/edge_cases2

f285d23

handle edge case model no_leap, data gregorian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ponded water/alternative solutions - object-oriented #192

Ponded water/alternative solutions - object-oriented #192

billsacks commented Dec 28, 2017

billsacks commented Dec 28, 2017

billsacks Dec 28, 2017

billsacks Dec 28, 2017

billsacks commented Dec 28, 2017

billsacks commented Dec 29, 2017 •

edited

Loading

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks Dec 29, 2017

billsacks Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented May 18, 2018

Ponded water/alternative solutions - object-oriented #192

Ponded water/alternative solutions - object-oriented #192

Conversation

billsacks commented Dec 28, 2017

billsacks commented Dec 28, 2017

billsacks Dec 28, 2017

Choose a reason for hiding this comment

billsacks Dec 28, 2017

Choose a reason for hiding this comment

billsacks commented Dec 28, 2017

billsacks commented Dec 29, 2017 • edited Loading

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks Dec 29, 2017

Choose a reason for hiding this comment

billsacks Dec 29, 2017

Choose a reason for hiding this comment

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented Dec 29, 2017

billsacks commented May 18, 2018

billsacks commented Dec 29, 2017 •

edited

Loading