Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline some of the code to construct the matrix in SoilTemperature #323

Closed
billsacks opened this issue Mar 15, 2018 · 12 comments
Closed

Inline some of the code to construct the matrix in SoilTemperature #323

billsacks opened this issue Mar 15, 2018 · 12 comments
Assignees
Labels
enhancement new capability or improved behavior of existing capability

Comments

@billsacks
Copy link
Member

The SoilTemperature module has an extreme level of modularity, with a few levels of subroutine calls in order to do any work. As noted in #301 , @swensosc investigated this, and found that, by inlining all of the calls to set up the matrix, he improved the performance of the matrix construction by about 1/3. We should probably do at least some of this inlining - both to improve readability and to improve performance. But @martynpclark and @swensosc feel we could strike a compromise: keep some of the modularity, but inline some pieces.

@billsacks billsacks added the enhancement new capability or improved behavior of existing capability label Mar 15, 2018
@negin513
Copy link
Contributor

There could be a way to achieve this level of inlining using compiler flags that specifies inlining levels: such as --inline all or --inline speed or specifying -inline-level.
If compiler inlining using the above flags works, we don't need to compromise the modularity of the Fortran code and we can achieve the speedup (to a degree). I am going to look at the *.optrpt files to check the optimization and test the compiler flags.

@billsacks
Copy link
Member Author

This isn't just a performance thing: The readability of this module is currently being hurt by the deep nesting of subroutine calls. So reducing the deep nesting of calls here is actually a win-win, and we should go with that approach rather than just changing compiler flags.

@negin513
Copy link
Contributor

I agree that if the readability of the module is being hurt by too much modularity, the best and easiest way forward is simplifying the module and subroutine calls. Because of the word "compromise", I was under the impression that there is a demand for this level of modularity.

@billsacks
Copy link
Member Author

I'll clarify: the "compromise" is between the extreme level of modularity in the current code (which is bad for both performance and readability) and having 0 modularity (which may also be bad for readability and maintainability).

@bishtgautam
Copy link

One possible option would be to only split the code only for Snow, SSW, and Soil and not subdivide the code for Urban and NonUrban.

@negin513
Copy link
Contributor

negin513 commented Jan 31, 2020

There are different possible solutions to address this issue and decrease the modularity level.

The complex modular structure for this module introduced in CLM4.5. We can come up with some ideas for better inlining these subroutines by comparing the non-modularized code from CLM4 source to this. (Thanks to @swensosc for pointing me to the non-modularized version.)

Currently, in this module (i.e. SoilTemperatureMod.F90), we have SetMatrix subroutine for snow, soil, standing surface water, etc. The snippet below demonstrates the nested call tree for one of these subroutines (subroutine SetMatrix_snow):

subroutine SetMatrix_snow
      call SetMatrix_SnowUrban
              call SetMatrix_SnowUrbanNonRoad
              call SetMatrix_SnowUrbanRoad
      call SetMatrix_SnowNonUrban

A similar subroutine structure is used for Snow_Soil, Soil, Standing Surface Water, etc. For example:

subroutine SetMatrix_snow_Soil
      call SetMatrix_Snow_SoilUrban
              call SetMatrix_Snow_SoilUrbanNonRoad
              call SetMatrix_Snow_SoilUrbanRoad
      call SetMatrix_Snow_SoilNonUrban

The code *UrbanNonRoad and *UrbanRoad subroutines for each of the area types are very similar and can be easily merged with conditionals to handle both of UrbanNonRoad and UrbanRoad.

do j = -nlevsno+1,0
    do fc = 1,num_nolakec
         c = filter_nolakec(fc)
          l = col%landunit(c)
          if (lun%urbpoi(l)) then

               ! this handles  urban road columns
               if (col%itype(c) == icol_road_imperv .or. col%itype(c) == icol_road_perv) then
                .... 
               ! this handles  urban non- road columns
               if ((col%itype(c) == icol_sunwall .or. col%itype(c) == icol_shadewall &	
                    .or. col%itype(c) == icol_roof))
                .... 

          end if
    end do
end do

We can further simplify this code by combining Urban and NonUrban subroutines for each of the area types. This requires adding conditionals for urban points and non-urban points. So the loop above will be:

do j = -nlevsno+1,0
    do fc = 1,num_nolakec
         c = filter_nolakec(fc)
          l = col%landunit(c)

          ! this handles  only urban columns 
          if (lun%urbpoi(l)) then

               ! this handles  urban road columns
               if (col%itype(c) == icol_road_imperv .or. col%itype(c) == icol_road_perv) then
                .... 
               ! this handles  urban non- road columns
               if ((col%itype(c) == icol_sunwall .or. col%itype(c) == icol_shadewall &	
                    .or. col%itype(c) == icol_roof))
                .... 
          ! this handles  only non-urban columns 
          if (.not. lun%urbpoi(l)) then
                .... 
          end if
    end do
end do

So for each of the area types, we will have one subroutine SetMatrix_${TYPE} that handles Urban (both road and non-road) and non-Urban columns, instead of having five different subroutines.

@billsacks
Copy link
Member Author

This sounds like a good plan @negin513 - thanks for writing it up.

@negin513
Copy link
Contributor

These are my previous plans and I am writing just for documenting them:

Writing SetMatrix for UrbanNonRoad, UrbanRoad and NonUrban in one loop (for each layer type) with conditionals and simply inlining the codes results in a very big loop which I think is more confusing to follow.

Upon further inspection, I've noticed that most lines are identical between UrbanNonRoad and UrbanRoad and NonUrban, so I tried to simplify these conditionals as much as possible or reduce the conditionals. For this purpose, I am defining an array called nlev_thresh which is equal to nlevgnd for NonUrban and UrbanNonRoad and is equal to nlevurb for UrbanRoad.
Using nlev_thresh gives us the capability to use the main SetMatrix loop for all urban and nonurban conditions.

Here I combined all five subroutine: SetMatrix_Snow , SetMatrix_SnowUrban, SetMatrix_SnowUrbanRoad, SetMatrix_SnowUrbanNonroad, and SetMatrix_ nonUrban into one SetMatrix_Snow, which has two loops like below:

      ! one loop to set nlev_thresh

      do j = -nlevsno+1,0
         do fc = 1,num_nolakec
            c = filter_nolakec(fc)
            l = col%landunit(c)
            ! urban non-road and nonurban columns
            nlev_thresh(c) = nlevgrnd

            ! urban road columns
            if ((lun%urbpoi(l)) &
               .and.  &
               (col%itype(c) == icol_sunwall &
               .or. col%itype(c) == icol_shadewall &
               .or. col%itype(c) == icol_roof)) then

               nlev_thresh(c) = nlevurb

            end if
         enddo
       end do

The SetMatrix loop for all columns without the conditionals for urban, nonurban, etc.

      do j = -nlevsno+1,0
         do fc = 1,num_nolakec
            c = filter_nolakec(fc)
            l = col%landunit(c)
            if (j >= col%snl(c)+1) then
                 if (j == col%snl(c)+1) then
                    dzp     = z(c,j+1)-z(c,j)
                    bmatrix_snow(c,4,j-1) = 0._r8
                    bmatrix_snow(c,3,j-1) = 1._r8+(1._r8-cnfac)*fact(c,j)*tk(c,j)/dzp-fact(c,j)*dhsdT(c)
                    if ( j /= 0) then
                       bmatrix_snow(c,2,j-1) =  -(1._r8-cnfac)*fact(c,j)*tk(c,j)/dzp
                    end if
                 else if (j <= nlev_thresh(c) -1) then
                    dzm     = (z(c,j)-z(c,j-1))
                    dzp     = (z(c,j+1)-z(c,j))
                    bmatrix_snow(c,4,j-1) =   - (1._r8-cnfac)*fact(c,j)* tk(c,j-1)/dzm
                    bmatrix_snow(c,3,j-1) = 1._r8+ (1._r8-cnfac)*fact(c,j)*(tk(c,j)/dzp + tk(c,j-1)/dzm)
                    if (j /= 0) then
                       bmatrix_snow(c,2,j-1) =   - (1._r8-cnfac)*fact(c,j)* tk(c,j)/dzp
                    end if
                 end if
            end if
         enddo
      enddo

Please note that exactly similar logic will be used for other Snow_Soil, Soil, Standing Surface Water, etc.
My previous plan was to do similar simplifications and inlining for setting matrix for each layer type (i.e. snow, soil, ...etc) separately.

@negin513
Copy link
Contributor

After discussions with @swensosc, he pointed out that instead of using a different set of subroutines for treating different types of layers, we can combine them into one subroutine for all layers. He prefers to have one SetMatrix that is the same for snow, soil, snow-soil interfaces, etc.

This requires a significant code refactoring in this module but I believe it makes the code much more readable because we are technically doing the same thing for Snow, Soil, and their interfaces.

@negin513
Copy link
Contributor

negin513 commented Mar 12, 2020

Upon further investigations on this subroutine, I've noticed that there are many redundancies and logical issues in the original modular code. I am only including a few examples here, but there are more.

Example 1)

SetMatrixSnow subroutine loops over j= -nlevsno+1, 0

and set dzp to z(c,j+1)-z(c,j).

Similarly SetMatrixSnowSoil subroutine loops over do j = 0,0 *

and set dzp to z(c,j+1)-z(c,j).

setting dzp to z(c,j+1)-z(c,j) is included in both of the subroutines for j=0 and is done twice.

*Please note that do j = 0,0 is exactly copied from the module and I am just quoting here. I do understand that this means j=0 .

@negin513
Copy link
Contributor

Example 2)

The following logical issue is in the SetMatrix_SoilNonUrban

else if (j == 1) then
! this is the snow/soil interface layer
dzm = (z(c,j)-z(c,j-1))
dzp = (z(c,j+1)-z(c,j))
if (j /= 1) then
bmatrix_soil(c,4,j) = - frac_sno_eff(c) * (1._r8-cnfac) * fact(c,j) &
* tk(c,j-1)/dzm
end if
bmatrix_soil(c,3,j) = 1._r8 + (1._r8-cnfac)*fact(c,j)*(tk(c,j)/dzp &
+ frac_sno_eff(c) * tk(c,j-1)/dzm) &
- (1._r8 - frac_sno_eff(c))*fact(c,j)*dhsdT(c)
bmatrix_soil(c,2,j) = - (1._r8-cnfac)*fact(c,j)*tk(c,j)/dzp
else if (j <= nlevgrnd-1) then

Here we have an if block to check if (j == 1) and then inside the same if block we have another conditional to check if (j /= 1).
It is apparent that the condition of if (j /= 1) is never going to be met inside if (j == 1) block.
The same logic issues and redundancies (ifs that are never getting hit) is also appearing in other subroutines in the original module.

@negin513
Copy link
Contributor

I am including these notes here for people who are interested to know the logic behind this solution and those who would like to follow the conditionals to understand the unified SetMatrix subroutine.

Imagine we have a column of snow soil layers as shown below:

     atmosphere
----------------------
|   snow  layer snl+1|
----------------------
|   snow  layer ...  | 
----------------------
|   snow  layer -2   | 
---------------------- 
|   snow  layer -1   |  
----------------------
|   snow  layer 0    |  
----------------------
|   soil  layer 1    | 
---------------------- 
|   soil  layer 2    |  
----------------------
|   soil  layer 3    |  
----------------------
|   soil  layer ...  | 
----------------------
| soil layer nlevgrnd|  
----------------------

The banded matrix for the numerical solution of temperature (including coefficients in equation 6.16 of CLM technical note):

  1. For the top soil layer j=1 ( when no snow is present) or top snow layer j =snl+1:
dzp     = z(c,j+1)-z(c,j)
bmatrix(c,4,j) = 0._r8
bmatrix(c,3,j) = 1+(1._r8-cnfac)*fact(c,j)*tk(c,j)/dzp-fact(c,j)*dhsdT(c)
bmatrix(c,2,j) =  -(1._r8-cnfac)*fact(c,j)*tk(c,j)/dzp
  1. For the interior snow-soil layers ( snl+1 <j<nlevgrnd) excluding the top soil layer (j=1):
dzm     = (z(c,j)-z(c,j-1))
dzp     = (z(c,j+1)-z(c,j))
bmatrix(c,4,j) =   - (1._r8-cnfac)*fact(c,j)* tk(c,j-1)/dzm
bmatrix(c,3,j) = 1._r8+ (1._r8-cnfac)*fact(c,j)*(tk(c,j)/dzp + tk(c,j-1)/dzm)
bmatrix(c,2,j) =   - (1._r8-cnfac)*fact(c,j)* tk(c,j)/dz
  1. At the bottom of snow-soil column ( j = nlevgrnd):
dzm     = (z(c,j)-z(c,j-1))
bmatrix(c,4,j) =   - (1._r8-cnfac)*fact(c,j)*tk(c,j-1)/dzm
bmatrix(c,3,j) = 1._r8+ (1._r8-cnfac)*fact(c,j)*tk(c,j-1)/dzm
bmatrix(c,2,j) = 0._r8
  1. Finally for the snow-soil interface layer (i.e. soil top layer when snow is present) (j=1):
dzm     = (z(c,j)-z(c,j-1))
dzp     = (z(c,j+1)-z(c,j)
bmatrix(c,3,j) = 1._r8 + (1._r8-cnfac)*fact(c,j)*(tk(c,j)/dzp &
                         + frac_sno_eff(c) * tk(c,j-1)/dzm) &
                         - (1._r8 - frac_sno_eff(c))*fact(c,j)*dhsdT(c)
bmatrix(c,2,j) = - (1._r8-cnfac)*fact(c,j)*tk(c,j)/dz
  • please note that for top soil layer when snow is not present this would be the same as top snow layer (condition 1).

  1. When surface water exists, we have the following for layer 0 (j=0) (SSW layer):
! surface water layer has two coefficients
dzm=(0.5*dz_h2osfc(c)+z(c,1))

!flux to top soil layer
bmatrix(c,2,0)= -(1._r8-cnfac)*(dtime/c_h2osfc(c))*tk_h2osfc(c)/dzm 

!interaction from atm
bmatrix(c,3,0)= 1+(1._r8-cnfac)*(dtime/c_h2osfc(c)) &
            *tk_h2osfc(c)/dzm -(dtime/c_h2osfc(c))*dhsdT(c) 
  1. When water is present (frac_h2osfc>0), we should do the following corrections for the top soil layer (j=1):
 ! diagonal element correction for the presence of h2osfc

if ( frac_h2osfc(c) /= 0.0_r8 )then
          bmatrix(c,4,1)=  - frac_h2osfc(c) * (1._r8-cnfac) * fact(c,1) &
             * tk_h2osfc(c)/dzm !flux from h2osfc
          bmatrix(c,3,1)=bmatrix(c,3,1)+ frac_h2osfc(c) &
               *((1._r8-cnfac)*fact(c,1)*tk_h2osfc(c)/dzm + fact(c,1)*dhsdT(c))
else
          bmatrix(c,4,1)= 0.0_r8
end if

The above is written for non-urban and urban non-road columns. Instead of nlevgrnd, we use nlevurb for Urban road columns:

Using this we can loop over all layers as do j = -nlevsno+1,nlevgrnd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability
Projects
Status: Done (non release/external)
Development

No branches or pull requests

3 participants