Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve mechanism for determining where to allocate memory for urban when running with dynamic urban #1572

Closed
billsacks opened this issue Dec 8, 2021 · 4 comments
Labels
enhancement new capability or improved behavior of existing capability

Comments

@billsacks
Copy link
Member

billsacks commented Dec 8, 2021

(Moving this from some discussion in #1546 - see #1546 (review) )

In the soon-to-be-merged PR to enable dynamic urban (#1546 ), the temporary mechanism for allocating memory for urban is to set run_zero_weight_urban to .true. That seems reasonable for development, but I think what we want long-term is something similar to lake; this is less of a performance and memory hit than the current run_zero_weight_urban mechanism (I'm not sure if you want to keep run_zero_weight_urban in addition to this mechanism). (although @olyson showed in #1445 (comment) that the performance hit isn't too great for the current mechanism). Here's what I think should be done:

  • Add a hasurban variable, similar to haslake (do we want a separate one for each urban landunit, or just one overall for urban – so if there is ever any urban in a given grid cell in a transient run, we'll allocate memory for all urban landunits in that grid cell?)
  • Add a subroutine urban_landunit_exists, similar to the lake_landunit_exists function added in ctsm5.1.dev003
  • I think we can (and should) change the urban logic in is_active_l to be similar to lakes: if memory is allocated for an urban column, then make it active, period. I think what we want is to have some logic that determines whether a given urban landunit should exist in memory, and then as long as the landunit exists in memory, it should be active so that it is spunup. (I think that's what we want, but I'm open to discussion here.)
  • Add code like surfrd_lakemask for urban in surfrdMod

@olyson replied to my original comment:

We'd like to do a separate PR for the hasurban implementation, so if you could open a separate issue for that, that would be great. We also think it would be beneficial to have separate hasurban for each urban landunit (likely a 3d variable instead of 3 separate variables). For example, the tall building district class is generally sparse and is likely to generally remain that way in the future, so we could probably save some memory by considering it separately. We plan on keeping the run_zero_weight_urban option for now.

@olyson @fang-bowen @Face2sea @keerzhang1

This is connected with #1445

@billsacks billsacks added the enhancement new capability or improved behavior of existing capability label Dec 8, 2021
@olyson
Copy link
Contributor

olyson commented Dec 8, 2021

Thanks for opening this PR.
A couple of questions:

  1. It looks like HASLAKE is a double precision variable on the surface dataset that is used to test dynamic lake. So I've put a double precision HASURBAN on the surface dataset that is used to test dynamic urban. However, @keerzhang1 points out that we could also use an integer for this variable. Is there a reason that this has to be a double?
  2. Ideally, we'd like to start the development for this PR from the dynamic urban branch that is nearly complete. Is it possible or good practice to branch off of a branch? In other words, can/should we branch off of Bowen's dynamic_urban branch for this work?

@billsacks
Copy link
Member Author

  1. It looks like HASLAKE is a double precision variable on the surface dataset that is used to test dynamic lake. So I've put a double precision HASURBAN on the surface dataset that is used to test dynamic urban. However, @keerzhang1 points out that we could also use an integer for this variable. Is there a reason that this has to be a double?

I can't remember why this was made a double precision variable. @Ivanderkelen do you remember if there was a reason for this or if it was an arbitrary choice? The only thing I can think of is that it is similar to the PCT_CFT_MAX, PCT_CROP_MAX and other variables, which are also double precision. I think @lawrencepj1 created those. If it's useful to have the actual max value for any reason, then it makes sense to have these double precision, but if all we need is a 1/0 flag, then I agree that an int is fine (or a byte, for that matter, though I'm not sure if we have the correct routines to read byte values into CTSM right now).

  1. Ideally, we'd like to start the development for this PR from the dynamic urban branch that is nearly complete. Is it possible or good practice to branch off of a branch? In other words, can/should we branch off of Bowen's dynamic_urban branch for this work?

I'm about to basically approve that branch, so it could come to master soon. But if you want to get started on this, then it is perfectly acceptable & good practice to start a branch off of a branch like this. Then, once #1546 is merged, you can merge that version of master into your branch, which git will handle cleanly.

@Ivanderkelen
Copy link
Contributor

@billsacks : I think the double precision for HASLAKE was arbitrary, as it is only a boolean flag. I agree with the reasoning setting it to int would be ok.

fang-bowen added a commit to fang-bowen/CTSM that referenced this issue Dec 13, 2021
@billsacks
Copy link
Member Author

Reviewing the changes in #1579 , I'm thinking it might be best to use PCT_URBAN_MAX instead of HASURBAN. This would be analogous to what we do for PCT_NAT_PFT_MAX and PCT_CROP_MAX. The advantage of this is that it would be more future-proof in case we ever want to have a more nuanced rule than just present / absent (e.g., include any urban point whose % cover is above some threshold). I'm often not a fan of doing something just because we might need it in the future, but in this case, I think it's justified because it can be a pain to need to change something in a coordinated fashion between the landuse timeseries files and the code, and also a pain to need to remake all the landuse timeseries files.

We would then also want to make a similar change in #1073 - changing HASLAKE to PCT_LAKE_MAX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability
Projects
None yet
Development

No branches or pull requests

3 participants