-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problematic ancillary pixels #44
base: main
Are you sure you want to change the base?
Conversation
… put new problematic_pixel code. Moved the mule replace operator to the common_utilities.py script because it is being repeated.
…ll three replace land/surface scripts.
…to the common_utilities.py script
Sorry @engelca, I'll be away for the next couple weeks and won't be able to review this, so I'll remove my request for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @engelca,
I left a some comments and changes related to clarity and performance.
Feel free to ask in the specific comment for any further details about specific changes.
print('transform') | ||
return sources[1] | ||
|
||
def replace_in_ff_problematic(f, mf_out, replace, stashcode, canopy_pixels, landsea_pixels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a docstring to better explain the purpose of the function.
I would also rename to something a bit clearer like replace_problematic_pixels_in_ff
|
||
mf_out.fields.append(replace([f, data])) | ||
|
||
def problematic_pixels(infile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename the function to be a bit more specific of its purpose. For example:
- get_problematic_pixels
- find_problematic_pixels
current_data = f.get_data() | ||
data=current_data.copy() | ||
|
||
if stashcode == 218: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to capture this number in a constant variable at the beinning of the script (outside the function) that makes it clearer what it stands for.
Something like:
STASH_CANOPY_HEIGHT=218
then change
if stashcode == 218: | |
if stashcode == STASH_CANOPY_HEIGHT: |
ix=canopy_pixels[j][1] | ||
if np.isnan(data[iy,ix]): | ||
data[iy,ix]=1. | ||
elif stashcode == 33: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above with the hardcoded number for the canopy height stash.
@@ -59,13 +59,13 @@ def main(): | |||
|
|||
# If necessary replace ERA5 land/surface fields with higher-resolution options | |||
if "era5land" in args.type: | |||
replace_landsurface_with_ERA5land_IC.swap_land_era5land(args.mask, args.file, t) | |||
replace_landsurface_with_ERA5land_IC.swap_land_era5land(args.mask, args.file, t, fix_problematic_pixels="yes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace_landsurface_with_ERA5land_IC.swap_land_era5land(args.mask, args.file, t, fix_problematic_pixels="yes") | |
replace_landsurface_with_ERA5land_IC.swap_land_era5land(args.mask, args.file, t, fix_problematic_pixels=True) |
lats=orog_data['latitude'].data | ||
lons=orog_data['longitude'].data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lats=orog_data['latitude'].data | |
lons=orog_data['longitude'].data |
# Printing information to the standard output for reporting purposes (so scientists can be aware) | ||
npoints,nxy = (canopy_pixels.shape) | ||
if npoints>0: | ||
for i in range(npoints): | ||
print("%.1f, %.1f, Nan canopy"%(lons[canopy_pixels[i,1]],lats[canopy_pixels[i,0]])) | ||
|
||
npoints,nxy = (landsea_pixels.shape) | ||
if npoints>0: | ||
for i in range(npoints): | ||
print("%.1f, %.1f, Misplaced Orography"%(lons[landsea_pixels[i,1]],lats[landsea_pixels[i,0]])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Printing information to the standard output for reporting purposes (so scientists can be aware) | |
npoints,nxy = (canopy_pixels.shape) | |
if npoints>0: | |
for i in range(npoints): | |
print("%.1f, %.1f, Nan canopy"%(lons[canopy_pixels[i,1]],lats[canopy_pixels[i,0]])) | |
npoints,nxy = (landsea_pixels.shape) | |
if npoints>0: | |
for i in range(npoints): | |
print("%.1f, %.1f, Misplaced Orography"%(lons[landsea_pixels[i,1]],lats[landsea_pixels[i,0]])) | |
# Printing information to the standard output for reporting purposes (so scientists can be aware) | |
tmp = missing_canopy.where(missing_canopy.compute(), drop=True) | |
for lon, lat in zip(tmp.longitude, tmp.latitude): | |
print(f"{lat:.1f}, {lon:.1f}, NaN Canopy") | |
tmp = misclassified_land.where(misclassified_land.compute(), drop=True) | |
for lon, lat in zip(tmp.longitude, tmp.latitude): | |
print(f"{lat:.1f}, {lon:.1f}, Misplaced Orography") |
Also a minor thing, I swapped the printing of 'latitude' and 'longitude' values (placing 'latitude' first), to be more consistent with the usual coordinate specification.
print('transform') | ||
return sources[1] | ||
|
||
def replace_in_ff_problematic(f, mf_out, replace, stashcode, canopy_pixels, landsea_pixels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def replace_in_ff_problematic(f, mf_out, replace, stashcode, canopy_pixels, landsea_pixels): | |
def replace_in_ff_problematic(f, mf_out, replace, missing_canopy, misclassified_land): | |
stashcode = f.lbuser4 |
I would avoid passing an argument that can be computed within the function.
stashcode
is always going to be f.lbuser4
, but since f
is an argument of the function, we can compute it inside the function itself.
Note the function call should be modified in the other files after this modification.
For changes in canopy_pixels
and landsea_pixels
, refer to the related comments below.
print("%.1f, %.1f, Misplaced Orography"%(lons[landsea_pixels[i,1]],lats[landsea_pixels[i,0]])) | ||
|
||
# returning pixel locations so they can be processed appropriately | ||
return canopy_pixels,landsea_pixels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return canopy_pixels,landsea_pixels | |
return missing_canopy,misclassified_land |
This would simplify the logic in the replace_in_ff_problematic
function (see related comments above for details).
current_data = f.get_data() | ||
data=current_data.copy() | ||
|
||
if stashcode == 218: | ||
for j in range(len(canopy_pixels)): | ||
iy=canopy_pixels[j][0] | ||
ix=canopy_pixels[j][1] | ||
if np.isnan(data[iy,ix]): | ||
data[iy,ix]=1. | ||
elif stashcode == 33: | ||
for j in range(len(landsea_pixels)): | ||
data[landsea_pixels[j][0],landsea_pixels[j][1]]=0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_data = f.get_data() | |
data=current_data.copy() | |
if stashcode == 218: | |
for j in range(len(canopy_pixels)): | |
iy=canopy_pixels[j][0] | |
ix=canopy_pixels[j][1] | |
if np.isnan(data[iy,ix]): | |
data[iy,ix]=1. | |
elif stashcode == 33: | |
for j in range(len(landsea_pixels)): | |
data[landsea_pixels[j][0],landsea_pixels[j][1]]=0. | |
if stashcode == 218: | |
data = np.where(missing_canopy, 1., f.get_data()) | |
elif stashcode == 33: | |
data = np.where(misclassified_land, 0., f.get_data()) |
If we use missing_canopy
and misclassified_land
(the masks) as input arguments we can simplify the fixing logic without needing loops.
This pull request adds the ability to fix problematic ancillary pixels "on-the-fly" in the replace_landsurface scripts (when handling start dumps).