-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow clean functions to handle _avg variables #377
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmcvey3 thanks for submitting this. I think most of these questions are for me but maybe I found something helpful.
# Use "avg" velocty if standard isn't available. | ||
# Should not matter which is used. | ||
tag = [] | ||
if hasattr(ds, "vel"): | ||
tag += [""] | ||
if hasattr(ds, "vel_avg"): | ||
tag += ["_avg"] | ||
|
||
# This finds the maximum of the echo profile: | ||
inds = np.argmax(ds["amp"].values, axis=1) | ||
inds = np.argmax(ds["amp" + tag[0]].values, axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tag[0]
could throw an error since it is initialized as an empty list.
I think this should have an else tag=['']
(or maybe its initialized as an empty string and only add _avg
?) or have better error handling:
raise ValueError("Neither 'vel' nor 'vel_avg' found in dataset")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thoughts; however, if "vel" doesn't exist, "amp" and "corr" will also not exist. The signal amplitude and correlation are the quality analysis of the velocity ping's signal.
@@ -199,7 +207,7 @@ def water_depth_from_amplitude(ds, thresh=10, nfilt=None) -> None: | |||
|
|||
ds["depth"] = xr.DataArray( | |||
d.astype("float32"), | |||
dims=["time"], | |||
dims=["time" + tag[0]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed to have a time average if we have a vel_avg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I hardcoded it. The data stored under the "averaging" ID has their own timestamp, and I log "time" from that data ID with the "_avg" tag.
mhkit/dolfyn/adp/clean.py
Outdated
d = np.median(D, axis=0) | ||
|
||
# Throw out values that do not increase near the surface by *thresh* | ||
for ip in range(ds["vel"].shape[1]): | ||
for ip in range(ds["vel" + tag[0]].shape[1]): | ||
itmp = np.min(inds[:, ip]) | ||
if (edf[itmp:, :, ip] < thresh).all(): | ||
d[ip] = np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on 188 here you are using the median on d1 and d2 and I notice on 194 you add a nan to the array.
Will d1 or d2 ever have nan? If so median will always return nan. Is that the behavior you want? Or would np.nanmedian
be preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm this is a Levi function... I'll add nan to all of those median min and max functions, because if this is called after another QC function that would be a problem.
raise NameError("The variable 'temp' does not exist.") | ||
|
||
# Density calcation | ||
P = ds["pressure"].values | ||
T = ds["temp"].values # temperature, degC | ||
P = ds[pressure[0]].values # pressure, dbar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do all instruments use dbar (over Pa)? or is it well described in the examples? Should this be added to the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they all use dbar because it translates nearly 1:1 as meters-beneath-the-surface (assuming the pressure sensor was zeroed before deploying, of course)
mhkit/dolfyn/adp/clean.py
Outdated
# Fetch cell size | ||
cs = [ | ||
a | ||
for a in ds.attrs | ||
if ( | ||
("cell_size" in a) | ||
and ("_bt" not in a) | ||
and ("_alt" not in a) | ||
and ("wave" not in a) | ||
) | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this:
raise KeyError("No valid 'cell_size' attribute found in dataset.")
Or will there always be a "cell_size"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will always be a "cell_size" if the user doesn't remove it. I'll add a code block for user input if need be.
With Sterling's commits to |
Makes the ADCP cleaning functions more robust - updated based on latest reader updates for dual profiling instruments. Solution for Issue #373