Allow clean functions to handle _avg variables #377

jmcvey3 · 2025-02-06T20:03:48Z

Makes the ADCP cleaning functions more robust - updated based on latest reader updates for dual profiling instruments. Solution for Issue #373

ssolson

@jmcvey3 thanks for submitting this. I think most of these questions are for me but maybe I found something helpful.

ssolson · 2025-02-13T17:53:43Z

mhkit/dolfyn/adp/clean.py

+    # Use "avg" velocty if standard isn't available.
+    # Should not matter which is used.
+    tag = []
+    if hasattr(ds, "vel"):
+        tag += [""]
+    if hasattr(ds, "vel_avg"):
+        tag += ["_avg"]
+
    # This finds the maximum of the echo profile:
-    inds = np.argmax(ds["amp"].values, axis=1)
+    inds = np.argmax(ds["amp" + tag[0]].values, axis=1)


tag[0] could throw an error since it is initialized as an empty list.

I think this should have an else tag=[''] (or maybe its initialized as an empty string and only add _avg ?) or have better error handling:
raise ValueError("Neither 'vel' nor 'vel_avg' found in dataset")

Good thoughts; however, if "vel" doesn't exist, "amp" and "corr" will also not exist. The signal amplitude and correlation are the quality analysis of the velocity ping's signal.

ssolson · 2025-02-13T17:58:08Z

mhkit/dolfyn/adp/clean.py

@@ -199,7 +207,7 @@ def water_depth_from_amplitude(ds, thresh=10, nfilt=None) -> None:

    ds["depth"] = xr.DataArray(
        d.astype("float32"),
-        dims=["time"],
+        dims=["time" + tag[0]],


Are we guaranteed to have a time average if we have a vel_avg?

Yes, I hardcoded it. The data stored under the "averaging" ID has their own timestamp, and I log "time" from that data ID with the "_avg" tag.

ssolson · 2025-02-13T18:01:54Z

mhkit/dolfyn/adp/clean.py

    d = np.median(D, axis=0)

    # Throw out values that do not increase near the surface by *thresh*
-    for ip in range(ds["vel"].shape[1]):
+    for ip in range(ds["vel" + tag[0]].shape[1]):
        itmp = np.min(inds[:, ip])
        if (edf[itmp:, :, ip] < thresh).all():
            d[ip] = np.nan


on 188 here you are using the median on d1 and d2 and I notice on 194 you add a nan to the array.

Will d1 or d2 ever have nan? If so median will always return nan. Is that the behavior you want? Or would np.nanmedian be preferred?

Hmmm this is a Levi function... I'll add nan to all of those median min and max functions, because if this is called after another QC function that would be a problem.

ssolson · 2025-02-13T18:14:47Z

mhkit/dolfyn/adp/clean.py

        raise NameError("The variable 'temp' does not exist.")

    # Density calcation
-    P = ds["pressure"].values
-    T = ds["temp"].values  # temperature, degC
+    P = ds[pressure[0]].values  # pressure, dbar


do all instruments use dbar (over Pa)? or is it well described in the examples? Should this be added to the docstring?

Yes, they all use dbar because it translates nearly 1:1 as meters-beneath-the-surface (assuming the pressure sensor was zeroed before deploying, of course)

ssolson · 2025-02-13T18:25:20Z

mhkit/dolfyn/adp/clean.py

+    # Fetch cell size
+    cs = [
+        a
+        for a in ds.attrs
+        if (
+            ("cell_size" in a)
+            and ("_bt" not in a)
+            and ("_alt" not in a)
+            and ("wave" not in a)
+        )
+    ]
+


Should this:
raise KeyError("No valid 'cell_size' attribute found in dataset.")

Or will there always be a "cell_size"?

There will always be a "cell_size" if the user doesn't remove it. I'll add a code block for user input if need be.

into clean_avg

akeeste · 2025-02-20T19:46:04Z

With Sterling's commits to develop pulled in, tests are now passing

akeeste · 2025-02-25T15:20:44Z

@jmcvey3 are there any other outstanding changes in this PR from @ssolson's review?

jmcvey3 · 2025-02-25T17:10:53Z

@jmcvey3 are there any other outstanding changes in this PR from @ssolson's review?

I was waiting on a response in issue #373, but looks like we're good to go.

jmcvey3 added 3 commits February 6, 2025 12:02

Allow clean functions to handle _avg variables

9074cb6

Cleanup for fixes

40d4be0

Not sure what this newest black formatting is actually changing

31701d1

jmcvey3 force-pushed the clean_avg branch from ed81e98 to 31701d1 Compare February 12, 2025 17:00

ssolson reviewed Feb 13, 2025

View reviewed changes

jmcvey3 and others added 2 commits February 14, 2025 12:34

Handle nan's in amplitude surface function, set cellsize as user input

9717b3b

Merge branch 'develop' of https://github.com/MHKiT-Software/MHKiT-Python

0fbf031

into clean_avg

jmcvey3 merged commit 7e91cea into MHKiT-Software:develop Feb 25, 2025
43 checks passed

jmcvey3 deleted the clean_avg branch February 25, 2025 17:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow clean functions to handle _avg variables #377

Allow clean functions to handle _avg variables #377

jmcvey3 commented Feb 6, 2025

ssolson left a comment

ssolson Feb 13, 2025

jmcvey3 Feb 14, 2025

ssolson Feb 13, 2025

jmcvey3 Feb 14, 2025 •

edited

Loading

ssolson Feb 13, 2025

jmcvey3 Feb 14, 2025

ssolson Feb 13, 2025

jmcvey3 Feb 14, 2025

ssolson Feb 13, 2025

jmcvey3 Feb 14, 2025 •

edited

Loading

akeeste commented Feb 20, 2025

akeeste commented Feb 25, 2025

jmcvey3 commented Feb 25, 2025

Allow clean functions to handle _avg variables #377

Allow clean functions to handle _avg variables #377

Conversation

jmcvey3 commented Feb 6, 2025

ssolson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmcvey3 Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmcvey3 Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

akeeste commented Feb 20, 2025

akeeste commented Feb 25, 2025

jmcvey3 commented Feb 25, 2025

jmcvey3 Feb 14, 2025 •

edited

Loading

jmcvey3 Feb 14, 2025 •

edited

Loading