Test merge split beam angle (#15)

* create general structure of add_splitbeam_angle function * create simrad.py and start working on a check of the waveform and encode modes * rearange and improve logic for checking the encode and waveform modes * finish documenting functions in simrad and finalizing their outputs, incorporate new waveform/encode mode check into calibrate_ek and consolidate api * allow for 2 beam groups for waveform_mode=BB and add check_waveform_encode_mode to compute_Sv and compute_TS * move core split-beam angle functions to split_beam_angle.py, finish implementation of _get_splitbeam_angle_power_CW, _add_splitbeam_angle_to_ds, add an empty test suite for echodata.simrad, and start working on test_add_splitbeam_angle * make check_waveform_encode_mode return only the echodata group name corresponding to the encode mode * change test_add_splitbeam_angle so that it takes a list of echoview mat paths, construct a function that creates a numpy array from mat paths, and select echodata beam_group data in add_splitbeam_angle based on ds channel dims * add pulse_compression input to add_splitbeam_angle, implement (CW, complex) and (BB, complex, no pulse compression) options, and create tests for them * modify test_add_splitbeam_angle pytest parameters to account for data location, finish docstring for add_splitbeam_angle, drop beam wherever necessary * add comment at the top of test_consolidate.py specifying where the split-beam angle tests data exists * creat apply_mask function without implementing it * create general structure implementation of apply_mask * create first simple test for apply_mask with mock Dataset creation * create apply_mask paramter test for when mask is a list of DataArrays * change the function name validate_source_ds to validate_source_ds_da and allow it to handle a DataArray input, modify existing tests for validate_source_ds * create routines that check the input of apply_mask and form the inputs for the core method in apply_mask * allow get_mock_source_ds_apply_mask to produced delayed variables and add associated test in test_apply_mask * add code and parameters to test_apply_mask to test for a mask that is provided as a path * add tests for different fill_value types and accout for type(fill_value)=DataArray conflict with xr.where * correct docstring in _check_mode_input_without_data Co-authored-by: Wu-Jung Lee <[email protected]> * raise ValueError instead of RuntimeError in _check_mode_input_without_data * remove TODO in _check_mode_input_with_data_EK60 * improve grammar in error output in _check_mode_input_with_data_EK80 Co-authored-by: Wu-Jung Lee <[email protected]> * remove check for EK80-like sensors in _check_mode_input_with_data_EK80 as it is not necessary * improve docstring function description for add_splitbeam_angle Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove States in pulse_compression docstring description Co-authored-by: Wu-Jung Lee <[email protected]> * improve logic associated with split-beam angle function calls in add_splitbeam_angle Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove remnant else statement caused by GH commit in add_splitbeam_angle * rename _check_mode_input_without_data to _check_input_args_combination * add reminder to add a test for _check_input_args_combination * change _check_mode_input_with_data_EK60/EK80 to _retrieve_correct_beam_group_EK60/EK80 * rename check_waveform_encode_mode to retrieve_correct_beam_group * finish constructing test_check_input_args_combination * remove reference to MVBS for ds input in add_splitbeam_angle function Co-authored-by: Wu-Jung Lee <[email protected]> * rename ds to source_Sv in add_splitbeam_angle * allow source_Sv to be a Dataset, str, or pathlib.Path * add return_dataset argument to add_splitb_ang and modify _add_splitbeam_angle_to_ds so that it will write the data to a zarr or netcdf, if necessary * remove auto chunking when opening a Dataset since netcdf files have issues with this and free resources linked to ds in _add_splitbeam_angle_to_ds as netcdf also has an issue if the file remains open * modify test_add_splitbeam_angle to allow writing ds_Sv to a file and add associated test parameters * raise not implemented error if source_Sv corresponds to MVBS * implement split-beam angle calculation for waveform_mode=BB, encode_mode=complex, and pulse_compression=False, and add corresponding test * Improve documentation by clarifying statements Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove whitespace caused by GH commit * replace RuntimeError with ValueError * replace chunks=auto with chunks={} * add default value to apply_mask input var_name and modify documentation accordingly * Fix meta_source_filenames bug and enable (meta)source_filenames appending of path and list (#908) * Expand prov.source_files_vars to support path sequences that mix a str/path and another sequence * For EK80, ES70, ES80, EA640, conversion was inserting an unnecessary, empty meta_source_filenames variable * Add more comprehensive and readable prov source-files type hints; rename _source_files * Fix prov type hint bug with Py 3.8 * Add unit test for prov _sanitize_source_files; plus small fixes to np.ndarray type references in prov * fix validate_source_ds to validate_source_ds_da * improve Notes section in docstring for add_splitbeam_angle Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * clarify when the split-beam angle data potentially exist Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix long lines * Add clarification in NotImplemented error statement Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * split up line in NotImplemented error statement * rename function compute_split_beam_beamtype1 to _e2f * put radian to degree conversion on one line Co-authored-by: Wu-Jung Lee <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change sin to arcsin Co-authored-by: Wu-Jung Lee <[email protected]> * remove unnecessary statement Co-authored-by: Wu-Jung Lee <[email protected]> * replace RuntimeError with ValueError * change theta_fc --> theta and phi_fc --> phi * allow splitbeam calculation for power and CW options when beam_type is not equal to zero * replace core functions in _get_splitbeam_angle_complex_CW with get_splitbeam and then remove them * shorten function names, minor docstring tweaks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add TODO to test * fix bug in logic for at least 1 channel being split-beam Co-authored-by: b-reyes <[email protected]> Co-authored-by: b-reyes <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Emilio Mayorga <[email protected]>
OSOceanAcoustics · Dec 22, 2022 · ad48768 · ad48768
1 parent 7c399c8
commit ad48768
Show file tree

Hide file tree

Showing 7 changed files with 1,159 additions and 84 deletions.
diff --git a/echopype/calibrate/calibrate_ek.py b/echopype/calibrate/calibrate_ek.py
@@ -3,6 +3,7 @@
 from scipy import signal
 
 from ..echodata import EchoData
+from ..echodata.simrad import retrieve_correct_beam_group
 from ..utils import uwa
 from ..utils.log import _init_logger
 from .calibrate_base import CAL_PARAMS, CalibrateBase
@@ -134,30 +135,24 @@ def get_cal_params(self, cal_params, waveform_mode, encode_mode):
             else beam["equivalent_beam_angle"]
         )
 
-    def _cal_power(self, cal_type, use_beam_power=False) -> xr.Dataset:
+    def _cal_power(self, cal_type: str, power_ed_group: str = None) -> xr.Dataset:
         """Calibrate power data from EK60 and EK80.
 
         Parameters
         ----------
-        cal_type : str
+        cal_type: str
             'Sv' for calculating volume backscattering strength, or
             'TS' for calculating target strength
-        use_beam_power : bool
-            whether to use beam_power.
-            If ``True`` use ``echodata["Sonar/Beam_group2"]``;
-            if ``False`` use ``echodata["Sonar/Beam_group1"]``.
-            Note ``echodata["Sonar/Beam_group2"]`` could only exist for EK80 data.
+        power_ed_group:
+            The ``EchoData`` beam group path containing the power data
 
         Returns
         -------
         xr.Dataset
             The calibrated dataset containing Sv or TS
         """
         # Select source of backscatter data
-        if use_beam_power:
-            beam = self.echodata["Sonar/Beam_group2"]
-        else:
-            beam = self.echodata["Sonar/Beam_group1"]
+        beam = self.echodata[power_ed_group]
 
         # Harmonize time coordinate between Beam_groupX data and env_params
         for p in self.env_params.keys():
@@ -284,10 +279,16 @@ def get_env_params(self, **kwargs):
             )
 
     def compute_Sv(self, **kwargs):
-        return self._cal_power(cal_type="Sv")
+        power_ed_group = retrieve_correct_beam_group(
+            echodata=self.echodata, waveform_mode="CW", encode_mode="power", pulse_compression=False
+        )
+        return self._cal_power(cal_type="Sv", power_ed_group=power_ed_group)
 
     def compute_TS(self, **kwargs):
-        return self._cal_power(cal_type="TS")
+        power_ed_group = retrieve_correct_beam_group(
+            echodata=self.echodata, waveform_mode="CW", encode_mode="power", pulse_compression=False
+        )
+        return self._cal_power(cal_type="TS", power_ed_group=power_ed_group)
 
 
 class CalibrateEK80(CalibrateEK):
@@ -883,78 +884,22 @@ def _compute_cal(self, cal_type, waveform_mode, encode_mode) -> xr.Dataset:
         xr.Dataset
             An xarray Dataset containing either Sv or TS.
         """
-        # Raise error for wrong inputs
-        if waveform_mode not in ("BB", "CW"):
-            raise ValueError(
-                "Input waveform_mode not recognized! "
-                "waveform_mode must be either 'BB' or 'CW' for EK80 data."
-            )
-        if encode_mode not in ("complex", "power"):
-            raise ValueError(
-                "Input encode_mode not recognized! "
-                "encode_mode must be either 'complex' or 'power' for EK80 data."
-            )
+
+        power_ed_group = retrieve_correct_beam_group(
+            echodata=self.echodata,
+            waveform_mode=waveform_mode,
+            encode_mode=encode_mode,
+            pulse_compression=False,
+        )
 
         # Set flag_complex
         #  - True: complex cal
         #  - False: power cal
-        # BB: complex only, CW: complex or power
+        flag_complex = False
         if waveform_mode == "BB":
-            if encode_mode == "power":  # BB waveform forces to collect complex samples
-                raise ValueError("encode_mode='power' not allowed when waveform_mode='BB'!")
             flag_complex = True
-        else:  # waveform_mode="CW"
-            if encode_mode == "complex":
-                flag_complex = True
-            else:
-                flag_complex = False
-
-        # Raise error when waveform_mode and actual recording mode do not match
-        # This simple check is only possible for BB-only data,
-        #   since for data with both BB and CW complex samples,
-        #   frequency_start will exist in echodata["Sonar/Beam_group1"] for the BB channels
-        if waveform_mode == "BB" and "frequency_start" not in self.echodata["Sonar/Beam_group1"]:
-            raise ValueError("waveform_mode='BB' but broadband data not found!")
-
-        # Set use_beam_power
-        #  - True: use self.echodata["Sonar/Beam_group2"] for cal
-        #  - False: use self.echodata["Sonar/Beam_group1"] for cal
-        use_beam_power = False
-
-        # Warn user about additional data in the raw file if another type exists
-        # When both power and complex samples exist:
-        #   complex samples will be stored in echodata["Sonar/Beam_group1"]
-        #   power samples will be stored in echodata["Sonar/Beam_group2"]
-        # When only one type of samples exist,
-        #   all samples with be stored in echodata["Sonar/Beam_group1"]
-        if self.echodata["Sonar/Beam_group2"] is not None:  # both power and complex samples exist
-            # If both beam and beam_power groups exist,
-            #   this means that CW data are encoded as power samples and in beam_power group
-            if waveform_mode == "CW" and encode_mode == "complex":
-                raise ValueError("File does not contain CW complex samples")
-
-            if encode_mode == "power":
-                use_beam_power = True  # switch source of backscatter data
-                logger.info(
-                    "Only power samples are calibrated, but complex samples also exist in the raw data file!"  # noqa
-                )
-            else:
-                logger.info(
-                    "Only complex samples are calibrated, but power samples also exist in the raw data file!"  # noqa
-                )
-        else:  # only power OR complex samples exist
-            if (
-                "backscatter_i" in self.echodata["Sonar/Beam_group1"].variables
-            ):  # data contain only complex samples
-                if encode_mode == "power":
-                    raise TypeError(
-                        "File does not contain power samples! Use encode_mode='complex'"
-                    )  # user selects the wrong encode_mode
-            else:  # data contain only power samples
-                if encode_mode == "complex":
-                    raise TypeError(
-                        "File does not contain complex samples! Use encode_mode='power'"
-                    )  # user selects the wrong encode_mode
+        elif encode_mode == "complex":
+            flag_complex = True
 
         # Compute Sv
         if flag_complex:
@@ -964,7 +909,7 @@ def _compute_cal(self, cal_type, waveform_mode, encode_mode) -> xr.Dataset:
         else:
             # Power samples only make sense for CW mode data
             self.compute_range_meter(waveform_mode="CW", encode_mode=encode_mode)
-            ds_cal = self._cal_power(cal_type=cal_type, use_beam_power=use_beam_power)
+            ds_cal = self._cal_power(cal_type=cal_type, power_ed_group=power_ed_group)
 
         return ds_cal
 

diff --git a/echopype/consolidate/__init__.py b/echopype/consolidate/__init__.py
@@ -1,3 +1,3 @@
-from .api import add_depth, add_location, swap_dims_channel_frequency
+from .api import add_depth, add_location, add_splitbeam_angle, swap_dims_channel_frequency
 
-__all__ = ["swap_dims_channel_frequency", "add_depth", "add_location"]
+__all__ = ["swap_dims_channel_frequency", "add_depth", "add_location", "add_splitbeam_angle"]
diff --git a/echopype/consolidate/api.py b/echopype/consolidate/api.py
@@ -1,10 +1,20 @@
 import datetime
-from typing import Optional
+import pathlib
+from typing import Optional, Union
 
 import numpy as np
 import xarray as xr
 
 from ..echodata import EchoData
+from ..echodata.simrad import retrieve_correct_beam_group
+from ..utils.io import validate_source_ds_da
+from .split_beam_angle import (
+    add_angle_to_ds,
+    get_angle_complex_BB_nopc,
+    get_angle_complex_BB_pc,
+    get_angle_complex_CW,
+    get_angle_power_CW,
+)
 
 
 def swap_dims_channel_frequency(ds: xr.Dataset) -> xr.Dataset:
@@ -174,3 +184,168 @@ def sel_interp(var):
     interp_ds["longitude"] = interp_ds["longitude"].assign_attrs({"history": history})
 
     return interp_ds.drop_vars("time1")
+
+
+def add_splitbeam_angle(
+    source_Sv: Union[xr.Dataset, str, pathlib.Path],
+    echodata: EchoData,
+    waveform_mode: str,
+    encode_mode: str,
+    pulse_compression: bool = False,
+    storage_options: dict = {},
+    return_dataset: bool = True,
+) -> xr.Dataset:
+    """
+    Add split-beam (alongship/athwartship) angles into the Sv dataset.
+    This function calculates the alongship/athwartship angle using data stored
+    in the beam groups of ``echodata``. In cases when angle data does not already exist
+    or cannot be computed from the data, an error is issued and no angle variables are
+    added to the dataset.
+
+    Parameters
+    ----------
+    source_Sv: xr.Dataset or str or pathlib.Path
+        The Sv Dataset or path to a file containing the Sv Dataset, which will have the
+        split-beam angle data added to it
+    echodata: EchoData
+        An ``EchoData`` object holding the raw data
+    waveform_mode : {"CW", "BB"}
+        Type of transmit waveform
+
+        - ``"CW"`` for narrowband transmission,
+          returned echoes recorded either as complex or power/angle samples
+        - ``"BB"`` for broadband transmission,
+          returned echoes recorded as complex samples
+
+    encode_mode : {"complex", "power"}
+        Type of encoded return echo data
+
+        - ``"complex"`` for complex samples
+        - ``"power"`` for power/angle samples, only allowed when
+          the echosounder is configured for narrowband transmission
+    pulse_compression: bool, False
+        Whether pulse compression should be used (only valid for
+        ``waveform_mode="BB"`` and ``encode_mode="complex"``)
+    storage_options: dict, default={}
+        Any additional parameters for the storage backend, corresponding to the
+        path provided for ``source_Sv``
+    return_dataset: bool, default=True
+        If True, ``source_Sv`` with the split-beam angle data added to it
+        will be returned, else it will not be returned. A value of ``False``
+        is useful in the situation where ``source_Sv`` is a path and the user
+        only wants to write the split-beam angle data to the path provided.
+
+    Returns
+    -------
+    xr.Dataset or None
+        If ``return_dataset=False``, nothing will be returned. If ``return_dataset=True``
+        either the input dataset ``source_Sv`` or a lazy-loaded Dataset (obtained from
+        the path provided by ``source_Sv``) with the split-beam angle data added
+        will be returned.
+
+    Raises
+    ------
+    ValueError
+        If ``echodata`` has a sonar model that is not analogous to either EK60 or EK80
+    ValueError
+        If the input ``source_Sv`` does not have a ``channel`` dimension
+    ValueError
+        If ``source_Sv`` does not have appropriate dimension lengths in
+        comparison to ``echodata`` data
+    ValueError
+        If the provided ``waveform_mode``, ``encode_mode``, and ``pulse_compression`` are not valid
+    NotImplementedError
+        If an unknown ``beam_type`` is encountered during the split-beam calculation
+
+    Notes
+    -----
+    Split-beam angle data potentially exist for the following echosounders depending on
+    the instrument configuration and recording setting:
+
+        - Simrad EK60 echosounder paired with split-beam transducers and
+          configured to store angle data
+        - Simrad EK80 echosounder paired with split-beam transducers and
+          configured to store angle data
+
+    In most cases where the type of samples collected by the echosounder (power/angle
+    samples or complex samples) and the transmit waveform (broadband or narrowband)
+    are identical across all channels, the channels existing in ``source_Sv`` and `
+    `echodata`` will be identical. If this is not the case, only angle data corresponding
+    to channels existing in ``source_Sv`` will be added.
+
+    For EK80 broadband data, the split-beam angles can be estimated from the complex data.
+    The current implementation generates angles estimated *without* applying pulse compression.
+    Estimating the angle with pulse compression will be added in the near future.
+    """
+
+    # ensure that echodata was produced by EK60 or EK80-like sensors
+    if echodata.sonar_model not in ["EK60", "ES70", "EK80", "ES80", "EA640"]:
+        raise ValueError(
+            "The sonar model that produced echodata does not have split-beam "
+            "transducers, split-beam angles cannot be added to source_Sv!"
+        )
+
+    # validate the source_Sv type or path (if it is provided)
+    source_Sv, file_type = validate_source_ds_da(source_Sv, storage_options)
+
+    # initialize source_Sv_path
+    source_Sv_path = None
+
+    if isinstance(source_Sv, str):
+
+        # store source_Sv path so we can use it to write to later
+        source_Sv_path = source_Sv
+
+        # TODO: In the future we can improve this by obtaining the variable names, channels,
+        #  and dimension lengths directly from source_Sv using zarr or netcdf4. This would
+        #  prevent the unnecessary loading in of the coordinates, which the below statement does.
+        # open up Dataset using source_Sv path
+        source_Sv = xr.open_dataset(source_Sv, engine=file_type, chunks={}, **storage_options)
+
+    # raise not implemented error if source_Sv corresponds to MVBS
+    if source_Sv.attrs["processing_function"] == "preprocess.compute_MVBS":
+        raise NotImplementedError("Adding split-beam data to MVBS has not been implemented!")
+
+    # check that the appropriate waveform and encode mode have been given
+    # and obtain the echodata group path corresponding to encode_mode
+    encode_mode_ed_group = retrieve_correct_beam_group(
+        echodata, waveform_mode, encode_mode, pulse_compression
+    )
+
+    # check that source_Sv at least has a channel dimension
+    if "channel" not in source_Sv.variables:
+        raise ValueError("The input source_Sv Dataset must have a channel dimension!")
+
+    # set ds_beam, select the same channels that are in source_Sv
+    ds_beam = echodata[encode_mode_ed_group].sel(channel=source_Sv.channel.values)
+
+    # fail if source_Sv and ds_beam do not have the same lengths
+    # for ping_time, range_sample, and channel
+    same_dim_lens = [
+        ds_beam.dims[dim] == source_Sv.dims[dim] for dim in ["channel", "ping_time", "range_sample"]
+    ]
+    if not same_dim_lens:
+        raise ValueError(
+            "Input source_Sv does not have the same dimension lengths as all dimensions in ds_beam!"
+        )
+
+    # obtain split-beam angles from
+    # CW mode data
+    if waveform_mode == "CW":
+        if encode_mode == "power":  # power data
+            theta, phi = get_angle_power_CW(ds_beam=ds_beam)
+        else:  # complex data
+            theta, phi = get_angle_complex_CW(ds_beam=ds_beam)
+    # BB mode data
+    else:
+        if pulse_compression:  # with pulse compression
+            theta, phi = get_angle_complex_BB_pc(ds_beam=ds_beam)
+        else:  # without pulse compression
+            theta, phi = get_angle_complex_BB_nopc(ds_beam=ds_beam, ed=echodata)
+
+    # add theta and phi to source_Sv input
+    source_Sv = add_angle_to_ds(
+        theta, phi, source_Sv, return_dataset, source_Sv_path, file_type, storage_options
+    )
+
+    return source_Sv