Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Krill inversion component sketch #321

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
34be8cc
Update `requirements-dev.txt`
brandynlucca Nov 8, 2024
84d616c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 8, 2024
f87c582
Add `jupyter-book` tests to `tox`
brandynlucca Nov 8, 2024
f8cdfa3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 8, 2024
8b1e97f
Merge branch 'OSOceanAcoustics:main' into main
brandynlucca Nov 8, 2024
0ee25b9
Update example notebook
brandynlucca Nov 16, 2024
fee0021
Fix validation function
brandynlucca Nov 16, 2024
f320013
Merge branch 'main' of https://github.com/brandynlucca/echopop
brandynlucca Nov 25, 2024
fd8d357
Merge branch 'main' of https://github.com/brandynlucca/echopop
brandynlucca Nov 27, 2024
21f89ea
Merge branch 'main' of https://github.com/brandynlucca/echopop
brandynlucca Dec 10, 2024
921d32a
Merge branch 'main' of https://github.com/brandynlucca/echopop
brandynlucca Dec 24, 2024
184578e
Initial scoping
brandynlucca Dec 31, 2024
3698253
Merge branch 'main' of https://github.com/brandynlucca/echopop into i…
brandynlucca Dec 31, 2024
38921f4
Further scoping
brandynlucca Jan 6, 2025
0e92b69
Some typing edits
brandynlucca Jan 7, 2025
69cd8ec
Add comment steps for averaging functions
brandynlucca Jan 7, 2025
9670fbd
Add patching and small doc updates elsewhere
brandynlucca Jan 7, 2025
f499a1c
Adjustments based on writeup
brandynlucca Jan 9, 2025
845a4e4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file.
98 changes: 98 additions & 0 deletions echopop/extensions/inversion/data_ingestion.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
from pydantic import BaseModel
from pandera import DataFrameModel
from lmfit import Parameters
from typing import Any, Dict, List, Union

import pandas as pd
from pathlib import Path

####################################################################################################
# Validation / preparation
####################################################################################################

class inversion_configuration_validator(BaseModel):
"""
Pydantic model for validating configuration parameters
"""

# RETURNS: Dict[str, Any]
pass

class dataset_validator(DataFrameModel):
"""
Pandera model for validating dataset values
"""

# RETURNS: pd.DataFrame
pass

def prepare_scattering_model_inputs(scattering_config: Dict[str, Any]) -> Dict[str, Any]:
"""
Prepare scattering model parameter inputs
"""
# == functions/set_para.m
# == functions/inversion_para.m

# PHASE 1) INGEST VALUES FROM CONFIGURATION FILE
# PHASE 2) VALIDATE USING `inversion_configuration_validator`
# PHASE 3) COMPUTE INTERMEDIATE VARIABLES (e.g. acoustic wavenumber, position matrix)
# PHASE 4) PASS TO SCATTERER CLASS
# --> EXTERNAL TO THIS FUNCTION

# RETURNS: Validated scattering model inputs
pass

def prepare_dataset(dataset: pd.DataFrame) -> Dict[str, Any]:
"""
Prepare dataset inputs
"""

# PHASE 1) INGEST DATASET (*.xlsx)
# PHASE 2) VALIDATE USING `dataset_validator`
# PHASE 3) PARTITION DATASET BASED ON DIFFERENT ECHOMETRICS (e.g. mean Sv, median Sv)

# RETURNS: Validated dataset DataFrame objects used for inversion
pass

def prepare_inversion_settings(inversion_config: Dict[str, Any]) -> Dict[str, Any]:
"""
Prepare inversion configuration and parameterization
"""

# PHASE 1) INGEST VALUES FROM CONFIGURATION FILE
# PHASE 2) VALIDATE USING `inversion_configuration_validator`
# PHASE 3) COMPUTE INTERMEDIATE VARIABLES (e.g. acoustic wavenumber, position matrix)

# RETURNS: Validated inversion and optimization parameters
pass


####################################################################################################
# Data ingestion
####################################################################################################

def yaml_configuration_reader(config_file: Union[str, Path]) -> Dict[str, Union[float, int, Parameters, pd.DataFrame, str]]:
"""
Read and validate the input parameterization YAML configuration
"""
# == functions/load_para_data.m
# == functions/load_geo_phy_para.m
# == functions/get_simu_para.m

# PHASE 1) READ CONFIGURATION FILE

# RETURNS: Raw Dict[str, Any]
pass

def dataset_reader(data_file: Union[str, Path]) -> pd.DataFrame:
"""
Read aggregate acoustic backscatter measurements
"""
# == functions/get_acoustic_data.m
# == functions/load_MOCNESS_data.m
# == functions/load_BIOMAPPER_data.m

# PHASE 1) READ IN FILES

# RETURNS: Raw pd.DataFrame (or Dict[str, Any]: see `prepare_dataset`)
pass
43 changes: 43 additions & 0 deletions echopop/extensions/inversion/invert.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from typing import Any, Callable, Dict, Literal, Union
from numpy.typing import ArrayLike
import pandas as pd

def normalize_scattering_model_parameters(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go into optimize.py since the output of the optimizer is doing the "inversion" so can just be packaged there.

scattering_model_parameters: Dict[str, Any],
) -> Dict[str, Any]:
"""
Normalize the scattering model parameters
"""
# == model_para_conversion.m
pass

def Sv_prediction_error(
measured_Sv: ArrayLike[float],
predicted_Sv: ArrayLike[float],
):
"""
Compute inverted volumetric backscattering strength ($S[v]$) prediction error
"""
pass

def invert_population(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This invert_population and Sv_prediction_error are not strictly part of the "inversion" but are derived computations. Let me read through the rest of doc and see where these might go.

measured_Sv: ArrayLike[float],
predicted_Sv: ArrayLike[float],
inverted_ts: ArrayLike[float],
kwargs # other parameters
) -> ArrayLike[float]: # or just a full DataFrame given the multiple estimates being calculated
"""
Generate population estimates based on inverted TS model parameters
"""

# PHASE 1) MEAN NUMBER DENSITY
# PHASE 2) AREAL NUMBER DENSITY
# PHASE 3) ABUNDANCE
# PHASE 4) ANIMAL BODY DENSITY (g/cm^3)
# PHASE 5) BIOMASS
# PHASE 6) AREAL BIOMASS DENSITY
# PHASE 7) COMPUTE TOTAL PREDICTION ERROR ("Qe")
total_error = Sv_prediction_error(measured_Sv, predicted_Sv)

# RETURNS: Array or DataFrame ot population estimates
pass
120 changes: 120 additions & 0 deletions echopop/extensions/inversion/math.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
"""
Mathematical and numerical utility functions.
"""

from scipy.special import spherical_jn, spherical_yn
import numpy as np
from typing import Any, Dict, Literal
from numpy.typing import ArrayLike

def spherical_hn(n, z, derivative=False) -> ArrayLike:
"""
Spherical Bessel function of the third kind (Hankel function) or its derivative

Defined as [1]_,

.. math:: h_n^{(1)}(z)=j_n(z)+in_n(z),

where :math:`h_n^{(1)}` is the spherical Bessel function of the third kind (or Hankel function
of the first kind), :math:`j_n` is the spherical Bessel function of the first kind, :math:`n_n`
is the spherical Bessel function of the second kind (or Neumann function), :math:`n` is the
order of the function (:math:`n>=0`), :math:`z` is the Bessel function argument value, and
:math:`i` is an imaginary number.

Parameters
----------
n: int
Order of the Bessel function (n >= 0)
z: Union[float, np.complex]
Argument of the Bessel function
derivative: Optional[bool]
When True, the derivative is computed

Notes
-----
The derivative is computed using the relations [2]_,

.. math::
\frac{n}{z} h^{(1)}_n - h^{(1)}_{n+1}(z)

References
----------
.. [1] https://dlmf.nist.gov/10.47#E5
.. [2] https://dlmf.nist.gov/10.51#E2

"""
# == lib/sphhn.m

# Define internal function
def _spherical_hn(n, z):
return spherical_jn(n, z) + 1j * spherical_yn(n, z)

# Computing derivative
if derivative:
return (n/z) * _spherical_hn(n, z) - _spherical_hn(n+1, z)
else:
return _spherical_hn(n, z)

def length_average(
length: ArrayLike[float],
form_function: ArrayLike[complex],
distribution_kwargs: Dict[str, float],
distribution: Literal["gaussian", "uniform"] = "gaussian",
) -> ArrayLike:
"""
Compute the length-averaged linear backscattering cross-section (:math:`\sigma_{bs}(L)`)
"""
# == Scat_models/length_ave.m

# PHASE 1) EXTRACT RELEVANT PARAMETERS (e.g. ka)
# PHASE 2) GENERATE PDF BASED ON SELECTED DISTRIBUTION
if distribution == "gaussian":
pass
elif distribution == "uniform":
pass
else:
raise ValueError("Invalid distribution type. Choose 'gaussian' or 'uniform'.")
# PHASE 3) SQUARE SIGMA_BS
# PHASE 4) COMPUTE SIGMA_BS OVER CONFIGURED PDF BINS AT EACH DEFINED FREQUENCY

# RETURNS: sqrt(sum(sigma_bs))
pass

def orientation_average(
angle: ArrayLike[float],
form_function: ArrayLike[complex],
distribution_kwargs: Dict[str, float],
distribution: Literal["gaussian", "uniform"] = "gaussian",
) -> ArrayLike:
"""
Compute the orientation-averaged linear backscattering cross-section :math:`\sigma_{bs}(\theta)`
"""
# == Scat_models/orient_ave.m

# PHASE 1) EXTRACT RELEVANT PARAMETERS (e.g. ka)
# PHASE 2) GENERATE PDF BASED ON SELECTED DISTRIBUTION
if distribution == "gaussian":
pass
elif distribution == "uniform":
pass
else:
raise ValueError("Invalid distribution type. Choose 'gaussian' or 'uniform'.")
# PHASE 3) SQUARE SIGMA_BS
# PHASE 4) COMPUTE SIGMA_BS OVER CONFIGURED PDF BINS AT EACH DEFINED FREQUENCY

# RETURNS: sqrt(sum(sigma_bs))
pass

def fit_rayleigh_pdf(
measured: ArrayLike[float],
density: ArrayLike[float],
mean: float,
standard_deviation: float,
lower_bounds: float,
upper_bounds: float,
arg_distribution: Literal["exponential", "gaussian"] = "gaussian",
):
"""
Fit a single-parameter Rayleigh probability density function to the measured data
"""
pass
75 changes: 75 additions & 0 deletions echopop/extensions/inversion/optimize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
import numpy as np
from lmfit import Minimizer, Parameters
from typing import Any, Callable, Dict, Literal, Union
from numpy.typing import ArrayLike

def mae(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be mad? it's used below in prepare_optimization

prediction: ArrayLike[float],
measurement: ArrayLike[float],
):
"""
Mean absolute deviation (MAD) in logarithmic space (dB)
"""
# == functions/cost_functionALL.m
pass

def rmse(
prediction: ArrayLike[float],
measurement: ArrayLike[float],
):
"""
Root mean square deviation (RMSE) in logarithmic space (dB)
"""
# == functions/cost_functionALL.m
pass

def normalize_optimization_parameters(parameters: Dict[str, Any]) -> Dict[str, Any]:
"""
Normalize the optimization parameters
"""
pass

def prepare_optimization(
scattering_model_parameters: Dict[str, Any],
optimization_settings: Dict[str, Any],
cost_function: Callable = mad,
) -> Dict[str, Union[Minimizer, Parameters]]:
"""
Prepare optimization settings
"""

# PHASE 1) EXTRACT RELEVANT SCATTERING MODEL PARAMETERS
# PHASE 2) CONVERT RELEVANT OPTIMIZATION PARAMETERS INTO ASSOCIATED `lmfit::Parameters`
params = Parameters(**scattering_model_parameters) # not actual code, just a placeholder
# PHASE 3) WITH COST-FUNCTION, CREATE `lmfit::Minimizer` OBJECT
# not actual code, just a placeholder
minim = Minimizer(cost_function, params, **optimization_settings)
# RETURNS: Dictionary with optimization parameters and minimizer
return {"parameters": params, "minimizer": minim}

def optimize_scattering_model(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be renamed to something like optimize_model_parameters, since it is not the model that gets optimized, but the parameters.

predicted_Sv: ArrayLike[float],
measured_Sv: ArrayLike[float],
parameters: Parameters,
cost_function: Minimizer,
optimization_settings: Dict[str, Any],
) -> Dict[str, Any]:
"""
Optimize scattering model parameters
"""
# == functions/SVpredictionALL.m
# == KrillSvInversion_simu_data_2020_05_01.m

# PHASE 1) RUN OPTIMIZATION
# not actual code, just a placeholder
parameters_optimized = cost_function.minimize(
method="least_squares",
**optimization_settings["config"]
)
# PHASE 2) CALCULATE MEAN ABSOLUTE DEVIATION
mad_optimized = np.mean(np.abs(parameters_optimized.residual))
# PHASE 3) EXTRACT THE BEST-FIT PARAMETERS
best_fit_params = parameters_optimized.params.valuesdict()

# RETURNS: Best-fit scattering model parameters
return best_fit_params
44 changes: 44 additions & 0 deletions echopop/extensions/inversion/patcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
from pathlib import Path
import pandas as pd
from typing import Any, Dict, Union
from ...survey import Survey

def inversion_pipeline(
Copy link
Member Author

@leewujung leewujung Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way things are designed now are completely separate from the hake component, which is not what I envisioned. I think it is fine as an intermediate step, but ultimately a more modularized framework will be needed to make the package more useful as an open-source project, and also remove the need for most "patcher" type of functions. Let's talk more about this when we meet.

acoustic_dataset_file: Union[str, Path],
scattering_config_file: Union[str, Path],
inversion_config_file: Union[str, Path],
) -> Dict[str, Any]:
"""
Consolidated workflow for predicting volumetric backscatter using inverted scattering model
parameters/inputs
"""

# PHASE 1) READ IN DATASET FILE
# PHASE 2) READ IN CONFIGURATION FILES
# PHASE 3) PREPARE DATASET FOR INVERSION
# PHASE 4) PREPARE CONFIGURATION SETTINGS FOR OPTIMIZATION CALCULATIONS
# PHASE 5) PREPARE SCATTERING MODEL AND OBJECT
# PHASE 6) INVERT SCATTERING MODEL
# PHASE 7) CALCULATE POPULATION ESTIMATES

# RETURNS: A dictionary with grouped DataFrame (or new columns appended to acoustic dataset
# columns?) objects that incorporate the estimated population estimates from the inverted
# scattering model. One key in this output would also comprise the simulation results from the
# optimization for user scrutinization
pass

def inversion_survey_patch(
self: Survey,
acoustic_dataset_file: Union[str, Path],
scattering_config_file: Union[str, Path],
inversion_config_file: Union[str, Path],
) -> None:
"""
Patching method to add `inversion_pipeline` function as a method to the base `echopop::Survey`
class
"""

# NOTE: This would be patched using the import functions defined in
# `extentions/survey_extentions.py`

return inversion_pipeline(acoustic_dataset_file, scattering_config_file, inversion_config_file)
Loading
Loading