Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Residual range limit #316

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/config_yaml.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,8 @@ The information that should be defined in the config.yaml file is as follows:
- **Array Center** The approximate center of the array.
- **Transponders** A priori information for each transponder in the array.
GNATSS is currently configured to operate on arrays of 3 transponders. GNATSS
will assign names to the transponders based upn the order they are included in
the configuration file. The first entry will be named "SITE-1", the second
will assign names to the transponders based upon the order they are included
in the configuration file. The first entry will be named "SITE-1", the second
entry "SITE-2", etc. The following information must be provided:
- lat: The latitude of the transponder in decimal degrees
- lon: The longitude of the transponder in decimal degrees
Expand Down
4 changes: 2 additions & 2 deletions docs/gnssa_preprocessing.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ Furthermore, there is a discrepancy in the time sampling of these instruments
and the epochs when the transducer receives an acoustic reply since the two-way
travel times are of arbitrary length dependent on the state of the ocean at any
particular moment. GNATSS is tuned to process data collected by the Sonardyne
GNSS-Acoutic payload mounted on an LRI model SV-3 wave glider, as currently
GNSS-Acoustic payload mounted on an LRI model SV-3 wave glider, as currently
employed in the Near Trench Community Geodetic Experiment. This instrument
collects the following data:

- Psuedorange data at a 10 Hz sampling rate, which the user may choose to
- Pseudorange data at a 10 Hz sampling rate, which the user may choose to
process at 1 or 10 HZ for GNSS antenna positions
- Wave glider velocities and roll, pitch, heading values at a 20 Hz sampling
rate
Expand Down
18 changes: 13 additions & 5 deletions docs/input.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
GNATSS has two processing modes, a "posfilter" mode and a "solver" mode, which
are used at different phases of the processing chain. It is possible to run both
processing modes in sequence or individually. The posfilter module outputs
GNSS-Acosutic travel times and transducer positions in the community standard
GNSS-Acoustic travel times and transducer positions in the community standard
GNSS-Acoustic data format, so if you have obtained data already in that format
you may skip the posfilter mode and immediately run the solver.

Expand All @@ -28,10 +28,17 @@ Step 4: Run the solver to calculate an array position
gnatss run --solver --distance-limit 150 --residual-limit 10000 config.yaml
```

Step 5: Repeat Step 4, reducing the residual limit as desired in each successive
iteration in order to remove erroneous residuals
Step 5: Run the solver again to with the _--remove-outliers_ option to remove
flagged residuals.

Step 6: Array positions, offsets, and statistics are stored in the
```bash
gnatss run --solver --distance-limit 150 --remove-outliers config.yaml
```

Step 6: Repeat Steps 4-5, reducing the residual limit as desired in each
successive iteration in order to remove erroneous residuals.

Step 7: Array positions, offsets, and statistics are stored in the
process_dataset.nc file.

_Alternative_: GNATSS supports end-to-end processing, so you may execute Steps 3
Expand All @@ -42,8 +49,9 @@ to skip the posfilter and immediately load gps solutions from your output
directory.

```bash
gnatss run --distance-limit 150 --residual-limit 10000 config.yaml
gnatss run --distance-limit 150 config.yaml
gnatss run --from-cache --distance-limit 150 --residual-limit 10000 config.yaml
gnatss run --from-cache --distance-limit 150 --remove-outliers config.yaml
```

## Required Input Data
Expand Down
5 changes: 3 additions & 2 deletions docs/install.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Installation Guide

GNATSS requires Python v3.10 or greater to run. If you have a conda
installation, we recommend creating a fresh environment to install GNATSS in.
GNATSS requires Python v3.10 or v3.11 to run. It is not currently compatible
with Python 3.12. If you have a conda installation, we recommend creating a
fresh environment to install GNATSS in.

```
conda create -n gnatss -c conda-forge --yes python=3.10 ipykernel
Expand Down
120 changes: 84 additions & 36 deletions docs/position_inversion.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ and place a configuration file inside of it. This configuration file may be
edited from the template available on the [GNATSS Config File](./config_yaml.md)
page or the [GNATSS GitHub](https://github.com/seafloor-geodesy/gnatss). You may
name this file whatever you like, but for the purposes of this example, we will
name the configuration file _NDP1_2022.yaml_. Edit _NDP1_2022.yaml_ to reflect
the a priori site information and locations of the input data files. Once you
have done this, run GNATSS in the working directory with the following command
on the command line interface:
name the configuration file _NDP1_2022_config.yaml_. Edit
_NDP1_2022_config.yaml_ to reflect the a priori site information and locations
of the input data files. Once you have done this, run GNATSS in the working
directory with the following command on the command line interface:

```
gnatss run --distance-limit 150 NDP1_2022.yaml
gnatss run --distance-limit 150 NDP1_2022_config.yaml
```

After a successful run, GNATSS will generate an _output_ directory defined in
Expand All @@ -32,7 +32,7 @@ the following files:
This file is only generated when the user runs the _posfilter_ mode of GNATSS.
- `process_dataset.nc`: NetCDF file with array offsets.
- `residuals.csv`: File with the GNSS-Acoustic residuals, the difference between
the measured and modelled two-way travel times for each ping. Residual values
the measured and modeled two-way travel times for each ping. Residual values
are converted to centimeters from seconds with the mean sound velocity.
- `deletions.csv`: File with a list of residuals with poor data quality, to be
removed the next time gnatss is executed.
Expand Down Expand Up @@ -78,12 +78,12 @@ _gps_solution.csv_ file in the output directory. We don't even need to update
the configuration file!

```
gnatss run --from-cache --distance-limit 150 --residual-limit 500 NDP1_2022.yaml
gnatss run --from-cache --distance-limit 150 --residual-limit 500 NDP1_2022_config.yaml
```

After doing this, you will notice that the solution itself has not changed.
However, if we open look a the residuals again, we can see that the vertical
spkies due to the GNSS cycle slips have been flagged.
spikes due to the GNSS cycle slips have been flagged.

![residuals_2](./residuals_2.png)

Expand All @@ -101,7 +101,7 @@ the "--remove-outliers" flag, which will direct it to automatically remove any
flagged residuals:

```
gnatss run --from-cache --distance-limit 150 --remove-outliers NDP1_2022.yaml
gnatss run --from-cache --distance-limit 150 --remove-outliers NDP1_2022_config.yaml
```

GNATSS has now removed the residuals that had been flagged during the previous
Expand All @@ -113,55 +113,103 @@ There are still some outlier residuals, so let's try tightening the outlier
threshold. The strategy used to flag and remove residuals is up to the user's
discretion. In this case, there is a fairly consistent band of residuals near 0
cm that I interpret as the primary geodetic signal within the time series. I
choose a strict outlier threshold of 30 cm in order to isolate that signal.
choose a strict outlier threshold of 100 cm in order to isolate that signal.

```
gnatss run --from-cache --distance-limit 150 --residual-limit 30 NDP1_2022.yaml
gnatss run --from-cache --distance-limit 150 --residual-limit 100 NDP1_2022_config.yaml
```

This flags the following residuals:

![residuals_4](./residuals_4.png)

Since I am satisfied that only outliers from the main time series are being
flagged, I lock in the deletions by running GNATSS again. In this case I'll call
both the "--remove-outliers" flag and the "--residual-limit" flag, although I
won't change the residual limit.
flagged, I lock in the deletions by running GNATSS again.

```
gnatss run --from-cache --distance-limit 150 --residual-limit 30 --remove-outliers NDP1_2022.yaml
gnatss run --from-cache --distance-limit 150 --remove-outliers NDP1_2022_config.yaml
```

This generates the following residual plot:

![residuals_5](./residuals_5.png)

Notice how new residuals have been flagged! Since the final position is
generated by averaging all of the residuals in the time series, outliers can
bias the result. Because of this, when we removed the previous outliers, the
mean position shifted enough that residuals just below the threshold were pushed
beyond it. However, although it is possible to direct GNATSS to remove outliers
and flag new residuals in a single run, we do not recommend doing so since it
increases the risk of the user accidentally flagging good data. In the case that
you flag residuals you do not want to discard, you may either manually delete
the _outliers.csv_ file or run gnatss again with a new "--residual-limit flag"
to update _outliers.csv_. As long as you do not execute GNATSS with the
"--remove-outliers" flag, any flagged residuals will not be removed from the
positioning.

I choose to run GNATSS a few more times in order to remove all of the residuals
right at the 30 cm threshold until no new residuals are flagged. The final
residual plot looks like this:

![residuals_6](./residuals_6.png)

There is a definitive cyclical signal within these residuals due to
oceanographic effects. However, the important thing to note is that all three
residual time series for the transponders are capturing similar signals from the
ocean and plotting on top of each other. Because of this, if we average the
residuals over space to estimate the offstes of the array center, we will see
residuals over space to estimate the offsets of the array center, we will see
that this cyclical signal mostly maps in to the vertical component and that the
horizontal offsets are pretty stable. This can be verified in the
_residuals_enu_components.png_ plot:

![residuals_enu_components_6](./residuals_enu_components_6.png)
![residuals_enu_5](./residuals_enu_5.png)

However, even though the apparent east and north positions are reasonably stable
there are some large excursions on 2022-08-03. It is unlikely that the array
moved many centimeters and then returned to its original position within the
span of a single day without triggering a seismic response, so it is likely this
signal is caused by our assumptions in the GNSS-A positioning briefly breaking
down. This could be due to oceanographic effects (such as from eddies) or errors
in the GNSS positions. Unfortunately, the poor residuals that occur during
2022-08-03 have a similar magnitude as many of the good residuals occurring
earlier in the data set, so if we naively set an outlier threshold as we have
previously done, we will risk flagging a volume of good data. To further refine
this solution, we will have to adopt a different strategy for flagging
residuals.

## Alternative Residual Flagging

GNATSS has an alternative method for flagging residuals, which is the
"--residual-range-limit" option. This option accepts a user-defined range in cm,
which defines a threshold for the maximum and minimum residuals for a given
ping. If the range, defined as the $\Delta a_{max}-\Delta a_{min}$, is greater
than the user-defined threshold, then the residuals for that ping will be
flagged for removal.

This works because of the underlying assumption in GNATSS processing that any
oceanographic variation must occur in the upper water column, where the acoustic
rays to each transponder in the array are close to one another when surveying
from the array center. If this assumption is true, each ray captures a similar
time delay due to the oceanographic generation so the residuals will be very
close. When this assumption starts to break down, the residuals will no longer
be close since they will be capturing different acoustic delays and so the
difference between the maximum and minimum residual increases. Oceanographic
phenomena that may cause these kinds of anisotropic delays include ocean
currents (particularly deep currents), eddies, or internal tides and waves, all
of which may cause horizontal gradients within the water column.

We can demonstrate this flagging strategy with the 2022 NDP1 data set we left in
the previous section. Starting from where we left off, having flagged residuals
that were >100 cm, we first observe the residuals and estimate how far outlier
replies are from replies received from the same ping. In this data set, it
appears that when the replies diverge from one another that they are separated
by >25 cm, so we will set this as our threshold:

```
gnatss run --from-cache --distance-limit 150 --residual-range-limit 25 NDP1_2022_config.yaml
```

This command will flag the following residuals:

![residuals_6](./residuals_6.png)

Notice how this has flagged many of the poor residuals from before. Upon
removing them, we arrive at the following:

![residuals_7](./residuals_7.png)

We can verify that this has removed many of the apparent excursions from the
average position with the ENU residual plot.

![residuals_enu_7](./residuals_enu_7.png)

From here we can keep tuning the solution by flagging more residuals until we
are satisfied that we have removed many of the outliers that may bias the
solution. In general, a good strategy is to flag residuals conservatively so
that you don't accidentally remove good data. This is particularly important
after the first run of GNATSS, since very large (~1000s of cm) residuals can
bias the average position so much that the primary residual time series diverge.
However, if you remove only the largest residuals, the residuals will usually
converge. Because of this, it is also good practice to verify which residuals
you wish to flag and remove by looking at the residual plots beforehand.
Binary file modified docs/residuals_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/residuals_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/residuals_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/residuals_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/residuals_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/residuals_6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/residuals_7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/residuals_enu_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/residuals_enu_7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/residuals_enu_components_6.png
Binary file not shown.
4 changes: 2 additions & 2 deletions docs/travel_time_delays.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

In general, a GNSS-Acoustic Two-Way Travel Time (TWTT) measurement will include
two additional timing delays in addition to the time when the acoustic pulse is
travelling gthrough the water column. We refer to these as:
traveling through the water column. We refer to these as:

- **The Transducer Delay Time** The delay time of the surface platform between
an interrogate command being sent to the transducer and the interrogation ping
Expand Down Expand Up @@ -31,7 +31,7 @@ phases:
In GNSS-Acoustic processing, the critical transducer positions that must be
known are the position at ping send (Phase 3) and ping reply (Phase 9) since
these positions are when the acoustic pulse physically starts and finishes
travelling through the water column to and from the transducer. Consistent with
traveling through the water column to and from the transducer. Consistent with
this, by convention the TAT is thus included in the TWTT measurement while the
transducer delay is excluded from the TWTT measurement. GNATSS requires the
TWTTs submitted in the _pxp_tt_ file to adhere to this convention, in which case
Expand Down
7 changes: 7 additions & 0 deletions src/gnatss/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@ def run(
None,
help=(f"{Solver.model_fields.get('residual_limit').description}" f". {OVERRIDE_MESSAGE}"),
),
residual_range_limit: float = typer.Option(
None,
help=(
f"{Solver.model_fields.get('residual_range_limit').description}" f". {OVERRIDE_MESSAGE}"
),
),
qc: bool = typer.Option(
True, help="Flag to plot residuals from run and store in output folder."
),
Expand Down Expand Up @@ -101,6 +107,7 @@ def run(
config_yaml=config_yaml,
distance_limit=distance_limit,
residual_limit=residual_limit,
residual_range_limit=residual_range_limit,
outlier_threshold=outlier_threshold,
from_cache=from_cache,
remove_outliers=remove_outliers,
Expand Down
9 changes: 9 additions & 0 deletions src/gnatss/configs/solver.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,15 @@ class Solver(BaseModel):
"which data points will be excluded from solution"
),
)
residual_range_limit: float = Field(
20000.0,
ge=0.0,
description=(
"Maximum residual range (maximum - minimum) "
"in centimeters for a given epoch, beyond which "
"data points will be excluded from solution"
),
)
residual_outliers_threshold: float = Field(
25.0,
ge=0.0,
Expand Down
8 changes: 8 additions & 0 deletions src/gnatss/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ def run_gnatss(
config_yaml: str,
distance_limit: float | None = None,
residual_limit: float | None = None,
residual_range_limit: float | None = None,
outlier_threshold: float | None = None,
from_cache: bool = False,
return_raw: bool = False,
Expand Down Expand Up @@ -45,6 +46,12 @@ def run_gnatss(

*Setting this argument will override the value set in the configuration.*

residual_range_limit : float | None, optional
Maximum residual range (maximum - minimum) in centimeters for
a given epoch, beyond which data points will be excluded from solution

*Setting this argument will override the value set in the configuration.*

outlier_threshold : float | None, optional
Residual outliers threshold acceptable before throwing an error in percent

Expand Down Expand Up @@ -95,6 +102,7 @@ def run_gnatss(
config_yaml,
distance_limit=distance_limit,
residual_limit=residual_limit,
residual_range_limit=residual_range_limit,
outlier_threshold=outlier_threshold,
from_cache=from_cache,
remove_outliers=remove_outliers,
Expand Down
2 changes: 2 additions & 0 deletions src/gnatss/ops/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,7 @@ def data_loading(
config_yaml: str,
distance_limit: float | None = None,
residual_limit: float | None = None,
residual_range_limit: float | None = None,
outlier_threshold: float | None = None,
from_cache: bool = False,
remove_outliers: bool = False,
Expand All @@ -415,6 +416,7 @@ def data_loading(
config_yaml,
distance_limit=distance_limit,
residual_limit=residual_limit,
residual_range_limit=residual_range_limit,
outlier_threshold=outlier_threshold,
)
# Switch off cache if gps solution does not exist
Expand Down
6 changes: 6 additions & 0 deletions src/gnatss/ops/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@
config,
distance_limit: float | None = None,
residual_limit: float | None = None,
residual_range_limit: float | None = None,
residual_outliers_threshold: float | None = None,
):
# Override the distance and residual limits if provided
Expand All @@ -108,6 +109,9 @@
if residual_outliers_threshold is not None:
config.solver.residual_outliers_threshold = residual_outliers_threshold

if residual_range_limit is not None:
config.solver.residual_range_limit = residual_range_limit

Check warning on line 113 in src/gnatss/ops/io.py

View check run for this annotation

Codecov / codecov/patch

src/gnatss/ops/io.py#L113

Added line #L113 was not covered by tests

return config


Expand Down Expand Up @@ -189,13 +193,15 @@
config_yaml: str,
distance_limit: float | None = None,
residual_limit: float | None = None,
residual_range_limit: float | None = None,
outlier_threshold: float | None = None,
):
config = load_configuration(config_yaml)
return set_limits(
config,
distance_limit=distance_limit,
residual_limit=residual_limit,
residual_range_limit=residual_range_limit,
residual_outliers_threshold=outlier_threshold,
)

Expand Down
Loading