Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear samples not matching time series 'valid observation' threshold #351

Open
adeane-ga opened this issue Aug 19, 2021 · 0 comments
Open
Labels

Comments

@adeane-ga
Copy link
Contributor

adeane-ga commented Aug 19, 2021

Description

  • There is per interferogram masking that occurs in previous steps which reduces the total number of usable observation in the time series step.
  • In the configuration file we have a parameter (ts_pthr) that describes the user set threshold for minimum number of valid observations allowed for a time series and rate calculation to be performed.
  • We are seeing areas with less observations than the set threshold based on the resulting linear_samples.tif.

Expected behaviour
If we set ts_pthr to 10, then we should see a velocity map that includes results only for those pixels where there are 10 or more observations.

Summary of findings

  • The input parameter ts_pthr is being used correctly in step to calculate time series.
  • But after this, more observations are removed beyond this threshold when the linear rate is calculated because we are removing rank deficient rows before calculating the linear rate.
  • Hence we can still end up with a velocity calculation based on only 2 observations, even if our user set threshold is higher than that.

Where is this happening?
These code links are from the time_series.py script

  • Line 180 shows where the if condition is that uses the user set threshold to calculate time series or not.

  • Line 191 calls a function to then remove rank deficient rows.

    def _time_series_pixel(row, col, b0_mat, sm_factor, sm_order, ifg_data, mst,
    nvelpar, p_thresh, interp, vcmt, method):
    """
    Wrapper function to compute time series for single pixel.
    """
    # check pixel for non-redundant ifgs
    sel = np.nonzero(mst[:, row, col])[0] # trues in mst are chosen
    if len(sel) >= p_thresh:
    ifgv = ifg_data[sel, row, col]
    # make design matrix, b_mat
    b_mat = b0_mat[sel, :]
    if interp == 0:
    # remove rank deficient rows
    rmrow = asarray([0]) # dummy
    while len(rmrow) > 0:
    # if b_mat.shape[0] <=1 then we return nans
    if b_mat.shape[0] > 1:
    b_mat, ifgv, sel, rmrow = _remove_rank_def_rows(
    b_mat, nvelpar, ifgv, sel)
    else:
    return np.empty(nvelpar) * np.nan
    # Some epochs have been deleted; get valid epoch indices
    velflag = sum(abs(b_mat), 0)
    # remove corresponding columns in design matrix
    b_mat = b_mat[:, ~np.isclose(velflag, 0.0)]
    else:
    velflag = np.ones(nvelpar)
    if method == 1: # Use Laplacian smoothing method
    tsvel = _solve_ts_lap(nvelpar, velflag, ifgv, b_mat,
    sm_order, sm_factor, sel, vcmt)
    elif method == 2: # Use SVD method
    tsvel = _solve_ts_svd(nvelpar, velflag, ifgv, b_mat)
    else:
    raise TimeSeriesError("Unrecognised time series method")
    return tsvel
    else:
    return np.empty(nvelpar) * np.nan

  • Further along when the linear rate is being calculated, the number of samples is returned from line 322.

  • This is the final number of samples used after previous steps, and this can be below the user set threshold.

    def linear_rate_pixel(y, t):
    """
    Calculate best fitting linear rate to cumulative displacement
    time series for one pixel using linear regression.
    :param ndarray y: 1-dimensional vector of cumulative displacement time series
    :param ndarray t: 1-dimensional vector of cumulative time at each epoch
    :return: linrate: Linear rate; gradient of fitted line.
    :rtype: float
    :return intercept: Y-axis intercept of fitted line at t = 0
    :rtype: float
    :return: rsquared: R-squared value of the linear regression
    :rtype: float
    :return: error: Standard error of the linear regression
    :rtype: float
    :return: samples: Number of observations used in linear regression
    :rtype: int
    """
    # Mask to exclude nan elements
    mask = ~isnan(y)
    # remove nan elements from both arrays
    y = y[mask]
    try:
    t = t[mask]
    except IndexError:
    raise TimeSeriesError("linear_rate_pixel: y and t are not equal length")
    # break out of func if not enough time series obs for line fitting
    nsamp = len(y)
    if nsamp < 2:
    return nan, nan, nan, nan, nan
    # compute linear regression of tscuml
    linrate, intercept, r_value, p_value, std_err = linregress(t, y)
    return linrate, intercept, r_value**2, std_err, int(nsamp)

Data Example
Here I am working on a small cropped data set to document issue. There are 17 IFGs and 8 time series epochs. I have ts_pthr set to 4. So our time series and rates should only include 4 or more obervations per pixel. But we see that there is as little as two for some pixels.

Linear rate map:
image

Linear samples map:
image

Print out of relevant variables
The array sel is the IFGs after MST selection. I printed the length of this out per pixel after the if statement which uses the ts_pthr value of 4 as a threshold:

  • note that there are no results for a query below 4.

image

I did the same thing for the variable nsamp which is the final amount of samples in a pixel:

  • note there are now pixels with samples below 4.

image

Remaining questions

  • is the way to solve this to repeat the ts_pthr threshold at the rate calculation step as well?
  • is the threshold operating on total number of observations, or number of sequential observations? I think its the former.
@adeane-ga adeane-ga added the bug label Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant