Linear samples not matching time series 'valid observation' threshold #351

adeane-ga · 2021-08-19T02:01:35Z

Description

There is per interferogram masking that occurs in previous steps which reduces the total number of usable observation in the time series step.
In the configuration file we have a parameter (ts_pthr) that describes the user set threshold for minimum number of valid observations allowed for a time series and rate calculation to be performed.
We are seeing areas with less observations than the set threshold based on the resulting linear_samples.tif.

Expected behaviour
If we set ts_pthr to 10, then we should see a velocity map that includes results only for those pixels where there are 10 or more observations.

Summary of findings

The input parameter ts_pthr is being used correctly in step to calculate time series.
But after this, more observations are removed beyond this threshold when the linear rate is calculated because we are removing rank deficient rows before calculating the linear rate.
Hence we can still end up with a velocity calculation based on only 2 observations, even if our user set threshold is higher than that.

Where is this happening?
These code links are from the time_series.py script

Line 180 shows where the if condition is that uses the user set threshold to calculate time series or not.

Line 191 calls a function to then remove rank deficient rows.

PyRate/pyrate/core/timeseries.py

Lines 173 to 212 in dde59c6

    
           def _time_series_pixel(row, col, b0_mat, sm_factor, sm_order, ifg_data, mst, 
        
                                     nvelpar, p_thresh, interp, vcmt, method): 
        
               """ 
        
               Wrapper function to compute time series for single pixel. 
        
               """ 
        
               # check pixel for non-redundant ifgs 
        
               sel = np.nonzero(mst[:, row, col])[0]  # trues in mst are chosen 
        
               if len(sel) >= p_thresh: 
        
                   ifgv = ifg_data[sel, row, col] 
        
                   # make design matrix, b_mat 
        
                   b_mat = b0_mat[sel, :] 
        
                   if interp == 0: 
        
                       # remove rank deficient rows 
        
                       rmrow = asarray([0])  # dummy 
        
                       while len(rmrow) > 0: 
        
                           # if b_mat.shape[0] <=1 then we return nans 
        
                           if b_mat.shape[0] > 1: 
        
                               b_mat, ifgv, sel, rmrow = _remove_rank_def_rows( 
        
                                   b_mat, nvelpar, ifgv, sel) 
        
                           else: 
        
                               return np.empty(nvelpar) * np.nan 
        
                       # Some epochs have been deleted; get valid epoch indices 
        
                       velflag = sum(abs(b_mat), 0) 
        
                       # remove corresponding columns in design matrix 
        
                       b_mat = b_mat[:, ~np.isclose(velflag, 0.0)] 
        
                   else: 
        
                       velflag = np.ones(nvelpar) 
        
                   if method == 1: # Use Laplacian smoothing method 
        
                       tsvel = _solve_ts_lap(nvelpar, velflag, ifgv, b_mat, 
        
                                             sm_order, sm_factor, sel, vcmt) 
        
                   elif method == 2: # Use SVD method 
        
                       tsvel = _solve_ts_svd(nvelpar, velflag, ifgv, b_mat) 
        
                   else: 
        
                       raise TimeSeriesError("Unrecognised time series method") 
        
                   return tsvel 
        
               else: 
        
                   return np.empty(nvelpar) * np.nan

Further along when the linear rate is being calculated, the number of samples is returned from line 322.

This is the final number of samples used after previous steps, and this can be below the user set threshold.

PyRate/pyrate/core/timeseries.py

Lines 292 to 329 in dde59c6

    
           def linear_rate_pixel(y, t): 
        
               """ 
        
               Calculate best fitting linear rate to cumulative displacement 
        
               time series for one pixel using linear regression. 
        
               :param ndarray y: 1-dimensional vector of cumulative displacement time series 
        
               :param ndarray t: 1-dimensional vector of cumulative time at each epoch 
        
               :return: linrate: Linear rate; gradient of fitted line. 
        
               :rtype: float 
        
               :return intercept: Y-axis intercept of fitted line at t = 0 
        
               :rtype: float 
        
               :return: rsquared: R-squared value of the linear regression 
        
               :rtype: float 
        
               :return: error: Standard error of the linear regression 
        
               :rtype: float 
        
               :return: samples: Number of observations used in linear regression 
        
               :rtype: int 
        
               """ 
        
               # Mask to exclude nan elements 
        
               mask = ~isnan(y) 
        
               # remove nan elements from both arrays 
        
               y = y[mask]     
        
               try: 
        
                   t = t[mask] 
        
               except IndexError: 
        
                   raise TimeSeriesError("linear_rate_pixel: y and t are not equal length") 
        
               # break out of func if not enough time series obs for line fitting 
        
               nsamp = len(y) 
        
               if nsamp < 2: 
        
                   return nan, nan, nan, nan, nan 
        
               # compute linear regression of tscuml  
        
               linrate, intercept, r_value, p_value, std_err = linregress(t, y) 
        
               return linrate, intercept, r_value**2, std_err, int(nsamp)

Data Example
Here I am working on a small cropped data set to document issue. There are 17 IFGs and 8 time series epochs. I have ts_pthr set to 4. So our time series and rates should only include 4 or more obervations per pixel. But we see that there is as little as two for some pixels.

Linear rate map:

Linear samples map:

Print out of relevant variables
The array sel is the IFGs after MST selection. I printed the length of this out per pixel after the if statement which uses the ts_pthr value of 4 as a threshold:

note that there are no results for a query below 4.

I did the same thing for the variable nsamp which is the final amount of samples in a pixel:

note there are now pixels with samples below 4.

Remaining questions

is the way to solve this to repeat the ts_pthr threshold at the rate calculation step as well?
is the threshold operating on total number of observations, or number of sequential observations? I think its the former.

The text was updated successfully, but these errors were encountered:

adeane-ga added the bug label Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linear samples not matching time series 'valid observation' threshold #351

Linear samples not matching time series 'valid observation' threshold #351

adeane-ga commented Aug 19, 2021 •

edited

Loading

Linear samples not matching time series 'valid observation' threshold #351

Linear samples not matching time series 'valid observation' threshold #351

Comments

adeane-ga commented Aug 19, 2021 • edited Loading

adeane-ga commented Aug 19, 2021 •

edited

Loading