Skip to content

Classification and Analysis Summary

cjthomas730 edited this page Jul 5, 2016 · 1 revision

Summary of Issue #52 in which Andrew details his talks with researchers at Duke about the ideal means of classifying MTR sites, the best practice to determine NDVI thresholds, and how to verify the accuracy of our results.

Andrew's full write-up are in this Google Doc.

Based on his work, this is the short list of what needs to be done:

When we are doing the accuracy assessment, we want to try to use as many non-Landsat images as possible when manually classifying mine sites. This could mean using NAIP for more-recent years, or finding other historical aerial imagery over our study area. Finding alternate imagery may not be possible for every year, but it is considered a best practice for remote sensing projects and will likely be something journal reviewers ask us about if we don't address it.

##Summary and Next Steps In review, we have three concrete phases of this mining analysis still to perform. First, we must decide upon and calculate the yearly NDVI threshold. Second, we must use that threshold to run the mining script. Third, we must assess the accuracy of that threshold via lots of sample points. To accomplish these goals, we have a list of tasks still to perform, many of which we must do in approximately this order:

  1. Update the study area extent with the list of counties as provided by Matt Wasson.
  • This is available here
  1. Create our own yearly greenest-pixel composites (or not), deciding whether we want to exclude outlying values or to perform some sort of cloud masking.

  2. Attain the updated version of the magic mask.

  • This is available here
  1. Create 30 training sample points per year, for use in the Otsu algorithm.

  2. Use the greenest-pixel composites and the training data in the Otsu method to get a yearly threshold.

  3. Use those yearly thresholds to run the mining script, which will reveal probable mining areas.

  4. Determine 10(?) randomly-located sample plots per year, of area 25 km2(?), making sure that each sample plot has at least one active mine during that year.

  5. Randomly generate at least 100(?) points within each sample plot per year, making sure that 50 points fall over mined areas (i.e., generate more random points as needed so the number of points in mined areas equals 50.)

  6. Extract the NDVI values per year at those points.

  7. Work as a team to manually classify those sample points into two categories, whether they do or do not contain mining that year.

  8. Create yearly confusion matrices and related statistics by comparing the manually classified points to how the threshold would have classified those points.

  9. If necessary (depending on the results of the confusion matrices), establish a new threshold and calculate new confusion matrices.

Clone this wiki locally