Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarize conversation about our classification #52

Closed
cjthomas730 opened this issue Jun 28, 2016 · 6 comments
Closed

Summarize conversation about our classification #52

cjthomas730 opened this issue Jun 28, 2016 · 6 comments
Assignees

Comments

@cjthomas730
Copy link
Contributor

@apericak would you mind writing up a summary of the conversation you had with your professor about our pseudo classification and the possibly ways to assess the accuracy of our EE script output, so that we have it on record

@apericak
Copy link
Collaborator

apericak commented Jul 1, 2016

Here's the Google Doc with a bunch of my thoughts: https://docs.google.com/document/d/1HrsThiimpRbJWVJswMJxL0yKwAouJv7PtLtXdpP6Rs4/edit?usp=sharing

Near the end is a list of things we still have to do, based off what I wrote in the document. The one thing I'm wondering now, though, is whether we will actually need a unique threshold for each year, or if we can just derive one common threshold as we are currently doing. Despite the autocorrelation issue with the data we were just discussing, the fact that the 0.51 threshold does so well across almost every year suggests that maybe we might be okay with one threshold. Nevertheless, we will first need new training data to establish the threshold.

That being said, if we switch to making our own greenest pixel composites and/or if we use surface reflectance imagery rather than top of atmosphere, the universal threshold will most likely change from its current value of 0.51 (especially with SR imagery).

There are three critical first steps that we should aim to have completed ASAP (also outlined in the list at the end of the document):

  1. We need an updated Fusion table of the study extent, using the counties as provided by AppVoices (maybe you all have already done that?)
  2. We need to figure out and finalize what imagery we are using (TOA vs SR, premade composites vs ones we make ourselves).
  3. We need the updated / complete magic mask, so that we can intelligently get training data for setting the threshold.

@cjthomas730
Copy link
Contributor Author

@JerrilynGoldberg is just about finished with an updated mask using 2015 data for the county list given by App. Voices, so we should have both that as well as an updated study extent shortly.

@cjthomas730
Copy link
Contributor Author

@apericak so here's my take-away on the major points, let me know if I missed anything:

  1. We need to establish a new NDVI threshold, but we can be biased in our site selection process
  2. The point set created by Tita will be useful moving forwards as the means to assess the accuracy of our new threshold
  3. While our current universal threshold is holding up well across sensors, we're better off creating a threshold for each sensor and each year
  4. It is beneficial for us to create our own GPCs from both a knowledge and control standpoint
    • we should also switch from TOA to SR imagery
  5. There are other metrics we can use to tell us how well our threshold operates

Thanks for writing this all up, this is going to be an excellent resource/guide moving forwards

@apericak
Copy link
Collaborator

apericak commented Jul 1, 2016

  1. Yep, either one threshold (which might work) or a unique threshold per year
  2. Yah, since we are going to have to do a large amount of accuracy assessment, that's already 1200 points that we know if they're mining or not, at least for those specific years, so definitely useful data.
  3. So this is the one we will have to test--Brady's spreadsheet, even if you were assessing the accuracy via the training data, still perhaps suggests that one threshold might be good enough. So it won't be worth the time to try to make a unique threshold if we could just use one. That being said, the Otsu thing doesn't require a lot of training data, so I'm going to make unique thresholds via that anyway and then we can compare to some single threshold.
  4. Right--at least for training the threshold (setting the threshold) we essentially don't want random data so that we are as sure as possible that our threshold is indicating things that we want to see (in this case, mines). When we get to the accuracy assessment, those should be mostly random, with the exceptions being that we will have sample plots and that we are doing that stratified sampling to make sure that we sample at least 50 mine points per plot.

We probably will want to switch to SR, but that's not a definite yet either. I actually talked to some more people today and learned that there may not be much to be gained by using SR rather than TOA. Using SR means we will have less variation over time in values, but that also means we introduce error stemming from the USGS's SR algorithm. What will probably be best is to do a mini-classification over one or a few mines to see if we get better accuracy with TOA or SR. I'll look into that early this upcoming week.

  1. Yep, true positives / true negatives are probably most important, but there are those other metrics to see how the threshold performs overall.

@apericak
Copy link
Collaborator

apericak commented Jul 1, 2016

Also, one other thing I just learned: when we are doing the accuracy assessment, we want to try to use as many non-Landsat images as possible when manually classifying mine sites. This could mean using NAIP for more-recent years, or finding other historical aerial imagery over our study area. Finding alternate imagery may not be possible for every year, but it is considered a best practice for remote sensing projects and will likely be something journal reviewers ask us about if we don't address it.

@cjthomas730 cjthomas730 changed the title Summarize conversation about our classificati Summarize conversation about our classification Jul 5, 2016
@cjthomas730
Copy link
Contributor Author

Closing the issue. See Classification and Analysis Summary wiki page for the details of this issue (including the links to the discussion on this issue and Andrew's full write-up)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants