Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss & comment on Data Processing Levels draft doc #4

Closed
emiliom opened this issue Mar 20, 2023 · 5 comments
Closed

Discuss & comment on Data Processing Levels draft doc #4

emiliom opened this issue Mar 20, 2023 · 5 comments

Comments

@emiliom
Copy link
Contributor

emiliom commented Mar 20, 2023

@leewujung we can use this issue to discuss the Data Processing Levels draft markdown doc I've created in this repo.

In my immediate TODO: Add an introductory paragraph. In that paragraph I'll touch on the relationship between data processing levels, provenance information and "data product" specifications.

Later on, I also plan to add discussion in a section below the listing of processing levels, giving some background and citations for the choices described in each level.

@leewujung
Copy link
Member

giving some background and citations for the choices described in each level

I think this part will be very useful to have before the symposium -- it will be way more convincing than the list right now. Most people in this field do not think about processing levels day-in-day-out, so having background explanation on what we are trying to achieve here, with concrete citations ("oh people in other fields have been doing this for a long time!") will make this more convincing.

@emiliom
Copy link
Contributor Author

emiliom commented Mar 20, 2023

What I meant by "immediate" vs "later on" was today or tomorrow vs later this week. I realize that wasn't at all clear.

@leewujung
Copy link
Member

@emiliom : from discussion with @valentina-s and @Sohambutala today we have some feedback on the data processing level draft):

Suggestions:

  • Sv with seafloor removal should be in L4
    • reason: the act of bottom removal requires annotation or detection of where the bottom is, it is essentially operating on derived products to produce another derived data product
  • Level 4 (L4) – Description: Acoustically derived biological, environmental, and other features
    • reason: adding these other qualifier so that the L4 products can be more broad

Questions:

  • We think it'll be good to store the masks, which is a form of derived data product.
    • Are the masks L4 data products?
    • Could you point us to examples of similar data products, say in the satellite data community?

@emiliom
Copy link
Contributor Author

emiliom commented Oct 19, 2023

I think seafloor removal and the seafloor mask occupy a gray area in data processing. In general, I think they could be either Level 2 or Level 3, or both.

ADEON and IMOS SOOP-BA don't address them explicitly; instead, they are processing steps that go into their "Level 2" products. However, L2 means different things in those two programs. ADEON "starts" with L1 as calibrated Sv, and L2 is L1 with further processing (e.g., noise and seafloor removal, and depth integration). For IMOS, L2 is a broad bucket that includes what ADEON defines as L1 and L2.

As for NASA, I've found the EOS Data products handbook, Volume 2 a very helpful reference or guide to common practices in the satellite data community. Like most documents of this sort, it doesn't fully explain the rationale for specific processing level designations. But because it has so many different types of data products, I find it very helpful when making analogies to echosounder products. I've focused mainly on MODIS products. The analogs I find there for "masks" derived from the same sensor (as opposed to external masks) include sea ice cover (MOD 29), snow cover (MOD 10), and land cover type (MOD 12). The first two can be either L2 or L3, and the third one is L3. MOD 12 uses (see the diagram on p. 32) MOD 10 and vegetation indices (MOD 13, also L3), among others. L4 products are limited to highly processed products / variable that involve some mix of multiple assumptions and inputs, spatio-temporal completeness on regular grids, and potentially a combination of sensors. A good examples is primary productivity.

Notice that a "variable" can occur at different levels! Most of the core oceanographic products (SST, Chl a, etc) are like that: they are available as either L2 or L3. The variable itself doesn't determine the processing level; apparently the processing steps involved do.

Taking all that into consideration, here is what I think:

  • A mask can be either L2 or L3, depending on the underlying data used to generate it.
  • While I still think L2B is the best fit for seafloor removal, if it's generated on regridded data, L3B may be appropriate too.

@leewujung
Copy link
Member

I'll close this now given recent discussions with @valentina-s @brandynlucca and during WGFAST GAIN workshop. Hopefully we get to discuss with a larger group on the details soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants