Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make species info conditionally required on observation_type=Animal #25

Closed
niconoe opened this issue Mar 11, 2021 · 13 comments
Closed

Make species info conditionally required on observation_type=Animal #25

niconoe opened this issue Mar 11, 2021 · 13 comments
Assignees
Labels

Comments

@niconoe
Copy link
Contributor

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Aug 21, 2020, 2:31

taxonomic_coverage has 3 properties: species_latin, species_common and count (see also #14 to update these names). The first 2 are required, but I think it cannot always be guaranteed that a species_common/vernacular_name is available. I would make that term optional.

Issue scope is now this: https://gitlab.com/oscf/camtrap-package-schemas/-/issues/25#note_408998361

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @kbubnicki on Aug 28, 2020, 15:45

And what about the proposed taxon_id?

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Aug 28, 2020, 15:51

From my experience with Darwin Core, only scientific_name should be required. If we make taxon_id required, people (who do not use a reference list) will populate it with a DB internal ID or one they invented. If that is better than not having an taxon_id, we can make it required.

Note: not all observations (empty, etc.) will have a species observation, so I don't know how we can make any of these 3 terms required, while the it is not required that they have content. The same is true for e.g. count.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @yliefting on Aug 28, 2020, 15:56

I still think mentioning and using a documented species list is essential for the standard. Species names change all the time and it's a complicated task to figure out how names changed if you have a 10 year old package without any information on the species list used. It becomes even worse when you allow subspecies, which is relevant to keep track of for many endangered species.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @kbubnicki on Aug 28, 2020, 17:08

Thats right! None of them can be actually required. Maybe we could use some derived rule e.g. if observation_type == Animal you have to provide at least scientific_name?

But I do not know if this is possible with json schema specification.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @kbubnicki on Aug 28, 2020, 17:11

I agree but then we would probably also need to define a list of supported taxonomy providers?

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Aug 29, 2020, 15:20

I feel like we are going quite far here.

  1. Mentioning the reference list (either controlled or not) is only going to help a little when using a 10 year old package, as reference lists change and (unfortunately) are not always versioned well.
  2. A user identifying a species likely uses a species concept that isn't necessarily the same as the reference list (which is why Darwin Core e.g. has identificationReferences). So even with the reference list you only know so much.
  3. There is an (imperfect) system in place to indicate which old names apply to new species concepts: synonyms, and many reference systems for dealing with those.

I therefore think there it is already quite a requirement (which I support) to have taxon_ids and to allow users to indicate the external source (if any) for those. That is already narrower in scope that Darwin Core Archive's "taxonomic scope" (which allows any description about the taxonomic scope). I would therefore not control this field any further, and allow a name, reference, url, ... that could help the user regarding the used taxon_ids.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Sep 2, 2020, 15:23

Addressed in #32: it is possible.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Sep 2, 2020, 15:24

Consensus is:

  • taxon_id: conditionally required (for species observations)
  • scientific_name: conditionally required (for species observations)
  • vernacular_name: optional
  • Other required species fields: make conditionally required.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Sep 8, 2020, 12:04

This issue meandered a bit, and in taxonomic coverage only taxon_id and scientific_name are required. What remains is to make scientific_name etc. conditionally required on observation_type being Animal. Will update issue title to reflect this.

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Sep 8, 2020, 12:04

changed title from Make species{-_common/vernacular_name option-}al to Make species{+ info conditionally required on observation_type=Anim+}al

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @kbubnicki on Sep 18, 2020, 12:56

This does not seem to be as easy as I thought. I am not sure if the frictionless table schema specification supports cases like this one. Probably we would have to add a new custom constraint:
https://specs.frictionlessdata.io/table-schema/#constraints

Can you also look at this @peterdesmet?
Maybe we should open the issue about it here:
https://github.com/frictionlessdata/specs/issues ?

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @peterdesmet on Sep 25, 2020, 14:18

Indeed, it is not supported in Table Schema. It was suggested as an issue in 2015, but marked as WONTFIX.

See #32 for a potential solution (extended the schema), but for now I would propose not to implement this and keep scientificName not required.

@kbubnicki is that ok for you?

@niconoe
Copy link
Contributor Author

niconoe commented Mar 11, 2021

In GitLab by @kbubnicki on Sep 26, 2020, 12:51

I agree. We can (re)consider adding this when there will be support for such custom constraints in Table Schema.

@niconoe niconoe closed this as completed Mar 11, 2021
@peterdesmet peterdesmet changed the title Make species info conditionally required on observation_type=Animal [REPLACEMENT ISSUE] Make species info conditionally required on observation_type=Animal Mar 11, 2021
@tdwg tdwg deleted a comment from niconoe Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants