Replies: 2 comments
-
I am working mainly with statistical output tables (unemployment figures an such) where we sometimes also have the uncertainty. However, most often this is specified using a lower and upper bound of the confidence interval. We currently code this in the variable names (e.g. "measurement_lb" and "measurement_ub") and it has been on our todo list for a while to encode this in the meta data. So +1. However, I think we need more than {
"fields": [
{
"name": "measurement",
"title": "The numeric value",
"type": "number",
},
{
"name": "error",
"title": "The error attached to the numeric value",
"type": "number",
"relation" : { "type": "errorOf", "column": "measurement"}
}
]
} This will also allow people to specify custom relations. Although a list of suggested/default supported relations would be nice. |
Beta Was this translation helpful? Give feedback.
-
@steko @djvanderlaan i think this is a perfect candidate for a "pattern" proposal. A pattern is something that would offer a suggestion of how to solve a particular problem - in this case linking error information to main measurement - without being a formal spec. |
Beta Was this translation helpful? Give feedback.
-
Hey all, based on a discussion with @danfowler I'm submitting this proposal to add support for observational error measurements in data, a rather common occurrence in scientific datasets. I can't draft a full spec at the moment but I hope others will chime in with comments from their specific experience. Examples below are archaeology-based.
While the idea came out in the context of data packages, it seems JSON table schema is the area where this kind of support should be added.
Examples
Radiocarbon dates
As can be seen in the Mediterranean Radiocarbon dates dataset (one of the largest open datasets of this kind), radiocarbon dates need to be expressed at least by the conventional radiocarbon age and the error. While it's common to write 3340 ± 45 in text, datasets usually record the two separately. However, the radiocarbon age has no meaning without the attached error.
Neutron activation analysis
Compositional data from INAA (Neutron Activation Analysis) are expressed as parts per million with an attached measurement error as can be seen in the Chemical Composition by Neutron Activation Analysis (INAA) of Neo-Assyrian Palace Ware dataset (a rather common case). In this case, measurement and error are recorded in a single column, separated by
±
.Existing implicit conventions
Separate columns
Single column
Proposed approach
Add a field descriptor in the JSON schema to explicitly mark the values in one field as linked to another field, e.g.:
An alternate approach:
This is just a basic description of the issue to get the discussion started, with no presumption of formal correctness nor exhaustive coverage of the various issues in other disciplines.
Beta Was this translation helpful? Give feedback.
All reactions