Missing GSIM class – quality indicators #6

FlavioRizzolo · 2022-01-15T18:02:54Z

Quality is an important information that drives the statistical business process. GSBPM Quality Indicator shows all GSBPM sub-processes produce quality information in one form or another and there is no GSIM information object to capture this quality information. Referential Metadata Set might be used for this, but it will make it difficult to differentiate quality from other types of metadata as well as overload Referential Metadata Set.

FrancineK · 2022-05-31T19:44:15Z

A tentative mapping of Referential Metadata

Referential Metadata Set: An organized collection of referential metadata for a given Referential Metadata Subject.
Ex: Methodology, Quality
Referential Metadata Structure: Defines the structure of an organized collection of referential metadata (Referential Metadata Set).
Ex: for Methodology, methodology types:
ERROR_DETECTION, ESTIMATION_COMPILATION, IMPUTATION, VALIDATION, REVISIONS, SAMPLING, ACCURACY, etc.
Ex: for Quality, six dimensions of quality:
relevance, accuracy, timeliness, accessibility, interpretability, coherence
Referential Metadata Attribute: The role given to a Represented Variable to supply information in the context of a Referential Metadata Structure.
Ex. Quality Indicator, Status Flag, Methodology Description, Quality Statement, etc.,
Referential Metadata Subject: Identifies the subject of an organized collection of referential metadata.
Ex. DataSet, Represented Variable, Data Point
Referential Metadata Subject Item: Identifies the actual subject for which referential metadata is reported.
Ex. Sampling Frame (DataSet), Population Count (Value in Data Point)
Referential Metadata Content Item: The content describing a particular characteristic of a Referential Metadata Subject.

JALinnerud · 2022-06-01T10:37:16Z

As far as I remember the referential metadata classes originated from reference metadata classes in SDMX. See https://raw.githubusercontent.com/UKGovLD/publishing-statistical-data/master/specs/src/main/vocab/sdmx-attribute.ttl These also relate to SIMS (Single Integrated Metadata Structure) used to report quality to Eurostat. We are not currently using SIMS for our quality declarations, but we are looking into using it.

FrancineK · 2022-06-01T12:38:26Z

Here is a proposed simplified version of Referential Metadata, that can directly be linked to any GSIM object.

JALinnerud · 2022-06-23T12:48:40Z

Why are the desciptions, names and text only in Bilingual text? Shoukdn't they be multilingual?

FrancineK · 2022-06-23T17:17:40Z

Why are the desciptions, names and text only in Bilingual text? Shoukdn't they be multilingual?

Yes, we will need to rework this model. This was done for StatCan internal use.

InKyungChoi · 2022-07-20T14:47:31Z

Use case: Reference Metadata in ESS Standard for Quality Reports Structure (ESQRS)

InKyungChoi · 2022-08-17T11:32:39Z

Current GSIM Referential Metadata

Mapping of GSIM referential metadata area for the ESS Standard for Quality Reports Structure (ESQRS) and Information Management Set (GSIM Issue from Sweden)

Issues:

Referential Metadata Subject is currently constrained by Value Domain. Its explanatory text says "GSIM object type may be Product for which there is a list specified in a Value Domain. The Value Domain specifies the list of actual Products for which reference metadata can be reported or authored using this Referential Metadata Structure." but I think creating a list to be able to refer to a subject of referential metadata is too much
-> potential solution: i) Add a relationship with several typical subjects such as Questionnaire, Statistical Program, Data Set; ii)
Add a relationship between Referential Metadata Subject and Identifiable Artefact
Referential Metadata Attribute is currently "defined by" Represented Variable. Although cardinality is 0,1, but its definition "role given to a Represented Variable to supply information in the context of a Referential Metadata Structure" seems to imply that it is not optional
-> potential solution: Change definition (e.g., "characteristic providing qualitative information for a given Referential Metadata Subject") and remove relationship with Represented Variable

As a reference, see how it is done in SDMX:

InKyungChoi · 2022-08-18T16:09:52Z

Another mapping example for a documentation of statistical register (from Istat's MWW2022 presentation)

InKyungChoi · 2022-09-02T08:37:32Z

Updated model

(based on discussion in #22)

Some remarks:

For now, subjects are kept in the model to see how it works as proposed in the last meeting. Removing RM Subject and RM Subject Item has the benefit of simplifying the model, but it might be worth keeping it to make it clear what subject is..?
The relationship between RM Structure and RM Attribute used to be composition - but can it be aggregation so that we can re-use RM Attributes not just in the context of certain RM Structure.
I am also not sure about cardinalities, please review them carefully
Represented Variable is now removed from the picture, but I wonder if we could also link RM Attribute with Idenfiable Artefect?

Proposed definition / explanatory text

Referential Metadata Subject
- Definition: subject for which an organised collection of referential metadata is reported
- Explanatory text: Referential Metadata Subject identifies the subject of the metadata that can be reported using this Referential Metadata Structure. These subjects may be any GSIM class on which organised set of metadata is needed, such as Statistical Program, Data Set, Statistical Classification.
Referential Metadata Structure
- Definition: structure of an organised collection of referential metadata
- Explanatory text: Referential Metadata Structure defines a structured list of Referential Metadata Attributes for a given Referential Metadata Subject (e.g., ESS Standard for Quality Reports Structure)
Referential Metadata Attribute
- Definition: particular characteristic of referential metadata OR characteristic that describes or qualifies Referential Metadata Subject (!! note that this definition is completely different from original definition, feel free to propose a new one!)
- Explanatory text: A set of Referential Metadata Attributes is structured by Referential Metadata Structure to describe Referential Metadata Subject. Examples of Referential Metadata Attributes can be Represented Variable (e.g., "Accuracy", "Timeliness" when describing quality information) or other GSIM class (e.g., Statistical Classification, Contact, Owner)
Referential Metadata Content
- Definition: actual content of Referential Metadata Attribute
- Explanatory text: Referential Metadata Content can take different formats (e.g., text, number, value from a predefined codelist, table)
Referential Metadata Subject Item
- Definition: actual subject for which referential metadata is reported
- Explanatory text: Examples are an actual Product such as Balance of Payments and International Investment Position, Australia, June 2013, or a collection of Data Points such as the Data Points for a single region within a Data Set covering all regions for a country.

InKyungChoi · 2022-10-07T07:28:51Z

I found an old GSIM discussion (from 2018) that has very different interpretations...!!! https://statswiki.unece.org/pages/viewpage.action?pageId=129177198

It seems, in short,

Referential Metadata (RM) parts were applied (or originally, primarily aimed to be applied) to footnotes of tables (using table footnote as RM Attribute - which actually got me even more confused about how Represented Variable plays a role here...)
it got too complicated;
Guillaume suggested "A MetadataStructureDefinition structures a MetadataFlowDefinition and contains one or more MetadataTarget composed of TargetObject. So basically it is a heap of metadata atttributes gathered in a set that targets a flow." which is more similar to what we discussed, and pointed out "The problem in this example is that we are mixing the DataStructure and the MetadataStructure areas without being as complete as the SDMX-IM on the MetadataStructure/Set part";
but the way he applied RM to Single Integrated Metadata Structure (SIMS) is quite different...
in the end, [there were no big changes made to this part] in new version of GSIM (https://statswiki.unece.org/display/gsim/GSIM+v1.2+main+changes) (except "Self-referential relationship name changed from parent/child to parent-child")

@JALinnerud - what do you think? Do you think we are deviating too much from what was originally aimed?
@FrancineK - do you think the new way we use can be applied to footnotes?

If we cannot apply RM to footnotes, we cannot do what we could do before - but to be honest, I am not sure if it COULD do before....?

FrancineK · 2022-10-26T13:18:19Z

Hi @InKyungChoi, I tried to work it out with this page: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3210000101&request_locale=en.
The Referential Metadata Set:

Frequency, Dataset ID, Release date (these are variables come from Data Structure) and Classification (Geography) (this is the identifier component), and Footnotes (Referential Metadata ONLY).
If I was to consider Footnotes only:
Referential Metadata Set: Footnotes
Referential Metadata Structure: Table Footnote: Footnote1, Footnote2, etc.
Referential Metadata Attribute:
-- Footnote1 statement
-- What about Frequency, Dataset ID, Release date which are variables? Could this have referenced by the link to Represented Variable?
Referential Metadata Subject: Dataset : Stocks of specified dairy products
Referential Metadata Subject Item: Instance of Stocks of specified dairy products with ID 32-10-0001-01
Referential Metadata Content Item: Footnote1 actual statement: As of January 1988 Newfoundland and Labrador stocks are included.

I hope I am making sense and also answering your question.

InKyungChoi · 2022-10-27T11:21:45Z

Updated version based on meeting notes October 5 #27 (relationship between RM Attribute and IA is added)

Would this provide reference mechanism? So, for example, a list of footnotes would be Reference Metadata Structure, each footnote is a Reference Metadata Attribute that refers to a Data Point (which is Identifiable Artefact) or Represented Variable (which is also IA)

FrancineK · 2022-11-16T01:28:37Z

Hopefully, this will finally answer the question!!

InKyungChoi · 2022-12-06T10:30:35Z

About Referential Metadata (RM) Subject Item in the example of StatCan "Nursing and residential care facilities" table.

When a subject is a single Represented Variable, we can say RM Subject is a "Total Residents" (with RM Attribute being a footnote, RM Content Item being a particular footnote no. 5, and RM Structure being a simple footnote), can we also still say RM Subject Item is "Total Residents" BUT used in the context of a particular data set? Hence RM Subject Item is indeed Instance Variable?

Because, "Total Residents" would already exist as a represented variable, for example, in a variable catalog. But when we use this variable in this particular table "Nursing and residential care facilities", we attach this particular footnote no. 5 this the variable.

flo7894 · 2022-12-07T10:09:17Z

With the lastest modelisation there seems to be no link between a ReferentialMetadataSet and its content which consist of ReferentialMetadataContentItem. Also the composition relation between ReferentialMetadataContentItem et ReferentialMetadataAttribute seems odd, shouldn't a ReferentialMetadataContentItem to be viewed as an instance of a ReferentialMetadataAtrribute in the context of a particular IdentifiableArtefact.

FrancineK · 2022-12-07T13:34:03Z

With the lastest modelisation there seems to be no link between a ReferentialMetadataSet and its content which consist of ReferentialMetadataContentItem. Also the composition relation between ReferentialMetadataContentItem et ReferentialMetadataAttribute seems odd, shouldn't a ReferentialMetadataContentItem to be viewed as an instance of a ReferentialMetadataAtrribute in the context of a particular IdentifiableArtefact.

You are right @flo7894, on both points. The first one was not obvious to me at first, but I think that the link between Attribute with ContentItem as an intance is what was missing from the original model.

InKyungChoi · 2023-01-31T09:30:03Z

Object	Definition	Explanatory Text
Referential Metadata Structure	structure of a Referential Metadata Set	A Referential Metadata Structure defines a structured list of Referential Metadata Attributes for a given Referential Metadata Subject. Examples of Referential Metadata Structure include structures for describing quality information and methodologies information (e.g., ESS Standard for Quality Reports Structure) or characteristics of registers as well as a structure of documentation storing information necessary for internal dataset management (e.g., GDPR status, existence of information on minor).
Referential Metadata Set	organised collection of referential metadata for a given Referential Metadata Subject (Item??)	Each Referential Metadata Set uses a Referential Metadata Structure to define a structured list of Referential Metadata Attributes for a given Referential Metadata Subject.
Referential Metadata Attribute	characteristic that describes or qualifies Referential Metadata Subject	Represented Variable can often be used to define a Referential Metadata Attribute (e.g., "Accuracy", "Timeliness", "Frequency" when describing quality information), but other GSIM class can also play a role of Referential Metadata Attribute (e.g., Statistical Classification, Contact, Owner).
Referential Metadata Subject	subject for which Referential metadata is reported	The Referential Metadata Subject identifies the subject of the metadata that can be reported using this Referential Metadata Structure. These subjects may be any GSIM class on which organised set of metadata is needed, such as Statistical Program Cycle, Data Set, Questionnaire and Statistical Classification.
Referential Metadata Subject Item	actual subject for which referential metadata is reported	Examples are an actual Product such as "Balance of Payments and International Investment Position (Australia, June 2013)", or a collection of Data Points such as the Data Points for a single region within a Data Set covering all regions for a country.
Referential Metadata Content Item	actual content for Referential Metadata Attribute	Referential Metadata Content Item can take different formats (e.g., text, number, value from a predefined codelist, table)

Examples (to be used in Specification)

Object	For quality (ESS Standard Quality Report)
Referential Metadata Structure	Structure as specified in the Eurostat metadata content list (1. Contact, 1.1. Contact Organisation, 1.2. Contact unit, 2. Statistical Presentation, 3. Statistical Processing, etc.)
Referential Metadata Set	Structured quality report
Referential Metadata Attribute	Contact, Represented Variables (e.g., accuracy, timeliness), etc.
Referential Metadata Subject	Statistical Program Cycle
Referential Metadata Subject Item	Labour Force Survey (2021 Q1)
Referential Metadata Content Item	"Eurostat" for Contact, textual descriptions and coefficients for accuracy, timeliness, etc.

Object	For register
Referential Metadata Structure	1. Identification information, 2. Main objective, 3. Data source information, etc.
Referential Metadata Set	Structured description for a register
Referential Metadata Attribute	Maintainer, Data Provider, Represented Variables (e.g., frequency, data source type)
Referential Metadata Subject	Register
Referential Metadata Subject Item	"Integrated System of Statistical Registers"
Referential Metadata Content Item	"Tax authority" for Data Provider and textual descriptions for the frequency, data source type, etc.

Object	For data table (footnotes)
Referential Metadata Structure	Implicit (footnote 1, footnote 2, etc.)
Referential Metadata Set	Structured set of footnotes
Referential Metadata Attribute	Table footnote (can be represented by Represented Variable)
Referential Metadata Subject	Data Set, Represented Variable
Referential Metadata Subject Item	"Nursing facilities, total resident by annual (2020)" (for Data Set), "Total Resident" (for Represented Variable"
Referential Metadata Content Item	"the counts in this have been rounded .. to meet the confidentiality requirement" for footnote 1, "Total residents is calculated by ..." for footnote 2, etc.

Questions:

Does this work? (still not clear how to do for a situation where Subject is a data table and certain footnotes (Attributes) are for the entire data table while others are for Represented Variable inside the table)
Referential Metadata Set is "organised collection of referential metadata for a given Referential Metadata Subject Item", not for "given Referential Metadata Subject"

FlavioRizzolo · 2023-01-31T19:06:05Z

I haven't finished reviewing the whole thing, but I noticed that Referential Metadata Attribute is missing the optional "is defined by" association to Represented Variable. I think we still need that for two reasons: (i) the Referential Metadata Attribute parallels the Attribute Component in Data Structures, and (ii) in many cases we could use a Represented Variable, as your last example shows.

FrancineK · 2023-02-01T13:50:27Z

If Content Item is an instance of Attribute, is the parent-child relationship justified for Content Item?
Both is an instance of cardinalities seem to be reversed.

flo7894 · 2023-02-01T15:27:55Z

A ReferentialMetadataSubject refers to a GSIM class e.g. Dataset whereas a ReferentialMetadataSubjectItem refers to an instance of Dataset e.g. "Nursing facilities, total resident by annual (2020)". The IdentifiableArtefact seems more likely to be the instance of Dataset. May be there should not be a "refers to" property between IdentifiableArtefact and ReferentialMetadataSubject ?

Questions:

1. Does this work? (still not clear how to do for a situation where Subject is a data table and certain footnotes (Attributes) are for the entire data table while others are for Represented Variable inside the table)

Regarding question 1, couldn't we consider having a ReferentialMetadataSet for the entire data table and others ReferentialMetadataSet for the _RepresentedVariable_s , then you would group them together with the DataSet in an InformationSet . Thus the Product using the InformationSet gets all the footnotes ?

InKyungChoi · 2023-02-15T16:59:58Z

@flo7894

A ReferentialMetadataSubject refers to a GSIM class e.g. Dataset whereas a ReferentialMetadataSubjectItem refers to an instance of Dataset e.g. "Nursing facilities, total resident by annual (2020)". The IdentifiableArtefact seems more likely to be the instance of Dataset. May be there should not be a "refers to" property between IdentifiableArtefact and ReferentialMetadataSubject ?

=> DataSet is a sub-type of IndentifiableArtefact, and many of existing GSIM classes that can be ReferentialMetadataSubject are sub-types of IndentifiableArtefact (e.g., StatisticalClassification, StatisticalProgram), so instead of listing all classes, IndentifiableArtefact was used..... thinking now if this creates more confusion...

Regarding question 1, couldn't we consider having a ReferentialMetadataSet for the entire data table and others ReferentialMetadataSet for the _RepresentedVariable_s , then you would group them together with the DataSet in an InformationSet . Thus the Product using the InformationSet gets all the footnotes ?

=> this works for me!

InKyungChoi · 2023-02-15T17:03:07Z

Updated:

FrancineK · 2023-02-22T14:50:35Z

Change link from Subject to Identifiable Artifact to Subject Item to Id. Art. And Add an attribute to Subject to indicate that it is any GSIM Information Class.
Change cardinality Subject to Subject Item from 1 - 0..* to 0..1 - 0..*

FlavioRizzolo · 2023-02-22T14:52:52Z

To be implemented in EA UML

FlavioRizzolo changed the title ~~Missing GSIM information objects – quality indicators~~ Missing GSIM class – quality indicators Feb 16, 2022

InKyungChoi mentioned this issue Aug 17, 2022

Information management datasets #12

Closed

FlavioRizzolo mentioned this issue Sep 9, 2022

Update UML EA model #23

Open

InKyungChoi mentioned this issue Oct 5, 2022

GSIM Specification (with implementation discussion) #10

Closed

InKyungChoi mentioned this issue Oct 13, 2022

GSIM Structure Group definition / explanatory text update #28

Closed

FlavioRizzolo added the V 2.0 label Jan 30, 2023

FlavioRizzolo mentioned this issue Jan 30, 2023

Explore common ways of describing and organizing data and metadata #26

Open

FlavioRizzolo closed this as completed Feb 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing GSIM class – quality indicators #6

Missing GSIM class – quality indicators #6

FlavioRizzolo commented Jan 15, 2022 •

edited

Loading

FrancineK commented May 31, 2022

JALinnerud commented Jun 1, 2022

FrancineK commented Jun 1, 2022

JALinnerud commented Jun 23, 2022

FrancineK commented Jun 23, 2022

InKyungChoi commented Jul 20, 2022

InKyungChoi commented Aug 17, 2022 •

edited

Loading

InKyungChoi commented Aug 18, 2022

InKyungChoi commented Sep 2, 2022 •

edited

Loading

InKyungChoi commented Oct 7, 2022

FrancineK commented Oct 26, 2022 •

edited

Loading

InKyungChoi commented Oct 27, 2022 •

edited

Loading

FrancineK commented Nov 16, 2022 •

edited

Loading

InKyungChoi commented Dec 6, 2022 •

edited

Loading

flo7894 commented Dec 7, 2022

FrancineK commented Dec 7, 2022

InKyungChoi commented Jan 31, 2023 •

edited

Loading

FlavioRizzolo commented Jan 31, 2023

FrancineK commented Feb 1, 2023 •

edited

Loading

flo7894 commented Feb 1, 2023 •

edited

Loading

InKyungChoi commented Feb 15, 2023 •

edited

Loading

InKyungChoi commented Feb 15, 2023 •

edited

Loading

FrancineK commented Feb 22, 2023 •

edited

Loading

FlavioRizzolo commented Feb 22, 2023

Missing GSIM class – quality indicators #6

Missing GSIM class – quality indicators #6

Comments

FlavioRizzolo commented Jan 15, 2022 • edited Loading

FrancineK commented May 31, 2022

JALinnerud commented Jun 1, 2022

FrancineK commented Jun 1, 2022

JALinnerud commented Jun 23, 2022

FrancineK commented Jun 23, 2022

InKyungChoi commented Jul 20, 2022

InKyungChoi commented Aug 17, 2022 • edited Loading

Current GSIM Referential Metadata

Mapping of GSIM referential metadata area for the ESS Standard for Quality Reports Structure (ESQRS) and Information Management Set (GSIM Issue from Sweden)

Issues:

As a reference, see how it is done in SDMX:

InKyungChoi commented Aug 18, 2022

InKyungChoi commented Sep 2, 2022 • edited Loading

Updated model

Proposed definition / explanatory text

InKyungChoi commented Oct 7, 2022

FrancineK commented Oct 26, 2022 • edited Loading

InKyungChoi commented Oct 27, 2022 • edited Loading

FrancineK commented Nov 16, 2022 • edited Loading

InKyungChoi commented Dec 6, 2022 • edited Loading

flo7894 commented Dec 7, 2022

FrancineK commented Dec 7, 2022

InKyungChoi commented Jan 31, 2023 • edited Loading

FlavioRizzolo commented Jan 31, 2023

FrancineK commented Feb 1, 2023 • edited Loading

flo7894 commented Feb 1, 2023 • edited Loading

InKyungChoi commented Feb 15, 2023 • edited Loading

InKyungChoi commented Feb 15, 2023 • edited Loading

FrancineK commented Feb 22, 2023 • edited Loading

FlavioRizzolo commented Feb 22, 2023

FlavioRizzolo commented Jan 15, 2022 •

edited

Loading

InKyungChoi commented Aug 17, 2022 •

edited

Loading

InKyungChoi commented Sep 2, 2022 •

edited

Loading

FrancineK commented Oct 26, 2022 •

edited

Loading

InKyungChoi commented Oct 27, 2022 •

edited

Loading

FrancineK commented Nov 16, 2022 •

edited

Loading

InKyungChoi commented Dec 6, 2022 •

edited

Loading

InKyungChoi commented Jan 31, 2023 •

edited

Loading

FrancineK commented Feb 1, 2023 •

edited

Loading

flo7894 commented Feb 1, 2023 •

edited

Loading

InKyungChoi commented Feb 15, 2023 •

edited

Loading

InKyungChoi commented Feb 15, 2023 •

edited

Loading

FrancineK commented Feb 22, 2023 •

edited

Loading