Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Separating core data and metadata within statements #465

Closed
lgs85 opened this issue Sep 22, 2022 · 4 comments
Closed

Feature: Separating core data and metadata within statements #465

lgs85 opened this issue Sep 22, 2022 · 4 comments

Comments

@lgs85
Copy link
Contributor

lgs85 commented Sep 22, 2022

[This ticket helps track progress towards developing a particular feature in BODS where changes or revisions to the standard may be required. It should be placed on the BODS Feature Tracker, under the relevant status column.

See Feature development in BODS in the Handbook.

The title of this GitHub ticket should be 'Feature: XXXXX' where XXXXX is the feature name below. The information in this first post on the thread should be updated as necessary so that it holds up-to-date information. Comments on this ticket can be used to help track high-level work towards this feature or to refine this set of information.]

Feature name: Separating core data and metadata within statements

Feature background

Briefly describe the purpose of this feature

Beneficial ownership data, and BODS by extension, is structurally highly complex, with multiple statement types all rich in information and all of which can change over time. In order to ensure high stakeholder uptake of BODS, it is vital that the data model is conveyed as simply and intuitively as possible.

At its simplest level, beneficial ownership data conveys information about people and about entities, connected by information about ownership-or-control. For each of these things, there is a core set of information that tells us for example about the entity/person/ownership-or-control interest: names, identifiers, share percentages, etc. But information about people, entities and ownership-or-control changes over time and is supplied from different sources. So, in BODS, there is necessarily a separate set of information about the statement itself (metadata), including when the statement was made, who is making it, and how the statement connects to previous statements.

Both the core information and metadata are necessary features of a BODS statement, but as of BODS v0.3 there is no encapsulation of these sets of information, conceptually or in the data model. Instead, core information about people, entities and ownership-or-control is interspersed with statement metadata within a JSON Statement object, and we have no way in documentation of referring to these different sets of information.

What user needs are met by introducing or developing this feature in BODS?

User stories include:

  • As a developer, I want to be able to clearly identify nodes (enities and persons) and edges (ownership-or-control interests) in BODS statements and separate these from metadata, so that I can easily convert ownership data into a graph database for my platform.

  • As a register analyst, I want to have a data standard with a logical and coherent structure, so that I can easily design data collection forms.

  • As a register developer/architect, I want to understand how our existing data model and fields map to BODS so that I can scope out the work involved in providing BODS-format exports.

  • As a researcher, I want to understand how changing information is represented in BODS so that I can interpret a BODS dataset correctly

What impact would not meeting these needs have?

  • Implementers may publish non-standard data due to misunderstandings about the purpose of BODS metadata.

  • Developers may spend more time and resource on writing consuming applications for BODS data

  • Implementers may decide not to publish using BODS due to the complex structure of the standard

How urgent is it to meet the above needs?

Because distinguishing core data from metadata in BODS is likely to involve a relatively simple restructuring of existing fields, it can be implemented at any stage. However, it is also the case that complex new features may benefit substantially from a clearer BODS structure. It may therefore be sensible to implement this change sooner rather than later.

Are there any obvious problems, dependencies or challenges that any proposal to develop this feature would need to address?

  • Though this feature affects any other feature going forward there are no real dependencies as it is likely to involve fairly straightforward restructuring. As above, however, implementing this feature may enable clarity on more complex features such as change over time and representing declarations.

  • There is a challenge on how best to distinguish core data from metadata for ownership-or-control statements. On the one hand, the subject and interested party in an ownership or control statement could be viewed as core data. On the other, one could argue that the interests themselves are the core information and the subject and interested party identities are part of the metadata. See this discussion.

Feature work tracking

This Github issue expands further on this topic.

Implementation proposal: #477

@lgs85
Copy link
Contributor Author

lgs85 commented Jan 9, 2023

A key consideration for this proposal is how it interacts with the concept of declarations. In particular, if the concept of a declarationID is introduced to BODS, to represent sets of statements made about ownership of a given entity or group of entities, then the question arises about whether much of the metadata (e.g. source, publicationDetails) can be made at the declaration level. This would significantly reduce redundancy in BODS data and address many of the issues outlined above. However, such a change may involve substantial structural changes to the conceptual model and schema. As such this requires working through carefully.

@lgs85 lgs85 moved this from 📋 To Do - Additional Tasks to To Do - Features in Release tracker: BODS version 0.4 Jan 18, 2023
@lgs85 lgs85 added this to the BODS v0.4 release milestone Jan 18, 2023
@kd-ods
Copy link
Collaborator

kd-ods commented Mar 10, 2023

Implementation proposal now available: #477

@kd-ods kd-ods moved this from To Do - Features to 🏗 In progress in Release tracker: BODS version 0.4 Dec 5, 2023
@kd-ods
Copy link
Collaborator

kd-ods commented Jan 5, 2024

Implementation of this feature (along with two other features) is being tracked via this ticket #487.

@kd-ods kd-ods moved this from 🏗 In progress to To Do - Features in Release tracker: BODS version 0.4 Mar 5, 2024
@github-project-automation github-project-automation bot moved this from To Do - Features to ✅ Done in Release tracker: BODS version 0.4 Apr 23, 2024
@kathryn-ods kathryn-ods moved this from ✅ Done to Added to Changelog in Release tracker: BODS version 0.4 May 2, 2024
@kathryn-ods kathryn-ods moved this from Added to Changelog to ✅ Done in Release tracker: BODS version 0.4 May 2, 2024
@kd-ods kd-ods moved this to Done in BODS Feature Tracker May 28, 2024
@kathryn-ods
Copy link
Contributor

Noting that this feature is complete as of BODS 0.4 see - https://standard.openownership.org/en/0.4.0/standard/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants