Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18 vocprovider paris broker in system layer #79

Merged
merged 23 commits into from
Mar 31, 2022
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
4247417
fix: reordering the system layer sub-sections
sebbader Jan 6, 2022
9315b3a
fix: reordering the system layer sub-sections
sebbader Jan 6, 2022
4f47d3e
fix: adding a first ParIS section to the system layer
sebbader Jan 6, 2022
df14951
fix: starting with the metadata broker in the system layer
sebbader Jan 6, 2022
cc35f3b
fix: general overview of the metadata broker in the system layer
sebbader Jan 6, 2022
2f77b7e
fix: general overview of the vocabulary hub in the system layer
sebbader Jan 7, 2022
3acb8b9
Update 3_5_4_Metadata_Broker.md
jlangkau Jan 18, 2022
d7da59d
fix: Introduction of system Layer
HeinrichPet Jan 21, 2022
006e1fe
Update 3_5_4_Metadata_Broker.md
HeinrichPet Jan 24, 2022
5dd0f42
Integrating the review comments for the ParIS subchapter of the Syste…
sebbader Jan 24, 2022
2781f58
Adding a reference to the ParIS subchapter in the Identity Provider d…
sebbader Jan 24, 2022
2a6f568
Integrating the comments for the Metadata Broker subchapter in the Sy…
sebbader Jan 24, 2022
b345a57
Moving the current ParIS sections to the other Layers.
sebbader Feb 2, 2022
49b2cfa
Update 3_5_4_Metadata_Broker.md
HeinrichPet Feb 25, 2022
34495f5
18 vocprovider paris broker in system layer (#151)
sebbader-sap Mar 11, 2022
094657f
Update 3_5_4_Metadata_Broker.md
HeinrichPet Mar 11, 2022
cf86306
chore: Integrate Feedback to 3_5_0_System_Layer.md
HeinrichPet Mar 14, 2022
9c93194
Merge remote-tracking branch 'remotes/origin/main' into 18-vocprovide…
HeinrichPet Mar 15, 2022
da13a6d
chore: fix typos
tmberthold Mar 17, 2022
63d2305
chore: update interaction image
juliapampus Mar 25, 2022
e5ad072
Update documentation/3_Layers_of_the_Reference_Architecture_Model/3_5…
HeinrichPet Mar 25, 2022
29b92fe
Apply suggestions from code review
HeinrichPet Mar 25, 2022
c9d119a
chore: IAM is not optional fpr ParIS
HeinrichPet Mar 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,42 +1,21 @@
# System Layer

On the System Layer, the roles specified on the Business Layer are mapped onto a concrete data and service architecture
in order to meet the requirements specified on the Functional Layer, resulting in what can be considered the technical core
of the International Data Spaces.
On the System Layer, the roles specified on the Business Layer and the processes defined in the Process Layer are mapped onto a concrete data and service architecture, resulting in what can be considered the technical core of the International Data Spaces.

From the requirements identified on the Functional Layer, three major technical components result:
- the Connector,
- the Broker, and
- the App Store.
The IDS consists of the following core components:
- the Identity Provider (consisting of DAPS and ParIS),
- the IDS Connector,
- the App Store,
- the Metadata Broker,
- the Clearing House, and
- the Vocabulary Hub.

How these components interact with each other is depicted
in Figure 3.31.

The Connector, the Broker, and the App Store are supported
by four additional components (which are not specific to the
International Data Spaces, but specified for the International
Data Spaces):

- the Identity Provider as defined in the Security
Perspective,
- the Vocabulary Hub currently as defined outside the IDS,
- the Update Repository (i.e. the source for updates of deployed Connectors) depending on the connectors technology, and
- the Trust Repository (i.e. the source for trustworthy software stacks and fingerprints as well as remote attestation checks) as discussed in the Security Perspective.
A distributed network like the International Data Spaces relies on the connection of different member nodes where IDS Connectors or other core components are hosted (an IDS Connector comprising one or more Data Endpoints). The IDS Connector is responsible for the exchange of data or as a proxy in the exchange of data, as it executes the complete data exchange process (see Section 3.3.2) from and to the internal data resources and enterprise systems of the participating organizations and the International Data Spaces. It provides metadata to the Metadata Broker as specified in the IDS connector self-description, e.g. technical interface description, authentication mechanism, exposed data sources, and associated data usage policies. It is important to note that the data is transferred between the IDS Connectors of the Data Provider and the Data Consumer (peer-to-peer network concept).
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved

A distributed network like the International Data Spaces relies on the connection of different member nodes where Connectors or other core components are hosted (a Connector comprising one or more Data Endpoints). The Connector is responsible for the exchange of data or as a proxy in the exchange of data, as it executes the complete data exchange process (see Section 3.3.2) from and to the internal data resources and enterprise systems of the participating organizations and the International Data Spaces. It provides metadata to the Broker as specified in the connector self-description, e.g. technical interface description, authentication mechanism, exposed data sources, and associated data usage policies. It is important to note that the data is transferred between the Connectors of the Data Provider and the Data Consumer (peer-to-peer network concept).
IDS Connectors executes the exchange of data between participants of the International Data Spaces. The International Data Spaces network is constituted by the total of its IDS Connectors. Each IDS Connector provides data via the Data Endpoints it exposes. Applying this principle, there is no need for a central instance for data storage. An IDS Connector is typically operated behind a firewall in a specially secured network segment of a participant (so-called “Demilitarized Zone”, DMZ). It should be possible to reach an IDS Connector using the standard Internet Protocol (IP), and to operate it in any appropriate environment. A participant may operate multiple IDS Connectors (e.g., to meet load balancing or data partitioning requirements). IDS Connectors can be operated on-premises or in a cloud environment.

There may be different types of implementations of the Connector, based on different technologies and depending on what specific functionality is required regarding the purpose of the Connector. Two fundamental variants are the Base Connector and the Trusted Connector (see Section 4.1) as they differ in the capabilities regarding security and data sovereignty.

Connectors can be further distinguished into External Connectors and Internal Connectors:
- An External Connector executes the exchange of data between participants of the International Data Spaces. The
International Data Spaces network is constituted by the total of its External Connectors. Each External Connector
provides data via the Data Endpoints it exposes. Applying this principle, there is no need for a central instance for
data storage. An External Connector is typically operated behind a firewall in a specially secured network segment
of a participant (so-called “Demilitarized Zone”, DMZ). From a DMZ, direct access to internal systems is not possible.
It should be possible to reach an External Connector using the standard Internet Protocol (IP), and to operate it
in any appropriate environment. A participant may operate multiple External Connectors (e.g., to meet load balancing
or data partitioning requirements). External Connectors can be operated on-premises or in a cloud environment.
- An Internal Connector is typically operated in an internal company network (i.e., a network which is not accessible
from outside). Implementations of Internal Connectors and External Connectors may be identical, as only the purpose
and configuration differ. The main task of an Internal Connector is to facilitate access to internal data sources in
order to provide data to External Connectors.
There may be different types of implementations of the IDS Connector, based on different technologies and depending on what specific functionality is required regarding the purpose of the Connector. Officially, we distinguish IDS connectors according to their certification level defined in [section 4.2](../../4_Perspectives_of_the_Reference_Architecture_Model/4_2_Certification_Perspective/), which indicates, among other things, which security and data sovereignty criteria the connector implements.
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Participant Information Service (ParIS)

From a System Layer view, the internal architecture components and endpoints of a ParIS are very similar to the ones of an IDS Metadata Broker. Both need to receive IDS Self-Descriptions, need to persist them and make them available for other IDS Connectors to query them. The main difference is the type of Self-Description they manage - Connectors and Resources by the Metadata Brokers and Participants by the ParIS.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved


## Components

A ParIS typicaly consists of the following functional building blocks, which can be implemented using different technology stacks and hosting solutions:

- _Server_ to host the IDS Endpoints.
- _Database_ to persist the RDF Self-Descriptions of the registered IDS Participants.
- _Index_ (optional) to increase the speed for read requests.
- _Website_ (optional) for human interactions with the ParIS.
- _IAM_ (optional) for checking the identity claims of clients and to validate their authorization using the IDS DAT. Can be located at the surrounding Identity Provider.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved


## Endpoints

The interactions with a ParIS can be distinguished into two main categories. The first one is related to the initial provisioning of Participant information during their onboarding in an IDS as well as the according updates through the operators of the general Identity Provider. As this workflow is completely component-internal, proprietary or custom patterns might be used. The necessity for this internal endpoint is due to the requried higher trust in the Participant metadata. For instance, an incorrect VAT-ID or jurisdication has direct and concrete legal consequences, therefore a certain validation workflow at the Identity Proivder operator must be enabled.

In addtion, an IDS compliant endpoint must be exposed for the communciations with IDS Connectors. While this endpoint could also - given proper authentication and authorization procedures - serve for the purpose described above, its main concern is the provisioning of querying capabilities and to allow individual Participants to adjust their own Self-Description.


## Search and Querying

Each ParIS instance must provide IDS compliant functions to dereference Participant identifiers. A dereferencation function accepts the Participant identifier, an IRI accroding to the IDS Information Model, and returns the related Self-Description document. In addtion, a ParIS may provide further search capabilities, like full-text search, attribute-based or facet search, or even expose expresive query language like SPARQL. In any case, the respective capabilities must be outlined in the Self-Description of the ParIS itself, to make them discoverable for IDS Connectors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clange Is there an Participant identifier IRI in the Information Model? I have not found it... And if yes, is it also part of a DAT? Because somehow the DAPS have to link a Connector to a Participant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HeinrichPet the Participant identifier is part of the native RDF model, therefore it does not appear additionally in the model. But in the example the Participant IRI appears in the "@id" value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a good point about this IRI in the DAT... Unfortunately, I don't remember all details of the discussion we had about DAT attributes anymore. I think the chain is like 'IDS Key' --DAT-attribute-sub--> the DAT itself --DAT-attribute-referringConnector--> Connector SD --maintainer/operator-attribute--> Participant IRI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I write it, I don't have a very good feeling anymore either. The Participant should be an explicit DAT attribute, otherwise it's not trustworthy...



## Life Cycle of Participant's Self-Description

Similar to Connector and Resource Self-Descriptions, also Participant Self-Descriptions go through different lifecycle stages. The initial version is provided by the Participant itself, either directly as an IDS Information Model instance or as a filled form during the onboarding process. This SD is then, after the IDS identity of the new Participant has been created, populated at the according ParIS.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved

In case mistakes in this SD are noticed or attributes of the Participant change, both the operator of the Identity Provider as well as the Participant itself have the technical means to adjust the Self-Description. Note that the operator of the Identity Provider could also prohibit direct updates due to otherwise skipped validation workflows.

In case a Participant temporarily or completely leaves an IDS, the according Self-Description can also be made unavailable. An unavailable SD is not exposed to the regular search and query functionalities anymore. Nevertheless, the ParIS should still keep the SD or at least its identifier, to enable potential later reactivations and especially prevent identity hijacking attempts. In such an attack, a newly onboarded Participant could try to use an identifier of another Participant that has left the IDS already, and thereby claim the access and usage permissions of the latter.


## Data Synchronization inside the Identity Provider

The core attributes of an IDS Participant, their IDS Key, UUID, and the IRI identifier, need to be maintained comprehensivly between the different functional components of the Identity Provider. Apart of that, no further synchronization between different ParIS or Identity Provider instances are enforced.
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Identity Provider

## DAPS
## Dynamic Attribute Provisioning Service (DAPS)

## ParIS
## Participant Information Service (ParIS)

The [ParIS](./3_5_1_2_ParIS.md) is a vital part of the Identity Provider. While the CA is responsible to issue and manage technical identity proofs and the DAPS provides time-dependent tokens, the ParIS provides business-related information of IDS Participants in machine- and human-readable manners.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Metadata Broker

The IDS Metadata Broker consists of an IDS Connector (see section [3.5.2.0](./3_5_2_0_Connector.md)<!--//**TODO** add correct link-->), an endpoint for the registration, publication, maintenance and query of Self-Descriptions. Therefore, for any interaction with the IDS Metadata Broker the processes defined on the Process Layer, the descriptions defined on the Information Layer and descriptions defined on the System Layer can be applied. The Information Layer describes the message types for registration and query. An IDS Metadata Broker may provide additional services that in term must be described by using terms from the IDS Information Model in the respective Metadata Broker's Self-Description document.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved

**Note: Even though the name might indicate a different purpose, an IDS Metadata Broker is *not* a message broker or provides any similar functions to distribute data assets actively by itself.**

As a direct consequence of the IDS Connector-nature of the Metadata Broker, each instance must be compliant to the Connector Certification criteria and in particular provide the functionalities and endpoints of general Connectors. For instance, a Metadata Broker must provide a Self-Description that provides further information about itself for other IDS components. A Metadata Broker must also have a valid IDS Identity and use a valid DAT in its communication.

In addition to these requirements for each IDS Connector, the Metadata Broker provides further functionalities for a data space. It's main purpose is the persistence and storing of Self-Description documents and offering efficient access and search functions on their content. It therefore requires a reliable and scalable internal database. As the Self-Description documents are encoded in RDF, usually JSON-LD, a graph-oriented database like a triple store or a property graph database might be used. Nevertheless also traditional SQL or NoSQL databases may be applied, which may not have the same native query support but still can be sufficient. In any case the internal architecture of a Metadata Broker must be flexible enough to cope with extensions of the data scheme. The IDS Information Model can allways be enriched with further attributes, so a Metadata Broker must also allow the persistence and querying of information which was not yet known at its deployment time. Furthermore, Metadata Brokers operated for certain domains or dedicated data spaces might also enforce the existence of attributes that are not covered by the core IDS Information Model or part of the IDS namespace. That implies that a certain Metadata Broker instances require Self-Descriptions which information content goes beyond the IDS Information Model. In such cases, the additional requirements are outlined in the Metadata Broker Self-Description as well as in the content of the return messages, in case a Connector has not set such attributes yet.

Furthermore, a Metadata Broker implementation might add indexing or caching modules to reduce the query evaluation time. It can be generally expected that the amount of READ requests is significantly higher than the overall number of remote WRITE activities so a READ-optimized architecture can lead to better user experiences. Such design decisions however are in the responsibility of the operator.

Additionally, most use cases for Metadata Brokers require a human-oriented interface to the Self-Descriptions. A website with fulltext and facet search capabilities is therefore usually provided. The website might further provide the creation and management of the locally stored Self-Descriptions. However, as the registration and updating process at the Metadata Broker is centered around Connectors, the authority of the human website user and the asset-hosting Connector must be ensured.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved

## Endpoints

Metadata Brokers must provide remote endpoints to their own Self-Description (read-only) as well as to the locally persisted Self-Description graph (read/write for the hosting Connectors, read-only for the others). The server hosting these endpoints translates incoming requests, performs the necessary IDS identity and validity checks, and translates them into operations to the database.

A Metadata Broker might support endpoints for different IDS protocol bindings. In any case, the content of the responses are protocol-independent. That means a successful read operation using one binding must also be successful through any other if targeting the same Self-Description. A Metadata Broker may however discriminate based on the identity of the requester, providing responses to one Connector while rejecting another due to IDS Usage Control configurations.


## Search and Querying

The main purpose of a Metadata Broker is the provisioning of remote search functionalities. This can be done in a resource-oriented manner if the identifiers of the targeted Self-Descriptions are already known in advance. Alternatively, full-text or complex queries might be used. A complex query in this sense is any query that combines filters, aggregations or traverses the Self-Description graph to search for information. Which query language is supported by which Metadata Broker instance is outlined in its own Self-Description.
HeinrichPet marked this conversation as resolved.
Show resolved Hide resolved

The IDS Information Model provides the scheme for the searches. The knowledge of the Information Model can be used by querying Connectors to formulate their inquiries. Metadata Brokers may also provide additional templates or preformulated queries to support the Connectors.


## Self-Description Life Cycle

Self-Descriptions go through a life cycle. Created Self-Descriptions are in the `active` state as long as they are not put to `unavailable` by its sovereign. It's important to note that the later state is different to a deletion. It is important to track the usage of Self-Descriptions, in particular their unique identifier, to avoid name clashes or false flag attacks. A Connector therefore can ask to not publicly provide a Self-Description anymore by setting it to `unavailable` but it cannot force the Metadata Broker or any other Connector to completely delete the information from its internal databases.

A Self-Description can be made `active` at any time again by the respective Connector. In addition, it can overwrite already active Self-Descriptions with a new one. The update of a previously `unavailable` Self-Description however will set it back to `active` automatically. Furthermore, new Self-Descriptions must not use the identifier of already existing ones.

## Data Synchronization

The Metadata Broker is an optional component in a data space. That of course means that there can be data spaces that completely operate without any Metadata Broker. There can be however also data spaces where several Metadata Broker instances are provided. In such use cases, the synchronization between these instances becomes a topic, in particular to avoid redundant or conflicting information.

At the current state, the IDS does not specify or recommend any technical synchronization mechanism or process. Data space operators may implement such processes, via peer-to-peer architectures, declaring a leading instance, or relying on Distributed Ledger approaches. As a consequence, a Connector communicating with several Metadata Brokers at the same time must not - without having additional information - assume that the Self-Descriptions of the various Metadata Brokers are aligned.
Loading