-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
15 review process layer and broker #76
15 review process layer and broker #76
Conversation
… stating that the identity certificate is not necessarily the TLS certificate.
...mentation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_1_Onboarding.md
Show resolved
Hide resolved
...mentation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_1_Onboarding.md
Show resolved
Hide resolved
...mentation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_1_Onboarding.md
Show resolved
Hide resolved
...mentation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_1_Onboarding.md
Show resolved
Hide resolved
...tation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_2_Data_Offering.md
Outdated
Show resolved
Hide resolved
...tation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_2_Data_Offering.md
Show resolved
Hide resolved
...tation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_2_Data_Offering.md
Show resolved
Hide resolved
@sebbader Integration of domain specific ontologies in the self-description or generic key/values are missing for the Data Offering. We also should think about integrating the Vocabulary Hub here as a added value. |
- **//TODO** Handful of sentences that the self-description have to be update accordingly | ||
- **//TODO** Not all metadata is made available to everyone. Usage Policy enforcement starts right here and shows everyone who requests the self-description only the data they could access. | ||
|
||
As mentioned already, the Connector hosting the data asset is the solely applicable source of truth regarding the data asset's IDS Self-Description. This implies in particular, that the hosting Connector, more precisely the Participant controlling the Connector, can change the data asset as well as its Self-Description at any time. Even though it might be in its interest to establish a reputation as a reliable and trustworthy business partner, it might need to deploy updates without further notice. The Data Provider might want to inform certain other Connectors about changes but is not obliged to do so. It is also not necessary to supply older or outdated data assets or Self-Descriptions. Consequently, the existence of a suitable Self-Description document is not a sufficient proof of the existence of the related data asset. A Data Consumer might want to request the latest version of the Self-Description also at the original Connector to be sure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also not necessary to supply older or outdated data assets or Self-Descriptions. A Data Consumer might want to request the latest version of the Self-Description also at the original Connector to be sure.
is kind of the same like a few lines above: Copied or otherwise differently located Self-Descriptions might be outdated or misleading, therefore a potential Data Consumer may want to double-check the correctness of a found Self-Description by also requesting a version directly from the original Connector.
The first warning above could be removed and the text could be included here to have only one warning about outdated Self-Descriptions.
...tation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_2_Data_Offering.md
Outdated
Show resolved
Hide resolved
|
||
### Data Consumer searching for Self-Descriptions | ||
|
||
To find a Data Provider, the Data Consumer may search in the catalogs of a Metadata Broker Service Provider. Before that, however, the Data Consumer needs to select a suitable Metadata Broker (e.g. based on thematic coverage) and determine the query capabilities (e.g. a graphical search interface or a domain-specific query language). The Metadata Broker then returns the query result to the Data Consumer, who needs to interpret the result to find out about the different data sources available in the International Data Spaces for providing the data specified in the query. Each query result must provide information about each IDS Connector capable of providing the desired data, so that the Data Consumer can retrieve each Connector’s self-description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metadata Broker Service Provider
: Within the blue swimlanes in the two attached graphics (vertical headlines), the word Metadata in front of the Broker Service Provider is missing, maybe update the graphics to match the text?
...mentation/3_Layers_of_the_Reference_Architecture_Model/3_3_Process_Layer/3_3_1_Onboarding.md
Show resolved
Hide resolved
|
||
The first step in a typical data publication process is therefore the proper creation of a data asset Self-Description in JSON-LD. Usually, Connectors provide the technical manners to create and maintain them through suitable GUIs. In any way, the created Self-Descriptions are then deployed at the Connector that also hosts the related data assets. This Connector is also the only applicable source of truth for metadata about the data assets. Copied or otherwise differently located Self-Descriptions might be outdated or misleading, therefore a potential Data Consumer may want to double-check the correctness of a found Self-Description by also requesting a version directly from the original Connector. | ||
|
||
After reaching a syntactically and semantically correct Self-Description, the Data Provider may want to announce it in a data space. To do so, it sends the Self-Description to the responsible IDS infrastructure component, an IDS Metadata Broker. The location of available Metadata Broker instances as well as the selection of the appropriate ones is in the responsibility of each data space Participant and not - for now - generally specified. The Metadata Broker then stores the received Self-Descriptions and makes them available for search requests for other Connectors. Potential Data Consumers can search through the stored Self-Descriptions, filter for relevant offers, and then in the third step of the process negotiate and request a data asset directly at the hosting Connector. | ||
|
||
Another possible approach to find relevant offers in a data ecosystem is a federated catalog. This approach is based on a crawler architecture implementing a federated cache node and a federated cache crawler. Since the Data Provider provides its offers in its Self-Description and further data describing the contents can be requested from the Data Provider, another IDS Connector can cache the available data offerings by crawling the Data Providers. The Data Consumer can then query its own already available cache of all known data offerings. The data offerings cache at the Data Consumer must be updated periodically or by event by the crawler. Depending on the size of the data space, multiple crawlers can also be used by the Data Consumer, which would allow partitioning of large data spaces into crawler-regions for the Data Consumer. For this approach, an overview of all participants in the data ecosystem is required for the Data Consumer crawler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmberthold We should include suggestions to get such an overview, for example via a Metadata Broker like above or maybe trough ParIS @sebbader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further extension idea: If a central DAPS is used, it knows which connectors have requested a DAT from it in the last e.g. 2 days and therefore knows the active participants, given a current DAT validity of one hour. Couldn't a DAPS provide a simple list with the details of the refferingConnector for which he issued a DAT within the last 2 days? The crawler could then prioritize crawling these active connectors, instead of crawling possibly inactive ones to the same extent only to find out that nothing has changed because apparently then also no UpdateMessage was sent to a Metadata Broker (as Google does prioritize crawling webpages that often add content). Would also allow the crawling itself to be split up instead of "crawl all or none". With a built-in versioning in the crawled offerings cache, the crawling-connector could also display when the crawler last crawled data from another connector (like Google does with webpages, so it may be outdated or nothing changed because the other connector doesn’t seem to be active at all). If multiple DAPS are used which I maybe don’t know as Connector, one could build a link structure for crawling like Google by following the links: by having the issuer (iss) in the DAT of an incoming message, my Connector would get to know a previously unknown DAPS as it is the issuer of the DAT of the incoming message, and then go to that DAPS to get the list of his active participants and then crawl them. By that the other participants of this DAPS also get to know my DAPS and could crawl me if they support this feature and so on.
Furthermore, any organization that wants to assume the role of Data Provider or Data Consumer has the option to configure custom access restrictions for bilateral communications. For instance, a Data Provider may want to block certain Connectors or participants from accessing their services, or it may require specific access credentials. Another factor is the secure platform that a Connector instance is deployed on. The platform provides major security features such as process isolation. | ||
These configurations may be set up in the last step of the Security Setup sub process (see section 4.1). **//TODO** insert link to Business Layer. | ||
**//TODO** the Connector self-description must be correct and valid, which is ensured... Not by the DAT. Ensured by creating and signing? it by a certified participant? We have this in the issues. Verifying the self-description ist yet TBD. The DAT only brings some claims into the equation. | ||
To enable secure communication, a Certification Authority issues a certificate to the Data Provider or Data Consumer. This certificate is deployed locally to enable Transport Layer Security (TLS) and identification of the respective IDS participant in combination with the Dynamic Attribute Token (IDS DAT). Note that the TLS certificate mentioned here and the previously mentioned IDS identity certificate may not be the same file. On top of that, the Connector Self-Description must be correct and valid, which is ensured by requesting and receiving a Dynamic Attribute Token from the Identity Provider (section 4.1). The token is a signed attestation that the security-critical information that the Connector states about itself has been verified and is actually true. The token is presented by each subsequent outgoing communication message of the Connector, so that also the communicating Connectors have a means to verify the trustfulness of their communication partners at any time. Important to understand is that the DAT only supports the claims that are actually contained in the token itself. Additional attributes or descriptions that are only part of the Self-Description files, for instance Contract Offers, licenses, or endpoint descriptions, are not verified by any IDS Identity Provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This certificate is deployed locally to enable Transport Layer Security (TLS) and identification of the respective IDS participant...". That sounds like there is ALWAYS ONE certificate that does both, TLS AND identification of the IDS participant. But in the next sentence it is stated that there can be an SSL certificate and an IDS identity certificate.
Maybe this should be split up: "To enable secure communication an SSL certificate must be installed..." and then "To make the connector identify itself to others an IDS identity certificate must be installed..."
|
||
![Onboarding process](../../media/image22.png) | ||
|
||
The following paragraphs describe each step of the onboarding process in more detail. | ||
|
||
### ACQUIRE IDENTITY | ||
Any organization that wants to operate a connector in order to exchange data in the International Data Spaces as a Data provider or Data Consumer needs to acquire a unique identity in the form of a certificate (TODO: need to define participant identity further). in addition, to deploy and run a connector as part of the ecosystem, each connector needs to be provided with an identity certificate, issued by a accredited Certificate Authority. This certificate enables them to establish secure and trusted connections to other IDS participants' connectors(see section 3.1). | ||
**//TODO** insert link to Business Layer, add link to connector identity | ||
Any organization that wants to operate an IDS Connector in order to exchange data in the International Data Spaces as a Data provider or Data Consumer needs to acquire a unique identity in the form of a digital certificate. This certificate enables them to establish secure and trusted connections to other IDS participants (see section 3.1). Please note that this identity certificate is by default not the same as the one necessary to encrypt the communication channel itself. Even though both may use the same standards (X509), the purposes are different and therefore different certificate files can be used. It might be even a best practice to distinguish them to reduce the risk of identity theft, even though the IDS itself does not determine how to preoceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This certificate enables them to establish secure and trusted connections to other IDS participants" is maybe misleading. Because the next sentence states that this certificate is not used to create secured SSL connections. So maybe a better description of this certificate would be "This certificate is used in communication with others to proof the identity of this organization to others and is therefore called the IDS identity certificate."
This PR is my first approach for #15 and #59, therefore I assign it to the creators of the tickets @ssteinbuss and @HeinrichPet