Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Connector sends out 50 contract offers max. #1515

Closed
DominikPinsel opened this issue Jun 22, 2022 · 12 comments · Fixed by #1539
Closed

SQL: Connector sends out 50 contract offers max. #1515

DominikPinsel opened this issue Jun 22, 2022 · 12 comments · Fixed by #1539
Assignees
Labels
bug Something isn't working

Comments

@DominikPinsel
Copy link
Contributor

Bug Report

Describe the Bug

When a connector requests a catalog from another connector, it only gets the first 50 contract offers, even if the connector would be able to send out a lot more offers.

Expected Behavior

  • all contract offers can be accessed by another connector

Observed Behavior

  • another connector can only access the first 50 contract offers

Steps to Reproduce

Steps to reproduce the behavior:

  1. Setup Connector with SQL persistence
  2. Setup the data so that the connector would generate more than 50 offers
  3. Request catalog with another connector
    -> catalog contains only 50 offers

Context Information

  • Requires SQL Extensions on offer provider side

Detailed Description

Possible Implementation

Dominik Pinsel [email protected], Mercedes-Benz Tech Innovation GmbH, legal info/Impressum

@DominikPinsel DominikPinsel added the bug Something isn't working label Jun 22, 2022
@paullatzelsperger
Copy link
Member

paullatzelsperger commented Jun 22, 2022

Verified with current upstream main and the current Postgres implementation that this does indeed work:

  • two connectors both running as native java process
  • each generates and publishes 1000 assets
  • invoke curl -H "x-api-key:<APIKEY>" "http://localhost:8182/api/v1/data/catalog?providerUrl=http://localhost:9292/api/v1/ids/data"
  • pipe to jq ".contractOffers | length" returns 1000
  • also verified this with the debugger.
  • postgres was running in docker

Since your bug description is rather sparse, I can only speculate as to how data was retrieved, what endpoints were called and with what parameters, so the only idea that comes to mind is that most Data Management API calls by default have a page size of 50, unless specified otherwise.

@DominikPinsel
Copy link
Contributor Author

Verified with current upstream main and the current Postgres implementation that this does indeed work:

* two connectors both running as native java process

* each generates and publishes 1000 assets

* invoke `curl -H "x-api-key:<APIKEY>" "http://localhost:8182/api/v1/data/catalog?providerUrl=http://localhost:9292/api/v1/ids/data"`

* pipe to `jq ".contractOffers | length"` returns 1000

* also verified this with the debugger.

* postgres was running in docker

Since your bug description is rather sparse, I can only speculate as to how data was retrieved, what endpoints were called and with what parameters, so the only idea that comes to mind is that most Data Management API calls by default have a page size of 50, unless specified otherwise.

Thanks for trying this out. From your description I'd assume that you did not use a control plane with SQL storage?
Because my guess currently is, that a default query limit of 50 contract definitions is causing the problem.

@paullatzelsperger
Copy link
Member

  • postgres was running in docker

I did.
again, without telling me what REST calls were made, I'm reading the tea leaves.

@DominikPinsel

This comment was marked as abuse.

@paullatzelsperger
Copy link
Member

Obviously, that is not what I was referring to. I was referring to the exact call how the catalog request was made.
Based on the path for policies it looks like you're not using an old version, so I will close this unless you can reproduce the issue with current upstream main and provide the proper information.

@DominikPinsel
Copy link
Contributor Author

Obviously, that is not what I was referring to. I was referring to the exact call how the catalog request was made.
Based on the path for policies it looks like you're not using an old version, so I will close this unless you can reproduce the issue with current upstream main and provide the proper information.

There are not that many ways to request a catalog. But here you go

curl -G -X GET ${connectorUrl}/data/catalog -H "X-Api-Key: password" --data-urlencode "providerUrl=${targetConnectorUrl}/api/v1/ids/data" --header "Content-Type: application/json" -s

@ndr-brt
Copy link
Member

ndr-brt commented Jun 24, 2022

This issue seems to be located in the SqlContractDefinitionStore, where the findAll method initialize a QuerySpec.none() that by default has limit = 50, while, for example, the SqlAssetIndex in the queryAssets method defines a limit = Integer.MAX_VALUE.
I think we need to find an unique way. I don't know if paginated responses are handled in the IDS protocol or if there's a defined limit for the response size, but I think something needs to be done here, so I will open this issue again

@ndr-brt ndr-brt reopened this Jun 24, 2022
@paullatzelsperger
Copy link
Member

paullatzelsperger commented Jun 24, 2022

@ndr-brt you're right. The quick fix would be to change the SQL query to not use LIMIT and OFFSET.

Eventually IDS requests will all be pageable, but the above mentioned mitigation would be an easy win. However, we might still run into I/O errors when assembling the response takes too long, or the body size is too large:

SEVERE: An I/O error has occurred while writing a response message entity to the container output stream.

[edit]: the reason it worked for me previously is that due to a typo in my build file the in-memory contract def store was indeed used.

@domreuter
Copy link
Contributor

I am using the in-memory stores and experience the same behaviour. Could be that there is also a limit in the database queries, but i think it is more a think related to the API. The "catalog" call is not able to use pagination right now like all the other endpoints.

@ndr-brt @paullatzelsperger Could it be that QuerySpec also influences the catalog call in general?

@paullatzelsperger
Copy link
Member

paullatzelsperger commented Jun 24, 2022

@domreuter don't think so. this was specific to the SQL implementation. InMemory really is just a map, not an in-mem database (such as Derby or H2).

In order to have "real" pagination for the /catalog endpoint, we'll need paginatino also in the IDS portion, otherwise we're not gaining anything.

@domreuter
Copy link
Contributor

Ok, but is there a reason why the catalog call is not following the query parameters? I experienced exactly the same problems you described earlier. Timeouts and an increasing size of the response body will cause some problems if we talk about a realistic amount of contract offer.

Therefore we have a problem with the IDS-calls which are not supporting pagination on the one side Connector <-> Connector and the catalog api on the other side Connector Data-Management API <-> e.g. EDC Managing Application.

Same problem different locations, at least from the api perspective that should not be a big deal.

Put that here as it is somehow related, but maybe worth a separate discussion?

@paullatzelsperger
Copy link
Member

paullatzelsperger commented Jun 24, 2022

@domreuter my point exactly. All the /catalog endpoint does is forward the request over IDS to the providerConnector. Unless IDS supports pagination, there is no point in adding pagination this to the REST API, because it would not change the IDS communication - paginating the request would still cause IDS to transfer the full collection of offers, only to "chop off" a part of it later.

We have that issue on our roadmap and will probably work on it in Milestone 6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants