Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whole system search with _type parameter and global params should be much faster #3232

Closed
lmsurpre opened this issue Jan 21, 2022 · 4 comments
Assignees
Labels
bug Something isn't working performance performance

Comments

@lmsurpre
Copy link
Member

lmsurpre commented Jan 21, 2022

Describe the bug
a whole-system search like this:

runQueryTest(Resource.class, "_type", "Basic,EvidenceVariable,ServiceRequest")

results in queries like these.

count:

      SELECT COUNT(*) AS CNT 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (8,52,126)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)

fetch ids:

      SELECT LR0.RESOURCE_TYPE_ID, LR0.LOGICAL_RESOURCE_ID 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (8,52,126)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)
    ORDER BY LR0.LOGICAL_RESOURCE_ID
 FETCH FIRST 10 ROWS ONLY

fetch resources:

      SELECT RESOURCE_ID, LOGICAL_RESOURCE_ID, VERSION_ID, LAST_UPDATED, IS_DELETED, DATA, LOGICAL_ID, RESOURCE_PAYLOAD_KEY, RESOURCE_TYPE_ID 
        FROM (
      SELECT R.RESOURCE_ID, R.LOGICAL_RESOURCE_ID, R.VERSION_ID, R.LAST_UPDATED, R.IS_DELETED, R.DATA, LR.LOGICAL_ID, R.RESOURCE_PAYLOAD_KEY, LR.RESOURCE_TYPE_ID AS RESOURCE_TYPE_ID 
        FROM (
      SELECT LR.LOGICAL_RESOURCE_ID, LR.LOGICAL_ID, LR.CURRENT_RESOURCE_ID, 8 AS RESOURCE_TYPE_ID 
        FROM Basic_LOGICAL_RESOURCES AS LR
       WHERE LR.IS_DELETED = 'N'
         AND LR.LOGICAL_RESOURCE_ID IN (30002)) AS LR 
  INNER JOIN Basic_RESOURCES AS R ON LR.CURRENT_RESOURCE_ID = R.RESOURCE_ID) AS COMBINED_RESULTS
    ORDER BY COMBINED_RESULTS.LOGICAL_RESOURCE_ID

The fetch ids query in particular is very slow when you have a lot of resources.
This is because the inner select grabs every row in the table and then the outer select filters it down to the (possibly very small) subset of resources.

Environment
Which version of IBM FHIR Server?

To Reproduce
Steps to reproduce the behavior:

  1. load a lot of resources
  2. do a whole-system search with a _type parameter

Expected behavior
The search should return in less than 2 seconds.

Additional context
In our testing, adding AND LR0.RESOURCE_TYPE_ID IN (8,52,126) to the inner select takes this query from >20 seconds down to ~100 ms.

@lmsurpre lmsurpre added bug Something isn't working performance performance labels Jan 21, 2022
@lmsurpre lmsurpre self-assigned this Jan 21, 2022
lmsurpre added a commit that referenced this issue Jan 21, 2022
lmsurpre added a commit that referenced this issue Jan 21, 2022
@lmsurpre
Copy link
Member Author

for QA: ensure coverage for this type of whole-system search is in our test performance regression suite (fhir-meter). ensure the performance is at least as good (hopefully much better)

@d0roppe
Copy link
Collaborator

d0roppe commented Feb 3, 2022

ran this on 4.10.2 and it generated the following SQL

Feb 3 07:25:26 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Got the connection for [default/default]. Took 4 ms
Feb 3 07:25:26 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE bind marker count: 0
Feb 3 07:25:26 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE      query string: 
      SELECT COUNT(*) AS CNT 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)
Feb 3 07:25:27 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE No current transaction. Starting new transaction on current thread.
Feb 3 07:25:27 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Getting connection for tenantId/dsId: [default/default]...
Feb 3 07:25:28 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Configuring new connection in this transaction. Key='com.ibm.fhir.persistence.jdbc.connection.FHIRDbTenantDatasourceConnectionStrategy/default/default'
Feb 3 07:25:28 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Got the connection for [default/default]. Took 217 ms
Feb 3 07:25:28 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server WARNING Deployment update required: database schema version [24] is newer than code schema version [22]
Feb 3 07:25:28 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Committing transaction on current thread...
Feb 3 07:25:33 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Successfully retrieved count; count=2731253 [took 6695.140562 ms]
Feb 3 07:25:33 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE bind marker count: 0
Feb 3 07:25:33 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE      query string: 
      SELECT LR0.RESOURCE_TYPE_ID, LR0.LOGICAL_RESOURCE_ID 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)
    ORDER BY LR0.LOGICAL_RESOURCE_ID
 LIMIT 10
Feb 3 07:25:33 test-ibm-fhir-server-979968b87-pfjlk ibm-fhir-server FINE Successfully retrieved logical resource Ids [took 16.777616 ms]

Then Ran on latest and this was the SQL

Feb 3 07:36:31 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Got the connection for [default/default]. Took 5 ms
Feb 3 07:36:31 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE bind marker count: 0
Feb 3 07:36:31 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE      query string: 
      SELECT COUNT(*) AS CNT 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19))
Feb 3 07:36:39 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE No current transaction. Starting new transaction on current thread.
Feb 3 07:36:39 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Getting connection for tenantId/dsId: [default/default]...
Feb 3 07:36:39 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Configuring new connection in this transaction. Key='com.ibm.fhir.persistence.jdbc.connection.FHIRDbTenantDatasourceConnectionStrategy/default/default'
Feb 3 07:36:39 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Got the connection for [default/default]. Took 137 ms
Feb 3 07:36:39 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Committing transaction on current thread...
Feb 3 07:36:49 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE No current transaction. Starting new transaction on current thread.
Feb 3 07:36:49 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Getting connection for tenantId/dsId: [default/default]...
Feb 3 07:36:49 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Configuring new connection in this transaction. Key='com.ibm.fhir.persistence.jdbc.connection.FHIRDbTenantDatasourceConnectionStrategy/default/default'
Feb 3 07:36:49 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Got the connection for [default/default]. Took 7 ms
Feb 3 07:36:49 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Committing transaction on current thread...
Feb 3 07:36:59 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE No current transaction. Starting new transaction on current thread.
Feb 3 07:36:59 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Getting connection for tenantId/dsId: [default/default]...
Feb 3 07:36:59 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Configuring new connection in this transaction. Key='com.ibm.fhir.persistence.jdbc.connection.FHIRDbTenantDatasourceConnectionStrategy/default/default'
Feb 3 07:36:59 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Got the connection for [default/default]. Took 7 ms
Feb 3 07:36:59 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Committing transaction on current thread...
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Successfully retrieved count; count=2731253 [took 32743.452897 ms]
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Successfully retrieved count; count=2731253 [took 82535.136897 ms]
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE bind marker count: 0
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE bind marker count: 0
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE      query string: 
      SELECT LR0.RESOURCE_TYPE_ID, LR0.LOGICAL_RESOURCE_ID 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19))
    ORDER BY LR0.LOGICAL_RESOURCE_ID
 LIMIT 10
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE      query string: 
      SELECT LR0.RESOURCE_TYPE_ID, LR0.LOGICAL_RESOURCE_ID 
        FROM LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19)
         AND EXISTS (
      SELECT 1 
        FROM LOGICAL_RESOURCES AS LR1
       WHERE LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID
         AND LR0.RESOURCE_TYPE_ID IN (104,54,19))
    ORDER BY LR0.LOGICAL_RESOURCE_ID
 LIMIT 10
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Successfully retrieved logical resource Ids [took 28.130812 ms]
Feb 3 07:37:04 test-ibm-fhir-server-765f94f767-nl9bm ibm-fhir-server FINE Successfully retrieved logical resource Ids [took 23.54674 ms]

Reopening issue

@lmsurpre
Copy link
Member Author

the interesting thing about the findings from dag is that its actually the count query that is being slow here. when robin and I were looking at the initial issue, it was the "fetch ids" step (the 2nd query out of the 3) that was being slow

lmsurpre added a commit that referenced this issue Feb 10, 2022
Previously, we were adding this extra IN clause for the _type parameter
in all cases.
In the case of a whole-system search *count* query, this was slowing
things down a bit. We still add the IN clause to the query where it
seemed to help, but now we avoid that for the count query.

Signed-off-by: Lee Surprenant <[email protected]>
lmsurpre added a commit that referenced this issue Feb 10, 2022
Previously, we were adding this extra IN clause for the _type parameter
in all cases.
In the case of a whole-system search *count* query, this was slowing
things down a bit. We still add the IN clause to the query where it
seemed to help, but now we avoid that for the count query.

Signed-off-by: Lee Surprenant <[email protected]>
lmsurpre added a commit that referenced this issue Feb 10, 2022
Previously, we were adding this extra IN clause for the _type parameter
in all cases.
In the case of a whole-system search *count* query, this was slowing
things down a bit. We still add the IN clause to the query where it
seemed to help, but now we avoid that for the count query.

Signed-off-by: Lee Surprenant <[email protected]>
lmsurpre added a commit that referenced this issue Feb 10, 2022
Previously, we were adding this extra IN clause for the _type parameter
in all cases.
In the case of a whole-system search *count* query, this was slowing
things down a bit. We still add the IN clause to the query where it
seemed to help, but now we avoid that for the count query.

Signed-off-by: Lee Surprenant <[email protected]>
lmsurpre added a commit that referenced this issue Feb 15, 2022
issue #3232- add the extra IN clause more selectively
@d0roppe
Copy link
Collaborator

d0roppe commented Feb 25, 2022

The query now runs 20% faster than it did before on a DB with over 15 million resources. Closing issue.

@d0roppe d0roppe closed this as completed Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance performance
Projects
None yet
Development

No branches or pull requests

2 participants