Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] performance issues with list runs API #9780

Closed
deepk2u opened this issue Jul 25, 2023 · 3 comments
Closed

[backend] performance issues with list runs API #9780

deepk2u opened this issue Jul 25, 2023 · 3 comments

Comments

@deepk2u
Copy link
Contributor

deepk2u commented Jul 25, 2023

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    full kubeflow deployment using manifests repo
  • KFP version:
    sdk-2.0.0b7
  • KFP SDK version:
    sdk-2.0.0b7

Steps to reproduce

  1. create more than 200k runs in at least one namespace
  2. other namespaces can have any number of runs
  3. click on the runs tab on Kubeflow UI, it starts to timeout

I did some digging and found that in case if the number of runs is really high, then the select query starts to take more than a minute.

select * from run_details where Namespace = 'namespace-with-200k-runs' limit 1; this query took 1 minute 46 sec

I tried to query using the experiment tab, where we pass the experiment id, and that query is still performing as expected.
select * from run_details where ExperimentUUID = 'a0dd8afa-d481-4c83-b2c2-31ef4b3d12ec' limit 1; this query is taking milliseconds

On a side note, 200k runs are really high, and I checked. Someone created a few Recurring Runs for that particular run using a cron schedule, which was running every second. That was a mistake. I feel we should add some warning below the cron schedule box if it is per minute or per second, or there should be a mechanism to not allow these kinds of cron schedules for administrators.

Expected result

Listing runs should not time out, irrespective of how much data we have in the database.

Materials and Reference

Below is a snapshot of the data we have in the run_details table.

mysql> select count(*) from run_details;
+----------+
| count(*) |
+----------+
|   227025 |
+----------+
1 row in set (1.91 sec)

mysql> SELECT Namespace,COUNT(*) as count FROM run_details GROUP BY Namespace ORDER BY count DESC;
+-----------------------------------------------------+--------+
| Namespace                                           | count  |
+-----------------------------------------------------+--------+
| n1      | 219437 |
| n2       |   2032 |
| n3                 |   1478 |
| n4            |   1090 |
| n5        |    384 |
| n6      |    367 |
| n7            |    285 |
| n8    |    283 |
.....
56 rows in set (1 min 45.73 sec)

To fix this query, I manually created an Index on the Namespace column and everything seems to be running fine for now.

mysql> show indexes from run_details;
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name                                  | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| run_details |          0 | PRIMARY                                   |            1 | UUID            | A         |      108032 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_createatinsec              |            1 | ExperimentUUID  | A         |         148 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_createatinsec              |            2 | CreatedAtInSec  | A         |      112366 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            1 | ExperimentUUID  | A         |         150 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            2 | Conditions      | A         |         254 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            3 | FinishedAtInSec | A         |      112366 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.00 sec)

mysql> CREATE INDEX namespace ON run_details (Namespace);
Query OK, 0 rows affected (8 min 52.07 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>
mysql>
mysql> show indexes from run_details;
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name                                  | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| run_details |          0 | PRIMARY                                   |            1 | UUID            | A         |      108032 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_createatinsec              |            1 | ExperimentUUID  | A         |         148 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_createatinsec              |            2 | CreatedAtInSec  | A         |      112367 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            1 | ExperimentUUID  | A         |         150 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            2 | Conditions      | A         |         254 |     NULL | NULL   |      | BTREE      |         |               |
| run_details |          1 | experimentuuid_conditions_finishedatinsec |            3 | FinishedAtInSec | A         |      112367 |     NULL | NULL   | YES  | BTREE      |         |               |
| run_details |          1 | namespace                                 |            1 | Namespace       | A         |          55 |     NULL | NULL   |      | BTREE      |         |               |
+-------------+------------+-------------------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.00 sec)


Impacted by this bug? Give it a 👍.

@Linchin
Copy link
Contributor

Linchin commented Jul 27, 2023

Hi @deepk2u, thank you for bringing this issue up and offer a solution! If you would like to create a PR to fix this problem, it would be great.

@tam0201
Copy link

tam0201 commented Aug 1, 2023

We also impacted by this issue, most of our namespace contains > 200k runs. Anh now we cannot view any run on UI. Hope there will be a fix soon!

@deepk2u
Copy link
Contributor Author

deepk2u commented Aug 1, 2023

@Linchin @tam0201 I have tried to fix it in my PR #9806 . Please take a look when you have time.

zijianjoy pushed a commit to zijianjoy/pipelines that referenced this issue Aug 10, 2023
…beflow#9806)

* Update client_manager.go

* Update client_manager.go
chensun pushed a commit that referenced this issue Aug 17, 2023
* Update client_manager.go

* Update client_manager.go
stijntratsaertit pushed a commit to stijntratsaertit/kfp that referenced this issue Feb 16, 2024
…beflow#9806)

* Update client_manager.go

* Update client_manager.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants