-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[frontend] Latency when listing runs #10778
Comments
I saw your comment in #10230, but I believe those are different issues. Although the old issue says it's a frontend issue, we discovered that it was a Database issue, and with the indexes creation we could improve performance. Now the challenge is figure out what's going on in frontend that makes to much repetitive requests. Thank you for describing the steps to reproduce, I'll follow these steps and see if there's something we can do to propose a workaround until we have the actual fix. |
Thank you, @rimolive. We will continue investigating workarounds on our end as well. |
Found in the KFP Pipeline logs. Seems very relevant. Continuing to investigate but want to share now in case it's helpful for anyone else who is debugging:
|
I think we just need to adjust the QPS and burst when instantiating the K8s client for the KFP API. Will validate the fix and update here. |
After investigating more thoroughly with @tarat44, it seems like modifying the QPS is probably not the optimal solution. We're going to try updating the list runs page to make a single call to https://www.kubeflow.org/docs/components/pipelines/v1/reference/api/kubeflow-pipeline-api-spec/#operation--apis-v1beta1-experiments-get instead of individual calls for each experiment id and then parse the larger response for reduced latency. Please let us know if this seems reasonable. Thanks! |
@droctothorpe that seems reasonable. Are there any other places which are making repeated requests that can be either cached, or combined? Also, I guess we could have the backend cache the result of the subject access review for a few seconds. |
Thanks, @thesuperzapper. Just submitted a preliminary PR. Still have to update tests but waiting for design review before doing so. Going through the frontend code, it definitely seemed like there may be other opportunities to reduce latency. Here is another example where |
Environment
Using the AWS helm chart.
2.0.5
Steps to reproduce
https://<domain>/pipeline/apis/v2beta1/experiments/<experiment id>
.Expected result
In an ideal world, the frontend would cache the experiment and not have to make the same request 10 times. In addition, even without caching, 10 experiment API requests probably shouldn't result in this much latency. Also, might be a red herring, but the experiment requests all include a
Cache-Control: no-cache
header. cc @owmasch.Impacted by this bug? Give it a 👍.
The text was updated successfully, but these errors were encountered: