-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize tsh db ls performance #14092
Conversation
78315ad
to
3a21023
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can cover this flow somehow in the tsh test.
For example, we can use our own dialer with custom timeout and inject this in tsh test to simulate low latency between the tsh client and teleport proxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered the idea of having the allowed database users retrieved and returned as part of ListDatabases
instead, to avoid the extra roundtrips to also fetch the rolesets?
sidenote: we don't seem to care about showing the list of database users when we're outputting databases in json format, perhaps we could skip the extra fetches in that case?
additional question: do we absolutely need all the changes to parallelize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional point raised by @rosstimothy: we should be careful about parallelizing connections - a cluster with a lot of leaf clusters can cause a massive amount of outbound connections when doing listDatabasesAllClusters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I have left one comment but aport from that the fix looks good for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. We probably want to consider adding some tracing bits to the new stuff added here. It might also be worthwhile to capture traces before and after this change. That would really show the performance gains and also help identify any areas that could still be improved upon.
@rosstimothy traceability for new changes works out of the box. Old code Here is the comparison. Tested against my cluster in AWS. Before this change
With this change
Really love the tracing! |
…/14075_improve_tsh_db_ls_performance
…m:gravitational/teleport into STeve/14075_improve_tsh_db_ls_performance
Looks good! I think that |
…/14075_improve_tsh_db_ls_performance
…m:gravitational/teleport into STeve/14075_improve_tsh_db_ls_performance
fix #14075
Changes:
GetCurrentUserRoles
that streams all user rolestsh db ls
to reuse proxy clienttsh db ls --all
to run fetch in parallel per profile