Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding worker autoscaling support with KEDA #277

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sdaberdaku
Copy link
Member

No description provided.

@cla-bot cla-bot bot added the cla-signed label Dec 13, 2024
@sdaberdaku
Copy link
Member Author

For some reason, the trino_execution_resourcegroups_InternalResourceGroup_RunningQueries JMX metric is not exposed by Trino 446. It is by versions 435 and by 467. My KEDA test starts with 0 worker replicas. Then, a query on TPCH is launched on the coordinator, which increases this value from 0 to 1 and triggers the creation of a worker pod. Unfortunately, queries that are submitted when no workers are available enter a state of "WAITING_FOR_RESOUCES" which is not blocked nor queued.

@nineinchnick I am open to suggestions for a better metric or testing strategy.

@sdaberdaku sdaberdaku force-pushed the feature/add-keda-support branch 2 times, most recently from 35dc778 to b9dcd3b Compare December 14, 2024 18:20
@sdaberdaku sdaberdaku marked this pull request as ready for review December 14, 2024 18:31
@sdaberdaku sdaberdaku force-pushed the feature/add-keda-support branch 2 times, most recently from 52c79e9 to 757de90 Compare December 15, 2024 13:00
@sdaberdaku
Copy link
Member Author

I found the trino_execution_ClusterSizeMonitor_RequiredWorkers metric that works with all versions of Trino and allows us to test scaling up from 0 workers.

I am also wondering if the Chart should support the creation of TriggerAuthentication objects to cover the cases when Prometheus requires authentication. One could always create this object outside the chart and reference it in the ScaledObject trigger, so it is not mandatory. What is blocking me is how to implement the trigger - TriggerAuthentication assignment. One TriggerAuthentication can be referenced by multiple triggers. Naming also needs to be deterministic, since the trigger needs to reference the TriggerAuthentication object by name. I could "inject" the {{ Release.Name }} prefix to these objects, but I don't like it very much.

Copy link
Member

@nineinchnick nineinchnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, I haven't yet checked all the new properties.

@@ -114,6 +114,70 @@ server:
# selectPolicy: Max
# ```

# -- Configure [KEDA](https://keda.sh/) for workers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document this is exclusive with server.autoscaling, and if possible, help users make the choice, if they're just starting with autoscaling and don't know either one of those options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes totally sense. I will improve the documentation on this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to clarify this in the documentation. I also added a warning in NOTES.txt to indicate that keda would take precedente over hpa in case they are both enabled.

tests/trino/test-values.yaml Show resolved Hide resolved
@sdaberdaku sdaberdaku force-pushed the feature/add-keda-support branch from 757de90 to d20836f Compare December 21, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants