You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to request functionality in the KEDA HTTP Add-on that supports asynchronous “post-response” tasks before scaling down a pod. This is similar to the “waitUntil()” concept present in some application frameworks (e.g., Vercel Functions), but implemented at the infrastructure/auto-scaling layer through custom metrics.
Context
Currently, in some serverless platforms and application frameworks, an API can return a response immediately while scheduling asynchronous tasks (e.g., logging, analytics, cache updates) to run in the background. In order to prevent these tasks from being terminated prematurely (e.g., when a pod is about to spin down), some form of coordination is needed so that the auto-scaler knows there are still in-flight tasks.
In the KEDA HTTP Add-on world, we can’t literally provide a “waitUntil()” function—because that involves application-level code. Instead, the Add-on could expose or respect a Prometheus (or similar) metric that indicates outstanding background tasks. Only when this metric reaches zero can the pod be considered safe to scale down.
Proposed Approach
The application itself tracks how many “post-response” tasks are currently in flight.
It serves a Prometheus metric (e.g., via an endpoint like /metrics) indicating that count.
KEDA HTTP Add-on is configured to allow scale-in (down to zero or removing pods) only if this metric is zero.
Prometheus Metric Example
Below is a simple example of how you might expose this metric in a Go application (the same idea can be used in any language). You could name it something like “myapp_background_tasks_in_flight”:
package main
import (
"fmt""net/http""sync/atomic"
)
vartasksInFlightint64funcmain() {
http.HandleFunc("/do-something", func(w http.ResponseWriter, r*http.Request) {
// Do your normal request handling herew.Write([]byte("OK\n"))
// Start a background taskatomic.AddInt64(&tasksInFlight, 1)
gofunc() {
deferatomic.AddInt64(&tasksInFlight, -1)
// ... do some logging, analytics, etc.
}()
})
// Expose metricshttp.HandleFunc("/metrics", func(w http.ResponseWriter, r*http.Request) {
metricFmt:="# HELP myapp_background_tasks_in_flight Count of background tasks.\n"metricFmt+="# TYPE myapp_background_tasks_in_flight gauge\n"metricFmt+=fmt.Sprintf("myapp_background_tasks_in_flight %d\n", atomic.LoadInt64(&tasksInFlight))
w.Write([]byte(metricFmt))
})
http.ListenAndServe(":8080", nil)
}
When there are three background tasks running, the /metrics endpoint might show:
# HELP myapp_background_tasks_in_flight Count of background tasks.
# TYPE myapp_background_tasks_in_flight gauge
myapp_background_tasks_in_flight 3
KEDA HTTP Add-on Configuration Sketch
If KEDA HTTP Add-on supported a configuration parameter (e.g., “scaleDownWhenZero: myapp_background_tasks_in_flight”), it would look like:
apiVersion: keda.sh/v1alpha1kind: HTTPScaledObjectmetadata:
name: myapp-http-scalerspec:
host: "example.com"rules:
- name: myappscaleDownMetric:
metricName: "myapp_background_tasks_in_flight"mustBeZero: true# ... other standard config, placeholders, etc.# ...
• “metricName: myapp_background_tasks_in_flight” references the Prometheus metric.
• “mustBeZero: true” means: do not scale down if its current value is > 0.
This ensures no pods will be terminated as long as background tasks are in progress.
Use-Case
Logging: Avoid truncated logs by ensuring the logging process in the background finishes.
Analytics: Send analytics data asynchronously and reliably, even in bursty traffic environments.
Cache Updates: Update and invalidate caches asynchronously without risking partial updates if the pod shuts down too soon.
Benefits
Improved Performance: Responses are sent immediately, while heavier tasks happen post-response.
Efficient Resource Usage: Pods only remain alive if there are still tasks in flight; no guesswork.
Better Developer Experience: Infrastructure “knows” not to kill pods while there are unfinished tasks.
Conclusion
By adding a way to respect a “tasks in flight” metric, KEDA HTTP Add-on would let developers perform post-response tasks without risking termination during critical background work. It avoids implementing an application-specific “waitUntil()” method and cleanly leverages existing Prometheus monitoring. This feature would fulfill a similar role as waitUntil() in other environments—ensuring that asynchronous chores complete before a pod is scaled down.
Thank you for considering this request!
Is this a feature you are interested in implementing yourself?
No
Anything else?
The text was updated successfully, but these errors were encountered:
Proposal
Describe the Feature
I would like to request functionality in the KEDA HTTP Add-on that supports asynchronous “post-response” tasks before scaling down a pod. This is similar to the “waitUntil()” concept present in some application frameworks (e.g., Vercel Functions), but implemented at the infrastructure/auto-scaling layer through custom metrics.
Context
Currently, in some serverless platforms and application frameworks, an API can return a response immediately while scheduling asynchronous tasks (e.g., logging, analytics, cache updates) to run in the background. In order to prevent these tasks from being terminated prematurely (e.g., when a pod is about to spin down), some form of coordination is needed so that the auto-scaler knows there are still in-flight tasks.
In the KEDA HTTP Add-on world, we can’t literally provide a “waitUntil()” function—because that involves application-level code. Instead, the Add-on could expose or respect a Prometheus (or similar) metric that indicates outstanding background tasks. Only when this metric reaches zero can the pod be considered safe to scale down.
Proposed Approach
Prometheus Metric Example
Below is a simple example of how you might expose this metric in a Go application (the same idea can be used in any language). You could name it something like “myapp_background_tasks_in_flight”:
When there are three background tasks running, the /metrics endpoint might show:
KEDA HTTP Add-on Configuration Sketch
If KEDA HTTP Add-on supported a configuration parameter (e.g., “scaleDownWhenZero: myapp_background_tasks_in_flight”), it would look like:
• “metricName: myapp_background_tasks_in_flight” references the Prometheus metric.
• “mustBeZero: true” means: do not scale down if its current value is > 0.
This ensures no pods will be terminated as long as background tasks are in progress.
Use-Case
Benefits
Conclusion
By adding a way to respect a “tasks in flight” metric, KEDA HTTP Add-on would let developers perform post-response tasks without risking termination during critical background work. It avoids implementing an application-specific “waitUntil()” method and cleanly leverages existing Prometheus monitoring. This feature would fulfill a similar role as waitUntil() in other environments—ensuring that asynchronous chores complete before a pod is scaled down.
Thank you for considering this request!
Is this a feature you are interested in implementing yourself?
No
Anything else?
The text was updated successfully, but these errors were encountered: