Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keda GCP PubSub Triggering Unnecessary ScaledJobs After Ack #5613

Closed
emirsaidh opened this issue Mar 19, 2024 · 2 comments
Closed

Keda GCP PubSub Triggering Unnecessary ScaledJobs After Ack #5613

emirsaidh opened this issue Mar 19, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@emirsaidh
Copy link

Report

I have long-running ScaledJobs, lasting approximately 10 minutes. The trigger configuration appears as follows, scaling one job for each message in the PubSub queue. However, upon successful processing and sending an acknowledgment (ack) to PubSub, Keda checks the queue every 30 seconds. Due to PubSub's non-real-time behavior, it updates the UnackedMessageNumber almost after 2 minutes. Consequently, Keda scales new jobs every 30 seconds (3-4 times).

  • type: gcp-pubsub
    metadata:
    mode: "SubscriptionSize" # Optional - Default is SubscriptionSize - SubscriptionSize or OldestUnackedMessageAge
    value: "1.0"
    subscriptionName: my-sub

Expected Behavior

  • Upon receiving a message from PubSub, Keda should scale one job to process the message.
  • After processing and acknowledging the message, Keda should wait for the next message before scaling another job.

Actual Behavior

  • Keda checks the PubSub queue every 30 seconds, irrespective of the message processing time.
  • Due to the delayed update of UnackedMessageNumber in PubSub (approximately after 2 minutes), Keda incorrectly scales new jobs every 30 seconds, resulting in unnecessary job scaling (3-4 times).

Steps to Reproduce the Problem

1- Set up a PubSub subscription.
2- Configure Keda to scale jobs based on SubscriptionSize.
3- Ensure that the ScaledJobs have a runtime of few minutes.
4- Start sending messages to the PubSub queue
5- Monitor the behavior of Keda as it scales jobs based on the messages received.
6- Observe that Keda incorrectly scales new jobs every 30 seconds, even though the ack message sent to PubSub.

Logs from KEDA operator

example

KEDA Version

2.11.2

Kubernetes Version

1.27

Platform

Other

Scaler Details

GCP PubSub

Anything else?

No response

@emirsaidh emirsaidh added the bug Something isn't working label Mar 19, 2024
@JorTurFer
Copy link
Member

Hello @emirsaidh ,
Thanks for reporting! The problem here is that pub/sub still reports the jobs some time after your ACK the message and KEDA doesn't check the messages at all, so we cannot check if the message is the same or not. In this case, I'd suggest 2 options to mitigate the gap:

  • Increase the pollingInterval to 2 minutes to mitigate the impact of this (but take into account that it will apply also for adding jobs)
  • Changing your job's code to exit if there isn't any message in the queue. With this approach, you could finish and remove those jobs which have been created because of pub/sub propagation lag

I know that both aren't the best option, but the problem is that the backend (stackdriver api) still reports the message

@emirsaidh
Copy link
Author

Unfortunately both solutions does not work for us but thanks a lot for the response and your time @JorTurFer, I understand that is not about Keda but Pub/Sub.

@github-project-automation github-project-automation bot moved this from To Triage to Ready To Ship in Roadmap - KEDA Core Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

2 participants