From 700b603ae5048742747de807267ab84ae140a602 Mon Sep 17 00:00:00 2001 From: Pete F Date: Fri, 12 Jul 2024 15:29:52 +0100 Subject: [PATCH] Recommendations for multi-instance apps --- scheduled-jobs.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/scheduled-jobs.md b/scheduled-jobs.md index de65a07..b2ce5ee 100644 --- a/scheduled-jobs.md +++ b/scheduled-jobs.md @@ -18,6 +18,27 @@ Prefer [systemd timers](https://askubuntu.com/a/1051208/17211) (configured in `/ See also https://opensource.com/article/20/7/systemd-timers. + +### Multi-instance apps (e.g. EC2 in an auto-scaling group) + +`crontab` and similar solutions won't always be suitable for multi-instance apps. For example, if you have more than one instance but only need to run the scheduled task once (common if the task has side effects, like sending email). + +In this case, if the task can be triggered via a request to a HTTPS endpoint then the app's load balancer can ensure that at most one instance of the app receives the request and runs the task. + +Scheduling the request from outside the app itself can be done in multiple ways (e.g. a scheduled Lambda), but one good solution is to use EventBridge rules with the [API destination integration](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-api-destinations.html#eb-create-rule-api-destination-target). This allows EventBridge rules to make authenticated calls to a specified endpoint on a schedule. + +#### Pros +* All of the moving pieces required to create the schedule and trigger can be configured as part of your infrastructure, e.g. via the AWS CDK. So there's +no need to write business logic or maintain a Lambda with the dependencies required to make the HTTP requests. +* No need to hand roll scheduling and retry logic because this is baked in to the EventBridge framework. + +#### Cons +* EventBridge requests to an API destination endpoint have a maximum timeout of 5 seconds. So if your task takes more than 5 seconds, and you need the +caller to be aware of its outcome (e.g. to enable retries on failure) this approach won't be suitable. +* The AWS constructs require a fair amount of boilerplate CDK code, but there are examples in the Guardian estate that could be used as a basis, e.g. [in the crosswords status checker](https://github.com/guardian/crosswordv2/blob/126acf8c6cf88dcc2edc0e851df5b2d0bbe8685b/cdk/lib/scheduled-status-check.ts). + + + ### AWS Lambda An [`AWS::Events::Rule`](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-events-rule.html) can invoke an AWS Lambda at regular intervals.