Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reading of secrets to environment variable for containerised lambdas #2785

Merged
merged 11 commits into from
Dec 16, 2024

Conversation

kenoir
Copy link
Contributor

@kenoir kenoir commented Dec 11, 2024

What does this change?

Adds a lambda extension to read variables from secrets manager, this is required as there is not a drop in replacement for the mechanism used in ECS where AWS will convert secrets manager references to secrets for us.

This approach uses the Lambda Extension API, which is available to containerised Lambdas by adding files to /opt/extensions/.

For simplicity (by comparison to including some other language runtime to run extension code and/or providing a binary to package), we use a bash script extension, based on the example here..

How it works:

  • In terraform the pipeline_lambda module uses the existing map of environment variable names to secret names to generate a map of environment variable names to secret names prefixed secret:, in addition we also provision permissions for the lambda to access those secrets.
  • bash_secrets_extension.sh looks for environment variables values that have been passed to the lambda that have been prefixed with secret: and retrieves them from AWS Secrets Manager
  • The extension creates a file in the containerised environment called /tmp/config that will persist between invocations, the file uses the Typesafe Config format (HOCON).
  • Inside the Scala application we layer global, application and the config loaded from /tmp/config to resolve a final set of configuration in LambdaConfiguration, allowing the values from /tmp/config to substitute for environment variables where they don't already exist.

How to test

  • Run the relation_embedder lambda using RIE and docker-compose with a valid .env file.
  • Deploy this change, and terraform update to a non-production pipeline, observe the lambda reading the secrets.
Screen.Recording.2024-12-13.at.08.53.20.mov

How can we measure success?

This lambda, and all other ones that require secrets have a mechanism to do so that require no application or significant terraform changes.

Have we considered potential risks?

Yes, this change involves handling and providing secrets to our services. We must take care not to log or reveal these in code or application logs.

This change intends to keep secrets encrypted at rest within secrets manager until they are required by a lambda invocation. This follows an AWS recommended pattern, that unfortunately we can't take direct advantage of as we are using containerised lambdas, but follows the same mechanism.

@kenoir kenoir requested a review from a team as a code owner December 11, 2024 14:04
@kenoir kenoir marked this pull request as draft December 11, 2024 14:04
@kenoir kenoir mentioned this pull request Dec 11, 2024
@kenoir kenoir changed the title Add lambda extension to read variables from secrets manager Allow reading of secrets to environment variable for containerised lambdas Dec 11, 2024
- index_batch_size=100
- index_flush_interval_seconds=60
env_file:
- .env
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .env file contains the remaining pipeline specific variables, we could have a script to generate a file like this:

es_apikey=secret:elasticsearch/pipeline_storage_2024-11-18/relation_embedder/api_key
es_host=secret:elasticsearch/pipeline_storage_2024-11-18/public_host
es_denormalised_index=works-denormalised-2024-11-18
es_merged_index=works-merged-2024-11-18
es_protocol=https
es_port=443

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script added, as part of run_local.sh

@kenoir kenoir force-pushed the rk/lambda-var-relation-embedder branch from 3767805 to b84acee Compare December 11, 2024 15:18

set -euo pipefail

OWN_FILENAME="$(basename $0)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For use in other container images, this should be in a shared location eventually.

EXTENSION_ID=$(grep -Fi Lambda-Extension-Identifier "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)
echo "[${LAMBDA_EXTENSION_NAME}] Registration response: ${RESPONSE} with EXTENSION_ID $(grep -Fi Lambda-Extension-Identifier "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)"

# Event processing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event loop seems to be necessary to keep the Lambda running happily. The extension is invoked as a separate process from the main lambda, but the extension API will fail the start-up if we exit before the lifetime of the lambda container itself.

There may be scope to simplify this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would sleeping for 15 minutes work here, or does the extension have to explicitly listen for the shutdown event to avoid keeping the Lambda awake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I want to double check my assumptions here too. I'll check.

Copy link
Contributor Author

@kenoir kenoir Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some experiments, removing the while loop here results in the following error at lambda initialization:
Screenshot 2024-12-12 at 15 23 22

It looks like exiting early results in stopping the runtime interface emulator (RIE) from invoking the main lambda.

Would sleeping for 15 minutes work here

Putting a sleep 900 instead of the loop, does allow the lambda to be invoked and terminated (though the logs mutter about a forced stop). However a 2nd invocation then fails with the following error:

Screenshot 2024-12-12 at 15 29 14

So I think that messes with some logic around invocation in the RIE in a way that may be reproduced in an in-situ invocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I expect it starts when a lambda is first switched on and keeps going until there's nothing left to do, rather than running for each invocation. This allows warm starts to benefit the most from that warmth.

@kenoir kenoir force-pushed the rk/lambda-var-relation-embedder branch from 7906444 to dde62df Compare December 12, 2024 15:06

export PIPELINE_DATE=$1

PROJECT_NAME="relation_embedder"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is intended for re-use, in. combination with use of template.env and local.docker-compose.yml files. The intention is to work towards running any of the lambda services using the same script / pattern from the root with one script and a docker-compose.yml for all the services.

@kenoir kenoir requested a review from paul-butcher December 13, 2024 09:01
@kenoir kenoir marked this pull request as ready for review December 13, 2024 09:01
# Build the docker image
docker compose -f local.docker-compose.yml \
build lambda

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to call this a build script and end it here (or have a build and a run script).

Although recompiling with SBT is reasonably efficient, it still takes about 15 seconds to reach this point when there's nothing to do.

(not a blocker)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a --skip-build flag to do this optionally.

@kenoir kenoir self-assigned this Dec 13, 2024
val config: Config
}

trait LambdaConfiguration extends Configuration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of thing makes me happy.

Could we even go one step further and hide val config from the Lambda, instead translating the relevant config to a case class or trait so that all the string-based references to properties are in one place?

I suspect it has too many tendrils to do cleanly as part of this change, but might be a worthy ticket to raise for a future improvement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

// This is all shared
trait ApplicationConfig {}
case class MyApplicationConfig(someValue: String, someOtherValue: Int) extends ApplicationConfig

trait ConfigurationBuilder[C, T <: ApplicationConfig] {
  protected val rawConfig: C

  def build(rawConfig: C): T
  def config: T = build(rawConfig)
}

trait TypesafeConfigurable[T <: ApplicationConfig] extends ConfigurationBuilder[Config, T] {
    def build(rawConfig: Config): T
}

trait LambdaConfigurable extends TypesafeConfigurable[MyApplicationConfig] {
  private val defaultResolveFromFile: String = "/tmp/config"
  private val defaultApplicationConfig: String = "application.conf"

  private val lambdaConfigFile: File =
    new File(defaultResolveFromFile)

  protected val baseConfig: Config =
    ConfigFactory.load()

  protected val applicationConfig: Config =
    ConfigFactory.parseResources(defaultApplicationConfig)

  protected val lambdaConfig: Config = if (lambdaConfigFile.exists()) {
    ConfigFactory.parseFile(lambdaConfigFile)
  } else {
    ConfigFactory.empty()
  }

  lazy val rawConfig = lambdaConfig
    .withFallback(applicationConfig)
    .withFallback(baseConfig)
    .resolve()
}

// Then in an app do
trait MyAppConfigurable extends LambdaConfigurable {
  def build(rawConfig: Config): MyApplicationConfig = {
    MyApplicationConfig(
      someValue = rawConfig.getString("someValue"),
      someOtherValue = rawConfig.getInt("someOtherValue")
    )
  }
}

// And in LambdaMain
object LambdaMain
    extends RequestHandler[SQSEvent, String]
    with Logging
    with MyAppConfigurable {

      // config: MyApplicationConfig is available in this scope

}

Yes, I do prefer having one place where we extract the config from typesafe. I might have a go at this in a future PR 👍

@kenoir kenoir merged commit afb237f into main Dec 16, 2024
7 checks passed
@kenoir kenoir deleted the rk/lambda-var-relation-embedder branch December 16, 2024 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Archive
Development

Successfully merging this pull request may close these issues.

3 participants