-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(ci): migrate E2E tests to CDK CLI and off Docker #1501
Conversation
Authoring experience changeWe no longer need BEFORE from pathlib import Path
from tests.e2e.utils.data_builder import build_service_name
from tests.e2e.utils.infrastructure import BaseInfrastructure
class TracerStack(BaseInfrastructure):
# Maintenance: Tracer doesn't support dynamic service injection (tracer.py L310)
# we could move after handler response or adopt env vars usage in e2e tests
SERVICE_NAME: str = build_service_name()
FEATURE_NAME = "tracer"
def __init__(self, handlers_dir: Path, feature_name: str = FEATURE_NAME, layer_arn: str = "") -> None:
super().__init__(feature_name, handlers_dir, layer_arn)
def create_resources(self) -> None:
env_vars = {"POWERTOOLS_SERVICE_NAME": self.SERVICE_NAME}
self.create_lambda_functions(function_props={"environment": env_vars}) AFTER from tests.e2e.utils.infrastructure import BaseInfrastructure
class TracerStack(BaseInfrastructure):
def create_resources(self) -> None:
self.create_lambda_functions() |
Lambda Layer improvementsWe no longer deploy a stack for a Lambda Layer, no more Docker, and instead use a local abstraction for Pip install including extensibility for future changes (e.g., boto3 removal). Our new Layer abstraction only builds if the source code has changed, improving the author experience for E2E speed. Time to hit first breakpoint is now at 80 seconds compared to 160s w/ Docker and no caching. layer_build = LocalLambdaPowertoolsLayer().build()
layer = LayerVersion(
self.stack,
"aws-lambda-powertools-e2e-test",
layer_version_name="aws-lambda-powertools-e2e-test",
compatible_runtimes=[
Runtime.PYTHON_3_7,
Runtime.PYTHON_3_8,
Runtime.PYTHON_3_9,
],
code=Code.from_asset(path=layer_build),
) |
Lessons learned
We've tried the following approaches before deciding we had to move to CDK CLI:
When migrating to CDK CLI, we learned a new side effects we had to work around in order to support our parallel deployment requirement:
To amortize the performance cost of bringing CDK CLI (Node init, CLI init, process spawning), we were incentivized to improve Lambda Layer and build it locally instead of Docker. However, we've also hit issues since Some of the intermittent errors when trying to make parallel deployments with CDK CLI work. ❌ testV39-metrics-cae198e3-e4d1-413c-9d8d-a2a408b731ab failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/protocol/query.js:50:29)
at Request.callListeners (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/request.js:686:14)
at Request.transition (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/request.js:688:12)
at Request.callListeners (/Users/lessa/.npm/_npx/e72b144743208263/node_modules/aws-sdk/lib/sequential_executor.js:116:18) {
code: 'InvalidChangeSetStatus',
time: 2022-09-03T18:38:32.058Z, jsii.errors.JSIIError: EEXIST: file already exists, mkdir '/Users/lessa/DEV/aws-lambda-powertools-python/cdk.out/asset.da705b5776c4e58b759b0a7027cebe7cae58195baada0a1281389eeeaa0eedb5' ❌ Deployment failed: Error: Stack Deployments Failed: AlreadyExistsException: ChangeSet cdk-deploy-change-set cannot be created due to a mismatch with existing attribute Description
at deployStacks (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/deploy.ts:61:11)
at CdkToolkit.deploy (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:312:7)
at initCommandLine (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/cli.ts:349:12)
Stack Deployments Failed: AlreadyExistsException: ChangeSet cdk-deploy-change-set cannot be created due to a mismatch with existing attribute Description ❌ testV39-event-handlers-dc05b551-0baa-43d9-944d-1b5f703d0016 failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/protocol/query.js:50:29)
at Request.callListeners (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/request.js:686:14)
at Request.transition (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/request.js:688:12)
at Request.callListeners (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-sdk/lib/sequential_executor.js:116:18) {
code: 'InvalidChangeSetStatus',
time: 2022-09-03T19:52:55.176Z,
requestId: '6c56e040-5ee4-4f1c-b821-fa844a6ca90f',
statusCode: 400,
retryable: false,
retryDelay: 386.4977584554941
}
❌ Deployment failed: Error: Stack Deployments Failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at deployStacks (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/deploy.ts:61:11)
at CdkToolkit.deploy (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:312:7)
at initCommandLine (/Users/lessa/.nvm/versions/node/v14.17.2/lib/node_modules/aws-cdk/lib/cli.ts:349:12) |
Adding an example of a top-level AWS CDK application that defines E2E tests cloud infrastructure + how to synthesize, deploy and destroy it. import aws_cdk as cdk
import constants
from deployment import Tracer, Logger
app = cdk.App()
if app.node.try_get_context("tracer") == "true":
Tracer(app, "Tracer", env=constants.DEV_ENV)
if app.node.try_get_context("logger") == "true":
Logger(app, "Logger", env=constants.DEV_ENV)
app.synth()
# Synth
#
# stacks=<get stack names to synth based on selected test groups>
# synth_command="cdk synth"
# foreach stack in stacks:
# synth_command += " --context ${stack}=true"
# $(synth_command)
# Deploy
#
# number_of_stacks=<get number of stacks from previous step>
# cdk deploy --app cdk.out --all --require-approval never --concurrency ${number_of_stacks} --outputs-file outputs.json
# Destroy
#
# cdk destroy --app cdk.out --all --force |
Extended the E2E framework documentation to include the new CDK CLI parallelization, and how a maintainer/contributor would add a new E2E test altogether (infra+tests)... and the internals behind it. Adding as a new comment as it might be hard to follow in the GH PR file diff UI |
E2E frameworkStructureOur E2E framework relies on Pytest fixtures to coordinate infrastructure and test parallelization - see Test Parallelization and CDK CLI Parallelization. tests/e2e structure .
├── __init__.py
├── conftest.py # builds Lambda Layer once
├── logger
│ ├── __init__.py
│ ├── conftest.py # deploys LoggerStack
│ ├── handlers
│ │ └── basic_handler.py
│ ├── infrastructure.py # LoggerStack definition
│ └── test_logger.py
├── metrics
│ ├── __init__.py
│ ├── conftest.py # deploys MetricsStack
│ ├── handlers
│ │ ├── basic_handler.py
│ │ └── cold_start.py
│ ├── infrastructure.py # MetricsStack definition
│ └── test_metrics.py
├── tracer
│ ├── __init__.py
│ ├── conftest.py # deploys TracerStack
│ ├── handlers
│ │ ├── async_capture.py
│ │ └── basic_handler.py
│ ├── infrastructure.py # TracerStack definition
│ └── test_tracer.py
└── utils
├── __init__.py
├── data_builder # build_service_name(), build_add_dimensions_input, etc.
├── data_fetcher # get_traces(), get_logs(), get_lambda_response(), etc.
├── infrastructure.py # base infrastructure like deploy logic, etc. Where:
MechanicsUnder This allows us to benefit from test and deployment parallelization, use IDE step-through debugging for a single test, run one, subset, or all tests and only deploy their related infrastructure, without any custom configuration.
classDiagram
class InfrastructureProvider {
<<interface>>
+deploy() Dict
+delete()
+create_resources()
+create_lambda_functions() Dict~Functions~
}
class BaseInfrastructure {
+deploy() Dict
+delete()
+create_lambda_functions() Dict~Functions~
+add_cfn_output()
}
class TracerStack {
+create_resources()
}
class LoggerStack {
+create_resources()
}
class MetricsStack {
+create_resources()
}
class EventHandlerStack {
+create_resources()
}
InfrastructureProvider <|-- BaseInfrastructure : implement
BaseInfrastructure <|-- TracerStack : inherit
BaseInfrastructure <|-- LoggerStack : inherit
BaseInfrastructure <|-- MetricsStack : inherit
BaseInfrastructure <|-- EventHandlerStack : inherit
Authoring a new feature E2E testImagine you're going to create E2E for Event Handler feature for the first time. Keep the following mental model when reading: graph LR
A["1. Define infrastructure"]-->B["2. Deploy/Delete infrastructure"]-->C["3.Access Stack outputs" ]
1. Define infrastructureWe use CDK as our Infrastructure as Code tool of choice. Before you start using CDK, you'd take the following steps:
class EventHandlerStack(BaseInfrastructure):
def create_resources(self):
functions = self.create_lambda_functions()
self._create_alb(function=functions["AlbHandler"])
...
def _create_alb(self, function: Function):
vpc = ec2.Vpc.from_lookup(
self.stack,
"VPC",
is_default=True,
region=self.region,
)
alb = elbv2.ApplicationLoadBalancer(self.stack, "ALB", vpc=vpc, internet_facing=True)
CfnOutput(self.stack, "ALBDnsName", value=alb.load_balancer_dns_name)
...
from aws_lambda_powertools.event_handler import ALBResolver, Response, content_types
app = ALBResolver()
@app.get("/todos")
def hello():
return Response(
status_code=200,
content_type=content_types.TEXT_PLAIN,
body="Hello world",
cookies=["CookieMonster", "MonsterCookie"],
headers={"Foo": ["bar", "zbr"]},
)
def lambda_handler(event, context):
return app.resolve(event, context) 2. Deploy/Delete infrastructure when tests runWe need to create a Pytest fixture for our new feature under This will instruct Pytest to deploy our infrastructure when our tests start, and delete it when they complete whether tests are successful or not. Note that this file will not need any modification in the future.
import pytest
from tests.e2e.event_handler.infrastructure import EventHandlerStack
@pytest.fixture(autouse=True, scope="module")
def infrastructure():
"""Setup and teardown logic for E2E test infrastructure
Yields
------
Dict[str, str]
CloudFormation Outputs from deployed infrastructure
"""
stack = EventHandlerStack()
try:
yield stack.deploy()
finally:
stack.delete() 3. Access stack outputs for E2E testsWithin our tests, we should now have access to the We can access any Stack Output using pytest dependency injection.
@pytest.fixture
def alb_basic_listener_endpoint(infrastructure: dict) -> str:
dns_name = infrastructure.get("ALBDnsName")
port = infrastructure.get("ALBBasicListenerPort", "")
return f"http://{dns_name}:{port}"
def test_alb_headers_serializer(alb_basic_listener_endpoint):
# GIVEN
url = f"{alb_basic_listener_endpoint}/todos"
... InternalsTest runner parallelizationBesides speed, we parallelize our end-to-end tests to ease asserting async side-effects may take a while per test too, e.g., traces to become available. The following diagram demonstrates the process we take every time you use graph TD
A[make e2e test] -->Spawn{"Split and group tests <br>by feature and CPU"}
Spawn -->|Worker0| Worker0_Start["Load tests"]
Spawn -->|Worker1| Worker1_Start["Load tests"]
Spawn -->|WorkerN| WorkerN_Start["Load tests"]
Worker0_Start -->|Wait| LambdaLayer["Lambda Layer build"]
Worker1_Start -->|Wait| LambdaLayer["Lambda Layer build"]
WorkerN_Start -->|Wait| LambdaLayer["Lambda Layer build"]
LambdaLayer -->|Worker0| Worker0_Deploy["Launch feature stack"]
LambdaLayer -->|Worker1| Worker1_Deploy["Launch feature stack"]
LambdaLayer -->|WorkerN| WorkerN_Deploy["Launch feature stack"]
Worker0_Deploy -->|Worker0| Worker0_Tests["Run tests"]
Worker1_Deploy -->|Worker1| Worker1_Tests["Run tests"]
WorkerN_Deploy -->|WorkerN| WorkerN_Tests["Run tests"]
Worker0_Tests --> ResultCollection
Worker1_Tests --> ResultCollection
WorkerN_Tests --> ResultCollection
ResultCollection{"Wait for workers<br/>Collect test results"}
ResultCollection --> TestEnd["Report results"]
ResultCollection --> DeployEnd["Delete Stacks"]
CDK CLI parallelizationFor CDK CLI to work with independent CDK Apps, we specify an output directory when synthesizing our stack and deploy from said output directory. flowchart TD
subgraph "Deploying distinct CDK Apps"
EventHandlerInfra["Event Handler CDK App"] --> EventHandlerSynth
TracerInfra["Tracer CDK App"] --> TracerSynth
EventHandlerSynth["cdk synth --out cdk.out/event_handler"] --> EventHandlerDeploy["cdk deploy --app cdk.out/event_handler"]
TracerSynth["cdk synth --out cdk.out/tracer"] --> TracerDeploy["cdk deploy --app cdk.out/tracer"]
end
We create the typical CDK
from tests.e2e.event_handler.infrastructure import EventHandlerStack
stack = EventHandlerStack()
stack.create_resources()
stack.app.synth() When we run E2E tests for a single feature or all of them, our total 8
drwxr-xr-x 18 lessa staff 576B Sep 6 15:38 event-handler
drwxr-xr-x 3 lessa staff 96B Sep 6 15:08 layer_build
-rw-r--r-- 1 lessa staff 32B Sep 6 15:08 layer_build.diff
drwxr-xr-x 18 lessa staff 576B Sep 6 15:38 logger
drwxr-xr-x 18 lessa staff 576B Sep 6 15:38 metrics
drwxr-xr-x 22 lessa staff 704B Sep 9 10:52 tracer classDiagram
class CdkOutDirectory {
feature_name/
layer_build/
layer_build.diff
}
class EventHandler {
manifest.json
stack_outputs.json
cdk_app_V39.py
asset.uuid/
...
}
class StackOutputsJson {
BasicHandlerArn: str
ALBDnsName: str
...
}
CdkOutDirectory <|-- EventHandler : feature_name/
StackOutputsJson <|-- EventHandler
Where:
Together, all of this allows us to use Pytest like we would for any project, use CDK CLI and its context methods (
|
Ready for review while I'm away next week - @rubenfonseca @leandrodamascena. This should delight Ruben with CDK |
Codecov ReportBase: 99.42% // Head: 99.42% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## v2 #1501 +/- ##
=======================================
Coverage 99.42% 99.42%
=======================================
Files 128 128
Lines 5745 5745
Branches 670 671 +1
=======================================
Hits 5712 5712
Misses 18 18
Partials 15 15
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
This is just a pre-review comment to show my appreciation for this awesome work. Congratulations guys, it looks amazing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THANK YOU SO MUCH FOR THIS, what a great evolution from the previous code! Just left some comments and questions, but it looks very mergeable to me :)
340333f
to
7002812
Compare
Unfortunately, because of our changes in |
Signed-off-by: heitorlessa <[email protected]>
Signed-off-by: heitorlessa <[email protected]>
Signed-off-by: heitorlessa <[email protected]>
Signed-off-by: heitorlessa <[email protected]>
a79ef69
to
e834314
Compare
@heitorlessa targeting v2, and fix the e2e tracer specs. Only problem remaining is the |
Agree, let's remove. I think the "best practice" from CDK only makes sense
for projects owned by a given team that use the same account, not in our
case as a OSS lib.
…On Tue, 20 Sept 2022 at 14:54, Ruben Fonseca ***@***.***> wrote:
@heitorlessa <https://github.com/heitorlessa> targeting v2, and fix the
e2e tracer specs.
Only problem remaining is the cdk.context.json. Already had to checkout
the file multiple times before committing, because I didn't want to add my
AWS account data on top of the file. The file will only get bigger as more
maintainers run the tests. I think we should try to remove it for now and
see how it goes.
—
Reply to this email directly, view it on GitHub
<#1501 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZPQBGUFTN6UDH3VSU4WMLV7GXYLANCNFSM6AAAAAAQEZXMTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Co-authored-by: Rúben Fonseca <[email protected]>
Issue number: #1500
Summary
This PR moves our E2E deployment details to CDK CLI and off Docker for Lambda Layer build. It simplifies the E2E authoring experience as a bonus, and removes our previous drop-in replacement for Asset bundling in favour of CDK CLI synth.
We can now use
from_lookup
methods. I've also made a minor modification in Tracer to move the annotation forService
to allow updating service name at runtime to ease E2E tests.Changes
feature_name
andhandlers_dir
at init to reduce authoring boilerplate. It will also be the basis to support reuse in other scenarios (integ test
)User experience
See comments due to verbosity.
Checklist
If your change doesn't seem to apply, please leave them unchecked.
Is this a breaking change?
RFC issue number:
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.