-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintenance: Refactor E2E mechanism for extensibility, ease test writing, and unblock integ tests #1435
Comments
@mploski turns out the PR it's much smaller than I thought - ~800 LOC addition, ~300 deletion. Given that you're mostly familiar with it, I'm inclined to make a single PR instead of 3 smaller ones (more coordination). I'll create a proper description with the changes and send a PR tomorrow for review before you go on holidays |
Final result with the new parallelization and Lambda Layer being built once. Last step is to enable at CI level and measure whether retry/jitter numbers are sufficient for the random hardware we might get. graph TD
A[make e2e test] -->Spawn{"Split and group tests <br>by feature and CPU"}
Spawn -->|Worker0| Worker0_Start["Load tests"]
Spawn -->|Worker1| Worker1_Start["Load tests"]
Spawn -->|WorkerN| WorkerN_Start["Load tests"]
Worker0_Start -->|Wait| LambdaLayerStack["Lambda Layer Stack Deployment"]
Worker1_Start -->|Wait| LambdaLayerStack["Lambda Layer Stack Deployment"]
WorkerN_Start -->|Wait| LambdaLayerStack["Lambda Layer Stack Deployment"]
LambdaLayerStack -->|Worker0| Worker0_Deploy["Launch feature stack"]
LambdaLayerStack -->|Worker1| Worker1_Deploy["Launch feature stack"]
LambdaLayerStack -->|WorkerN| WorkerN_Deploy["Launch feature stack"]
Worker0_Deploy -->|Worker0| Worker0_Tests["Run tests"]
Worker1_Deploy -->|Worker1| Worker1_Tests["Run tests"]
WorkerN_Deploy -->|WorkerN| WorkerN_Tests["Run tests"]
Worker0_Tests --> ResultCollection
Worker1_Tests --> ResultCollection
WorkerN_Tests --> ResultCollection
ResultCollection{"Wait for workers<br/>Collect test results"}
ResultCollection --> TestEnd["Report results"]
ResultCollection --> DeployEnd["Delete Stacks"]
|
This is now released under 1.28.0 version! |
Summary
We defined the E2E mechanism on #1226 and recently implemented a POC. This issue tracks the extensibility work required to allow E2E to own infrastructure. This includes making it easier for tests to be able to control the payload each E2E test uses to invoke their respective Lambda function.
Tasks
Why is this needed?
When trying to increase E2E coverage, we identified a gap in the mechanism as it doesn't allow each feature group (e.g., Metrics) to customize their own infrastructure.
This made it difficult to add E2E tests for Idempotency where we wanted to (1) create a DynamoDB table and reference it in Lambda functions env var, and (2) send different payloads to test idempotency results.
Once this is complete, we can resume increasing coverage for other features, including defining test strategies (when to use which) and resume integration tests.
Which area does this relate to?
Tests
Solution
This is how it looks like on my fork using the new refactoring (one or more PRs depending on final size).
Metrics infrastructure
tests/e2e/metrics/infrastructure.py
fully controls what infrastructure Metrics should have. It keeps the original mechanism of creating Lambda functions within the handlers directory but now more explicit. It also allows to override any CDK Lambda function prop.Metrics infrastructure parallelization
tests/e2e/metrics/conftest.py
handles infrastructure deployment so tests can run in parallel after deployment. It now handles stack deletion in case of failures too.Metrics test including cold start
tests/e2e/metrics/test_metrics.py
now uses CDK Stack outputs as separate fixtures for each function ARN. It creates a standard for outputs so any function will become two outputs in PascalCase:Name
andNameArn
. Additional helpers were added to make explicit tests easier to create. The main focus is ease of writing tests and allowing any new or existing contributor to understand what's going on.Functions are fully isolated which allows us to safely parallelize tests - Before ~175s and now it is ~94s even with an additional Lambda function (
ColdStart
).Acknowledgment
The text was updated successfully, but these errors were encountered: