Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS SageMaker] Add CodeBuild Steps #3668

Merged
merged 23 commits into from
May 4, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Remove TTY from docker run
Nicholas Thomson committed Apr 30, 2020
commit aed84654fd4b40bfefd8c095da8fba090f2367b8
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ phases:

# Run the container and copy the results to /tmp
# Passes all host environment variables through to the container
- docker run -it --name integration-test-container $(env | cut -f1 -d= | sed 's/^/-e /') amazon/integration-test-image
- docker run --name integration-test-container $(env | cut -f1 -d= | sed 's/^/-e /') amazon/integration-test-image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design question:

  1. What is the need to run container in container here ?
  • We can have a codebuild step which builds all the images needed for the pipeline(release/unit/integ-test) first so that each step can use those as base image of codebuild job directly
    Especially with integration test and deploy - the test container needs IAM permissions, and maybe things where it need access to resources outside the codebuild job so we can use avoid passing them explicitly here.
    Also we can maintain only one image per job rather than a base image + step specific image
  1. Placement of these codebuild spec files
  • Why do we need a separate codebuild directory, can we add the specs into their respective directories? create a release directory if needed

Copy link
Contributor Author

@RedbackThomson RedbackThomson May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* We can have a codebuild step which builds all the images needed for the pipeline(release/unit/integ-test) first so that each step can use those as base image of codebuild job directly
  Especially with integration test and deploy - the test container needs IAM permissions, and maybe things where it need access to resources outside the codebuild job so we can use avoid passing them explicitly here.
  Also we can maintain only one image per job rather than a base image + step specific image

I worry that this would add additional complexity to the build process that may be prone to failure. We now need another CodeBuild project and ECR repo to build and then host the test images, and our integration test project would need to point to a specific tag in that repo. Developers would now need to know that they had to run the build step before the integration test step for every run - rather than simply running the one step. I don't think this would save time either, since the time to spin up the codebuild run, build the container, upload to ECR and then finally run the integration test codebuild project would surely be more time than building it and running it in one.

My ideal world would be this:

  • The developer of a new integration test can edit a .env file and then build and run the integration test Dockerfile. Everything would be created automatically (except data maybe) and all of the tools needed for testing would be included in that Docker container.
  • All the pipeline needs to do is build and run the same container. The scripts would be written in such a way that we could programatically override any values (role names, bucket names, etc.) to work with either system.

The current implementation of CodeBuild uses the standard:3.0 base image, which has Docker and Python in it if we need those. I have already included in this CodeBuild spec the ability for the outer CodeBuild container to pass all of its environment variables through to the inner (integration test) container. This allows us to simply pass any values in from our CDK infra to the CodeBuild container and the integration test container would be able to access them - this includes role permissions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Why do we need a separate codebuild directory, can we add the specs into their respective directories? create a release directory if needed

It is fairly common for projects to have a "codebuild" directory, but I've seen both solutions. I think because each of these codebuild steps does roughly the same thing (build and run a container), they should stay fairly close together so we remember to update them. They also don't really have anything to do with the actual unit or integration tests themselves - I wouldn't expect anyone else to care about them except us.

- docker cp integration-test-container:/app/integration_tests/log /tmp/results.xml
- docker rm -f integration-test-container

2 changes: 1 addition & 1 deletion components/aws/sagemaker/codebuild/unit-test.buildspec.yml
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ phases:

# Run the container and copy the results to /tmp
# Passes all host environment variables through to the container
- docker run -it --name unit-test-container $(env | cut -f1 -d= | sed 's/^/-e /') amazon/unit-test-image
- docker run --name unit-test-container $(env | cut -f1 -d= | sed 's/^/-e /') amazon/unit-test-image
- docker cp unit-test-container:/app/unit_tests/log /tmp/results.xml
- docker rm -f unit-test-container