-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the same DockerContext and Dockerfile paths as the CDK is using #4074
Use the same DockerContext and Dockerfile paths as the CDK is using #4074
Conversation
return { | ||
SAM_METADATA_DOCKERFILE_KEY: dockerfile, | ||
SAM_METADATA_DOCKER_CONTEXT_KEY: dockerfile_context, | ||
SAM_METADATA_DOCKERFILE_KEY: str(dockerfile_path), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit concerned about this change since this will break the current code behaviour, might need to investigate on the current usage volume to decide if we can change this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might need to investigate on the current usage volume to decide if we can change this part.
Sounds like a good move 👍
this change... will break the current code behaviour
As noted, I think it's good to investigate this.
From our own investigation...
I would not describe this as breaking
any current behavior.
It does change
the current behavior.
But a change
is only a breaking
if it causes something to stop working.
So - would this change
cause any currently working code to break
?
We do not think it will.
To illustrate...
Say your CDK code looks like this:
code = aws_lambda.DockerImageCode.from_image_asset(
directory="/Users/john/PycharmProjects/project-root/application/runtime/",
file="Dockerfile",
)
docker_lambda = aws_lambda.DockerImageFunction(code=code)
This works with the current code.
And this will still work after this PR is merged.
Currently - the SAM CLI only works with CDK code where file
is just the Dockerfile
and directory
is the full path to the Dockerfile
.
In contrast, CDK code like this...
code = aws_lambda.DockerImageCode.from_image_asset(
directory="/Users/john/PycharmProjects/project-root",
file="./application/runtime/Dockerfile",
)
docker_lambda = aws_lambda.DockerImageFunction(code=code)
...does not currently work with the SAM CLI. But it will work when this PR is merged.
An important question would be –
Why was the code written like this to begin with?
In the PR where this code was introduced – this does not seem to have been discussed in the comments.
What this code behaviour specified in your internal tools e.g. Jira? If so, what were the reasons given?
The goal of this feature is to be compatible with the CDK.
But currently, the code is not supporting an important capability of the CDK - the ability to control the directory that assets are built from.
Both cdk deploy
and sam build
work fine with metadata like:
DockerContext ="/Users/john/PycharmProjects/project-root",
Dockerfile ="./application/runtime/Dockerfile",
So it is not clear to us why the code in the PR was been written to ensure sam cli
always works with metadata like:
DockerContext ="/Users/john/PycharmProjects/project-root/application/runtime",
Dockerfile ="Dockerfile",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it is not clear to us why the code in the PR was been written to ensure sam cli always works with metadata like
This is how SAM CLI handles it. So it was written to match SAM CLI expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfuss apologies for the slow reply, I was on vacation.
So I had a look at the SAM CLI code to see how DockerContext
and Dockerfile
metadata are used by the SAM CLI.
I found 4 other usages.
Just one usage is relevant to this PR:
In this code block in samcli/lib/build/app_builder.py
We read the metadata properties:
dockerfile = cast(str, metadata.get("Dockerfile"))
docker_context = cast(str, metadata.get("DockerContext"))
Then we make docker_context
into an absolute path:
docker_context_dir = pathlib.Path(self._base_dir, docker_context).resolve()
Then we call docker build
using the python docker client, passing the properties as arguments:
build_args = {
"path": str(docker_context_dir),
"dockerfile": dockerfile,
}
build_logs = self._docker_client.api.build(**build_args)
So in summary -
Dockerfile
is passed without modification to_docker_client.api.build
as thedockerfile
argumentDockerContext
is converted to an absolute path, and passed to_docker_client.api.build
as thepath
argument
Here is the documentation for '_docker_client.api.build'
And those arguments are:
path (str) – Path to the directory containing the Dockerfile
dockerfile (str) – path within the build context to the Dockerfile
In my opinion, the documentation for path
is a little bit ambiguous.
Does this mean that path
is:
- the immediate parent directory containing the Dockerfile?
- a directory that ultimately contains the dockerfile, which might be nested in child folder(s)?
I would completely understand if something thinks it means 1.
- this is what I thought at first.
But if we look at the documentation for dockerfile
- it says path within the build context to the Dockerfile
Because dockerfile
can be a path within the build context
- it can be a path (potentially via subfolders) to the Dockerfile
.
So in fact, 2.
is the correct interpretation.
This interpretation is reinforced by the docker cli
documentation for docker build
Usage is:
docker build [OPTIONS] PATH
Where PATH
will be DockerContext
in the SAM CLI.
And the option --file
will be Dockerfile
in the SAM CLI.
In the docker cli
documentation, they refer to something called the build context
.
The build context
is very similar to PATH
(AKA DockerContext
).
The build context
is a copy of the folder specified in PATH
(DockerContext
), excluding files specified in .dockerignore
.
So --file
(AKA Dockerfile
) is a path to a docker file within the build context
(usually a relative path is used).
There is no requirement for the docker file to be in the root of the build context
.
I.E. --file
(Dockerfile
) can be a path via subfolders to the docker file.
Here's the documentation for --file
.
The documentation has examples where --file
is a path via subfolders to the docker file like:
docker build --file dockerfiles/Dockerfile.debug .
docker build --file dockerfiles/Dockerfile.prod .
In the example above, PATH
is .
- the current directory.
And there's a quote in this section of the docs.
By default the docker build command will look for a Dockerfile at the root of the build context.
The -f, --file, option lets you specify the path to an alternative file to use instead.
This is useful in cases where the same set of files are used for multiple builds.
The use case where the same set of files are used for multiple builds
is exactly what this PR is trying to solve.
This is the use case described in the PR description.
This use-case is supported by the AWS CDK
.
This use-case is also - in fact - already supported by the SAM CLI
.
The AWS CDK
docs support this use-case explicitly:
directory (str) – the directory from which the asset must be created.
file (Optional[str]) – Path to the Dockerfile (relative to the directory). Default: ‘Dockerfile’
The SAM CLI
developer guide docs do not support this use-case explicitly:
Dockerfile - The name of the Dockerfile associated with the Lambda function.
DockerContext - The location of the Dockerfile.
The SAM CLI
developer guide sound like interpretation 1.
of the documentation for '_docker_client.api.build'.
But in fact, the SAM CLI
code works perfectly well if Dockerfile
is a path via subfolders to the docker file.
And I've personally used this 'undocumented feature' in multiple SAM
projects I have worked on.
In other words -
What seems to have happened is...
- Someone thought that in
SAM
projects,Dockerfile
must be a direct child ofDockerContext
- SAM developer guide documented this limitation
- When
SAM
was integrated withCDK
, somebody wrote code to enforce this limitation
The code to enforce this supposed limitation means SAM
is not fully compatible with CDK
.
So this PR is removing that code, to improve support for the CDK
.
Also - it seems likely the supposed limitation never in fact existed, and it's always been possible in SAM
for Dockerfile
to be a path via subfolders to the docker file.
So I've created a PR to update the SAM CLI
developer guide, so people will be aware of this capability of the SAM CLI
.
I've also created a PR to update the documentation for the Docker Python Client to be more clear.
It may be possible that unclear documentation in docker-py
is the original source of the belief that SAM CLI
requires Dockerfile
to be a direct child of DockerContext
.
A further note -
Above there is a link do documentation for the docker cli
.
The python docker SDK doesn't use the docker cli
, it uses the docker api
.
So I checked the docker API documentation for build
It takes an argument:
dockerfile - Path within the build context to the Dockerfile.
It also takes a tar archive which contains the build context
- documented as:
Build an image from a tar archive with a Dockerfile in it.
The Dockerfile specifies how the image is built from the tar archive.
It is typically in the archive's root, but can be at a different path or have a different name by specifying the dockerfile parameter.
I.e. dockerfile
'is typically in the archive's root' - but it can be a path to a dockerfile in a subfolder.
And I checked the source code of the docker python sdk, to see how it's interacting with the docker API.
Here's the code where it creates the build context tar archive
.
This is a copy of the folder specified in path
(AKA DockerContext
), excluding files specified in .dockerignore
.
And on this line it passes Dockerfile
to the docker API as dockerfile
.
For reference, here are the 3 other usages of DockerContext
metadata, which are not relevant to this PR:
- In samcli/commands/build/build_context.py - If it's a docker lambda, confirm these metadata properties are defined.
- In samcli/lib/utils/resource_trigger.py - trigger a code refresh if this metadata is changed the the developer
- In samcli/lib/providers/sam_function_provider.py - converting
DockerContext
to be an absolute path if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @i18n-tribe for your contribution and your very detailed explanation. I think you have a point regarding this change, I think we miss understood the DockerContext
and DockerFile
properties and we assumed that the DockerFile
should be a file in the DockerContext
directory path. I did some investigations to make sure that this change will not break any customer. I think it will work as expected, and fix this bug.
my only request is to add some integration testing to cover this case
…e_paths_as_the_cdk_is_using
@i18n-tribe I am still trying to grasp what is being changed and how that impacts the CLI. At the very least, we need integration tests added to ensure this behaves as expected. I would have to check to see if we have integ tests that cover the current integrations, but if we don't have those, we need them added. This ensures we are not breaking current customers and allows us to better regression test going further. |
Thanks for your contribution @i18n-tribe. Until you will find sometime and get back to this PR, I am marking it as draft. |
@jfuss perhaps it could be productive if we arrange a quick video call to discuss? I can completely understand you not yet finding this clear. It's quite a subtle issue, and it takes quite a lot of words to explain in writing. But I have a feeling if we discuss it briefly in realtime, we will quickly be able to align on a common understanding. If you have Calendly or a similar app I could use to book a meeting at a convenient time for you, I'd be very happy to do so. We could potentially also include @sriram-mv who originally wrote the code to support docker lambdas using SAM. In the meantime, I have written a systematic analysis of how the From this analysis, and from my personal experience working on multiple SAM projects, I believe the |
@mndeveci sorry for slow reply to @jfuss 's comments, I was on vacation. I'd be more than happy to help move this forward when you have some time - by discussing on a brief call, and if you like by helping to write the integration tests. I think a call would be really helpful as a next step, because we do not yet have a shared understanding of what this PR is doing. Once we have a shared understanding of the PR, it will be easier to prioritise this PR accurately - at an appropriate level of priority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Could you please add some integration testing to cover this case.
…ubdirectories. This means images can share code more easily.
d0f6b8f
to
ec8d936
Compare
Thanks @moelasmar , I have added a new commit with integration testing for the case where All tests are passing in CI. I'm not sure if this new test class is running in CI, but I have manually run it locally with the following command, and it's passing 👍
|
…e_paths_as_the_cdk_is_using
Thanks @i18n-tribe for your update. Can I ask you to update the integration test case to be a CDK application and add it here or at least use a synthesized template, as the sample test case you added not really testing the metadata normalizer change you introduce. |
@moelasmar sorry for the slow reply, work has been busy. I agree the first tests I added do not test the CDK integration, and it would be good to test CDK integration. Thanks for linking to the location of the CDK integration tests. I have now added some new tests which hopefully cover this. I think the first tests I added also add value, because they demonstrate the more general point that SAM does in fact support Thus - the first tests I added mean that this PR to update the SAM developer guide can be merged. |
…ile` is a path to a docker file via subdirectories.
752ac7f
to
178ddb9
Compare
@moelasmar - all the integration tests are passing on my machine 👍 Ran with command |
…e_paths_as_the_cdk_is_using
Thanks @i18n-tribe for your contribution |
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
…e_paths_as_the_cdk_is_using
Hello Folks :) Any estimative on when this gonna be merged? I'm having problems with my current setup where the dockerfile resides outside the root folder and the build fails because it cannot reach shared code that resides in another folder and I need to specify the docker context. |
…e_paths_as_the_cdk_is_using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks for your contribution!
…e_paths_as_the_cdk_is_using
Hey @i18n-tribe, can you please checking the failing tests? - https://ci.appveyor.com/project/AWSSAMCLI/aws-sam-cli/builds/45273404 |
99af471
to
b36eaa0
Compare
Fixed in this commit @hawflau 👍 |
Which issue(s) does this change fix?
#4059
Why is this change necessary?
When defining an
aws_lambda.DockerImageFunction
construct using the CDK, the docker image is configured with code like this:When the CDK project is built, the CDK uses
directory
as theDockerContext
andfile
as theDockerfile
.See the documentation:
This means
/Users/john/PycharmProjects/project-root
is theDockerContext
and the
Dockerfile
is a path to a docker file from within theDockerContext
- i.e../application/runtime/Dockerfile
This capability of the CDK is useful because...
Say you have a folder structure like:
There are 2 apps in the repo - in folders
application
andanother_application
Each app contains a lambda.
The 2 lambdas share some common code, which is in a folder called
shared_code
In each lambda's
Dockerfile
, we need to copy theshared_code
module into the container.Therefore, when we are building each lambda's container, we need to set the
DockerContext
to be the root folder of the repo/project-root
, because this means/project-root/shared_code
will be accessible when we build the container.The
DockerContext
cannot be the folder that containsDockerfile
. I.e. it cannot beproject-root/application/runtime
- because the/project-root/shared_code
folder would not be readable while the container is being built.This example works fine in the CDK.
However - when we wish to test locally using the
sam-cli
- it does not work.The reason it does not work with the
sam-cli
is - thesam-cli
code is not using the sameDockerContext
andDockerfile
paths as the CDK. Thesam-cli
code is currently ensuring thatDockerContext
is always the folder that containsDockerfile
, irrespective of howDockerContext
has been configured in theaws_lambda.DockerImageFunction
CDK construct.So in this example -
DockerContext
is/Users/john/PycharmProjects/project-root/application/runtime
Side-by-side comparison:
DockerContext
/Users/john/PycharmProjects/project-root
/Users/john/PycharmProjects/project-root/application/runtime
Dockerfile
./application/runtime/Dockerfile
Dockerfile
Side Note:
This example would be the same with a single app in the repo.
This example is the same in any case where there are 2 lambdas that share some code, but don't share a Dockerfile.
How does it address the issue?
The SAM CLI will now use the same
DockerContext
andDockerfile
paths as the CDK is using.What side effects does this change have?
No effect at all on production code.
It could in theory effect local testing in some cases.
However – we cannot think of a scenario where local testing is currently working, but will stop working because of this change.
If there is any code that is currently working that will stop working because of this change – the currently working code must be relying on in some kind of hack to work around the issue this PR is fixing.
We wrote our own hack - it looks like this:
So we are manually setting
DockerContext
andDockerfile
to be the same as the CDK.And by setting
SamNormalized=True
metadata, we are able to bypass the currentsam-cli
code that is settingDockerContext
andDockerfile
metadata to be different to the CDK - which would overwrite what we had manually set.sam build
works fine with our hack.Our code (that includes the hack) will still work after the change in this PR is merged. So if a team has done a similar hack to us, their ability to test locally will not break when this PR is merged.
Mandatory Checklist
PRs will only be reviewed after checklist is complete
make pr
passesmake update-reproducible-reqs
if dependencies were changedBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.