Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise upload of deployment artifacts #8666

Open
medikoo opened this issue Dec 23, 2020 · 49 comments
Open

Optimise upload of deployment artifacts #8666

medikoo opened this issue Dec 23, 2020 · 49 comments
Assignees
Labels

Comments

@medikoo
Copy link
Contributor

medikoo commented Dec 23, 2020

Use case description

Side related to #8499

Currently on service deployment, we generate and upload all artifacts to Serverless bucket, even if they remained unchanged against previous deployment.

It's highly inefficient, as in many cases it's upload and related resources updates that may take significant part of deploy process (although I haven't investigated on how AWS treats the case where same zip file (with same hash) is provided for lambda from different location (different uri), but I guess it's still treated as code update unconditionally).

While to maintain stateless nature (locally) we need to locally generate artifacts for all resources on each service deployment, having that done, we may then inspect against deployed ones whether their hash changed, and on that basis avoid unnecessary uploads.

Proposed solution

Note: This is based on implementation idea as presented in @remi00 PR, which seems to provide us with means to introduce this improvement transparently (without a need for additional flags or breaking changes). It additionally ensures that when relying on sls package and sls deploy --package steps separately, we do not accidentally produce erroneous deploy

Change the location of where artifacts are stored in S3 bucket, to common folder where artifacts from all deployments are stored, and are named after their md5 hash. That will allow to easily confirm whether given artifact is already uploaded or not.

I propose to store them in <deployment-prefix>/<service>/<stage>/code-artifacts folder.

In packaging step:

  • When configuring the lambda artifact location in CF template internally resolve hash for given artifact and return name dedicated for S3 bucket. Additionally store resolved hash name into a map, which should be stored in serverless-state.json (so at deployment step we do not need to seclude generated hash names from generated CF template as that can be problematic)
  • Ensure that hashes that we calculate for lambda versioning rely on same hashing logic, and that we do not calculate hash for same file twice

In deployment step:

  • Resolve artifact S3 location paths from has map stored in serverless-state.json file. For convenience ideally if given hash map is assigned to serverless.getProvider('aws).artifactsHashNamesMap in context of extendedValidate where actual serverless-state.json is read.
  • On old versions cleanup we should deduct from CF templates of versions to stay, which code artifacts should remain in S3 bucket, and on that basis remove all that are found in code-artifacts folder, but are not found in kept CF templates.
@pgrzesik
Copy link
Contributor

pgrzesik commented Jan 7, 2021

Thing to note, we need to remember about updating logic that checks if we should try to redeploy (https://github.com/serverless/serverless/blob/master/lib/plugins/aws/deploy/lib/checkForChanges.js#L156-L173) that currently relies on checking functions funcLastModifiedDate. If we won't update all functions on the deployment, then this logic will be always true as non-modified functions will have older modification dates in configuration. It is currently affecting docker images as well.

@remi00
Copy link
Contributor

remi00 commented Jul 7, 2021

I can just confirm that AWS Cloudformation just cares about Code parameter when checking if Lambda has to be redeployed. And taking care of not changing it when doing subsequent redeploys of the stack brings indeed significant improvement with regards of the deployment time.

The question arises on how to perform the solution. Currently, I've seen that package command performs similar kind of check for layers. However, it uses Cloudformation outputs for storing hash of the layer packages. It is risky and not future-proof due to the CFN quota on outputs of 200 items.

Solution proposal [1]: There is already a mechanism of checking if the deployment is required at all on deploy command. It could be reworked in such a way that it would carry over the information individually per each Lambda artifact instead of single global flag (shouldNotDeploy). The hash of the function can be retrieved through AWS.Cloudformation.getStackResources() and then AWS.Lambda.getFunction(). The metadata about Lambda changed would be later on handled accordingly within uploadArtifacts and others.

Solution proposal [2]: probably more future-proof, however wider - as suggested by @medikoo, with the change how the artifacts are stored within the deployment bucket. The approach inspired from AWS SAM and aws cloudformation package could be reasonable: with artifacts naming pattern of <stage>/<function-name>/<sha-hash>.zip it would be really easy to detect if there is any change - just S3.headObject() to know if given artifact file is already there uploaded. The rest of solution would be similar to proposal [1].

@medikoo
Copy link
Contributor Author

medikoo commented Jul 8, 2021

@remi00 great thanks for that insight. It's great to have confirmed that AWS CF simply cares about Code, and matching file hashes do not trigger any optimization on its side.

I also believe that [2] is the way to go. I've also updated main description with solution proposal. It's not detailed implementation spec, but outlines what should change in internal logic to have that optimizations applied and working.

We welcome a PR on that!

If solution description is not good enough, I may attempt to prepare a more detailed implementation spec on how it should be tackled.

@remi00
Copy link
Contributor

remi00 commented Jul 26, 2021

I created a PoC based on serverless 2.51.2 that you can find 🚧 here, focusing on a configuration setup, which in overview includes the plugins:

  - serverless-webpack
  - serverless-plugin-git-variables
  - serverless-prune-plugin
  - serverless-dependson-plugin
  - serverless-provisioned-concurrency-autoscaling
  - serverless-plugin-ifelse

There is a few findings that are worth addressing before making the mechanism production-ready.

Undeterministic ZIP hashes

This is a struggle about extra-metadata (namely last-modified timestamp) that ZIP includes in the archives. Doing (re-)package creates new zip files with different timestamps and different file entries ordering (at least in the setup with webpack plugin). And even though it might be considered as some kind of smell or risk, it looks that a change to the hash calculation algorithm has to be made, to just take the contents into account. This implies scanning the ZIP archive file entries. In the prototype, I took advantage of crc32 property of each file entry specified in ZIP spec to avoid pure binary scan but it's TBD if it's not too-early optimization.

Dropping (optional) CodeSha256 property from Lambda version resources

PoC includes changes for Lambda version resources to manage their creation in line with the rest (so that new versions are also created only when either actual ZIP artifact contents or configuration change). Unfortunately, because of change described above, CodeSha256 values that are currently added to AWS::Lambda::Version resources will differ from the one calculated by AWS Lambda. Therefore, they have to dropped. This property is optional on Cloudformation level, and as far as I understand its purpose, its value is questionable in case of deployment model that Serverless uses. Both AWS::Lambda::Function and AWS::Lambda::Version are deployed always together, in the same stack, and providing CodeSha256 just prevents creating new Lambda version in case there were some changes made on the Lambda code (between the deployment of both resources).

Both issues above should be gone, see further comments.

GIT variables plugin interfering

When using Lambda versioning, this new deploy mechanism will interfere with serverless-git-variables-plugin, as it modifies the configuration with each commit. Users of this plugin will have to rely on other ways of tracking the connection between GIT and the deployed code. Lambda versioning is already a versioning mechanism (people can rely on Lambda version in Cloudwatch logs streams names), but it might be useful to e.g. allow putting git metadata to the artifacts uploaded to S3.

And from @medikoo technical proposal questions:

  • There is no issue with filenames based on the hash value, assuming usage of hex encoding instead of base64. Length is OK too.
  • S3 upload from AWS SDK that Serverless uses is already a multipart upload with all bells and whistles on handling the upload in a bulletproof fashion.

Me and my organization would ❤️ to have this improvement within Serverless, so we are greatly interested in contribution to it. It'd be great if we get aligned with the concerns highlighted above first. I am fully open to discuss them and proceed with the work. 😎

@remi00
Copy link
Contributor

remi00 commented Jul 26, 2021

Additionally, referring to @pgrzesik comment under #9732, this change will indeed have to take the rollback into account, although it was not in scope of PoC.

@remi00
Copy link
Contributor

remi00 commented Jul 27, 2021

I will also just add some stats for sample, large project that we tested before and after the improvements.

  • Project size: ca. 40 Lambda functions
  • Initial setup is with individual: false, with PoC we changed it to true obviously.
  • Package step: ~2 minutes got worse into ~3 minutes, which was expected as individual packages now have to be created
  • Deploy step: ~12 minutes reduced to less than 3 minutes in case there is a small change applying to 3-5 Lambda functions. It is expected to grow linearly.

The largest benefit is therefore significant possible reduction of deployment process time. However, this is not the only one, others are:

  • Better approach for development, reducing the need to use sls deploy function.
  • Reduce upload traffic, important for poor connectivity scenarios (development rather than CI/CD paths).
  • Lambda versioning starts making sense - currently each deploy action triggers publishing all Lambda functions. After the change, only modified ones will be published, as needed.

@remi00
Copy link
Contributor

remi00 commented Jul 28, 2021

I just realized that in config without serverless-webpack, ZIP artifacts created with pure serverless have last-modified timestamp zeroed (@pgrzesik change from ca. half a year ago) and serverless-webpack is going to address this issue the same way with PR serverless-heaven/serverless-webpack#911. Therefore, my main concerns will be gone and the solution will be much, much simpler, without changes to hash calculation.

@pgrzesik
Copy link
Contributor

Thanks a lot for submitting your PoC @remi00 - I'm a bit busy at the moment but I will try to take a look at it tomorrow or next week (I'm not available on Friday). 🙇

@remi00
Copy link
Contributor

remi00 commented Jul 29, 2021

@pgrzesik no worries. Just take a look on this one with much reduced scope of basic changes.

@pgrzesik
Copy link
Contributor

pgrzesik commented Aug 4, 2021

Thanks @remi00 - I've reviewed the PoC and I think it looks like a good start. One thing that we definitely need to keep in mind as well is the functionality that removes previous deployment artifacts from here: https://github.com/serverless/serverless/blob/master/lib/plugins/aws/deploy/lib/cleanupS3Bucket.js

How do you imagine supporting that with proposed changes?

@remi00
Copy link
Contributor

remi00 commented Aug 5, 2021

Please review my proposal:

deploy list and rollback functionality (including cleanupS3Bucket.js)

I assume we need to keep backwards compatibility, so this area of implementation would be rather just expanded and aware of both legacy (current) and new approach. Proposal would be to adjust the way how all artifacts are stored, including storage of compiled-cloudformation-template.json, and do lookup to the template to find out artifacts associated with a given deployment.

Current S3 storage structure:

Code artifacts: <prefix>/<timestamp>-<datetime>/<service-name>.zip
Custom resources package: <prefix>/<timestamp>-<datetime>/custom-resources.zip
Template file: <prefix>/<timestamp>-<datetime>/compiled-cloudformation-template.json

Proposed S3 storage structure:

Code artifacts: <prefix>/<service-name>/<hash>.zip
Custom resources package: <prefix>/custom-resources/<hash>.zip
Template file: <prefix>/compiled-cloudformation-templates/<timestamp>-<datetime>.json

Both current and new storage structures would be supported, so the analysis of the S3 bucket and compiled CFN template contents will allow to provide all currently available functionality for deploy list, rollback and S3 bucket pruning (cleanupS3Bucket).

Additional background: In the new approach, the challenge is that we won't have discrete, single timestamp of deployment in the artifacts storage path. For that purpose we could adjust the format of storage path (we have LastModified param in S3.ListObjectsV2() but it will differ per each artifact depending on long the deployment will take, so it's unreliable). Keeping the timestamp for cloudformation templates will allow for keeping backwards compatibility.

Opt-in configuration flag for S3 storage structure

Should the change of S3 storage structure be configurable? To some extent it's internal thing, but maybe some plugins or (rarely used features) still depend on it. Should we make it configurable then?

@pgrzesik @medikoo Could you please advise on both matters? :) Thanks!

@remi00
Copy link
Contributor

remi00 commented Aug 5, 2021

Additional item is about layers:

Updating layers storage management

Should we reuse the flow for Lambda functions packaging and deploy also for the Lambda layers?

Additional background: Currently the approach to layers is slightly different. The storage structure is the same as for Code artifacts (see comment above). However, there's optimization on upload and deploy. For each layer added to the project creates three Output items to the Cloudformation template (and stack), including its artifact hash. These outputs are solely used on subsequent deploys to detect if the re-deploy is necessary.
This mechanism would be inconsistent with new proposed approach. It also has a drawback of using Cloudformation Outputs (somehow polluting it). There's also a limit of Cloudformation outputs per stack of 200 at maximum.

@medikoo medikoo added the deprecation Deprecation proposal (breaking with next major) label Aug 9, 2021
@medikoo
Copy link
Contributor Author

medikoo commented Aug 9, 2021

@remi00 great thanks for all the insight, and initial draft. It's all extremely valuable. You've already helped us answering some important questions we had.

I've just updated top description with a proposal that also takes into account deployment versioning (so rollback handling etc.). I think it should answer already some of the questions you've had above.

(@pgrzesik please also let me know what you think about updated proposal)

Some additional points:

  1. I wouldn't treat custom resource lambda package in any way special. I would just ensure that all artifacts to be uploaded (either generated by us or prepared by users) are copied to deployment dir (<service-dir>/.serverless) under <hash>.zip name, and provide more-less generic handling from that point
  2. I wouldn't try to support both versioning methods at once. e.g. trying to handle scenario where versions from 10-6 are still stored old way, and version from 5-1 are stored new way will introduce quite siginicant complexity and will make implementation more blurry. Instead as proposed above I would just introduce an alternative way that will simple take over when conditions are met.

Let me know what you think @remi00

Additionally I think best if first we just focus on implementation spec, without writing any solution before we have full agreement on that. So we do not burn too much time on directions from which we will need to eventually revert.

Concerning plugins, I suggest to take them into account, but if there's an issue with any, ideally if it's solved on plugin side, and not that we have to resign from most optimal path.

@remi00
Copy link
Contributor

remi00 commented Aug 9, 2021

@medikoo thanks for making this clarification, it's right on time :) For now I see no controversies about the technical proposals provided.

I like the idea of start doing the things right away, technical proposals are precise enough. I already started doing updates including UTs updates. Hopefully I will be able to provide a number of (smaller) PRs instead of doing it big-bang.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 10, 2021

I've just realized that, as changes affect already packaging steps (we need to generate already a hash paths into CF template), we cannot really add intelligence of detecting whether it's a first time deployment and reading global state from S3 bucket (packaging step is by design offline operation).

Hence I've updated description, so it doesn't cover that. It's a bit simpler now, but with a downside of leaving user either with deprecation notice or a requirement to add extra property to their service config until next major arrives. I'm not sure if we can do anything better, but any suggestions are highly welcome

@pgrzesik
Copy link
Contributor

Thanks @medikoo - the updated proposal looks great in my opinion, I have only one concern.

I wouldn't try to support both versioning methods at once. e.g. trying to handle scenario where versions from 10-6 are still stored old way, and version from 5-1 are stored new way will introduce quite siginicant complexity and will make implementation more blurry. Instead as proposed above I would just introduce an alternative way that will simple take over when conditions are met.

This is a bit tricky, as switching to a new method will cause the old artifacts to stay in S3 forever as the new logic would not consider them at all during cleanup. One option is to either support them both for some time (which would probably mean indefinitely because we don't know when someone switches the packaging way) or introduce a detailed step-by-step instruction on how to migrate to new packaging/uploading approach. This has the downside of being harder to do and cause people to not migrate to new approach.

@remi00
Copy link
Contributor

remi00 commented Aug 10, 2021

I like the improvements as well. I actually planned to not use the global state in the first iteration anyway.

I think that idea with per-deploy serverless-state.json with artifactNames is brilliant. It will decouple the artifact names handling from the rest and will just require us to slightly extend findAndGroupDeployments to keep support for both pruning and rollback. findAndGroupDeployments (and surroundings) will either rely on these artifactNames or fall-back to current behavior in case when it's missing.

Based on that the concern from @pgrzesik about keeping the S3 dirty would be resolved - we will support both legacy and hash-based artifacts naming easily. And @medikoo - I do mean easily so no worries about too much blurry, or complex implementation.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 10, 2021

This is a bit tricky, as switching to a new method will cause the old artifacts to stay in S3 forever as the new logic would not consider them at all during cleanup.

@pgrzesik for sake for simplicity I think approach should be that when switching to new approach we delete all old data, and not that it stays forever (I just added it to spec), so that it technically resets the history.

Reasoning behind it is that trying to support both methods which theoretically may be switched numerous times (imagine versions 1, 2 on old method, 3,4,5 on new, then 6,7 on old and 8 on new etc.), will be difficult and will significantly raise the cost of implementation (and complex implementation is a perfect settlement for new bugs to be addressed)

Additional reason is that I don't think it's a really crucial feature. I've just checked and our data shows that sls rollback usage is 0.01% of all commands usage.

This usage data, actually raises the question on whether we should simply not drop the support for rollback, and make this as breaking change instead. Having that, changing the method wouldn't have to be breaking.

@pgrzesik
Copy link
Contributor

for sake for simplicity I think approach should be that when switching to new approach we delete all old data, and not that it stays forever (I just added it to spec), so that it technically resets the history.

That's definitely a valid approach, but doesn't it introduce an overhead for each deploy with this new packaging method, as we cannot be sure if there are no old artifacts left at any point in time, so we will have to check on each deploy?

Additional reason is that I don't think it's a really crucial feature. I've just checked and our data shows that sls rollback usage is 0.01% of all commands usage.

I would not expect big usage of this feature, but I think it's quite important one, as you very rarely need to use it, but if you need to use it, it's really important to be working reliably.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 10, 2021

That's definitely a valid approach, but doesn't it introduce an overhead for each deploy with this new packaging method, as we cannot be sure if there are no old artifacts left at any point in time, so we will have to check on each deploy?

With each deploy we anyway need to scan S3 content. So I don't think there's any overhead here. (old method already scans and removes versions older than 10)

@medikoo
Copy link
Contributor Author

medikoo commented Aug 10, 2021

Anyway, maybe my initial judgement that supporting both methods at once will require a complex implementation is not necessary valid. If that'll appear somewhat easy, then we could introduce it without introducing a deprecation and that's always preferable (although we still have plugins which may got broken, that'll have to be confirmed)

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

@pgrzesik so the working implementation is available to review here: grasza-consulting#1
I won't submit it to main repo until I finish with the first PR split of course.

The highlight on how to handle deploy list, S3 cleanup after deploy and rollback: lookup into compiled cloudformation templates to determine what are the artifacts associated with the given deployment.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 19, 2021

@remi00 as I investigated there will be no changes needed to rollback or deploy list handling, as they do not resolve and operate on artifacts in any way. rollback simply resolves location of CF template and passes it for updateState.

Are you suggesting that we should grab older CF templates from S3 during cleanup process and based on that we will know which artifacts should be cleaned up or not?

@pgrzesik yes, exactly

In such case, what is the purpose of storing hashes in serverless-state.json?

It's purely for local processing (serverless-state.json is not currently uploaded to S3 bucket), so we can reliably support deployment of preprepared packages. As I outlined in description there can be scenario when user runs sls package with older version and does sls deploy --package with new version, to ensure we upload files at right location, we need to know what strategy was used for S3 paths generation, and we can deduct that from serverless-state.json

I was thinking that we want to upload serverless-state.json for subsequent deployment and during cleanup fetch them for older ones and remove artifacts that are not in e.g. last 5 deploys.

There was similar idea in previous spec version, but we don't have to do that, we have all information we need in CF templates which are already uploaded

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

Okay, in such case I assume that:

  • current PR Optimize upload of deployment artifacts #9828 shall be preferably closed and replaced with a number of new ones (starting with minor, just required refactoring PR and then with the implementation);
  • the approach within the implementation shall be slightly adjusted: instead of determining artifacts paths both on package and deploy commands using the artifactsVersioningMode, we should rely on this flag just during the package. And deploy should either use artifacts mapping from serverless-state.json or using legacy scheme in case the mapping is missing.

@pgrzesik @medikoo just plz confirm or clarify if necessary :) Thanks

@medikoo
Copy link
Contributor Author

medikoo commented Aug 19, 2021

current PR hall be preferably closed and replaced with a number of new ones (starting with minor, just required refactoring PR and then with the implementation);

@remi00 I think you can also first create the cleanup ones, and then update the existing PR so it's up to date with master and reflects final implementation. No need to close it (unless you strongly prefer to close it, and start work over in new PR context, that's ok)

the approach within the implementation shall be slightly adjusted: instead of determining artifacts paths both on package and deploy commands using the artifactsVersioningMode, we should rely on this flag just during the package. And deploy should either use artifacts mapping from serverless-state.json or using legacy scheme in case the mapping is missing.

Exactly! Do you see any potential issues? If you have any other proposals let us know

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

Exactly! Do you see any potential issues? If you have any other proposals let us know

It makes perfect sense. I will proceed with making the updates.

@mnapoli
Copy link
Contributor

mnapoli commented Aug 19, 2021

Could you clarify what the artifactsVersioningMode flag is? Where is it defined?

Also bumping up on performances:

Package step: ~2 minutes got worse into ~3 minutes, which was expected as individual packages now have to be created

That's the part I would like to clarify. Is there a worse performance (in some scenarios) because of the PR?

@medikoo
Copy link
Contributor Author

medikoo commented Aug 19, 2021

Could you clarify what the artifactsVersioningMode flag is? Where is it defined?

@mnapoli this was part of a proposal that's no longer on a table.

That's the part I would like to clarify. Is there a worse performance (in some scenarios) because of the PR?

No, there shouldn't be such case. I believe you're referring to some version that was explored by @remi00 at some point which is not considered at this point

In top level description there's an updated full implementation proposal and currently we stick just to that

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

@medikoo - shouldn't artifactsVersioningMode be added? Probably as a temporary opt-in flag to 2.x major version line to deploymentBucket properties, as proposed previously (see my previous comment made 1 hr ago). In future major releases the new approach could become the default one. I'd recommend that slightly.

On the performance, those stats were based on legacy approach from initial PoC that involved significant changes to hash calculation to make them more deterministic. We now know it doesn't have to be made (it's thanks to changes ongoing into serverless-webpack plugin that my organization uses to build the ZIP files). Final approach does not involve these changes, so there is no performance degraded on the package step.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 19, 2021

@medikoo - shouldn't artifactsVersioningMode be added?

@remi00 In my understanding this improvement can be added transparently, without a need for a flag. Do you see any potential issues of doing that?

Note, that technically here we're fixing a design issue, which affected performance, I don't think that users should be bothered about that, we can simply just improve performance for them without a notice.

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

@medikoo I think you are right. It's just about thinking of possible regression/incompatibilities with some custom configuration or plugins. Though I think I can't assess it as reliably as you can. So, after reconsidering it, if you think it's best to just introduce it as default behavior, that's good for me as well.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 19, 2021

@medikoo I think you are right. It's just about thinking of possible regression/incompatibilities with some custom configuration or plugins.

I'm currently in process of investigating deeply all our plugin ecosystem, and there are many which override our packaging process injecting different artifact generation logic. Still they work in area we won't touch. Their end result is assigned to package.artifact and when artifact uploads is concerned it's where we pick up.

I've double checked whether there could be plugins that attempt to tweak our upload process, and just by using keywords upload or artifact nothing jumps out, also it's hard to find any real use case for fiddling with that (aside of providing a fix we're discussing here)

@remi00
Copy link
Contributor

remi00 commented Aug 19, 2021

Thanks a lot @medikoo! 👍 I proceed without any opt-in feature flag in the configuration.

@remi00
Copy link
Contributor

remi00 commented Aug 23, 2021

@medikoo One additional question for the changes regarding the structure within <deployment-prefix>/code-artifacts. Serverless allows to provide directly the artifact property. That artifact property value can be artibitrary, not corresponding to the resource name. Current behavior is that the filename is taken of that artifact when uploading to S3. For the new scheme, for now the PR just takes that file basename (stripping .zip extension) as the name of the parent directory of artifact files for a given resource. Shouldn't we change that and use the resource name (lambda function name, lambda layer name or custom-resources) for the parent directory instead of the raw artifact filename?

Asking this because I can imagine some edge cases (at worst case rather causing either that new mechanism won't work or it can lead to a slightly more messy S3 bucket) but on the other hand it may slightly increase the amount of changes. WDYT?

@medikoo
Copy link
Contributor Author

medikoo commented Aug 23, 2021

For the new scheme, for now the PR just takes that file basename (stripping .zip extension) as the name of the parent directory of artifact files for a given resource. Shouldn't we change that and use the resource name (lambda function name, lambda layer name or custom-resources) for the parent directory instead of the raw artifact filename?

I understand you refer to package.artifact, yes users may override it with own values, and also packaging plugins do that. Still in context of this functionality I think it's irrelevant how that value originates. When we generate the CF template or upload artifacts, package.artifact just points the location of artifact on user machine.

Now, when uploading, currently, we're taking the basename of artifact and upload it to <deploymentPrefix>/<service>/<stage>/<deploymentTimestamp>/<artifactBasename>.zip location.

With this optimization we will change it, so it's uploaded to <deployment-prefix>/code-artifacts/<hash>.zip. Still actually I think it should be
<deploymentPrefix>/<service>/<stage>/code-artifacts/<hash>.zip. I didn't propose it originally, as by default bucket is specific to service and stage (so extra that extra nesting seem to not be needed). However user may customize bucket location via provider.deploymentBucket.name property, and originally additional<service>/<stage> was added to ensure there are no collisions if users decides to rely on same bucket for different services and stages (I've just updated spec to reflect that)

@medikoo
Copy link
Contributor Author

medikoo commented Aug 25, 2021

I've just realized one thing when reviewing PR. Is that scenario where package artifacts are changed between package and deploy steps is already breaking with lambda versioning on. So technically we can consider such scenarios as invalid.

In light of that there should be no issue in recalculating file hashes in deploy step, all we need is some confirmation in serverless-state.json that file was packaged with new approach, and for that instead of storing there map of file hashes let's maybe simply add artifactsVersioningMode: 'hash' property there.

@remi00 what are your thoughts?

@remi00
Copy link
Contributor

remi00 commented Aug 26, 2021

I'm okay adding artifactsVersioningMode: 'hash' to serverless-state.json, but should it influence actually?
We can jump on some call to discuss the changes if you like, btw.

@medikoo
Copy link
Contributor Author

medikoo commented Aug 26, 2021

I'm okay adding artifactsVersioningMode: 'hash' to serverless-state.json, but should it influence actually?

We need confirmation that service was packaged with Serverless Framework version that supports hash based naming.

Do you have some other idea for it? Or use case it's supposed to address seems not clear?

@remi00
Copy link
Contributor

remi00 commented Aug 26, 2021

I just want to make sure I understand why we need this confirmation. Currently, Cloudformation template is being compiled on package step. So the destination S3 key paths are already as they should be, at any case (legacy or hash based naming). Upload happens on deploy step though. Maybe we shouldn't then recalculate the destination paths, but just rely on the template compiled on package step - use the S3 paths from there?

@remi00
Copy link
Contributor

remi00 commented Aug 26, 2021

Above approach would make deploy step much more agnostic of naming schemes (used and determined just on the package step).

@medikoo
Copy link
Contributor Author

medikoo commented Aug 26, 2021

Maybe we shouldn't then recalculate the destination paths, but just rely on the template compiled on package step - use the S3 paths from there?

Indeed, that's a very good idea.

I've also looked more closely how deploy resolves CF template to be uploaded to S3, and it's that in first phase of deploy state is restored from package path, and that ensures that internal provider.compiledCloudFormationTemplate property reflects exactly the CF template that's stored in package path, so it can be trusted out of a box (we do not need any additional action to read this template from package path).

So yes, we definitely can do that, and that I believe even more reduces needed changes, which is even better. I've updated the spec

@remi00
Copy link
Contributor

remi00 commented Sep 1, 2021

@medikoo As discussed within #9876, there is a challenge with resolving artifact S3 location paths from CF template, mostly with finding out proper name for custom resource artifact. Main driver for doing it this way was that it will be straightforward for regular Lambda functions and layers (for each such item we would just lookup into provider.compiledCloudFormationTemplate for the resource, using naming.getLambdaLogicalId and naming.getLambdaLayerLogicalId, and read Properties.Code.S3Key).

In case of custom resource lambda we do not have metadata like getAllFunctions() / getFunction() and getAllLayers() / getLayer(). So, to use provider.compiledCloudFormationTemplate some probing-like mechanism have to be implemented. We'd have to look up for different possible resource names from naming.getCustomResourceXxxxxxxxxxxFunctionLogicalId(). If any such resource is found, take its Properties.Code.S3Key as the destination S3 path.

This destination S3 path lookup algorithm will be reconstructing, which is a smell (making it dependant on some other internals to reconstruct mapping between local paths and destination paths) - any future change to internals of how this compiled template is build may require reflecting changes in this algorithm as well.
For this reason, proposed solution was to store artifactsMap within serverless-state.json. But I am fully open to either other suggestions or even implement this algorithm as describe above. Please advise.

@medikoo
Copy link
Contributor Author

medikoo commented Sep 1, 2021

@remi00 I fully agree that this reconstruction logic will be leaky and prone to issues, and indeed it seems more optimal to go with artifactsMap.

Ideally if function which resolves hash for artifact automatically behind the scenes fills artifactsMap with given item (so handling of that is transparent to functionalities which retrieve S3 keys).

I've updated the spec, also I proposed where we could store retrieved hash map for convenient access when reading serverless-state.json file

@ronkorving
Copy link
Contributor

Jumping in from #10353, with the express intent not to hijack the conversation.... Since I didn't see the feature (which might be quite useful here, if I'm not mistaken about what you're trying to achieve), have you considered using S3 object Metadata (a key/value dictionary) when you PUT the package object to S3? If you want to store your own MD5 hash for example, you can pass that along when you upload the object. When you do a head-object call, you will receive that Metadata property back.

If this is irrelevant, please ignore me, but I figured I would regret it if this turned out helpful and I didn't mention it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants