-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AWS x-ray support for API Gateway #5692
Add AWS x-ray support for API Gateway #5692
Conversation
while unit tests are passing an integration test im trying is not serverless.yml service: gw-tracing
provider:
name: aws
apiGateway:
stageDescription:
tracingEnabled: true
functions:
hello:
runtime: python3.6
handler: hello.handler
events:
- http: any / hello.py def handler(event, ctx):
return {
"statusCode": 200
} package.json {
"dependencies": {
"serverless": "softprops/serverless#deployment-stage-description"
}
} $ npx serverless deploy
...
Serverless Error ---------------------------------------
An error occurred: ApiGatewayDeployment1547359246920 - StageDescription cannot be specified when stage referenced by StageName already exists. I saw a discussion about a similar issue here I'll let you know what I find |
if (!_.isBoolean(tracingEnabled)) { | ||
throw new Error('REST API stage description tracingEnabled must be a boolean'); | ||
} | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm letting tracingEnabled
be the only supported (and documented) field associated with stage descriptions for now, but having this method in place should make it straightforward and easy to add additional supported fields in the future. Something I've not yet figured out is is there is a better place to communicate structural validation prior to serverless deploy
invocations.
It seems the cloudformation error only happens after the first deploy. I've seen this error references in at least one other serverless plugin |
I'm starting to get the impression stagedescription may indeed not be supported for updates to a stage despite working on the first deploy. I'm thinking another option may be for serverless to somehow manage a stage resource independently https://docs.amazonaws.cn/en_us/AWSCloudFormation/latest/UserGuide/aws-resource-apigateway-stage.html |
@softprops thanks a lot for this PR, and especially for the super helpful comments 🙌 . I'm currently reviewing it. Is it good to go from your side? or are there any other changes or issues you need to address? Just wanna know the status of this PR. |
@eahefnawy thanks for posting back. I think this is going to have to be halted until I figure out the last issue I commented. Help on that would be welcomed. While the code checks according to cloudformation docs out the integration test I posted does not. I need to step a few levels deeper to how cloud formation is actually supposed to behave in this case. The first deploy works ( with active tracing enabled ) but any deploy afterwards it fails with the error
I've read through and tried to grok the following similar cases mapbox/deprecated-hookshot#9 (comment) https://github.com/jacob-meacham/serverless-plugin-bind-deployment-id#known-issues https://forums.aws.amazon.com/thread.jspa?threadID=230706 This kind of makes me think that in order to get this to work I'd need to change some fundamentals for how serverless manages api gatway deployment stages, potentially as a first class cloud formation resource. I'd appreciate any help I can get knowledge wise on the history of the current setup for apigw deployments. Maybe that was tried before but has since switched to its current state. The example I provided above should be a good minimal reproducible case for a local test if you have time |
@softprops hmmm yeah I see what you're saying. We are currently adding the stage implicitly via the Deployment resource. I'm hesitant to add a new CF resource for each stage as it'd increase the CF resource count and get us even closer to the 200 resources limit, which is already an issue. Don't wanna solve a problem by introducing another problem. Fixing this specific problem for a subset of our users who want active tracing might introduce a new resource limit problem for all framework users. On the other hand, it's just a single resource for each stage, which doesn't seem like a lot. hmmm. Would it be possible to only add this stage resource only if active tracing is enabled in sls.yml? If not, then we'd revert to the implicit stage definition in the Deployment resource? I'm gonna raise this up internally with the team, and let you know 😊 |
I can try that route. Another way to look at this to is that it would solve multiple problems when we figure this out. enabling tracing is one problem solved by adding the stage description, here's a list of all the other things that this would enable https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-apigateway-stage.html |
This will be really awesome to have |
Having an explicit StageDescription would help a lot, but probably require people to tear down their existing stacks and redeploy to make use it. |
@softprops we've discussed it internally, and we'd like to merge it in only if you could add the new Stage resource for people activating active tracing in the yml file, otherwise it'd work the same way for the rest of the users. As @jplock pointed out, this could be a breaking change for some users. Also, if someone is already at the 200 resource limit, they wouldn't be able to make another deployment with this core change. |
@eahefnawy I still haven't found a way to work around updates. This only seems to work the first time you deploy. And time you deploy after you'll get the You can reproduce the problem with the minimal example posted above |
I was working on adding possibility to configure Stage with some
I think the best thing to do would be indeed as point out @eahefnawy to have an explicit I'll be happy to give a hand if needed has this is a current issue we are trying to solve @CoorpAcademy |
@AdrieanKhisbe I made that suggestion then @eahefnawy added that we may need to conditionally add this if activating tracing to avoid making this a breaking change. With your use case we'd also have to add another conditional if users opt into configuring the I'll see if I can get the conditional based on xray tracing soon but id want to maybe see how well that works before adding additional conditional rules for adding a stage resource after. |
@softprops I see. I was just wondering if the explicit What do you think about this @softprops and @eahefnawy ? |
I think the hesitation to make the Stage resource default is based on cloudformation limits of 200 resources and some users complain that serverlesses implicit stack should have fewer to opt out of to make more room for application resources. |
…ment-stage-description
I did a little research on differences and what can be described Deployment.StageDescription vs Stage resources Stage Description
Stage
Interestingly it seemed like there was more properties in StageDescription than Stage itself It seems all of these fields are also properties on elements of This then begs the question if it would be misleading to be exposing stage settings on Let me know if this is thinking too far ahead, i.e. supporting stage properties other than tracingEnabled. |
Thanks for commenting @orwell1984 👍
Yes, that sounds like a good plan and seems to be the last resort for the time being to get this merged... |
I looked into the CloudFormation problems with the current implementation again and it seems like there's no way to resolve them in a non-interruptive fashion. Given that I've updated the docs and added a comment about the current limitations and the workaround of removing and re-deploying the API Gateway. Using API Gateway X-Ray Tracing is opt-in, so the user should know what he can expect. Since it's opt-in we should move forward and merge this into @dschep @eahefnawy can you take a final look into this and merge if appropriate? Thanks! |
I've added some logic which inspects the CloudFormation error message and adds some information on how to solve the problem. This way the user directly sees in the CLI that a remove and re-deployment is necessary. |
Here's another update after testing several edge cases with current implementation today: During testing I experienced an odd behavior. If one enables My hunch is that it's because we're going back from a dedicated I'm putting this PR on-hold again since it's not clear if this is a breaking change or if there's anything we can do to work around this issue. |
Knock, knock, it's me again... Just looked into this again. Apparently there's no way to fix this (as we already know). Because of this I implemented a check which will diif the old CloudFormation template with the new one and prints out helpful error messages showing that upgrades require a remove and re-deploy and downgrades might result in unexpected behavior (the user can use This logic will automatically apply to all future PRs which will also switch to use the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested this again and it works fine. LGTM
And even after deploying once I need to always user the --force option??? or i get this error NOTE: Disabling API Gateway X-Ray Tracing for existing deployments might result in unexpected behavior. Please refer to our documentation for more information. |
This feature has introduced production blocking bug as well, we can no longer deploy using serverless 1.41.0 as the
I have provided the relevant portion of my serverless.yml with the cloudformation resources below, we define this to enable detailed access logging in addition to x-ray tracing and set the deployment id using plugin
|
@jonathannaguin |
@jonathanalberghini while I agree that this is annoying I have to say that we only had the best intentions when implementing this. The main reason was to provide some safeguards when people want to upgrade and use AWS X-Ray. Saying that something is "a stupid way to implement" is very harsh and insulting. That's not how you should talk to people on the internet. @exoego thanks for jumping in 👍 @gootdude thanks for providing in-depth information. I just prepared a PR where I remove the check for breaking changes. Will be out soon and after that I'll publish a patch release. |
Quick update that we've just published https://github.com/serverless/serverless/releases/tag/v1.41.1 which should fix the problem. |
What did you implement:
Closes #5564
This changes adds support for declaratively enabling active tracing for apigateway
How did you implement it:
If tracing is enabled we'll create a separate
Stage
resource. This feature is opt-in through explicit tracing configuration. When using aStage
resource we'll remove theStageName
from theDeployment
resource.We can extend the
Stage
resource configurability later on via a dedicatedserverless.yml
section inprovider
.How can we verify it:
To enable this you should only have to add
provider:apiGateway.stageDescription.enableTracing
Todos:
Is this ready for review?: YES
Is it a breaking change?: NO