-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core/cloudwatch logging #8309
Core/cloudwatch logging #8309
Conversation
Codecov Report
@@ Coverage Diff @@
## main #8309 +/- ##
==========================================
- Coverage 78.09% 77.85% -0.25%
==========================================
Files 236 239 +3
Lines 16599 16879 +280
Branches 3559 3594 +35
==========================================
+ Hits 12963 13141 +178
- Misses 3526 3627 +101
- Partials 110 111 +1
Continue to review full report at Codecov.
|
A few initial questions, presumably just for my own edification:
|
@svidgen Thanks for the questions.
|
be668cf
to
7da18bf
Compare
Added in a queue for logging, with support for batched log pushes. Also added safety checks on the size of the log + log batches, a retry in case a log fails due to the improper sequence number being attached to the request, and an interval to monitor the log queue and submit log pushes whenever there are logs present in the queue. |
Would this have the functionality to send EMF objects to CloudWatch? On failed messages will this retry, back off, and eventually stop attempting to send a message? |
I'm actually not familiar with EMF. Let me take a look at that and see if it would just plug in or if additional functionality would be required.
It depends on why the log push failed. If the log push failed because of an incorrect sequence number, then it will back off, attempt to get the correct sequence number, and then perform a retry. |
Looked into it, and it looks like EMF requires a bit of a different format than just sending up logs normally. This is functionality that should be relatively simple to implement, but would probably be best added in a subsequent PR so as to stop the size creep on this one. I am keeping a list of things to add to our logger offering once this gets merged in, and will make additional issues for them so we can track their progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I left some comments on the code and I have a couple more general questions (before I pull it and try it out on my dev desktop later 😄) :
- The usage of this would be something like
Logger.addPluggable(new AWSCloudWatchProvider())
? - With that, every
logger.x
call will add to the internal buffer ofAWSCloudWatchProvider
and be shipped to CloudWatch every 2 seconds? - Is there a way for customer to specify which level of logs they want to be shipped to Cloudwatch ? For example a user might just want to ship
logger.warn
orlogger.error
but not anything else.
} catch (err) { | ||
logger.error(`error during _safeUploadLogEvents: ${err}`); | ||
|
||
if (RETRY_ERROR_CODES.includes(err.name)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this should be explicitly retried? It might also be confusing in a scenario where the initial attempt failed, but the retry succeeded, then the user will see error during _safeUploadLogEvents
but then the log actually got shipped to Cloudwatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the explicit retry here is to handle an edge case race condition where we might attempt to push a log with an improper sequence number. The update I made to utilize a queue should ideally take care of this problem on its own...but I feel it is safest for us to handle an explicit retry on this specific case anyway. I figure it's a low risk-high reward to include it.
Thanks for the comments! I will run through them and make the changes highlighted.
Yep, you got it. I have a sample code snippet in the PR overview that shows this usage.
Every call of that instance of the logger, yes. We could reduce the time to lower than two seconds, but I chose that as a starting point in order to safely avoid hitting throttle limits on the CloudWatch APIs. But, really, even dropping it down to every 30ms or so would be fine. And every log gets published with a timestamp of the event, too.
I didn't include that, but that would be simple enough to add. The current behavior of our logger is that the default log level is |
I can't envision a scenario where our teams would use an ability like that. If I'm using |
I'll leave this as something to investigate in my tracker ticket for future ideas then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of my comments were blockers — Mostly just maintainability and readability paranoia.
if (this._timer) { | ||
clearInterval(this._timer); | ||
} | ||
|
||
this._timer = setInterval(async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: Why recreate the interval here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually just took this timer creation from another one of our Providers.
const cmd = new GetLogEventsCommand(params); | ||
const client = this._initCloudWatchLogs(); | ||
|
||
try { | ||
const credentialsOK = await this._ensureCredentials(); | ||
if (!credentialsOK) { | ||
logger.error(NO_CREDS_ERROR_STRING); | ||
throw Error; | ||
} | ||
|
||
const output = await client.send(cmd); | ||
return output; | ||
} catch (error) { | ||
logger.error(`error getting log events - ${error}`); | ||
throw error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basic patterns shows up quite a few times. It might be worthwhile to DRY this out, as that could really increase the SNR on some of these methods.
Thinking "aloud" here, maybe as an alternative, we add a _sendAuthenticatedCommand()
method (or something). There's even enough parity between the log messages at the header and catch
that a common function could handle that with some action labeling:
public async getLogEvents(
params: GetLogEventsCommandInput
): Promise<GetLogEventsCommandInput> {
return await this._sendAuthenticatedCommand({
command: new GetLogEventsCommand(),
log: "getting log events"
});
}
I feel like this could be DRY'd out even further. But, what do think?
this._dataTracker.logEvents = [...this._dataTracker.logEvents, ...logs]; | ||
} | ||
|
||
private async _validateLogGroupExists( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker of course, but this name doesn't read to me like a method that will do things.
How about something like simplifying the name and adding an arg that communicate intent, like:
_getLogGroup(..., createIfMissing?: boolean = true) {
...
}
Or something like _upsertLogGroup(...)
?
} | ||
} | ||
|
||
private async _validateLogStreamExists( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto some of the above comments. The naming doesn't communicate intent to create (not a blocker for me). I also suspect there are opportunities to DRY things out between this and the other _validate*
method(s).
}, 2000); | ||
} | ||
|
||
private _getDocUploadPermissibility(): boolean { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me longer to grok this than I care to admit. Could this be accurately called _canUpload
?
} | ||
} | ||
|
||
private _getBufferedBatchOfLogs(): InputLogEvent[] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this is well-covered indirectly. But, it may be worthwhile to add some explicit test-cases for this, unless I overlooked them.
@eddiekeller I see this piece of work has been reverted in #8412 . Is there any plan to incorporate this work again? |
Hey @eddiekeller 👋 Thanks for re-merging this! Excited to use it. In my testing though I can't seem to actually send any logs to cloudwatch. I am getting denied by a missing REQ:
Note: I did not add this sequenceToken to the request, reading the source it looks like you try to add it based on the previous requests but in this case the previous requests to configure the logGroups and logStreams do not contain one. RES:
Through reading up on this and your above comments it seems like a previous request should contain this sequence token, but none do. The initial POSTs for Cognito, I am configuring the logger like you specified above:
and even tried throwing a dummy sequence token in but it does not get passed through. Wondering if I am doing something wrong here, or you can give me some advice. Thanks!! |
@alexanderMontague I ran into the same issue testing out this functionality with similar config. In my case the root cause seemed to be the fact that |
Hey all - this is a great plugin thank you. Question - does this work in all regions? |
Re: my last comment - I have this up and working in us-east-1 and it works fine. Now trying to implement in ap-southeast-2 with a very similar configuration I'm getting a "AWSCloudWatch - error getting log group - Error: Region is missing" error. I've tried everything I can think of but nothing seems to work. Thoughts anyone? |
@johnsonfamily1234 , have you tried adding |
Is there any work planned to move this provider into documentation? I browsed both the Amplify TypeDocs and the Amplify Docs but do not see anything mentioned about this provider. It would be helpful to know what configuration there is without having to dig through code. |
@alexanderMontague The issue you're having with |
Can I get an explanation how to add access: Im getting an error: |
I get error |
Full Error |
This should be documented here: https://docs.amplify.aws/lib/utilities/logger/q/platform/js/ |
This pull request has been automatically locked since there hasn't been any recent activity after it was closed. Please open a new issue for related bugs. Looking for a help forum? We recommend joining the Amplify Community Discord server |
Description of changes
Issue - #3524
This PR adds an
AWSCloudWatchProvider
and expands on the functionality of theConsoleLogger
class to integrate with said provider. Users can use the new functionality to push logs to CloudWatch in their AWS account. If a user adds theAWSCloudWatchProvider
as a pluggable to theConsoleLogger
, then the logger will automatically send their logs to CloudWatch as well as print them to the console.When configuring amplify via
Amplify.configure()
, the user will just need to add in aLogging
property that contains alogGroupName
andlogStreamName
that they wish to write to. If the log group or log stream does not exist, then we will create it for them.Additionally, users will need to ensure that their accounts have the necessary permissions to interact with CloudWatch. Namely -
DescribeLogGroups
DescribeLogStreams
CreateLogGroup
CreateLogStream
PutLogEvents
Example usage -
Description of how you validated changes
Tested via unit testing and local testing with personal AWS account and sample app
Checklist
yarn test
passesBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.