Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core/cloudwatch logging #8309

Merged
merged 16 commits into from
Jun 1, 2021

Conversation

eddiekeller
Copy link
Contributor

@eddiekeller eddiekeller commented May 19, 2021

Description of changes

Issue - #3524

This PR adds an AWSCloudWatchProvider and expands on the functionality of the ConsoleLogger class to integrate with said provider. Users can use the new functionality to push logs to CloudWatch in their AWS account. If a user adds the AWSCloudWatchProvider as a pluggable to the ConsoleLogger, then the logger will automatically send their logs to CloudWatch as well as print them to the console.

When configuring amplify via Amplify.configure(), the user will just need to add in a Logging property that contains a logGroupName and logStreamName that they wish to write to. If the log group or log stream does not exist, then we will create it for them.

Additionally, users will need to ensure that their accounts have the necessary permissions to interact with CloudWatch. Namely -

DescribeLogGroups
DescribeLogStreams
CreateLogGroup
CreateLogStream
PutLogEvents

Example usage -

Amplify.configure({
  Logging: {
    logGroupName: 'plugin-1',
    logStreamName: '1-plugin',
  },
  ...awsExports
});

const logger = new Logger('logBuddy');
Amplify.register(logger);
logger.addPluggable(new AWSCloudWatchProvider());

function App() {
  return (
    <div className="App">
        <p onClick={() => { logger.warn('clicking,, huh?', Date.now()) }}>
          Edit <code>src/App.js</code> and save to reload.
        </p>
    </div>
  );
}

Description of how you validated changes

Tested via unit testing and local testing with personal AWS account and sample app

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@eddiekeller eddiekeller requested a review from Ashish-Nanda as a code owner May 19, 2021 19:38
@eddiekeller eddiekeller removed the request for review from Ashish-Nanda May 19, 2021 21:47
@codecov-commenter
Copy link

codecov-commenter commented May 19, 2021

Codecov Report

Merging #8309 (597e48b) into main (9cc5218) will decrease coverage by 0.24%.
The diff coverage is 62.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #8309      +/-   ##
==========================================
- Coverage   78.09%   77.85%   -0.25%     
==========================================
  Files         236      239       +3     
  Lines       16599    16879     +280     
  Branches     3559     3594      +35     
==========================================
+ Hits        12963    13141     +178     
- Misses       3526     3627     +101     
- Partials      110      111       +1     
Impacted Files Coverage Δ
...ckages/core/src/Providers/AWSCloudWatchProvider.ts 58.36% <58.36%> (ø)
packages/core/src/Parser.ts 91.66% <66.66%> (-3.58%) ⬇️
packages/core/src/Logger/ConsoleLogger.ts 82.89% <72.97%> (-1.11%) ⬇️
packages/aws-amplify/src/index.ts 100.00% <100.00%> (ø)
packages/core/src/Providers/index.ts 100.00% <100.00%> (ø)
packages/core/src/Util/Constants.ts 100.00% <100.00%> (ø)
packages/core/src/Util/index.ts 100.00% <100.00%> (ø)
packages/core/src/index.ts 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9cc5218...597e48b. Read the comment docs.

@svidgen
Copy link
Member

svidgen commented May 21, 2021

A few initial questions, presumably just for my own edification:

  • Do we need to handle retries for throttling exceptions? Or does the AWS SDK already handle that?
  • Is the throttling on CreateLogStream (50 TPS) a per-client limit? A per-log-group limit?
  • Is timestamp enough to ensure logs are recorded in order? Or, do we need some sort of locking or debouncing to ensure logs are always sent in the expected order?
console.log("oh no! something is broken!");
console.log("... maybe.");
console.log("OH! and remember to pick up dinner on the way home!")

@eddiekeller
Copy link
Contributor Author

A few initial questions, presumably just for my own edification:

  • Do we need to handle retries for throttling exceptions? Or does the AWS SDK already handle that?
  • Is the throttling on CreateLogStream (50 TPS) a per-client limit? A per-log-group limit?
  • Is timestamp enough to ensure logs are recorded in order? Or, do we need some sort of locking or debouncing to ensure logs are always sent in the expected order?
console.log("oh no! something is broken!");
console.log("... maybe.");
console.log("OH! and remember to pick up dinner on the way home!")

@svidgen Thanks for the questions.

  1. The AWS SDK should be handling retries and throttling already. But I can add in some functionality on our end too to try and avoid throttling situations. Shouldn't be too much extra work.
  2. It is per client - not per log group.
  3. Good question. The order that the logs get sent in doesn't have to do with the timestamp. Each request gets sent with a unique sequence token for the specific log stream. The next sequence token gets generated by CloudWatch when there is a successful log push. Timestamps are moreso used in the CloudWatch client where users can sort by the timestamp of the event. However, I suppose this doesn't necessarily factor in race conditions of asynchronous requests being initially sent in order. The solution for this is probably related to the first question you mentioned. I feel like a simple queue with a bit of management on our end would probably address both points.

@eddiekeller eddiekeller force-pushed the core/cloudwatch-logging branch from be668cf to 7da18bf Compare May 26, 2021 20:28
@eddiekeller
Copy link
Contributor Author

Added in a queue for logging, with support for batched log pushes. Also added safety checks on the size of the log + log batches, a retry in case a log fails due to the improper sequence number being attached to the request, and an interval to monitor the log queue and submit log pushes whenever there are logs present in the queue.

@iCodeForBananas
Copy link

iCodeForBananas commented May 27, 2021

Would this have the functionality to send EMF objects to CloudWatch?
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html
Or at least give us the ability to create a function that can create the object to feed the logger?

On failed messages will this retry, back off, and eventually stop attempting to send a message?

@eddiekeller
Copy link
Contributor Author

eddiekeller commented May 27, 2021

Would this have the functionality to send EMF objects to CloudWatch?

I'm actually not familiar with EMF. Let me take a look at that and see if it would just plug in or if additional functionality would be required.

On failed messages will this retry, back off, and eventually stop attempting to send a message?

It depends on why the log push failed. If the log push failed because of an incorrect sequence number, then it will back off, attempt to get the correct sequence number, and then perform a retry.

@eddiekeller
Copy link
Contributor Author

Would this have the functionality to send EMF objects to CloudWatch?

I'm actually not familiar with EMF. Let me take a look at that and see if it would just plug in or if additional functionality would be required.

Looked into it, and it looks like EMF requires a bit of a different format than just sending up logs normally. This is functionality that should be relatively simple to implement, but would probably be best added in a subsequent PR so as to stop the size creep on this one. I am keeping a list of things to add to our logger offering once this gets merged in, and will make additional issues for them so we can track their progress.

Copy link
Contributor

@jamesaucode jamesaucode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I left some comments on the code and I have a couple more general questions (before I pull it and try it out on my dev desktop later 😄) :

  1. The usage of this would be something like Logger.addPluggable(new AWSCloudWatchProvider())?
  2. With that, every logger.x call will add to the internal buffer of AWSCloudWatchProvider and be shipped to CloudWatch every 2 seconds?
  3. Is there a way for customer to specify which level of logs they want to be shipped to Cloudwatch ? For example a user might just want to ship logger.warn or logger.error but not anything else.

packages/core/src/Providers/AWSCloudWatchProvider.ts Outdated Show resolved Hide resolved
packages/core/src/Providers/AWSCloudWatchProvider.ts Outdated Show resolved Hide resolved
packages/core/src/Providers/AWSCloudWatchProvider.ts Outdated Show resolved Hide resolved
} catch (err) {
logger.error(`error during _safeUploadLogEvents: ${err}`);

if (RETRY_ERROR_CODES.includes(err.name)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this should be explicitly retried? It might also be confusing in a scenario where the initial attempt failed, but the retry succeeded, then the user will see error during _safeUploadLogEvents but then the log actually got shipped to Cloudwatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the explicit retry here is to handle an edge case race condition where we might attempt to push a log with an improper sequence number. The update I made to utilize a queue should ideally take care of this problem on its own...but I feel it is safest for us to handle an explicit retry on this specific case anyway. I figure it's a low risk-high reward to include it.

packages/core/src/Providers/AWSCloudWatchProvider.ts Outdated Show resolved Hide resolved
packages/core/src/Providers/AWSCloudWatchProvider.ts Outdated Show resolved Hide resolved
@eddiekeller
Copy link
Contributor Author

eddiekeller commented May 28, 2021

Nice work! I left some comments on the code and I have a couple more general questions (before I pull it and try it out on my dev desktop later 😄) :

Thanks for the comments! I will run through them and make the changes highlighted.

  1. The usage of this would be something like Logger.addPluggable(new AWSCloudWatchProvider())?

Yep, you got it. I have a sample code snippet in the PR overview that shows this usage.

  1. With that, every logger.x call will add to the internal buffer of AWSCloudWatchProvider and be shipped to CloudWatch every 2 seconds?

Every call of that instance of the logger, yes. We could reduce the time to lower than two seconds, but I chose that as a starting point in order to safely avoid hitting throttle limits on the CloudWatch APIs. But, really, even dropping it down to every 30ms or so would be fine. And every log gets published with a timestamp of the event, too.

  1. Is there a way for customer to specify which level of logs they want to be shipped to Cloudwatch ? For example a user might just want to ship logger.warn or logger.error but not anything else.

I didn't include that, but that would be simple enough to add. The current behavior of our logger is that the default log level is WARN, and unless the user specifies a lower log level when instantiating the logger, then only WARN and ERROR logs will run through the workflow. But I suppose it is easy enough to add an additional parameter to explicitly set the CloudWatch log level threshold too.

@iCodeForBananas
Copy link

  1. Is there a way for customer to specify which level of logs they want to be shipped to Cloudwatch ? For example a user might just want to ship logger.warn or logger.error but not anything else.

I didn't include that, but that would be simple enough to add. The current behavior of our logger is that the default log level is WARN, and unless the user specifies a lower log level when instantiating the logger, then only WARN and ERROR logs will run through the workflow. But I suppose it is easy enough to add an additional parameter to explicitly set the CloudWatch log level threshold too.

I can't envision a scenario where our teams would use an ability like that. If I'm using logger.* I'm intentionally sending those logs to CloudWatch otherwise I'd use console.*.

@eddiekeller
Copy link
Contributor Author

  1. Is there a way for customer to specify which level of logs they want to be shipped to Cloudwatch ? For example a user might just want to ship logger.warn or logger.error but not anything else.

I didn't include that, but that would be simple enough to add. The current behavior of our logger is that the default log level is WARN, and unless the user specifies a lower log level when instantiating the logger, then only WARN and ERROR logs will run through the workflow. But I suppose it is easy enough to add an additional parameter to explicitly set the CloudWatch log level threshold too.

I can't envision a scenario where our teams would use an ability like that. If I'm using logger.* I'm intentionally sending those logs to CloudWatch otherwise I'd use console.*.

I'll leave this as something to investigate in my tracker ticket for future ideas then.

Copy link
Member

@svidgen svidgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of my comments were blockers — Mostly just maintainability and readability paranoia.

Comment on lines +510 to +514
if (this._timer) {
clearInterval(this._timer);
}

this._timer = setInterval(async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: Why recreate the interval here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just took this timer creation from another one of our Providers.

Comment on lines 206 to 221
const cmd = new GetLogEventsCommand(params);
const client = this._initCloudWatchLogs();

try {
const credentialsOK = await this._ensureCredentials();
if (!credentialsOK) {
logger.error(NO_CREDS_ERROR_STRING);
throw Error;
}

const output = await client.send(cmd);
return output;
} catch (error) {
logger.error(`error getting log events - ${error}`);
throw error;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basic patterns shows up quite a few times. It might be worthwhile to DRY this out, as that could really increase the SNR on some of these methods.

Thinking "aloud" here, maybe as an alternative, we add a _sendAuthenticatedCommand() method (or something). There's even enough parity between the log messages at the header and catch that a common function could handle that with some action labeling:

public async getLogEvents(
  params: GetLogEventsCommandInput
): Promise<GetLogEventsCommandInput> {
  return await this._sendAuthenticatedCommand({
    command: new GetLogEventsCommand(),
    log: "getting log events"
  });
}

I feel like this could be DRY'd out even further. But, what do think?

this._dataTracker.logEvents = [...this._dataTracker.logEvents, ...logs];
}

private async _validateLogGroupExists(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker of course, but this name doesn't read to me like a method that will do things.

How about something like simplifying the name and adding an arg that communicate intent, like:

_getLogGroup(..., createIfMissing?: boolean = true) {
  ...
}

Or something like _upsertLogGroup(...)?

}
}

private async _validateLogStreamExists(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto some of the above comments. The naming doesn't communicate intent to create (not a blocker for me). I also suspect there are opportunities to DRY things out between this and the other _validate* method(s).

}, 2000);
}

private _getDocUploadPermissibility(): boolean {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me longer to grok this than I care to admit. Could this be accurately called _canUpload?

}
}

private _getBufferedBatchOfLogs(): InputLogEvent[] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is well-covered indirectly. But, it may be worthwhile to add some explicit test-cases for this, unless I overlooked them.

@eddiekeller eddiekeller merged commit 6f285c3 into aws-amplify:main Jun 1, 2021
@chemzo
Copy link

chemzo commented Jul 7, 2021

@eddiekeller I see this piece of work has been reverted in #8412 . Is there any plan to incorporate this work again?

@alexanderMontague
Copy link

alexanderMontague commented Jul 29, 2021

Hey @eddiekeller 👋 Thanks for re-merging this! Excited to use it. In my testing though I can't seem to actually send any logs to cloudwatch. I am getting denied by a missing sequenceToken:

REQ:

{
  logEvents: [{message: "[ERROR] 50:23.354 APP_LOGGER - this is an error!", timestamp: 1627530623355}],
  logGroupName: "TEST_WEB_APP",
  logStreamName: "DEVELOPMENT_LOGS",
  sequenceToken: ""
}

Note: I did not add this sequenceToken to the request, reading the source it looks like you try to add it based on the previous requests but in this case the previous requests to configure the logGroups and logStreams do not contain one.

RES:

{
  message: "1 validation error detected: Value '' at 'sequenceToken' failed to satisfy constraint: Member must have length 
  greater than or equal to 1",
  __type: "InvalidParameterException"
}

Through reading up on this and your above comments it seems like a previous request should contain this sequence token, but none do. The initial POSTs for Cognito, logGroups and logStreams all go through perfectly fine. We only get stuck on the above request ^

I am configuring the logger like you specified above:

    // configure amplify
    Amplify.configure({
      Logging: {
        logGroupName: 'TEST_WEB_APP',
        logStreamName: `${ENV}_LOGS`,
        sequenceToken: '123',
      },
      ...awsconfig,
    })

    this.amplifyLogger = new Logger('APP_LOGGER')
    Amplify.register(this.amplifyLogger)
    this.amplifyLogger.addPluggable(new AWSCloudWatchProvider())

and even tried throwing a dummy sequence token in but it does not get passed through.

Wondering if I am doing something wrong here, or you can give me some advice.

Thanks!!

@szatr
Copy link

szatr commented Aug 4, 2021

@alexanderMontague I ran into the same issue testing out this functionality with similar config. In my case the root cause seemed to be the fact that DescribeLogStreams wasn't returning a uploadSequenceToken value because the stream was empty. After adding an event to the stream manually, the sequenceToken value was populated correctly and subsequent log calls worked without issue. Perhaps the client can be updated to omit the sequenceToken parameter from the PutLogEventsCommand when it's empty.

@Leejjon
Copy link

Leejjon commented Aug 25, 2021

Does this work with SSR on Next.js? Or is this only for logging clientside log messages to CloudWatch?

I'm getting a dozen of errors:
Screenshot 2021-09-17 at 11 28 20

A full code sample of an amplify project that just does logging would be great (Next.js preferred).

@johnsonfamily1234
Copy link

Hey all - this is a great plugin thank you. Question - does this work in all regions?

@johnsonfamily1234
Copy link

Re: my last comment - I have this up and working in us-east-1 and it works fine. Now trying to implement in ap-southeast-2 with a very similar configuration I'm getting a "AWSCloudWatch - error getting log group - Error: Region is missing" error. I've tried everything I can think of but nothing seems to work. Thoughts anyone?

@slavco86
Copy link

@johnsonfamily1234 , have you tried adding region: "ap-southeast-2" to the Logging: {} config?
Worked for me, but I'm in eu-west... so might be different for you

@aaronlna
Copy link
Contributor

aaronlna commented Nov 7, 2021

Is there any work planned to move this provider into documentation? I browsed both the Amplify TypeDocs and the Amplify Docs but do not see anything mentioned about this provider. It would be helpful to know what configuration there is without having to dig through code.

@aaronlna
Copy link
Contributor

aaronlna commented Nov 7, 2021

@alexanderMontague The issue you're having with sequenceToken seems related to #9050. Are you trying to write to an empty log-stream?

@batelhi
Copy link

batelhi commented Jan 2, 2022

Can I get an explanation how to add access:
DescribeLogGroups
DescribeLogStreams
CreateLogGroup
CreateLogStream
PutLogEvents
to my amplify app?

Im getting an error:
[ERROR] 29:12.6 AWSCloudWatch - error getting log group - AccessDeniedException: User: arn:aws:sts::630051678415:assumed-role/amplify-intome-dev-232506-authRole/CognitoIdentityCredentials is not authorized to perform: logs:DescribeLogGroups on resource: arn:aws:logs:eu-west-1:630051678415:log-group::log-stream:

@jjwallaby
Copy link

I get error
AWSCloudWatch - failure during log push: InvalidSequenceTokenException: The given sequenceToken is invalid. The next expected sequenceToken is: 49623231331024656963658064413629994494964312060433793026

@jjwallaby
Copy link

Full Error
[13:24:15] [ERROR] 24:14.351 AWSCloudWatch - failure during log push: InvalidSequenceTokenException: The given sequenceToken is invalid. The next expected sequenceToken is: 49623231331024656963658064413629994494964312060433793026
at node_modules\react-native\Libraries\LogBox\LogBox.js:148:8 in registerError
at node_modules\react-native\Libraries\LogBox\LogBox.js:59:8 in errorImpl
at node_modules\react-native\Libraries\LogBox\LogBox.js:33:4 in console.error
at node_modules\expo\build\environment\react-native-logs.fx.js:27:4 in error
at node_modules@aws-amplify\core\lib-esm\Logger\ConsoleLogger.js:126:12 in prototype._log
at http://192.168.18.6:19000/node_modules%5Cexpo%5CAppEntry.bundle?platform=android&dev=true&hot=false&minify=false:154367:21 in error
at node_modules@aws-amplify\core\lib-esm\Providers\AWSCloudWatchProvider.js:386:36 in __generator$argument_1
at http://192.168.18.6:19000/node_modules%5Cexpo%5CAppEntry.bundle?platform=android&dev=true&hot=false&minify=false:475064:26 in step
at node_modules@aws-sdk\client-location\node_modules@aws-crypto\sha256-js\node_modules\tslib\tslib.js:122:34 in
at node_modules@aws-sdk\client-location\node_modules@aws-crypto\sha256-js\node_modules\tslib\tslib.js:113:43 in rejected
at node_modules\react-native\node_modules\promise\setimmediate\core.js:37:13 in tryCallOne
at node_modules\react-native\node_modules\promise\setimmediate\core.js:123:24 in setImmediate$argument_0
at node_modules\react-native\Libraries\Core\Timers\JSTimers.js:130:14 in _callTimer
at node_modules\react-native\Libraries\Core\Timers\JSTimers.js:181:14 in _callImmediatesPass
at node_modules\react-native\Libraries\Core\Timers\JSTimers.js:441:30 in callImmediates
at node_modules\react-native\Libraries\BatchedBridge\MessageQueue.js:387:6 in __callImmediates
at node_modules\react-native\Libraries\BatchedBridge\MessageQueue.js:135:6 in __guard$argument_0
at node_modules\react-native\Libraries\BatchedBridge\MessageQueue.js:364:10 in __guard
at node_modules\react-native\Libraries\BatchedBridge\MessageQueue.js:134:4 in flushedQueue

@Voyager-Two
Copy link

Voyager-Two commented Jun 30, 2022

This should be documented here: https://docs.amplify.aws/lib/utilities/logger/q/platform/js/
Surprised to find out the default logger just logs to console, and that's it.

@github-actions
Copy link

github-actions bot commented Jul 1, 2023

This pull request has been automatically locked since there hasn't been any recent activity after it was closed. Please open a new issue for related bugs.

Looking for a help forum? We recommend joining the Amplify Community Discord server amplify-help forum.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.