-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cloud Security][Telemetry] add error handling incase an individual collector fails #165918
[Cloud Security][Telemetry] add error handling incase an individual collector fails #165918
Conversation
Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security) |
x-pack/plugins/cloud_security_posture/server/lib/telemetry/collectors/types.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/cloud_security_posture/server/lib/telemetry/collectors/register.ts
Outdated
Show resolved
Hide resolved
@@ -61,6 +65,13 @@ export function registerCspmUsageCollector( | |||
getAlertsStats(collectorFetchContext.esClient, logger), | |||
]); | |||
|
|||
const indicesStats = handleResult('Indices', results[0]); | |||
const accountsStats = handleResult('Accounts', results[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here if there's an error you would do something like:
const accountsStats = handleResult('Accounts', results[1]) || []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Promise.allSettled()
always resolves in a truthy value - an object so I would never render a []
.
// [
// { status: 'fulfilled', value: 33 },
// { status: 'fulfilled', value: 66 },
// { status: 'fulfilled', value: 99 },
// { status: 'rejected', reason: Error: an error }
// ] ```
installationStats, | ||
alertsStats, | ||
] = await Promise.all([ | ||
const handleResult = <T>(taskName: string, result: PromiseSettledResult<T>) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we are not handling the error here (I don't see logging as handling it) so I'd suggest name that suggests the actual action when the happy flow happens like extractPromiseValue
@@ -8,7 +8,7 @@ | |||
import { CspStatusCode } from '../../../../common/types'; | |||
|
|||
export interface CspmUsage { | |||
indices: CspmIndicesStats; | |||
indices: CspmIndicesStats | never[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've read about never
it doesn't appear to be related here
my reference: https://www.typescriptlang.org/docs/handbook/basic-types.html#never
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, good point. I mistakenly thought we wanted to handle the error with an empty state so applying never
fixed the issue type error but I'll change the default value to undefined
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - added a nit refactoring suggestion
installationStats, | ||
alertsStats, | ||
] = await Promise.all([ | ||
const getPromiseValue = <T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit you could improve it a bit by not tangling yourself with results[X]
and working with it by using the following changing getPromiseValue
to receive a promise and wait for it
So when forming the allSettled request, the naming of the request is right next to it.
And now getPromiseValue
name also describes better its logic.
And maybe awaitPromiseSafe
describes it even better. It is safe because it won't throw any errors
const awaitPromiseSafe = async <T>(
taskName: CloudSecurityUsageCollectorType,
promise: Promise<T>
) => {
try {
const val = await promise;
return val;
} catch (error) {
logger.error(`${taskName} task failed: ${error.message}`);
logger.error(error.stack);
return undefined;
}
};
const esClient = collectorFetchContext.esClient;
const soClient = collectorFetchContext.soClient;
const [
indices,
// eslint-disable-next-line @typescript-eslint/naming-convention
accounts_stats,
// eslint-disable-next-line @typescript-eslint/naming-convention
resources_stats,
// eslint-disable-next-line @typescript-eslint/naming-convention
rules_stats,
// eslint-disable-next-line @typescript-eslint/naming-convention
installation_stats,
// eslint-disable-next-line @typescript-eslint/naming-convention
alerts_stats,
] = await Promise.all([
awaitPromiseSafe('Indices', getIndicesStats(esClient, soClient, coreServices, logger)),
awaitPromiseSafe('Accounts', getAccountsStats(esClient, logger)),
awaitPromiseSafe('Resources', getResourcesStats(esClient, logger)),
awaitPromiseSafe('Rules', getRulesStats(esClient, logger)),
awaitPromiseSafe(
'Installation',
getInstallationStats(esClient, soClient, coreServices, logger)
),
awaitPromiseSafe('Alerts', getAlertsStats(esClient, logger)),
]);
return {
indices,
accounts_stats,
resources_stats,
rules_stats,
installation_stats,
alerts_stats,
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we can replace allSettled
with all
, because we wrap it with a handler that keeps it safe
…m:Omolola-Akinleye/kibana into fix_telemetry_bug_usage_collector_failure
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
@@ -7,6 +7,14 @@ | |||
|
|||
import { CspStatusCode } from '../../../../common/types'; | |||
|
|||
export type CloudSecurityUsageCollectorType = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Summary
Summarize your PR. If it involves visual changes include a screenshot or gif.
The kibana
cloud security
usage collector payload fails when one feature collector fails. I applied error handling and enabled the feature collector to fail gracefully with a default value. We are now usingPromise.allSettled
vsPromse.all
to avoid all collectors failing in case one collector fails.