-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds new SavedObjectsRespository error type for 404 that do not originate from Elasticsearch responses #107301
Adds new SavedObjectsRespository error type for 404 that do not originate from Elasticsearch responses #107301
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments to reviewers.
const notFoundError = this.createGenericNotFoundError(type, id); | ||
return this.decorateEsUnavailableError( | ||
new Error(`${notFoundError.message}`), | ||
`x-elastic-product not present or not recognized` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied self-review comment:
I'm not sure this is the best text string to use as a descriptor for the "missing" header. I took inspiration from the client's team but made it a lot more specific. We need to get Product's input on wording if we choose to follow this implementation.
public static isNotFoundEsUnavailableError(error: Error | DecoratedError) { | ||
return ( | ||
isSavedObjectsClientError(error) && | ||
error[code] === CODE_ES_UNAVAILABLE && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied self-review comment:
Specifically set a status code of 503 to indicate an ES availability error. This overrides the 404 we would otherwise have thrown.
throw SavedObjectsErrorHelpers.createGenericNotFoundError(type, id); | ||
} else { | ||
// throw if we can't verify the response is from Elasticsearch | ||
throw SavedObjectsErrorHelpers.createGenericNotFoundEsUnavailableError(type, id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review:
createGenericNotFoundEsUnavailableError
will return a 503 error that adds the message from a NotFoundError
to the EsUnavailableError
. The decorated error that's thrown will assert true
to a check for isEsUnavailableError
, since the latter returns a boolean
for the statusCode
being 503.
If a saved object type and id are provided, the error will have the form:
{
message: 'x-elastic-product not present or not recognized: Saved object [foo/bar] not found',
statusCode: 503
}
Otherwise, the error will have the form:
{
message: 'x-elastic-product not present or not recognized: Not Found',
statusCode: 503
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional nit: I had to read createGenericNotFoundEsUnavailableError
implementation to understand what error it will create. Maybe we should pass an error reason to decorateEsUnavailableError
explicitly?
decorateEsUnavailableError(error, `Saved object [${type}/${id}] not found`);
// throw if we can't verify the response is from Elasticsearch
Will we add such a helper for every possible error response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshdover which question did you give the thumbs up for in #107301 (comment)? i.e.
Maybe we should pass an error reason to decorateEsUnavailableError explicitly?
or
Will we add such a helper for every possible error response?
I'm not sure which way we want to go eventually, as long as we provide enough info to easily figure out the root cause of an error when it's thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should pass an error reason to decorateEsUnavailableError explicitly?
We would still need to provide an error to pass to decorateEsUnavailableError
:
public static decorateEsUnavailableError(error: Error, reason?: string) {
return decorate(error, CODE_ES_UNAVAILABLE, 503, reason);
}
which is what we're doing in createGenericNotFoundEsUnavailableError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code comments.
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Awesome work! I've tested this locally by simulating situation with ES restart during the Kibana was still running - as a result all scheduled rules continue running after Kibana got back to the healthy state.
This PR should resolve this Alerting issue with no actions from the our side
@mshustov regarding your comment regarding using this header before our version check uses it:
In addition to @TinaHeiligers's note that this check only applies to 404, it also seems quite possible that we will have #105557 fixed for the 7.14.1 release which makes me less concerned about this. Though maybe we should consider delaying a backport until that change is also backported. |
@joshdover do you mean the 7.14.1 backport or both backports to 7.14.1 and 7.x (7.15)? |
ack: going to review tomorrow (04 Aug.) |
); | ||
} | ||
|
||
public static isNotFoundEsUnavailableError(error: Error | DecoratedError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this API? If we want plugins to treat this error as any other ES not found issue, I think it'd be ok to only include the factory function createGenericNotFoundEsUnavailableError
so that plugins don't waste time trying to figure out how they should handle this error in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually using the API within the repository to verify error types here. It also ensures we stay consistent with repository error type checks, such as isNotFoundError
, with the additional check on the error reason.
isNotFoundEsUnavailableError
isn't an error itself, it returns a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but as a plugin developer, I'd want to handle each error exposed on SavedObjectsErrorHelpers.is*
in order to make my code bulletproof. I'd expect that each of these is*Error
methods to only correspond to one error type and not overlap, but now that is no longer the case since both isEsUnavailableError
and isNotFoundEsUnavailableError
could return true on the same error. I'd also expect that each error type would correspond to something different I could do to handle this, however the underlying problem for both of these errors is the same and there's really nothing different I can/should do as a developer for the isNotFoundEsUnavailableError
case vs the isEsUnavailable
one.
Are you planning to add similar new error guards for each method as part of #107343? Personally, I think we should only expose isEsUnavailable
for all of these situations to signal to plugins that they should handle this error in the same way for every method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think we should only expose isEsUnavailable for all of these situations to signal to plugins that they should handle this error in the same way for every method.
That makes sense. I've removed isNotFoundEsUnavailableError
.
elasticsearchClientMock.createSuccessTransportRequestPromise( | ||
{ found: false }, | ||
undefined, | ||
{ 'x-elastic-product': 'Elasticsearch' } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should include this header as the default in elasticsearchClientMock.createSuccessTransportRequestPromise()
(or even better might be in the createApiResponse
function that this function calls) and only override it when needed for these tests. That would better represent the expected typical conditions in our mocks rather than the exceptional case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to handle that in the follow up issue (that I'm working on right now anyway), as it's going to become the typical conditions when we add the checks to the rest of the SO repository methods.
.then((res) => { | ||
const indexNotFound = res.statusCode === 404; | ||
const esServerSupported = isSupportedEsServer(res.headers); | ||
// check if we have the elasticsearch header when doc.found is not true (false, undefined or null) and if we do, ensure it is Elasticsearchq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// check if we have the elasticsearch header when doc.found is not true (false, undefined or null) and if we do, ensure it is Elasticsearchq | |
// check if we have the elasticsearch header when index is not found and if we do, ensure it is Elasticsearch |
I just mean delaying the backport to 7.14.1 |
@@ -141,9 +141,10 @@ export type MockedTransportRequestPromise<T> = TransportRequestPromise<T> & { | |||
|
|||
const createSuccessTransportRequestPromise = <T>( | |||
body: T, | |||
{ statusCode = 200 }: { statusCode?: number } = {} | |||
{ statusCode = 200 }: { statusCode?: number } = {}, | |||
headers?: Record<string, any> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshdover thanks for pointing the type issue out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
headers?: Record<string, string | string[]>
see
kibana/src/core/server/http/router/headers.ts
Lines 40 to 41 in b20e9c2
export type Headers = { [header in KnownHeaders]?: string | string[] | undefined } & { | |
[header: string]: string | string[] | undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jenkins test this please |
@elasticmachine merge upstream |
Summary of PR status: @dover @mshustov @rudolf I’ve created the following issues to follow up on from this work:
Please let me know if I’ve missed something or if there are remaining issues to tackle before moving forward with the PR. I’m hoping we can a fix in before 7.15 FF but, depending on priorities, don't know if there's enough time to get through all the follow up issues before FF. |
💚 Build Succeeded
Metrics [docs]Unknown metric groupsAPI count
API count missing comments
History
To update your PR or re-run it, just comment with: |
|
…nate from Elasticsearch responses (elastic#107301) Co-authored-by: Kibana Machine <[email protected]>
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
…nate from Elasticsearch responses (#107301) (#108037) Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
@TinaHeiligers Was this backported to 7.14? The tag says 7.14 but I can't find a backport PR. Wondering because we are seeing a saved object not found error for a rule in 7.14.1 in a cluster that seems to have some proxy connectivity issues. |
Summary
Resolves #102353.
Replaces #107104.
There may be cases where a 404 can be returned during calls to the Saved Objects Repository for reasons other than the saved object not being found.
The Saved Objects service should distinguish between a Saved Object is not found and other unknown reasons that do not originate from responses to an Elasticsearch call.
This PR adds an explicit 503 NotFoundEsUnavailable Error that will wrap 404 errors from
get
,update
anddelete
requests when the response does not contain thex-elastic-product: Elasticsearch
header.As a follow up to this initial implementation, we need to:
Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
x-elastic-product
header .For maintainers