Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Add anomaly detection API tests + fixes #73120

Merged
merged 7 commits into from
Jul 30, 2020

Conversation

sorenlouv
Copy link
Member

@sorenlouv sorenlouv commented Jul 23, 2020

  • Add anomaly detection API tests (read/write/no access users)
  • Remove error codes (not needed, since we can just set correct error messages)
  • use Boom errors where appropriate
  • only fetches anomaly detection jobs when available to user (Anomaly detection link)
  • requires correct ML read/write privileges to access get/create routes

@sorenlouv sorenlouv requested a review from a team as a code owner July 23, 2020 19:07
@botelastic botelastic bot added the Team:APM - DEPRECATED Use Team:obs-ux-infra_services. label Jul 23, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@sorenlouv sorenlouv added release_note:skip Skip the PR/issue when compiling release notes v7.10.0 v7.9.0 labels Jul 23, 2020
indices: [
{ names: ['observability-annotations'], privileges: ['read', 'view_index_metadata'] },
],
},
Copy link
Member Author

@sorenlouv sorenlouv Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These privileges are now part of the apm_user role.

text: MLErrorMessages[res.errorCode],
});
return false;
}
Copy link
Member Author

@sorenlouv sorenlouv Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got rid of the errorCode in the response. Now the server side will throw the error and the client should simply read the error message (error.message).

@sorenlouv sorenlouv force-pushed the add-anomaly-detection-api-tests branch 2 times, most recently from de82143 to a32d817 Compare July 24, 2020 23:17
</EuiButtonEmpty>

{canGetJobs && hasValidLicense ? (
<MissingJobsAlert environment={environment} />
Copy link
Member Author

@sorenlouv sorenlouv Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only show ML warnings if the user can act on them. Which means: the user should have read privileges to ML and a valid license

@@ -29,16 +29,12 @@ export async function createAnomalyDetectionJobs(
const { ml, indices } = setup;

if (!ml) {
throw new AnomalyDetectionError(ErrorCode.ML_NOT_AVAILABLE);
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member Author

@sorenlouv sorenlouv Jul 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Boom errors over the custom AnomalyDetectionError (which I removed everywhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if a 503 is appropriate here. 503 implies that it is a temporary error, and should be retried. Maybe a 500 makes more sense?

@@ -12,7 +15,12 @@ export async function hasLegacyJobs(setup: Setup) {
const { ml } = setup;

if (!ml) {
return false;
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member Author

@sorenlouv sorenlouv Jul 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing early instead of returning a default value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment, we could consider a 500 instead of a 503.


// get ML anomaly detection jobs for each environment
export const anomalyDetectionJobsRoute = createRoute(() => ({
method: 'GET',
path: '/api/apm/settings/anomaly-detection',
options: {
tags: ['access:apm', 'access:ml:canGetJobs'],
},
Copy link
Member Author

@sorenlouv sorenlouv Jul 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Require ML read privileges to access route

@@ -65,7 +52,7 @@ export const createAnomalyDetectionJobsRoute = createRoute(() => ({
method: 'POST',
path: '/api/apm/settings/anomaly-detection/jobs',
options: {
tags: ['access:apm', 'access:apm_write'],
tags: ['access:apm', 'access:apm_write', 'access:ml:canCreateJob'],
Copy link
Member Author

@sorenlouv sorenlouv Jul 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require ML write privileges to access route

@sorenlouv sorenlouv force-pushed the add-anomaly-detection-api-tests branch from d16f3e8 to d7735c4 Compare July 26, 2020 22:14
});

expect(toolTipAnchor).not.toBeInTheDocument();
expect(toolTipText).toBe(undefined);
});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests became very clean and easy to read. It did come at the cost of making renderTooltipAnchor slightly more complex. All in all I think it's worth it. LMK what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good compromise. It's unfortunate that we can't encapsulate all this logic into the original showAlert function, since that was simple to use & test. But then we'd be making extra API calls that might not be necessary, so this works well.

@sorenlouv sorenlouv force-pushed the add-anomaly-detection-api-tests branch from d7735c4 to ba7fac4 Compare July 26, 2020 22:17
@sorenlouv sorenlouv force-pushed the add-anomaly-detection-api-tests branch from ba7fac4 to 744373f Compare July 27, 2020 05:54
@dgieselaar dgieselaar self-requested a review July 27, 2020 13:55
it('should return true when there are no jobs', () => {
const result = showAlert(dataWithoutJobs, undefined);
expect(result).toBe(true);
it('when no jobs exists for the selected environment - it should show warning', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use describe('when no jobs exists for the selected environment', () => { it('shows a warning', () => >..

smith
smith previously approved these changes Jul 27, 2020
@dgieselaar
Copy link
Member

@sqren I've got some feedback pending, give me a couple hours tomorrow before merging

@smith smith self-requested a review July 27, 2020 20:38
@smith smith dismissed their stale review July 27, 2020 20:38

Waiting on other reviewers.

@ogupte ogupte changed the title [APM] Add anomaly detection API tests [APM] Add anomaly detection API and UI unit tests Jul 27, 2020
@ogupte ogupte changed the title [APM] Add anomaly detection API and UI unit tests [APM] Add anomaly detection API tests + fixes Jul 27, 2020
await createJobs(['production', 'staging']);

const { body } = await getJobs();
expect(body.hasLegacyJobs).to.be(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we don't have any legacy jobs in the test data, we can't test that part of the response here. But i think it's ok to do this in a unit test since it's only testing for a particular pattern in the a job id.

Copy link
Member

@dgieselaar dgieselaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll submit the feedback that I have for now - I'm waiting on ML to work on the obs test clusters to complete my review.

@@ -29,16 +29,12 @@ export async function createAnomalyDetectionJobs(
const { ml, indices } = setup;

if (!ml) {
throw new AnomalyDetectionError(ErrorCode.ML_NOT_AVAILABLE);
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if a 503 is appropriate here. 503 implies that it is a temporary error, and should be retried. Maybe a 500 makes more sense?

if (!ml) {
return [];
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment, we could consider a 500 instead of a 503.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 501 is probably better than 500 or 503. From the RFC:

10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.

10.5.2 501 Not Implemented
The server does not support the functionality required to fulfill the request. This is the appropriate response when the server does not recognize the request method and is not capable of supporting it for any resource.

@@ -12,7 +15,12 @@ export async function hasLegacyJobs(setup: Setup) {
const { ml } = setup;

if (!ml) {
return false;
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment, we could consider a 500 instead of a 503.

@@ -31,29 +35,14 @@ export async function getServiceAnomalies({
const { ml, start, end } = setup;

if (!ml) {
logger.warn('Anomaly detection plugin is not available.');
return DEFAULT_ANOMALIES;
throw Boom.serverUnavailable(ML_ERRORS.ML_NOT_AVAILABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps handle this in the route, and use a helper function that will either throw these errors or refine { ml?: ... } to { ml: ... }?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea but I'm going to leave these as-is to avoid too many changes this late. If you really think they're needed mark this PR with changes requested and I can get them in.

* you may not use this file except in compliance with the Elastic License.
*/

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that this test might be enough? i.e., if this fails I think it's safe to assume that other users (read, no access) won't have access either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right but I think I'll just leave these for now and err on the side of having too many tests.

return {
errorCode: mlErrorCode,
};
throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the log before this line if the route handler will throw anyway?

Comment on lines 42 to 44
context.logger.warn(
`Error while retrieving ML jobs: "${e.name}:${e.message}"`
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log if the route throws?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what you're saying here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the comment below I think you mean let's remove the try/catch and logging since we'll get the error anyway if it fails.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

async chunks size

id value diff baseline
apm 3.7MB -721.0B 3.7MB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@smith smith merged commit aa68e3b into elastic:master Jul 30, 2020
kibanamachine pushed a commit that referenced this pull request Jul 30, 2020
@kibanamachine
Copy link
Contributor

💔 Backport was not successful

The PR was attempted backported to the following branches:

  • 7.x
  • ❌ 7.9: Commit could not be cherrypicked due to conflicts

smith pushed a commit to smith/kibana that referenced this pull request Jul 30, 2020
Co-authored-by: Nathan L Smith <[email protected]>
# Conflicts:
#	x-pack/plugins/apm/public/components/app/Settings/anomaly_detection/jobs_list.tsx
smith added a commit that referenced this pull request Jul 30, 2020
Co-authored-by: Nathan L Smith <[email protected]>
Co-authored-by: Søren Louv-Jansen <[email protected]>
@sorenlouv sorenlouv deleted the add-anomaly-detection-api-tests branch August 15, 2020 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:APM - DEPRECATED Use Team:obs-ux-infra_services. v7.9.0 v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants