Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Product check from @elastic/elasticsearch-js #107663

Merged
merged 27 commits into from
Aug 26, 2021

Conversation

afharo
Copy link
Member

@afharo afharo commented Aug 4, 2021

Summary

Resolves #105557.

In #107642 we disabled the Product Check while we made sure Kibana was ready for it (and solved some bugs we noticed in the Product-Check implementation). This PR enables the feature back and adds tests to confirm that the new Product check from @elastic/elasticsearch-js does not break Kibana's behaviour.

Checklist

For maintainers

@afharo afharo added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.15.0 v7.14.1 labels Aug 4, 2021
@afharo afharo force-pushed the test-es-client-product-check-behaviour branch from 1b6bb04 to fa3d76f Compare August 6, 2021 08:48
@afharo afharo force-pushed the test-es-client-product-check-behaviour branch from fa3d76f to 0e21c2c Compare August 13, 2021 14:37
@afharo
Copy link
Member Author

afharo commented Aug 17, 2021

Blocked by #107536

@afharo afharo force-pushed the test-es-client-product-check-behaviour branch from 0e21c2c to 02a9a4b Compare August 18, 2021 15:48
@mistic mistic added v7.16.0 and removed v7.15.0 labels Aug 18, 2021
@afharo afharo force-pushed the test-es-client-product-check-behaviour branch from 02a9a4b to c9ccff1 Compare August 19, 2021 13:05
@afharo afharo changed the title Test Product check in @elastic/elasticsearch-js Enable Product check from @elastic/elasticsearch-js Aug 19, 2021
@afharo afharo removed the v7.14.1 label Aug 19, 2021
@afharo afharo marked this pull request as ready for review August 19, 2021 15:27
@afharo afharo requested review from a team as code owners August 19, 2021 15:27
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@afharo afharo added the Team:Operations Team label for Operations Team label Aug 19, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@afharo afharo added the auto-backport Deprecated - use backport:version if exact versions are needed label Aug 19, 2021
@@ -55,3 +63,67 @@ describe('elasticsearch clients', () => {
expect(resp.headers!.warning).toMatch('system indices');
});
});

function createFakeElasticsearchServer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see we rely on a product check to be performed during Elasticsearch version compatibility check step. Do you think it might make sense to have an explicit product check step during setup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, you cannot force a product check. The only thing that you can do is to make a random request (other than GET /), and the client will run it internally.

That said, I agree that we could run GET /_nodes like we do in the version-check polling, as the first thing during startup, and even crash Kibana if we agree to do so in the discussion #107663 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the method isValidConnection.

await kibanaServer.preboot();
await kibanaServer.setup();
kibanaServer.start();
await fakeServer.firstRequestReceived$.pipe(first()).toPromise(); // Waiting for the first ES request instead of the start to complete, because start can't never complete due to waiting for ES to be ready
Copy link
Contributor

@mshustov mshustov Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Kibana doesn't stop, but running indefinitely until the problem is resolved? Is it okay from the product point of view? IMO it's okay since a normal ES server might start later, just want to confirm it aligns with the product vision @thesmallestduck

Copy link
Member Author

@afharo afharo Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, AFAIK, once the client marks the connection as Product-invalid, it internally never checks again. It means that even when the ES server goes back to normal.

It might make sense to crash Kibana because of that. Although, we might need a retry logic to handle any possible errors other than the product check.

As an additional thought: we don't stop when Kibana's credentials or the ES host config are wrong.

I wonder if we should discuss this "Crash during startup on specific ES config errors" on a follow-up issue 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed about this sentence in the original issue:

Kibana should handle the incompatibility check exception thrown by @elastic/elasticsearch to log an error and stop Kibana gracefully.

I'll implement the shutdown logic 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afharo I added this requirement, but I didn't discuss it with the product management. Let's wait for @thesmallestduck feedback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @mshustov , a shutdown with a log message seems preferable to me. It is important to give an indication of why Kibana isn't functioning. An indefinite hang without a log message when connecting to an incorrect ES version seems unpleasant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deal! I'll trigger Kibana's shutdown when we see the ProductCheckError. FYI, we already log the error. But I'll make sure that it's more explicit.

// eslint-disable-next-line
const resp = await supertest(kibanaServer['server']['http']['prebootServer']['server']!.listener) // Mind that we are using the prebootServer at this point because the migration gets hanging, while waiting for ES to be correct
.get('/api/status')
.expect(503);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's turned out you cannot test it locally in the dev mode since basePathProxy blocks all the calls to the http preboot server 😞

if (msg === 'SERVER_LISTENING') {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, this case is supposed to be handled by

const isSetupOnHold = preboot.isSetupOnHold();
if (process.send && isSetupOnHold) {
process.send(['SERVER_LISTENING']);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, this case is supposed to be handled by

@pgayvallet Right now, this condition is always false because no plugins call holdSetupUntilResolved.

return send(
200,
{
name: 'es-bin',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why didn't this test contain this object before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

es_bin was not covering the GET / requests. It'd just returned 404 because of the default use case. This means, that the client would never pass the Product Check and all the kbn-es tests would fail.

We can either disable the product check for these tests, or handle the specific request to pass it and unblock the client from that state.

@afharo
Copy link
Member Author

afharo commented Aug 24, 2021

Fixed them with 31e6fdc (#107663):
I had to add an optional config to skip the validation of the connection so the tests wouldn't keep hanging due to ES not being started in those. It already skipped such ES connection hold previously performed by SOs by not initializing the plugins, so adding a new config property made sense to me.

@afharo afharo requested a review from a team as a code owner August 24, 2021 19:09
@afharo afharo requested review from a team August 24, 2021 19:09
@afgomez afgomez self-requested a review August 25, 2021 12:56
root = kbnTestServer.createRoot({
migrations: { skip: true },
plugins: { initialize: false },
elasticsearch: { skipStartupConnectionCheck: true },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fancy adding another config value just for testing but mock the service might over-complicate the testing and diverge from the production setup even more.
@afharo Would you mind tagging config values used for testing purposes? It might help us to get rid of them in the future. Maybe we should allow using them in the dev mode only? (if they aren't used in functional tests which are run in production mode)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I had to do it only when running from source (dist === false) because integration tests run with dev: false.

@jbudz FYI: I removed the config from kibana-docker because it is now only allowed when running from source.

// and this config is solely introduced to allow some of the integration tests to run without an ES server.
schema.contextRef('dist'),
true,
schema.boolean({
Copy link
Contributor

@mshustov mshustov Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can use schena.never() instead

emsUrl: schema.conditional(
schema.siblingRef('proxyElasticMapsServiceInMaps'),
true,
schema.never(),
schema.string({ defaultValue: '' })
),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense! but then the types will be skipStartupConnectionCheck?: boolean, meaning that we'll have undefined | false | true, it doesn't change much of the behaviour/logic, but do we want those 3 states?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we don't. We can normalize them in ElasticsearchConfig:

this.skipStartupConnectionCheck = Boolean(rawConfig.skipStartupConnectionCheck)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! The error message is not as self-explanatory, though:

"[skipStartupConnectionCheck]: a value wasn't expected to be present"

as opposed to the current

"[skipStartupConnectionCheck]: \"skipStartupConnectionCheck\" can only be set to true when running from source to allow integration tests to run without an ES server"

If you think it's fine, I'm happy to change to schema.never() :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's fine, I'm happy to change to

Up to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it as is for now then, for consistency with other keys in the same Elasticsearch config (i.e.: ignoreVersionMismatch).

I've created #110239 to follow up and change them all if we think schema.never() is a better approach.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@afharo afharo merged commit bb8ee0c into elastic:master Aug 26, 2021
@afharo afharo deleted the test-es-client-product-check-behaviour branch August 26, 2021 14:36
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Aug 26, 2021
@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.x

This backport PR will be merged automatically after passing CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team v7.16.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gracefully handle elasticsearch client version check
9 participants