Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Platform] Status endpoint incorrectly returns 200 before Kibana is ready #66326

Closed
sorenlouv opened this issue May 12, 2020 · 11 comments
Closed
Labels
Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@sorenlouv
Copy link
Member

sorenlouv commented May 12, 2020

I'm trying to programatically determine when Kibana is ready to serve requests and therefore calling the status api /api/status.

Expected behaviour
Until Kibana is up and ready to receive requests I would expect /api/status to return something other than 200.

Actual behaviour
/api/status initially returns 503

At some point the optimization step is complete which will be logged as:

[info][optimize] Optimization success in 2.83 seconds

At this point /api/status will return 200. However opening Kibana in the browser will still return an error message: "Elastic did not load properly. Check the server output for more information."

image

After about 60 seconds Kibana will output:

[success][@kbn/optimizer] 88 bundles compiled successfully after 297.9 sec, watching for changes

Kibana will now be truly ready to serve requests

Reproduce
This only seems to happen in a bare repository without any cached artefacts - at all! The easiest way to get to this state is by wiping all build files find {x-pack,src} -type d -name "target" | xargs rm -rf .

Cold start Kibana:

find {x-pack,src} -type d -name "target" | xargs rm -rf  && yarn kbn bootstrap &&   node ./scripts/kibana --no-base-path --dev --elasticsearch.username admin --elasticsearch.password changeme --elasticsearch.hosts http://localhost:9200

Poll the status endpoint while Kibana is starting:

for run in {1..1000}; do curl -s -o /dev/null -w "%{http_code}\n" -u admin:changeme localhost:5601/api/status; sleep 0.5; done

Note: you should be able to remove the credentials if you are connecting to an ES without security enabled

@sorenlouv sorenlouv added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team labels May 12, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@tylersmalley
Copy link
Contributor

Sounds like we're not awaiting the new platform plugins being built.

@jbudz
Copy link
Member

jbudz commented May 12, 2020

Related #46984, #6145

@sorenlouv
Copy link
Member Author

Are there any suggested workarounds until this is fixed? We are unable to run our Cypress tests suite on CI because of this but maybe someone has a bright idea.

@mshustov
Copy link
Contributor

Are there any suggested workarounds until this is fixed? We are unable to run our Cypress tests suite on CI because of this but maybe someone has a bright idea.

@sorenlouv
Copy link
Member Author

Thanks @restrry !

run tests again pre-built version of Kibana

Can we (easily) do this if we want to run the tests on every push to a PR?

use the same logic as FTR: to parse logs and wait for http server running

@v1v WDYT about this? Would you be able to look into the manual parse step?

@v1v
Copy link
Member

v1v commented May 18, 2020

@v1v WDYT about this? Would you be able to look into the manual parse step?

Sure thing, I can give a go

v1v added a commit to v1v/kibana that referenced this issue May 25, 2020
@v1v
Copy link
Member

v1v commented May 25, 2020

4dea197 could be the potential workaround in our end:

grep -m 1 "http server running" <(tail -f -n +1 kibana.log)

But I don't see the above wait to be 100% silver bullet, the E2E fails anyway since even the Kibana is up and running.

In other words, we have included two validations for running the E2E tests when kibana is up and running:

echo "Note: you need to start Kibana manually. Find the instructions at the top."
yarn wait-on -i 500 -w 500 http-get://admin:changeme@localhost:$KIBANA_PORT/api/status > /dev/null

## Workaround to wait for the http server running
## See: https://github.com/elastic/kibana/issues/66326
if [ -e kibana.log ] ; then
    grep -m 1 "http server running" <(tail -f -n +1 kibana.log)
    grep -m 1 "http server running" kibana.log. ## duplicated for some other reasons
    echo "\n✅ Kibana server running...\n"
fi

echo "\n✅ Setup completed successfully. Running tests...\n"

The log output looks like the below one:

[2020-05-25T10:47:59.742Z] + echo '\nWaiting for Kibana to start...'
[2020-05-25T10:47:59.742Z] \nWaiting for Kibana to start...
[2020-05-25T10:47:59.742Z] + echo 'Note: you need to start Kibana manually. Find the instructions at the top.'
[2020-05-25T10:47:59.742Z] Note: you need to start Kibana manually. Find the instructions at the top.
[2020-05-25T10:47:59.742Z] + yarn wait-on -i 500 -w 500 http-get://admin:changeme@localhost:5701/api/status
[2020-05-25T10:52:06.623Z] + '[' -e kibana.log ']'
[2020-05-25T10:52:06.624Z] + grep -m 1 'http server running' /dev/fd/63
[2020-05-25T10:52:06.624Z] ++ tail -f -n +1 kibana.log
[2020-05-25T10:52:06.624Z] {"type":"log","@timestamp":"2020-05-25T10:51:55+00:00","tags":["info","http","server","Kibana"],"pid":5240,"message":"http server running at http://localhost:5701"}
[2020-05-25T10:52:06.624Z] + grep -m 1 'http server running' kibana.log
[2020-05-25T10:52:06.624Z] {"type":"log","@timestamp":"2020-05-25T10:51:55+00:00","tags":["info","http","server","Kibana"],"pid":5240,"message":"http server running at http://localhost:5701"}
[2020-05-25T10:52:06.624Z] + echo '\n✅ Kibana server running...\n'
[2020-05-25T10:52:06.624Z] \n✅ Kibana server running...\n
[2020-05-25T10:52:06.624Z] + echo '\n✅ Setup completed successfully. Running tests...\n'

Probably too verbose, but there are three important timestamps:

  • [2020-05-25T10:47:59.742Z] + yarn wait-on -i 500 -w 500 http-get://admin:changeme@localhost:5701/api/status
  • [2020-05-25T10:52:06.623Z] + '[' -e kibana.log ']'
  • [2020-05-25T10:52:06.624Z] {"type":"log","@timestamp":"2020-05-25T10:51:55+00:00","tags":["info","http","server","Kibana"],"pid":5240,"message":"http server running at http://localhost:5701"}

In other words, the yarn wait-on api/status is equally the same that grep "http server running", and I'm afraid the bundles are not yet available :)

So the retry/sleep seems to be approach to run to bypass this particular issue in our end.

@sorenlouv
Copy link
Member Author

sorenlouv commented May 25, 2020

@v1v Thanks for looking into this!
Let's move the discussion about the e2e specific problems to a separate issue to not clutter this thread.

Edit: Oh, I misread the first part. I got the impression that we can reliably determine when Kibana is ready to service requests. But it sounds like that is still not trivial, even if we parse the logs?

So the retry/sleep seems to be approach to run to bypass this particular issue in our end.

We can go down this road if really necessary but would obviously be much better if we can find a way to programatically determine when Kibana is ready.

What about waiting for the string bundles compiled successfully after?

@pgayvallet
Copy link
Contributor

Fixed by #159768 for production usages. Not sure about the dev-environment use case given the dev-server is on front of Kibana, but given the operation team tag was removed from the issue, I'll assume we're only talking about production use cases here and close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

7 participants