Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better health checks and monitoring for solr service #930

Closed
3 tasks
adborden opened this issue Sep 3, 2019 · 1 comment
Closed
3 tasks

Better health checks and monitoring for solr service #930

adborden opened this issue Sep 3, 2019 · 1 comment
Labels
component/catalog Related to catalog component playbooks/roles component/inventory Inventory playbooks/roles

Comments

@adborden
Copy link
Contributor

adborden commented Sep 3, 2019

User Story

As an operator, I want to know when Solr is returning a high number of errors that might impact the health of Catalog and Inventory so that I am not waiting for Catalog or Inventory to go down before being able to act on the Solr issue.

Details

Solr serves both Catalog and Inventory and is a major backing service to CKAN. If Solr is unavailable, or returning errors, Catalog and Inventory are basically down. Currently, we only know that Solr is acting up when we observe a higher number of errors in Catalog or Inventory, which might manifest as an Uptrends "down" alert.

We currently monitor via New Relic that the solr service is running and the host is up.

Because of how we shard traffic to solr, it's very possible that one solr instance having issues would go unnoticed, or appear as intermittent errors in Catalog and Inventory.

Acceptance Criteria

  • Production issues with solr result in a Critical alert
  • Intermittent errors trigger an alert
  • Issues with a single solr host trigger an alert
@adborden adborden added component/catalog Related to catalog component playbooks/roles component/inventory Inventory playbooks/roles labels Feb 8, 2020
@nickumia-reisys
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/catalog Related to catalog component playbooks/roles component/inventory Inventory playbooks/roles
Projects
None yet
Development

No branches or pull requests

2 participants