Disable Disk-based Shard Allocation #70

Reedtechno · 2019-08-22T14:30:59Z

Disk space that is allocated to Elasticsearch is controlled by securityonion.conf and curator. If Disk-based Shard Allocation is enabled, it leads to disk watermark errors and indices being locked as read-only when disk usage hits 90%. Since Security Onion configures each Elasticsearch instance as a single node cluster, the index can never be moved to another node. This results in data being lost and not ingested into Elasticsearch. This only becomes a problem with larger disk sizes when users want to utilize greater than 90% of their disk.

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/6.7/disk-allocator.html

There are mentions of this error in documentation currently that do not address why it happens or how to prevent it. Documentation is also missing the last recovery step: curl command to set: "index.blocks.read_only_allow_delete": null on the affected indices.
Documentation page: https://github.com/Security-Onion-Solutions/security-onion/wiki/Logstash

Disk space that is allocated to Elasticsearch is controlled by securityonion.conf and curator. If Disk-based Shard Allocation is enabled, it leads to disk watermark errors and indices being locked as readonly. Since Security Onion configures each Elasticsearch instance as a single node cluster, the index can never be moved to another node. This results in data being lost and not ingested into Elasticsearch.

dougburks · 2019-10-24T21:55:27Z

Hi @Reedtechno ,

Thanks for the PR. Per our discussion today, it's probably best to keep disk-based shard allocation enabled because we really don't want to let the partition hit 100% disk usage as that might cause other (larger) problems.

It's worth noting that Setup defaults LOG_SIZE_LIMIT to 50% of your disk space. Depending on the options chosen during Setup, it may ask you if you want to change that default. Perhaps we just need to add a note to that screen reminding the user that the value should be less than 90%.

Thoughts?

Reedtechno · 2019-10-25T22:19:42Z

Hey @dougburks ,

That reasoning makes sense to me. What do you think about setting the high watermark as a actual size rather than a percentage then? If a user is running a storage node with 1TB of storage, the default setup right now will not allow them to store more that 899GB of data in Elasticsearch. Would something like the below setting be a one-size-fits-most solution?

"cluster.routing.allocation.disk.watermark.low": "50gb", "cluster.routing.allocation.disk.watermark.high": "15gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb",

I feel like the 90% disk space is hard to plan for during initial setup because that is also impacted by system files and any other stuff on the disk. Being able to say in docs or setup that stuff will break if available disk space falls below X could clear it up a little and also still provide the protection from filling up the disk.

dougburks mentioned this pull request Oct 24, 2019

Setup: remind user to keep LOG_SIZE_LIMIT under 90% Security-Onion-Solutions/security-onion#1659

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable Disk-based Shard Allocation #70

Disable Disk-based Shard Allocation #70

Reedtechno commented Aug 22, 2019

dougburks commented Oct 24, 2019

Reedtechno commented Oct 25, 2019

Disable Disk-based Shard Allocation #70

Are you sure you want to change the base?

Disable Disk-based Shard Allocation #70

Conversation

Reedtechno commented Aug 22, 2019

dougburks commented Oct 24, 2019

Reedtechno commented Oct 25, 2019