Skip to content
This repository has been archived by the owner on Apr 19, 2021. It is now read-only.

Disable Disk-based Shard Allocation #70

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Reedtechno
Copy link

Disk space that is allocated to Elasticsearch is controlled by securityonion.conf and curator. If Disk-based Shard Allocation is enabled, it leads to disk watermark errors and indices being locked as read-only when disk usage hits 90%. Since Security Onion configures each Elasticsearch instance as a single node cluster, the index can never be moved to another node. This results in data being lost and not ingested into Elasticsearch. This only becomes a problem with larger disk sizes when users want to utilize greater than 90% of their disk.

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/6.7/disk-allocator.html

There are mentions of this error in documentation currently that do not address why it happens or how to prevent it. Documentation is also missing the last recovery step: curl command to set: "index.blocks.read_only_allow_delete": null on the affected indices.
Documentation page: https://github.com/Security-Onion-Solutions/security-onion/wiki/Logstash

Disk space that is allocated to Elasticsearch is controlled by securityonion.conf and curator.  If Disk-based Shard Allocation is enabled, it leads to disk watermark errors and indices being locked as readonly.  Since Security Onion configures each Elasticsearch instance as a single node cluster, the index can never be moved to another node.  This results in data being lost and not ingested into Elasticsearch.
@dougburks
Copy link

Hi @Reedtechno ,

Thanks for the PR. Per our discussion today, it's probably best to keep disk-based shard allocation enabled because we really don't want to let the partition hit 100% disk usage as that might cause other (larger) problems.

It's worth noting that Setup defaults LOG_SIZE_LIMIT to 50% of your disk space. Depending on the options chosen during Setup, it may ask you if you want to change that default. Perhaps we just need to add a note to that screen reminding the user that the value should be less than 90%.

Thoughts?

@Reedtechno
Copy link
Author

Hey @dougburks ,

That reasoning makes sense to me. What do you think about setting the high watermark as a actual size rather than a percentage then? If a user is running a storage node with 1TB of storage, the default setup right now will not allow them to store more that 899GB of data in Elasticsearch. Would something like the below setting be a one-size-fits-most solution?

"cluster.routing.allocation.disk.watermark.low": "50gb", "cluster.routing.allocation.disk.watermark.high": "15gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb",

I feel like the 90% disk space is hard to plan for during initial setup because that is also impacted by system files and any other stuff on the disk. Being able to say in docs or setup that stuff will break if available disk space falls below X could clear it up a little and also still provide the protection from filling up the disk.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants