-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES性能优化 #5
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm using elasticsearch 1.5
and it is working perfectly the most part of the time, but everyday at the same time it becomes crazy, CPU % goes to ~70% when the average is around 3-5% there are SUPER servers with 32GB reserved for lucene, swap it is lock and clearing the cache doesn't solve the problem (it doesn't take down the heap mem)
Settings:
3 servers (nodes) 32 cores and 128GB RAM each
2 buckets (indices) one with ~18 million documents (this one doesn't receive updates pretty often just indexing new docs) the other one have around 7-8 million documents but we are constantly bombarding it with updates search delete and indexing as well
The best distribution for our structure, was to have only 1 shard per node with not replicas, we can afford to have a % of the data off for few seconds, that will be back as soon as the server get online again, and this process is fast enough since it doesn't need to relocate anything. previously we used to have 3 shards with 1 replica, but the issue mentioned above occurs as well, so is easy to figure it out that the problem is not related with the distribution.
Things that I already tried,
Merging, i try to use the Optimize API trying to give less load to the schedule merge, but actually the merging process takes a lot of R/W of the disk but it doesn't affect substantially the mem or the CPU load.
Flushing, I tried to flush with long and shot intervals, and the results were the same nothing
changed, since flushing affects directly the merging process and as mentioned above, merging process doesn't takes that much of the CPU or mem usage.
managing the cache, clearing it manually but it doesn't seems to take the cpu load to normal state not even for a moment.
Here is the most of the elasticsearch.yml configs
Force all memory to be locked, forcing the JVM to never swap
bootstrap.mlockall: true
Threadpool Settings
Search pool
threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 200
Bulk pool
threadpool.bulk.type: fixed
threadpool.bulk.size: 60
threadpool.bulk.queue_size: 3000
Index pool
threadpool.index.type: fixed
threadpool.index.size: 20
threadpool.index.queue_size: 1000
Indices settings
indices.memory.index_buffer_size: 30%
indices.memory.min_shard_index_buffer_size: 12mb
indices.memory.min_index_buffer_size: 96mb
Cache Sizes
indices.fielddata.cache.size: 30%
#indices.fielddata.cache.expire: 6h #will be depreciated & Dev recomend not to use it
indices.cache.filter.size: 30%
#indices.cache.filter.expire: 6h #will be depreciated & Dev recomend not to use it
Indexing Settings for Writes
index.refresh_interval: 30s
#index.translog.flush_threshold_ops: 50000
#index.translog.flush_threshold_size: 1024mb
index.translog.flush_threshold_period: 5m
index.merge.scheduler.max_thread_count: 1
here is the stats when the server is in a normal state:
node_stats_normal.txt
Node stats during the problem.
node_stats.txt
I will appreciate any help or discussion that can point me in the right direction to get rid of this behavior
thanks in advance..
Regards,
Daniel
Originally posted by @ACV2 in elastic/elasticsearch#4288 (comment)
The text was updated successfully, but these errors were encountered: