Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] High CPU usage with flat_object field type #12030

Closed
waqarsky opened this issue Jan 26, 2024 · 7 comments
Closed

[BUG] High CPU usage with flat_object field type #12030

waqarsky opened this issue Jan 26, 2024 · 7 comments
Labels

Comments

@waqarsky
Copy link

waqarsky commented Jan 26, 2024

Describe the bug

3 node cluster without dedicated master.

  "version" : {
    "distribution" : "opensearch",
    "number" : "2.8.0",
    "build_type" : "tar",
    "build_hash" : "db90a415ff2fd428b4f7b3f800a51dc229287cb4",
    "build_date" : "2023-06-03T06:24:11.331474107Z",
    "build_snapshot" : false,
    "lucene_version" : "9.6.0",
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"

Mapping template snippet:

"hashicorp": {
    "properties": {
        "vault": {
            "properties": {
                "audit": {
                    "properties": {
                        "request": {
                          "properties": {
                            "headers": {
                              "type": "flat_object"
                                  },
                              "data": {
                              "type": "flat_object"
                              },

hot_threads output

   100.5% (502.3ms out of 500ms) cpu usage by thread 'opensearch[{{HOSTNAME REDACTED}}][write][T#13]'
     10/10 snapshots sharing following 42 elements
       app//org.opensearch.index.mapper.ParseContext$Document.getFields(ParseContext.java:146)
       app//org.opensearch.index.mapper.FlatObjectFieldMapper.parseValueAddFields(FlatObjectFieldMapper.java:631)
       app//org.opensearch.index.mapper.FlatObjectFieldMapper.parseCreateField(FlatObjectFieldMapper.java:561)
       app//org.opensearch.index.mapper.FieldMapper.parse(FieldMapper.java:270)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:522)
       app//org.opensearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:540)
       app//org.opensearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:442)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:414)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:519)
       app//org.opensearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:540)
       app//org.opensearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:442)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:414)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:519)
       app//org.opensearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:540)
       app//org.opensearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:442)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:414)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:519)
       app//org.opensearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:540)
       app//org.opensearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:442)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:414)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:519)
       app//org.opensearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:540)
       app//org.opensearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:442)
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:414)
       app//org.opensearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:136)
       app//org.opensearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:91)
       app//org.opensearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:253)
       app//org.opensearch.index.shard.IndexShard.prepareIndex(IndexShard.java:1029)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:986)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:909)
       app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:621)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:467)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:531)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:412)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:124)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:223)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
       [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       [email protected]/java.lang.Thread.run(Thread.java:833)

Seeing 100% cpu usage across all 3 nodes with the same report on hot_threads. Data coming into OS has some fields which are mapped as "flat_object"
Issue stops happening if the mapping is changed to, e.g. keyword - CPU falls back down to ~3% from 100% with the same rate of incoming events as well as search.
Visualised across several days.

Changing the mapping back to flat_object to confirm theory was successful as then CPU shot back to 100% after the index was rolled over to start using flat_object again.

One other very important thing to note is that while this high CPU is occurring, a number of duplicate documents in OS were noticed. Exactly the same just the ID were different.

Related component

Indexing:Performance

To Reproduce

Use flat_object like in snippet above - > CPU 100%
Change flat_object to keyword - > CPU 3%
Change back to flat_object - > CPU 100%

Expected behavior

Not high CPU

Additional Details

No response

@waqarsky waqarsky added bug Something isn't working untriaged labels Jan 26, 2024
@reta
Copy link
Collaborator

reta commented Jan 26, 2024

Ah ... this is so sad, the problem was reported on the forum [1] but the user had never created an issue for that, it got lost, thank you @waqarsky for filling this one in.

[1] https://forum.opensearch.org/t/high-cpu-usage-with-several-write-threads-pools-in-queue/15642

@mgodwan
Copy link
Member

mgodwan commented Jan 27, 2024

Wondering if this still exists after the change done as part of #7835

@waqarsky
Copy link
Author

@reta @mgodwan Upgraded to 2.11.1, CPU seems to have gone down after a full restart. Keeping an eye on the performance, will update if anything changes.

1 - were the duplicate documents part of this bug and how so?
2 - Seeing this error below on Dashboards after the upgrade
image

@reta
Copy link
Collaborator

reta commented Jan 30, 2024

@reta @mgodwan Upgraded to 2.11.1, CPU seems to have gone down after a full restart. Keeping an eye on the performance, will update if anything changes.

👍

1 - were the duplicate documents part of this bug and how so?

I am not aware of this side effect

2 - Seeing this error below on Dashboards after the upgrade

A bit more details (server logs?) would help, thank you

@waqarsky
Copy link
Author

@reta Nothing in the logs that is related, however have found this #11491

@reta
Copy link
Collaborator

reta commented Jan 31, 2024

@reta Nothing in the logs that is related, however have found this #11491

Thank you @waqarsky , so it should be fixed in 2.12.0 (due in Feb)

@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7 8]
@waqarsky Thanks for filing, it looks like this issue has been addressed. Please reopen if you see it in 2.12+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants