Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadableKVStateBase.readCache is not properly synchronized #10127

Closed
OlegMazurov opened this issue Nov 27, 2023 · 1 comment · Fixed by #11112
Closed

ReadableKVStateBase.readCache is not properly synchronized #10127

OlegMazurov opened this issue Nov 27, 2023 · 1 comment · Fixed by #11112
Assignees
Labels
Modularization Issues or PRs related to modularization P1 High priority issue, which must be completed in the milestone otherwise the release is at risk.

Comments

@OlegMazurov
Copy link
Contributor

Description

The following exception is thrown by a node when under heavy load:

2023-11-27 20:33:31.143 ERROR 133  PreHandleWorkflowImpl - Unexpected error while pre handling a transaction!
java.lang.ClassCastException: class java.util.HashMap$Node cannot be cast to class java.util.HashMap$TreeNode (java.util.HashMap$Node and java.util.HashMap$TreeNode are in module java.base of loader 'bootstrap')
        at java.base/java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1994) ~[?:?]
        at java.base/java.util.HashMap$TreeNode.treeify(HashMap.java:2110) ~[?:?]
        at java.base/java.util.HashMap.treeifyBin(HashMap.java:778) ~[?:?]
        at java.base/java.util.HashMap.putVal(HashMap.java:650) ~[?:?]
        at java.base/java.util.HashMap.put(HashMap.java:618) ~[?:?]
        at com.hedera.node.app.spi.state.ReadableKVStateBase.markRead(ReadableKVStateBase.java:124) ~[app-spi-0.45.0-SNAPSHOT.jar:?]
        at com.hedera.node.app.spi.state.ReadableKVStateBase.get(ReadableKVStateBase.java:70) ~[app-spi-0.45.0-SNAPSHOT.jar:?]
        at com.hedera.node.app.service.token.impl.ReadableAccountStoreImpl.getAccountLeaf(ReadableAccountStoreImpl.java:132) ~[app-service-token-impl-0.45.0-SNAPSHOT.jar:?]
        at com.hedera.node.app.service.token.impl.ReadableAccountStoreImpl.getAccountById(ReadableAccountStoreImpl.java:104) ~[app-service-token-impl-0.45.0-SNAPSHOT.jar:?]
        at com.hedera.node.app.workflows.prehandle.PreHandleWorkflowImpl.preHandleTransaction(PreHandleWorkflowImpl.java:174) ~[HederaNode.jar:?]
        at com.hedera.node.app.workflows.prehandle.PreHandleWorkflowImpl.lambda$preHandle$0(PreHandleWorkflowImpl.java:128) ~[HederaNode.jar:?]
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) [?:?]
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708) [?:?]
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) [?:?]
        at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291) [?:?]
        at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754) [?:?]
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) [?:?]
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312) [?:?]
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) [?:?]
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) [?:?]
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188) [?:?]

The reason is unsynchronized calls to markRead() -> readCache.put() from multiple threads.

Steps to reproduce

The exception was observed when running the NftTransferLoadTest benchmark (during the token creation phase).

Additional context

No response

Hedera network

other

Version

develop

Operating system

Linux

@OlegMazurov OlegMazurov added the Modularization Issues or PRs related to modularization label Nov 27, 2023
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Services Team Nov 27, 2023
@rbair23 rbair23 added the P1 High priority issue, which must be completed in the milestone otherwise the release is at risk. label Nov 28, 2023
@netopyr netopyr moved this from 📋 Backlog to 🏃🏻 Sprint Backlog in Services Team Jan 4, 2024
@OlegMazurov
Copy link
Contributor Author

Functionality is fixed by #10965 with a heavy price on performance: ConcurrentHashMap should have been used instead of Collections.synchronizedMap(), which introduced a heavily contended lock.

@povolev15 povolev15 self-assigned this Jan 22, 2024
@povolev15 povolev15 moved this from 🏃🏻 Sprint Backlog to 👷🏼‍♀️ In Progress in Services Team Jan 22, 2024
@povolev15 povolev15 moved this from 👷🏼‍♀️ In Progress to 👀 In Review in Services Team Jan 23, 2024
@github-project-automation github-project-automation bot moved this from 👀 In Review to ✅ Done in Services Team Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Modularization Issues or PRs related to modularization P1 High priority issue, which must be completed in the milestone otherwise the release is at risk.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants