-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSK stuck in handling debug_traceTransaction RPC calls #1692
Comments
would be nice to know which java classes we might want to enable debug level logs to have more info in the logs in regards. |
Hi guys! I'm seeing that you're setting some tracing options on your request: { "disableStorage": true, "disableMemory": false, "disableStack": false, "fullStorage": false } For now, this options are being ignored, but we have a PR that will be merged in a near future where we start supporting the following tracing options: Besides all of that, I will do a couple of tests on my side, and will keep you posted on my findings. Thanks for reporting this! |
Hi guys! I have a couple of questions regarding the kind of processing you do to help me reproduce the problem:
Thanks a lot in advance! |
Hi! Many thanks for your reply!
To reproduce this issue it's enough to repeat the same request. It's stop in few hours or days. I didn't found any logic here.
No, the same transactions. It's just continues indefinitely (http timeout error)
We have tried to repeat same request per 30 seconds and one request per second, in any case RSK node stuck in some time. RSK version v3.1.0 |
Hi nagarev, |
maybe today :) ? |
Hi guys! Sorry for the delay! Yes, I've been running different tests and I've found a couple of insights regarding this, but I'm trying to figure out it completely before drawing any conclusion. In the meantime, have you guys performed some other test regarding that? |
no just the one described above. thnaks |
Hi @nagarev ! |
Hi guys! sorry for the long delay! I couldn't reproduce the issue, but I've done some research, please let me show you my results and conclusions: First of all, I checked the response size to know how it changes depending on filters being set (yes, now we support some filters for the This are the results:
So you can see the storage being the main contributor to the response size (97.6% of it). Knowing this, I've performed some stress tests on the node (basically, I've performed this request alongside other really heavy ones over an over, and let it run for a couple of hours) but couldn't observe any hangups (was not using the filters). Maybe you can try to reproduce it again using the new version (that will use the filters you're setting) and see what happens. I don't think I've got that last problem you've reported, regarding the getBlockNumber, does this method starts to fail when the other is stucked? |
RSK node stuck during processing RPC debug_traceTransaction call. Example of request:
This request works fine for some time (from few hours to days), but after time pass requests stuck on receiving response from RSK node port 4444 (rpc). Each of this stuck request left non-closed TCP session on 4444 socket.
output of
netstat -anp | grep :4444
In this time all other requests works fine (like eth_blockNumber, web3_clientVersion etc)
But when number of open connections in CLOSE_WAIT state reaches 28. RSK stop to process any RPC requests:
So looks like RSK node stop to close socket properly during processing debug_traceTransaction RPC function which in some moment resulting in a fully nonfunctional RPC.
After RSK node restart is able to process RPC calls again.
With TRACE logs enabled for "execute,blockexecutor,web3,jsonrpc" loggers. We collected logs for the whole day and found that sometimes during processing debug_traceTransaction, it's stuck before trace log message "Go transaction 390" and "Finalize transaction 390" .
Log file for 2022-01-11 attached:
rsklog.tar.gz
So the last trace is here:
rskj/rskj-core/src/main/java/org/ethereum/core/TransactionExecutor.java
Line 415 in 58ce04c
Than this RPC call closed by timeout on client side and socket keep connection on 4444 port (rpc) in CLOSE_WAIT tcp session state.
Reproducing this issue is quite easy but took some time of repetitively execution of debug_traceTransaction RPC call or using Debugger in Remix.
Platform: AWS EC2
OS: Ubuntu 18.04, kernel 5.4.0-1059-aws
Instance type: m5.xlarge
RSK Network: Mainnet
RSK node configs:
logback.xml.txt
node.conf.txt
Thanks in advance and lemme know if I can assist somehow.
The text was updated successfully, but these errors were encountered: