-
Notifications
You must be signed in to change notification settings - Fork 14
Conversation
@adschwartz fysa |
|
I have had an idea today which might or might not work well. Is this idea something that could be done? |
Running the following test on 8 node cluster: {
"participants": [
{
"el_client_type": "geth",
"el_client_image": "ethpandaops/geth:master",
"cl_client_type": "teku",
"cl_client_image": "consensys/teku:develop",
"bn_max_cpu": 3000,
"bn_max_mem": 4096,
"count": 3
},
{
"el_client_type": "geth",
"el_client_image": "ethpandaops/geth:master",
"cl_client_type": "lighthouse",
"cl_client_image": "sigp/lighthouse:latest",
"bn_max_cpu": 3000,
"bn_max_mem": 4096,
"v_max_cpu": 3000,
"v_max_mem": 2048,
"count": 1
},
{
"el_client_type": "geth",
"el_client_image": "ethpandaops/geth:master",
"cl_client_type": "lodestar",
"cl_client_image": "chainsafe/lodestar:latest",
"bn_max_cpu": 3000,
"bn_max_mem": 4096,
"v_max_cpu": 3000,
"v_max_mem": 2048,
"count": 1
},
{
"el_client_type": "geth",
"el_client_image": "ethpandaops/geth:master",
"cl_client_type": "prysm",
"cl_client_image": "prysmaticlabs/prysm-beacon-chain:latest,prysmaticlabs/prysm-validator:latest",
"beacon_extra_params": ["--grpc-max-msg-size=18388608"],
"validator_extra_params": ["--grpc-max-msg-size=18388608"],
"bn_max_cpu": 3000,
"bn_max_mem": 4096,
"v_max_cpu": 3000,
"v_max_mem": 2048,
"count": 1
},
{
"el_client_type": "geth",
"el_client_image": "ethpandaops/geth:master",
"cl_client_type": "nimbus",
"cl_client_image": "statusim/nimbus-eth2:amd64-latest",
"v_max_cpu": 3000,
"v_max_mem": 2048,
"count": 1
}
],
"network_params": {
"network_id": "3151908",
"deposit_contract_address": "0x4242424242424242424242424242424242424242",
"seconds_per_slot": 12,
"slots_per_epoch": 32,
"genesis_delay": 1800,
"capella_fork_epoch": 2,
"deneb_fork_epoch": 1000,
"num_validator_keys_per_node": 10000,
"preregistered_validator_keys_mnemonic": "giant issue aisle success illegal bike spike question tent bar rely arctic volcano long crawl hungry vocal artwork sniff fantasy very lucky have athlete"
},
"launch_additional_services": true,
"wait_for_finalization": false,
"wait_for_verifications": false,
"verifications_epoch_limit": 5,
"global_client_log_level": "info"
} Results in this following error: Command returned with exit code '0' and the following output:
--------------------
starting at time.struct_time(tm_year=2023, tm_mon=7, tm_mday=31, tm_hour=11, tm_min=5, tm_sec=39, tm_wday=0, tm_yday=212, tm_isdst=0)
executing eth2-val-tools keystores --insecure --prysm-pass password --out-loc /node-3-keystores --source-mnemonic "giant issue aisle success illegal bike spike question tent bar rely arctic volcano long crawl hungry vocal artwork sniff fantasy very lucky have athlete" --source-min 30000 --source-max 40000
Error occurred while executing: eth2-val-tools keystores --insecure --prysm-pass password --out-loc /node-3-keystores --source-mnemonic "giant issue aisle success illegal bike spike question tent bar rely arctic volcano long crawl hungry vocal artwork sniff fantasy very lucky have athlete" --source-min 30000 --source-max 40000
Error output:
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion
runtime stack:
runtime.throw({0x73a5fa?, 0x47dd80?})
/usr/local/go/src/runtime/panic.go:1047 +0x5d fp=0x7fbf95511cb8 sp=0x7fbf95511c88 pc=0x451e1d
runtime.checkmcount()
/usr/local/go/src/runtime/proc.go:766 +0x8c fp=0x7fbf95511ce0 sp=0x7fbf95511cb8 pc=0x455b2c
runtime.mReserveID()
/usr/local/go/src/runtime/proc.go:782 +0x36 fp=0x7fbf95511d08 sp=0x7fbf95511ce0 pc=0x455b76
runtime.startm(0xc00002cf00, 0x0)
/usr/local/go/src/runtime/proc.go:2318 +0x92 fp=0x7fbf95511d50 sp=0x7fbf95511d08 pc=0x4589d2
runtime.handoffp(0xffffffff?)
/usr/local/go/src/runtime/proc.go:2361 +0x2ee fp=0x7fbf95511d78 sp=0x7fbf95511d50 pc=0x458eee
runtime.retake(0x1cc3f15dd4)
/usr/local/go/src/runtime/proc.go:5351 +0x1d5 fp=0x7fbf95511db8 sp=0x7fbf95511d78 pc=0x45fc75
runtime.sysmon()
/usr/local/go/src/runtime/proc.go:5259 +0x325 fp=0x7fbf95511e28 sp=0x7fbf95511db8 pc=0x45f9a5
runtime.mstart1()
/usr/local/go/src/runtime/proc.go:1426 +0x93 fp=0x7fbf95511e50 sp=0x7fbf95511e28 pc=0x457313
runtime.mstart0()
/usr/local/go/src/runtime/proc.go:1383 +0x79 fp=0x7fbf95511e80 sp=0x7fbf95511e50 pc=0x457259
runtime.mstart()
/usr/local/go/src/runtime/asm_amd64.s:390 +0x5 fp=0x7fbf95511e88 sp=0x7fbf95511e80 pc=0x47dd85 |
Fix the keystore_stop_index to: 6db2e07 |
It looks like the secret files are no longer populated on the pods. |
Interesting. I only tried docker; haven't tried this on k8s. I'm away till Wednesday(back Thursday) but I really like your multi node key generation idea. That scales pretty well and we can reduce time down to minutes - especially if we can do one container per node for which we need to generated keystores. |
Have implemented #82 (a container per keystore generator) I had closed this in favor of that |
This PR here shows how we can possibly multi thread key generation. This is badly written and needs to be re-written and this is just a proof of concept
Running 10 nodes with 10000 keys per node; this takes about 8 minutes and 23 seconds to generate the keys. Compared to the previous implementation which took about 55 seconds minutes to generate keys for one node (9 minutes 10 seconds). The improvement isn't as dramatic as I would have hoped. Tried again with the previous way to do 10x10000; it took 8 minutes and 30 seconds. Perhaps this is because Python threads make use of IO parallelism and not CPU parallelism; will try multiprocessing. Multiprocessing (latest commit) took 8 min 17 seconds. With 5 cores assigned, no matter if I do single load, multi thread, multiprocessing I am constantly hitting 500% in CPU; this is a compute heavy operation. Wonder how much can I really squeeze out by parallelizing things.
These numbers are from my M1 Pro ; 32 GB where I have given Docker 5 CPUs and 11.65GB of Memory. Note I have several other applications running but I can expect a similar improvement on your end. Perhaps we can even add
min_cpu
&&min_memory
to the configuration when we launch this on K8SOther things being considered - Generating a bunch of keys and allowing a user to upload a few files artifacts, and then pass the artifact ids and skip key generation all together. This will have some limitations. Currently the following are configurable
mnemnoic
- if we have a cache this won't be configurableprysm-pass
- if we have a cache this won't be configurablenum_keys_per_node
- currently this is configurable if we make it static; we can't configure this. We can have different bundles somewhere stored though or have a "bring your own keystore api"@barnabasbusa Can you give this a spin and see if you get any improvements on your k8s workflow? Further if you can see how much you can play with
max_concurrent_threads = 10
on line 44 in the Python script to see how far you can get. When I tried running all 50 nodes 10000 keys together that failed. Perhaps your k8s boxes are bigger and you can try a higher count.Imagining the bring your own bundle api even more - I can imagine it looking like
How does this sound?
Relevant issue #78
To try out this branch do
kurtosis run github.com/kurtosis-tech/eth-network-package@gyani/speed-up