Automatic SSH pooling/multiplexing/... configuration? #215
Replies: 2 comments 5 replies
-
I am not (yet) sure we can really query SSH internals, we may need to scale this empirically — probably, we need two dynamic variables:
I have tried to study the actual observation that opening many connections via SSH via import logging
import asyncssh.logging
import asyncio, asyncssh, sys
root_logger = logging.getLogger()
asyncssh.logging.set_log_level(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
root_logger.addHandler(handler)
async def run_client():
async with asyncssh.connect(host='hpc.login.node',username='service-account', client_keys='/var/lib/cobald/.ssh/id_ed25519') as conn:
try:
result = await conn.run('scancel 9999999', check=True, input=None)
except asyncssh.ChannelOpenError as coe:
print("Something is bad...")
return
print(result.stdout, end='')
async def main():
loop = asyncio.get_event_loop()
tasks = []
for i in range(1,30):
tasks.append(loop.create_task(run_client()))
await asyncio.gather(*tasks)
try:
asyncio.get_event_loop().run_until_complete(main())
except (OSError, asyncssh.Error) as exc:
sys.exit('SSH connection failed: ' + str(exc)) Using
Trying the same thing with channels, I can use: import logging
import asyncssh.logging
import asyncio, asyncssh, sys
root_logger = logging.getLogger()
asyncssh.logging.set_log_level(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
root_logger.addHandler(handler)
async def run_cmd(conn):
try:
result = await conn.run('scancel 9999999', check=True, input=None)
except asyncssh.ChannelOpenError as coe:
print("Something is bad...")
return
print(result.stdout, end='')
async def main():
loop = asyncio.get_event_loop()
conn = await asyncssh.connect(host='hpc.login.node',username='service-account', client_keys='/var/lib/cobald/.ssh/id_ed25519')
tasks = []
for i in range(1,30):
tasks.append(run_cmd(conn))
await asyncio.gather(*tasks)
try:
asyncio.get_event_loop().run_until_complete(main())
except (OSError, asyncssh.Error) as exc:
sys.exit('SSH connection failed: ' + str(exc)) This reliably breaks after about 10 channels, and it fails with the message So at least we have two "signals" which may show resource exhaustion — or also just connection errors in general, at least in the first case. We may need to dig deeper to find potential internals of SSH which may help us to judge on this. |
Beta Was this translation helpful? Give feedback.
-
Based on @olifre and my tests I propose the following two-step roadmap:
|
Beta Was this translation helpful? Give feedback.
-
SSH commands currently re-use the same connection, and we just hope that one connection with multiplexing is enough. Depending on the situation, we might need to pool several connections, limit multiplexing, or even limit command frequency though. Since proper configuration of all this is likely complex and limited by knowledge of the setup, having the SSH executor configure itself automatically would be useful.
Which parameters would be useful for us? Which information can we query from ssh itself and which (how?) must we discover ourselves?
See also #144 on pooling of multiple connections and #145 on multiplexing over one connection.
Beta Was this translation helpful? Give feedback.
All reactions