Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list session crash #32

Open
luigi-calori opened this issue Mar 12, 2020 · 2 comments
Open

list session crash #32

luigi-calori opened this issue Mar 12, 2020 · 2 comments

Comments

@luigi-calori
Copy link
Member

When there is a session on a login node that is no more available or hangs, then either hangs

2020-03-11 17:43:59,714 - INFO - Logging...

2020-03-11 17:43:59,815 - INFO - On rcm.galileo.cineca.it run:
module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=config --build_platform='{"platform": "linux_64bit_ubuntu18", "version": "v0.1.1-79-gd868ad9", "checksum": "e29fd23030dcc0ecaccaacd169e89b7b", "client_info": {"screen_width": 1920, "screen_height": 1080}}'

or error like

2020-03-11 15:39:00,260 - INFO - Welcome to RCM!

2020-03-11 15:39:05,768 - INFO - Logging...

2020-03-11 15:39:05,774 - INFO - On login.galileo.cineca.it run:
module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=config --build_platform='{"platform": "linux_64bit_ubuntu18", "version": "v0.1.1-2-g1a329c0", "checksum": "139c861df6b53bd2e0040a8d35663573", "client_info": {"screen_width": 1920, "screen_height": 1080}}'

2020-03-11 15:39:11,602 - INFO - Logged as clatini0 to login.galileo.cineca.it

2020-03-11 15:39:11,603 - INFO - Checking if a new client version is available...

2020-03-11 15:39:11,784 - INFO - The client is up-to-date

2020-03-11 15:39:11,785 - INFO - On login.galileo.cineca.it run:
module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=loginlist --subnet='130.186.17'

2020-03-11 15:39:15,201 - INFO - On login03.galileo.cineca.it run:
module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=list --subnet='130.186.17'

2020-03-11 15:39:25,212 - ERROR - Failed to reload the display sessions

2020-03-11 15:39:25,214 - ERROR - timed out

2020-03-11 15:39:25,214 - ERROR - Exception occurred Traceback (most recent call last): File "client/logic/manager.py", line 117, in prex File "site-packages/paramiko/client.py", line 343, in connect File "site-packages/paramiko/util.py", line 280, in retry_on_signal File "site-packages/paramiko/client.py", line 343, in socket.timeout: timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "client/gui/thread.py", line 78, in run File "client/logic/manager.py", line 162, in list File "client/logic/rcm_protocol_client.py", line 52, in wrapper File "client/logic/manager.py", line 124, in prex RuntimeError: timed out

@lferraro
Copy link

Stesso problema si è verificato quando hanno dismesso il cluster HPC3 (su cui probabilmente avevo una creato una sessione, forse anche già chiusa, ma i cui file erano presenti nella stessa directory .rcm perchè la $HOME è condivisa tra i cluster).

2020-03-17 12:21:10,979 - INFO - Welcome to RCM! 2020-03-17 12:21:46,485 - INFO - Logging... 2020-03-17 12:21:46,523 - INFO - On login06-hpc4.eni.cineca.it run: module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=config --build_platform='{"platform": "win32_64bit", "version": "v0.1.1-2-g1a329c0", "checksum": "79fc7ac538d174ffeb31c3602bdda42a", "client_info": {"screen_width": 1920, "screen_height": 1080}}' 2020-03-17 12:21:47,192 - INFO - Logged as cibo13 to login06-hpc4.eni.cineca.it 2020-03-17 12:21:47,196 - INFO - Checking if a new client version is available... 2020-03-17 12:21:47,263 - INFO - The client is up-to-date 2020-03-17 12:21:47,264 - INFO - On login06-hpc4.eni.cineca.it run: module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=loginlist --subnet='130.186.14' 2020-03-17 12:21:48,026 - INFO - On login06-hpc3.eni.cineca.it run: module load rcm; python $RCM_HOME/bin/server/rcm_new_server.py --command=list --subnet='130.186.14' 2020-03-17 12:21:48,539 - ERROR - Failed to reload the display sessions 2020-03-17 12:21:48,540 - ERROR - Authentication failed. 2020-03-17 12:21:48,540 - ERROR - Exception occurred Traceback (most recent call last): File "client\logic\manager.py", line 117, in prex File "site-packages\paramiko\client.py", line 437, in connect File "site-packages\paramiko\client.py", line 749, in _auth File "site-packages\paramiko\client.py", line 736, in _auth File "site-packages\paramiko\transport.py", line 1436, in auth_password File "site-packages\paramiko\auth_handler.py", line 236, in wait_for_response paramiko.ssh_exception.AuthenticationException: Authentication failed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "client\gui\thread.py", line 78, in run File "client\logic\manager.py", line 162, in list File "client\logic\rcm_protocol_client.py", line 52, in wrapper File "client\logic\manager.py", line 124, in prex RuntimeError: Authentication failed.

@lferraro
Copy link

WORKAROUND: in attesa del fix di questo bug, è possibile cancellare le informazioni della sessione non più raggiungibile (solo quella sessione, non tutto) che si trovano nella cartella $HOME/.rcm (Unix) o in C:/Users//.rcm (Windows).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants