You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2i2c jupyterhub setups comes with two systems to handle inactivity setup by default. In this issue I'm summarizing what I think can be used to update the docs we provide about server culling and kernel culling.
A jupyter kernel culling system
If a jupyter notebook file is opened, a "kernel" is started. The kernel will retain state (variables' values etc) based on code that has run in the kernel (executed notebook cells). What the kernel culling system does, is that it terminates kernels that has been "idle" for one hour or more.
In practice this means that if you have a long running job running from a jupyter notebook where a kernel is involved, and want to retain state after the notebook execution completes, then the kernel culling system should be disabled.
Disabling it
This can be disabled for individual users by providing a ~/.jupyter/jupyter_server_config.json file like:
To disable it for an entire hub, a 2i2c engineer can re-configure the file we inject in user servers via the basehub chart:
jupyterhub:
singleuser:
extraFiles:
# ensure kernel culling is disabled so in-memory state of a long running# job is retained after it completejupyter_server_config.json:
data:
MappingKernelManager:
cull_idle_timeout: 0
A jupyter server culling system
When a user server is started by jupyterhub, it registeres to get information about "activity" from the user server. If the user server hasn't been accessed via the network recently (a user's browser does things), and the server reports no activity in the last hour, then its shut down.
A big drawback of this system is that it fails to regonize all activity. For example if a user starts a user server, then runs a command in a terminal to come back a week later and check, it could have been terminated by a lack of perceived activity. Something was running in a terminal, but it likely didn't register as server activity to this system. Not even busy kernels registers as server activity by itself, only if the busy kernel writes a status message reguarly for example.
A big upside of this system is that it helps protect users from forgetting to shut down a powerful server, and that can be costly.
I suggest three strategies to protect long running jobs:
We disable the server culling system it for everyone
We increase the inactivity duration from 1 hour to 24 hours or more
Note that you can track user server activity as understood by jupyterhub, and their status by visiting https://jupyter.quantifiedcarbon.com/hub/admin. If the server culling system is disabled, it may be relevant to check in there from time to time to avoid having a large server running without a user attending to it.
Disabling it
Its a basehub chart configuration of the dependency chart jupyterhub:
jupyterhub:
# ensure user server culling is disabled so server inactivity (includes busy# kernels that emit nothing while computing) doesn't get interruptedcull:
enabled: false
The text was updated successfully, but these errors were encountered:
consideRatio
changed the title
Refine docs here, and upstream, about server and kernel culling
Refine docs in this repo and upstream about server and kernel culling
May 23, 2023
2i2c jupyterhub setups comes with two systems to handle inactivity setup by default. In this issue I'm summarizing what I think can be used to update the docs we provide about server culling and kernel culling.
A jupyter kernel culling system
If a jupyter notebook file is opened, a "kernel" is started. The kernel will retain state (variables' values etc) based on code that has run in the kernel (executed notebook cells). What the kernel culling system does, is that it terminates kernels that has been "idle" for one hour or more.
In practice this means that if you have a long running job running from a jupyter notebook where a kernel is involved, and want to retain state after the notebook execution completes, then the kernel culling system should be disabled.
Disabling it
This can be disabled for individual users by providing a
~/.jupyter/jupyter_server_config.json
file like:To disable it for an entire hub, a 2i2c engineer can re-configure the file we inject in user servers via the basehub chart:
A jupyter server culling system
When a user server is started by jupyterhub, it registeres to get information about "activity" from the user server. If the user server hasn't been accessed via the network recently (a user's browser does things), and the server reports no activity in the last hour, then its shut down.
A big drawback of this system is that it fails to regonize all activity. For example if a user starts a user server, then runs a command in a terminal to come back a week later and check, it could have been terminated by a lack of perceived activity. Something was running in a terminal, but it likely didn't register as server activity to this system. Not even busy kernels registers as server activity by itself, only if the busy kernel writes a status message reguarly for example.
A big upside of this system is that it helps protect users from forgetting to shut down a powerful server, and that can be costly.
I suggest three strategies to protect long running jobs:
Note that you can track user server activity as understood by jupyterhub, and their status by visiting https://jupyter.quantifiedcarbon.com/hub/admin. If the server culling system is disabled, it may be relevant to check in there from time to time to avoid having a large server running without a user attending to it.
Disabling it
Its a basehub chart configuration of the dependency chart jupyterhub:
Related
The text was updated successfully, but these errors were encountered: