Releases: rackslab/Slurm-web
Releases · rackslab/Slurm-web
v4.0.0
Added
- Support Slurm 24.11 and Slurm REST API v0.0.40 (#366 → #400).
- agent:
- Return RacksDB infrastructure name and a boolean to indicate if metrics feature is enabled in
/info
endpoint, in addition to the cluster name. - Add optional
/metrics
endpoint with various Slurm metrics in OpenMetrics format designed to be scraped by Prometheus or compatible (#274). - Add possibility to query metrics from Prometheus database with
/v<version>/metrics/<metric>
endpoint. - Add possibility to filter jobs which are allocated a specific node with node query parameter on
/v<version>/jobs
endpoint.
- Return RacksDB infrastructure name and a boolean to indicate if metrics feature is enabled in
- gateway:
- Return RacksDB infrastructure name and boolean metrics feature flag of every clusters in
/clusters
endpoint. - Return optional markdown login service message as rendered HTML page with
/messages/login
endpoint. - Proxy metrics requests to agent through
/api/agents/<cluster>/metrics/<metric>
endpoint.
- Return RacksDB infrastructure name and boolean metrics feature flag of every clusters in
- frontend:
- Request RacksDB with the infrastructure name provided by the gateway (#348).
- Display time limit of running jobs in job details page (#352).
- Display service message below login form if defined (#253).
- Add dependency on charts.js and luxon adapter to draw charts with timeseries metrics.
- Display charts of resources (nodes/cores) status and jobs queue in dashboard page based on metrics from Prometheus (#275).
- Display list of jobs which have resources allocated on the node in node details page (#292).
- Display hash near all jobs fields in job details page to generate link to highlight specific field (#251).
- Represent terminated jobs with colored bullet in job status badge, using respectively green for completed (ie. successful) jobs, red for failed jobs and dark orange for timeout jobs (#354).
- conf:
- Add
racksdb
>infrastructure
parameter for the agent. - Add
metrics
>enabled
parameter for the agent. - Add
metrics
>restrict
parameter for the agent. - Add
metrics
>host
parameter for the agent. - Add
metrics
>job
parameter for the agent. - Add
ui
>templates
,message_template
,message_login
parameters for the gateway. - Select
alloc_cpus
andalloc_idle_cpus
nodes fields onslurmrestd
/slurm/*/nodes
and/slurm/*/node/<node>
endpoints. - Select
nodes
jobs field onslurmrestd
/slurm/*/jobs
endpoint. - Introduce service message template.
- Add
- show-conf: Introduce
slurm-web-show-conf
utility to dump current configuration settings of gateway and agent components with their origin, which can either be configuration definition file or site override (#349). - docs:
- Add manpage for
slurm-web-show-conf
command. - Add metrics feature configuration documentation page.
- Mention metrics optional feature in quickstart guide.
- Mention metrics export and charts feature in overview page.
- Mention possible Prometheus integration in architecture page.
- Mention login service message feature in overview page.
- Mention jobs badges to visualize job status in overview page.
- Add page to document Service Messages configuration.
- Mention support of Fedora 41.
- Add manpage for
- pkgs:
- Introduce
gateway
Python extra package. - Add requirement on markdown external library for
gateway
extra package. - Add dependency on prometheus-client for the agent.
- Add direct dependency on ClusterShell for the agent.
- Introduce
Changed
- agent: Bump minimal required Slurm version from 23.02.0 to 23.11.0.
- gateway: Change error message when unable to parse agent info fields.
- docs:
- Update configuration reference documentation.
- Update dashboard screenshot in overview page with example of resource chart.
- Replace mention of Slurm REST API version v0.0.39 by v0.0.40.
- Mention requirement of Slurm >= 23.11 and dropped support of Slurm 23.02.
- conf:
- Convert
[cache]
>password
agent parameter from string to password type. - Convert
[ldap]
>bind_password
gateway parameter from string to password type. - Bump
[slurmrestd]
>version
default value from0.0.39
to0.0.40
in agent configuration for compatibility with Slurm 24.11.
- Convert
- pkgs:
- Add requirement on RFL.core >= 1.1.0.
- Add requirement on RFL.settings >= 1.1.1.
Fixed
- agent:
- Fix retrieval of terminated jobs only available in accounting service with an option to ignore 404 for specific slurmrestd requests.
- Fix compatibility issue with Requests >= 2.32.2 (#350).
- Return HTTP/404 not found with meaningful error message when requesting unexisting node.
- gateway:
- Catch generic
requests.exceptions.RequestException
when retrieving information from agents to avoidAttributeError
with more specific exceptions on old versions on Requests library (#391). - Catch
JSONDecodeError
from simpleson external library and json standard library module not managed by Requests < 2.27.
- Catch generic
- frontend:
- Notifications not visible when browser is not at the top (#367).
- Update dependencies to fix CVE-2024-45812 and CVE-2024-45811 (vite), CVE-2024-47068 (rollup), CVE-2024-21538 (cross-spawn).
Removed
- Support of Slurm 23.02 and Slurm REST API v0.0.39.
- conf:
- Remove unused
required
from default selected jobs field onslurmrestd
/slurm/*/jobs
endpoint. - Remove unused
state_reason
from default selected job field onslurmrestd
/slurm/*/job/<id>
endpoint.
- Remove unused
- docs: Remove mention of Fedora 39 support.
v3.2.0
Added
- gateway: Support custom LDAP user primary group attribute and group object classes (#342).
- agent: Retrieve Slurm version from
slurmrestd
REST API and return value in response ofstats
endpoint. - frontend: Display Slurm version in clusters list (#314).
- ldap-check: Support custom LDAP user primary group attribute and group object classes (#342).
- conf:
- Add
ldap
>user_primary_group_attribute
parameter for the gateway. - Add
ldap
>group_object_classes
parameter for the gateway. - Add
cache
>version
parameter for the agent.
- Add
- docs:
- Add link to related github issue for
slurmrestd
TCP/IP socket limitation in architecture page. - Add warning about the pure documentation purpose of complete examples of gateway/agent configuration files.
- Explain
[slurmrestd]
>version
agent setting is more intended for developers and should not be changed. - Mention Slurm accounting is required in quickstart guide (#341).
- Add link to related github issue for
Changed
- agent: Check Slurm version returned from
slurmrestd
against hard-coded minimal version and log error if not greater or equal. - frontend: Add intermediate cluster list width to 80% on large screens, before going down to 60% on even larger screens.
- pkgs: Add requirement on RFL.core and RFL.authentication >= 1.0.3.
- docs:
- Update configuration reference documentation.
- Update screenshots with latest UI changes.
Fixed
- backend: handle rfl.settings.errors.SettingsSiteLoaderError when loading gateway and agent site configuration (#317).
- gateway: Response with HTTP/501 and JSON error when requesting users with authentication disabled.
- agent:
- Translate HTTP/404 from
slurmrestd
into JSON error agent that can be interpreted by frontend and emit clear error message in logs (#321). - Detect responses from
slurmrestd
not formatted in JSON, translated into JSON error for frontend and emit clear error message in logs (#333). - Detect absence of warnings key in
slurmrestd
responses and emit warning log instead of crashing (#316).
- Translate HTTP/404 from
- genjwt: fix portability to Python < 3.8 in debug message.
- ldap-check: fix usage of
user_name_attribute
configuration parameter (#340). - frontend:
- Support node names without digits in expand/fold logic (#328).
- Update dependencies to fix CVE-2024-39338 (axios), CVE-2024-6783 (vue-template-compiler) and CVE-2024-4067 (micromatch).
- Display empty list of users/account with light gray cross instead of dot in reservations page (#336).
- Hide users disclosure from jobs filters panel when authentication is disabled (#330).
- docs:
- Mention requirement of
SLURMRESTD_SECURITY=disable_user_check
environment variable inslurmrestd
service drop-in configuration override (#320). - Fix protocols section in architecture page to mention Slurm internal authentication mechanism (with
sackd
) and clarify thatmunge
is not involved between Slurm-web agent andslurmrestd
.
- Mention requirement of
🙏 Special thanks @rseaman2016 @Talavig @attssystem @digdilem @parapar @c-vinet @satishdotpatel for their feedback and contributions!
v3.1.0
Added
- frontend:
- gateway:
- agent: Add
cpus
andnode_count
fields as provided byslurmrestd
in jobs list responses. - docs: Add full gateway and agent configuration files examples.
- conf:
- Add
ldap
>user_name_attribute
parameter for the gateway. - Add
ui
>hide_denied
parameter for the gateway.
- Add
Changed
- frontend:
- Use server icon instead of cpu chip icon to represent nodes in clusters list and clusters pop over menu.
- Merge account column with user column in jobs list, the account is now displayed between parenthesis.
- docs: Update configuration reference documentation.
- pkgs:
- Add requirement on RFL >= 1.0.2.
- Add requirement on aiohttp.
Fixed
- frontend:
- gateway: Add possibility to configure custom LDAP user name attribute in alternative to
uid
(#305). - conf: Add documentation precisions and examples in agent and gateway configuration definitions.
- docs:
- Typo in slurmrestd service name in quick start guide.
- Use consistent URL format for
curl
commands onslurmrestd
Unix sockets. - Fix agent and gateway configuration file extension in configuration files reference documentation.
- Fix some agent and gateway configuration file extension typos in quickstart guide.
- Invert initial setup and JWT signing key sections in quickstart to satisfy
slurm-web-gen-jwt-key
configuration requirement.
v3.0.0
v2.4.0
Software changes:
- Re-organise the code to support standard setup.py
- Portage to python3
- Fix handling of job exclusive field in dashboard
- Introduce RPM el8 packaging
Debian specific changes:
- Introduce slurm-web-common package to install slurm-web Python module files common to all slurm-web python packages. making slurm-web binary package a meta-package to install all Slurm-web components.
- Bump d/compat to 9
v2.2.1
v2.2.0
- REST API:
- Honor Slurm PrivateData settings for jobs and reservations (#149)
- Use GET instead of POST for most routes. The optional authentication token is now given in a new
Authorization HTTP header (#63). - Remove password from token (#64)
- Handle LDAP SERVER_DOWN exception (#107)
- Fix PySLURM call to job find_id() following API change introduced with PySLURM >= 16.05 (#138).
- Remove use of join in partitions view following API change introduced with PySLURM >= 16.05 (#140).
- Slurm-web REST API now depends on PySLURM >= 16.05
- Dashboard:
- Make path to top-left corner logo configurable (#102)
- Show full name at the top right corner (#110)
- Show real cluster name instead of local (#61)
- Show TRES instead of nodelist in jobs view (#89)
- Show node down/drain reason (#90)
- Add optional extra customizable col in jobs view (#65)
- Significantly reduce margins between racks (#126)
- Fix empty jobs view lock (#136)
- Fix serial authentification failures (#137)
- Factorize dashboard error management (#124,#150)
- Add WCKey to dashboard jobs view (#141)
- Fix global logout on one cluster auth fail (#158)
- Disable caching effect on dashboard conf files (#99,#159)
- Adding sinfo endpoint to slurmrestapi (#145) (thanks to @alexxxxx)
- Integration:
- Tests:
- Introduce a programmable testing environment with mocks and fake data sources.
- Doc: