Releases: dell/omnia
Omnia 1.7
What’s New in this release:
- Refresh for XE9680 w/ AMD Mi300x accelerators & PowerSwitch Z9864F based network architecture
- Pre-enablement for XE9680 w/ Intel Gaudi 3 accelerators
- NVIDIA container toolkit for NVIDIA accelerators
- Installation of Kubernetes stack v1.29
- Sample playbook for a pre-trained Generative AI model - Llama 3.1
- CSI drivers for Kubernetes to access PowerScale storage
- Internal OpenLDAP server configuration as a proxy server
- Corporate proxy on RHEL, Rocky Linux, and Ubuntu clusters
- Omnia execution within a virtual environment w/ Python 3.11 and Ansible 9.5.1
- Setting OS Kernel command-line parameters using server_spec_update utility
- Revamped Omnia documentation featuring OS-specific install guides, deployment-flow diagram, and other enhancements
Omnia 1.6.1
This patch release is focused on fixing following issue:
- The dependent package ‘libssl1.1_1.1.1f-1ubuntu2.22_amd64’ required by Omnia 1.6 is no longer available for Ubuntu 22.04 OS.
Note:
- With Omnia 1.6.1, new cluster deployments will encounter a TLS CA certificate error with OpenLDAP due to changes in the dependent package ‘openldaptoolbox’. To resolve this, we recommend using Omnia 1.7 for new cluster deployments with OpenLDAP.
- A critical security vulnerability in the cryptography software used by Omnia versions 1.6.1 and earlier has been resolved in Omnia 1.7 by updating the cryptography software to version 44.0.0. We recommend that users upgrade to Omnia 1.7.
Omnia 1.6
This release has been deprecated since the dependent package ‘libssl1.1_1.1.1f-1ubuntu2.22_amd64’ is no longer available for Ubuntu 22.04
Note: Before running local repo in Omnia 1.6 production environment with Ubuntu 22.04 OS, please apply the fix by following the upgrade flow of Omnia 1.6.1
Omnia has been enhanced to offer:
-
Hardware Enablement
- Enablement for AI workloads on XE9680 with AMD Mi300x GPUs
-
OS enablement
-
Enablement for AI
-
Install GPU device plugin for Kubernetes
-
GPU device plugin for AMD
-
GPU device plugin for NVIDIA
-
-
Additional Features
-
One-off Utility to add a node or to remove a node.
-
HPC/AI cluster inventory partitioning
-
CPU inventory
-
AMD GPU inventory
-
NVIDIA GPU inventory
-
Omnia 1.5.1
This patch release is focused on fixing following issue:
-
Installation of Kubernetes 1.16 and 1.19 are deprecated.
-
Spark Operator support is deprecated.
-
Omnia now installs Kubernetes 1.26
Kubeflow is not supported on v1.5.1 due to Kubernetes upgrade.
Omnia 1.4.3.1
This release is focused on supporting following features:
-
Hardware Support: Intel E810 NIC, ConnectX-5/6 NICs.
-
Omnia github now hosts a “genesis” image with this functionality baked in for initial bootup.
-
Host aliasing for Scheduler and IPA authentication.
-
Login and Manager Node access from both public and private NIC.
-
Validation check enhancements:
-
Rearranged to occur as early as possible.
-
Isolate checks when running smaller playbooks.
-
-
Added a Benchmark Install Guide: OneAPI for Intel, MPI AOCC HPL for AMD.
Omnia 1.5
** This Release is now deprecated. Kubernetes v1.16 and v1.19 is no more available for deployment **
This release is focused on supporting following features:
- Expanded telemetry collection support to Regular, health check and GPU metrics.
- Rsyslog : Added ability to aggregate logs via xCAT’s syslog.
- Integration of apptainer for containerized HPC benchmark execution.
- Optimized installation of Visualization Dashboard and Log Aggregator Tool.
Omnia 1.4.3
This release is focused on supporting following features:
- XE 9640, R760 XA, R760 XD2 are now supported as control planes or target nodes with Nvidia H100 accelerators.
- Added ability for split port configuration on NVIDIA Quantum-2-based QM9700 (Nvidia InfiniBand NDR400 switches).
- Extended password-less SSH support for multiple user configuration in a single execution.
- Input mapping files and inventory files now support commented entries for customized playbook execution.
- NFS share is now available for hosting user home directories within the cluster.
Omnia 1.4.2
This release is focused on supporting following features:
- XE9680, R760, R7625, R6615, R7615 are now supported as control planes or target nodes
- Added ability for switch-based discovery of remote servers and PXE provisioning.
- Active RedHat subscription is no longer required on the control plane and the compute nodes. Users can configure and use local RHEL repositories.
- IP ranges can be defined for assignment to remote nodes when discovered via the switch.
Omnia 1.4.1
This release is focused on supporting following features:
- R660, R6625 and C6620 platforms are now supported as control planes or target nodes.
- One touch provisioning now allows for OFED installation, NVIDIA CUDA-toolkit installation along with iDRAC and InfiniBand IP configuration on target nodes.
- Potential servers can now be discovered via iDRAC.
- Servers can be provisioned automatically without manual intervention for booting/PXE settings.
- Target node provisioning status can now be checked on the control plane by viewing the OmniaDB.
- Omnia clusters can be configured with passwordless SSH for seamless execution of HPC jobs run by non-root users.
- Accelerator drivers can be installed on Rocky target nodes in addition to RHEL.
Omnia 1.4.0.1
Bugfix patch release which address the broken Singularity install issue.