Skip to content

Omnia 1.2

Compare
Choose a tag to compare
@sujit-jadhav sujit-jadhav released this 21 Apr 15:02
· 3778 commits to devel since this release
176bb71

This release is focused on supporting additional features on Infrastructure, Security, Telemetry & Visualizations.

Infrastructure

  • Omnia supports Rocky 8.5 full OS on the Control Plane
  • Omnia supports ansible version 2.12 (ansible-core) with python 3.6 support
  • All packages required to enable the HPC/AI cluster are deployed as a pod on control plane
  • Omnia now installs Grafana as a single pane of glass to view logs, metrics and telemetry visualization
  • Compute node provisioning can be done via PXE and iDRAC
  • Omnia supports multiple operating systems on the cluster including support for Rocky 8.5 and OpenSUSE Leap 15.3
  • Omnia can deploy compute nodes with a single NIC.
  • All Cluster metrics can be viewed using Grafana on the Control plane (as opposed to checking the manager node on each cluster)
  • AWX node inventory now displays service tags with the relevant operating system.

Security

  • Omnia adheres to most of the requirements of NIST 800-53 and NIST 800-171 guidelines on the control plane and login node.
  • Omnia has extended the FreeIPA feature to provide authentication and authorization on Rocky Nodes.
  • Omnia uses [389ds}(https://directory.fedoraproject.org/) to provide authentication and authorization on Leap Nodes.
  • Email Alerts have been added in case of login failures.
  • Administrator can restrict users or hosts from accessing the control plane and login node over SSH.
  • Malicious or unwanted network software access can be restricted by the administrator.
  • Admins can restrict the idle time allowed in an ssh session.
  • Omnia installs apparmor to restrict program access on leap nodes.
  • Security on audit log access is provided.
  • Program execution on the control plane and login node is logged using snoopy tool.
  • User activity on the control plane and login node is monitored using psacct/acct tools installed by Omnia

Telemetry & Visualizations

  • Omnia fetches key performance indicators from iDRACs present in the cluster
  • Omnia also supports fetching performance indicators on the nodes in the cluster when SLURM jobs are running.
  • The telemetry data is plotted on Grafana to provide better visualization capabilities.
  • Four visualization plugins are supported to provide and analyze iDRAC and Slurm data.
    • Parallel Coordinate
    • Spiral
    • Sankey
    • Stream-net (aka. Power Map)

In addition to the above features, changes have been made to enhance the performance of Omnia.