Releases: SchedMD/slurm-gcp
Releases · SchedMD/slurm-gcp
6.3.1
5.10.1
- Add maintenance_interval
Full Changelog: 5.10.0...5.10.1
6.3.0
- Upgrade installed Slurm to 23.02.7
- Fix deprecation warning in google_secret_manager_secret.
- Fix TPU delete_node API return message.
Full Changelog: 6.2.0...6.3.0
5.10.0
- Upgrade slurm to 23.02.7
- Fix slurmsync on reconfig when removing nodes.
Full Changelog: 5.9.1...5.10.0
6.2.0
- Reverse logic in valid_placement_nodes
- Add slurm_gcp_plugin support.
- Add reservation affinity to nodesets via reservation_name option.
- Change TPU node conf based on tpu version instead of TPU model.
- Add support for TPUv4
- Upgrade installed Slurm to 23.02.5
Full Changelog: 6.1.2...6.2.0
5.9.1
- Use reservation placement policy if placement is enabled, and a reservation is
specified.
Full Changelog: 5.9.0...5.9.1
5.9.0
- Remove spurious log message on resume, referring to "Reservation name".
- Support A3 VMs in compact placement policies.
- Migrate from network overrides in bulkInsert to honoring instance templates.
- Add additional_networks support to instance template and partition_nodes.
- Support Tier 1 networking in instance templates.
- Reverse logic in valid_placement_nodes
Full Changelog: 5.8.0...5.9.0
6.1.2
- Fix accelerator optimized machine type SMT handling.
- Prefix user visible errors with its source.
- Fix accelerator optimized machine type socket handling.
- Only compare config.yaml blob to cache file.
- Fix login nodes appearing as compute nodes in Slurm output.
- Add enable_debug_logging and extra_logging_flags to terraform.
- Only attempt static node resume when node is powered down.
- Fix CUDA on Ubuntu by installing CUDA via runfile alongside NVIDIA driver from signed repo.
- Fix conf generation issue on reconfiguration.
Full Changelog: 6.1.1...6.1.2
5.8.0
- Fix login nodes not reconfiguring when
enable_reconfigure=true
. - Do not temporarily disable partitions during reconfigure process.
- Fix login nodes appearing as compute nodes in Slurm output.
- Only attempt static node resume when node is powered down.
- Fix CUDA on Ubuntu by installing CUDA via runfile alongside NVIDIA driver from
signed repo. - Fix conf generation issue on reconfiguration.
Full Changelog: 5.7.6...5.8.0
5.7.6
- Prefix user visible errors with its source.
- Fix accelerator optimized machine type SMT handling.
- Fix accelerator optimized machine type socket handling.
Full Changelog: 5.7.5...5.7.6