-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Package Request] - CUDA Compatibility #316
Comments
We are working on nVidia support. It will not be via the RHEL 7 packages however. We'll update you when we have something and will post some documentation/instructions. Soon hopefully :) |
Hey @ozbenh, are there any updates on this? I also have been having problems with installing GPU drivers on amazon linux 2023 |
The dependencies for the Fedora 35 RPMs should be resolved by now, so those should install. Still working with nVidia to get Amazon Linux specific builds. |
Just stumbled upon this hurdle in our environment too. It would be great to see this installed/included by default into the AL2023 images so we don't need to think when picking instance types. I can confirm the |
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
Related: aws/containers-roadmap#2072 |
The following seems to work. I didn't include nccl as we don't use it. You may not need sudo/rsync/iproute/iputils/cifs-utils/shadow-utils/openssl-devel/etc for your application. Based heavily on https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/11.7.1/centos7/
|
We are working with nVidia on a better long term solution, including the container support. In the meantime, the Fedora 37 packages appear to work as well as the 35 ones after a quick smoke test. |
Going to centralize everything on #12 |
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
This seems to work in G5s - based on @bryantbiggs code
"nvidia-smi" command should return the GPUs If you need the nvidia container runtime (for example, ECS tasks), you need to also execute:
Tested with an ubuntu container
|
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
NVIDIA has added AL2023 support to CUDA repo. Install article at How do I install NVIDIA GPU driver, CUDA toolkit and optionally NVIDIA Container Toolkit in Amazon Linux 2023 (AL2023)? Here is my install script
|
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of aws#2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with aws#2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
### Description of changes Used new utility `os-resources.py` introduced as part of #2328 to create new resources for alinux2023, starting from redhat8 resources. Relevant changes to the code: * os_type --> Replaced rhel with alinux * Tried to fix CloudWatch agent setup by changing `platform_url_component` to point to the same of rhel (not sure if it's correct) * Copied network setup templates from redhat folders * Added alinux2023 to pcluster_dcv_connect.sh script Removed redhat_on_docker condition from: * stunnel * system_authentication * efa TODO: * efa -> check efa_supported? condition and log messages * lustre -> check version condition and log messages * install_packages --> Removed postgresql packages * Enable repository needed by hwloc-devel blas-devel libedit-devel and glibc-static packages ### Tests * Added Alinux2023 to ec2 kitchen configuration files. Copied from rhel8 with minor changes: * AMI name prefix took from EC2 Amazon Linux official AMI * I called the suite `alinux-2023`, with an "-" in the name to avoid having alinux2 as prefix of alinux-2023 and be able to distinguish them on Inspec runs. * Created new `pre_converge` hook to install libxcrypt-compat package, required to install Chef, leveraging the work done with #2342 * I had to pass a custom `provisioner/download_url` for cinc because package for AL2023 is not available in the default path. * The validated resources are: * nvidia_driver * arm_Pl * c_states * stunnel * build_tools * chrony * modules * munge * dns_domain (install only) * jwt_dependencies * nfs (install only) * raid (install only) * system_authentication (install only) * efs (install only) TODO: * Add Alinux2023 to GitHub actions * Add new os to kitchen.docker.yml config file (search for `kernel_release` version from an EC2 instance) * Fix Inspec and ChefSpec tests conditions ### References * https://hub.docker.com/_/amazonlinux * https://github.com/test-kitchen/kitchen-ec2/tree/main/lib/kitchen/driver/aws/standard_platform Known issues/FE: * amazonlinux/amazon-linux-2023#47 * amazonlinux/amazon-linux-2023#146 * amazonlinux/amazon-linux-2023#168 * amazonlinux/amazon-linux-2023#309 * amazonlinux/amazon-linux-2023#316 Signed-off-by: Enrico Usai <[email protected]> [AL2023] Use systemd-resolved instead of dhclient on Alinux 2023 Signed-off-by: Hanwen <[email protected]>
Got stopped here with this error
Seems this is because I am on ARM ( Following these instructions worked. Install script:
|
What package is missing from Amazon Linux 2023? Please describe and include package name.
Can't currently seem to install cuda as cuda-repo-rhel7-11-4-local-11.4.0_470.42.01-1 wants libtirpc-0.2.4 and only has 1.3.3-0, which provides libtirpc.so.3 and not libtirpc.so.1
Tried building/installing tirpc 0.2.4 from source and it gripes:
auth_time.c:46:10: fatal error: rpcsvc/nis.h: No such file or directory
Is this an update to existing package or new package request?
Downgrade, actually.
Is this package available in Amazon Linux 2? If it is available via external sources such as EPEL, please specify.
Yep, available. Believe it's installed by default.
Any additional information you'd like to include. (use-cases, etc)
Is CUDA intended to work on 2023 yet? Our security guys want us to upgrade from 2 if possible.
The text was updated successfully, but these errors were encountered: