Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD GPU on Linux doesn't work with Mesa's OpenCL drivers #239

Open
obj-obj opened this issue May 22, 2024 · 25 comments
Open

AMD GPU on Linux doesn't work with Mesa's OpenCL drivers #239

obj-obj opened this issue May 22, 2024 · 25 comments

Comments

@obj-obj
Copy link

obj-obj commented May 22, 2024

When trying to use Mesa's implementation of OpenCL, the gpu is greyed out in the devices list like so:
image

I can also see

"gpu:45:00:00": {"vendor": 4098, "device": 29679, "type": "amd", "supported": false, "description": "Navi 23 [Radeon RX 6650XT]"}

being printed out in the log. There is a similar line in gpus.json:

{"vendor": 4098, "device": 29679, "type": 1, "species": 8, "description": "Navi 23 [Radeon RX 6650XT]"}

Could this simply be a case of something setting the type field incorrectly (and FAH thus thinking the GPU is unsupported)?

@jcoffland
Copy link
Member

jcoffland commented May 23, 2024

The GPU is not supported with out OpenCL. What happens if you run clinfo? This should tell you if OpenCL is properly installed or not.

@obj-obj
Copy link
Author

obj-obj commented May 23, 2024

The GPU is not supported with out OpenCL. What happens if you run clinfo? This should tell you if OpenCL is properly installed or not.

I will have to get home to send it, but I've checked and everything seems to be correct. Geekbench's OpenCL benchmark also completes with no issues.

@jcoffland
Copy link
Member

If the client says it has now OpenCL support for the GPU it means that it was either not able to access OpenCL at all or it was not able to match the GPU it found in the system to a device in OpenCL. The later can occur if the driver does not supply PCI bus information. You may need to install AMD's drivers to get the GPU to work with F@H.

@jcoffland jcoffland changed the title Doesn't work with Mesa's OpenCL implementation AMD GPU on Linux doesn't work with Mesa's OpenCL drivers May 23, 2024
@obj-obj
Copy link
Author

obj-obj commented May 23, 2024

If the client says it has now OpenCL support for the GPU it means that it was either not able to access OpenCL at all or it was not able to match the GPU it found in the system to a device in OpenCL. The later can occur if the driver does not supply PCI bus information. You may need to install AMD's drivers to get the GPU to work with F@H.

Yeah, I'm using AMD's rocm driver to fold currently, and it works fine (as long as I delete the libraries bundled with the core to make it use the system libraries, I'll make a separate issue for that later). I just made this issue to try to find out why Mesa's implementation doesn't work.

@kbernhagen
Copy link
Contributor

Someone else had mesa drivers and their GPU was disabled. clinfo indicated that FP64 was not supported by mesa opencl.

@obj-obj
Copy link
Author

obj-obj commented May 23, 2024

Someone else had mesa drivers and their GPU was disabled. clinfo indicated that FP64 was not supported by mesa opencl.

OK, I'll make sure to send my clinfo output here when I get home in about an hour.

@obj-obj
Copy link
Author

obj-obj commented May 24, 2024

Forgot to send it, sorry. This is using Mesa's clover driver, as the rusticl driver doesn't seem to have FP64 support.

Still greyed out in folding@home though, with the same message in the log as previously. Could it be because clover seems to only be implementing OpenCL 1.1?

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 24.0.7-arch1.3.1
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 24.0.7-arch1.3.1
  Device Numeric Version                          0x401000 (1.1.0)
  Driver Version                                  24.0.7-arch1.3.1
  Device OpenCL C Version                         OpenCL C 1.1 
  Device OpenCL C Numeric Version                 0x401000 (1.1.0)
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Max compute units                               32
  Max clock frequency                             2725MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple (kernel)     <getWGsizes:1980: create kernel : error -46>
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              8589934592 (8GiB)
  Error Correction support                        No
  Max memory allocation                           2147483648 (2GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       32768 bits (4096 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Max number of constant args                     16
  Max constant buffer size                        67108864 (64MiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    ILs with version                              SPIR-V                                                           0x400000 (1.0.0)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning
  Device Extensions with Version                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.3.2
  ICD loader Profile                              OpenCL 3.0

@kbernhagen
Copy link
Contributor

Yes. OpenCL 1.2 or later is required.

@6rill2000
Copy link

There is a flag to tenable FP64 with Rusticl since Mesa 23.2-devel, but I don't have the hardware to test it : https://www.phoronix.com/news/Rusticl-OpenCL-FP64-Doubles

@jcoffland
Copy link
Member

Lack of OpenCL v1.2 and FP64 support are both problems that would prevent the GPU from getting an assignment on F@H. However, I believe the real reason the client marks the GPU as unsupported is because the driver does not support either the cl_khr_pci_bus_info extension or the CL_DEVICE_TOPOLOGY_AMD option to clGetDeviceInfo(). One of these is needed to get the PCI bus information which is necessary for GPU identification.

@obj-obj
Copy link
Author

obj-obj commented May 24, 2024

There is a flag to tenable FP64 with Rusticl since Mesa 23.2-devel, but I don't have the hardware to test it : https://www.phoronix.com/news/Rusticl-OpenCL-FP64-Doubles

Will test this when I'm back home, and send the clinfo output for the rusticl driver. rusticl seems to be a more complete implementation anyway, so hopefully it has cl_khr_pci_bus_info implemented.

Edit: Tested it on the integrated graphics in my laptop, and FP64 support does show up. If you look at the clinfo output, cl_khr_pci_bus_info is also in here:

Number of platforms                               1
  Platform Name                                   rusticl
  Platform Vendor                                 Mesa/X.org
  Platform Version                                OpenCL 3.0 
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_expect_assume cl_khr_extended_versioning cl_khr_icd cl_khr_il_program cl_khr_spirv_no_integer_wrap_decoration
  Platform Extensions with Version                cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_khr_expect_assume                                             0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                          0x400000 (1.0.0)
  Platform Numeric Version                        0xc00000 (3.0.0)
  Platform Extensions function suffix             MESA
  Platform Host timer resolution                  1ns

  Platform Name                                   rusticl
Number of devices                                 1
  Device Name                                     Mesa Intel(R) Xe Graphics (TGL GT2)
  Device Vendor                                   Intel
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 3.0 
  Device UUID                                     8680499a-0100-0000-0002-000000000000
  Driver UUID                                     028e935d-7ef8-d8dd-8696-8d2aaf2058fd
  Valid Device LUID                               No
  Device LUID                                     0000-000000000000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  24.0.7-arch1.3.1
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C Numeric Version                 0x402000 (1.2.0)
  Device OpenCL C all versions                    OpenCL C                                                         0xc00000 (3.0.0)
                                                  OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x400000 (1.0.0)
  Device OpenCL C features                        __opencl_c_integer_dot_product_input_4x8bit_packed               0x800000 (2.0.0)
                                                  __opencl_c_integer_dot_product_input_4x8bit                      0x800000 (2.0.0)
                                                  __opencl_c_fp64                                                  0x400000 (1.0.0)
                                                  __opencl_c_int64                                                 0x400000 (1.0.0)
                                                  __opencl_c_images                                                0x400000 (1.0.0)
                                                  __opencl_c_3d_image_writes                                       0x400000 (1.0.0)
                                                  __opencl_c_subgroups                                             0x400000 (1.0.0)
  Latest conformance test passed                  v2022-04-22-00
  Device Type                                     GPU
  Device PCI bus info (KHR)                       PCI-E, 0000:00:02.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               5
  Max clock frequency                             400MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  Preferred work group size multiple (device)     32
  Preferred work group size multiple (kernel)     32
  Max sub-groups per work group                   64
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               Yes
  Address bits                                    64, Little-Endian
  Global memory size                              1073741824 (1024MiB)
  Error Correction support                        No
  Max memory allocation                           1073741824 (1024MiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 No
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Atomic memory capabilities                      relaxed, work-group scope
  Atomic fence capabilities                       relaxed, acquire/release, work-group scope
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        None
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   1 bytes
    Pitch alignment for 2D image buffers          1 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                64
    Max number of read/write image args           0
  Pipe support                                    No
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Global
  Local memory size                               65536 (64KiB)
  Max number of constant args                     16
  Max constant buffer size                        67108864 (64MiB)
  Generic address space support                   No
  Max size of kernel argument                     4096 (4KiB)
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Device enqueue capabilities                     (n/a)
  Queue properties (on device)                    
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      53ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Non-uniform work-groups                       No
    Work-group collective functions               No
    Sub-group independent forward progress        No
    IL version                                    SPIR-V_1.0 SPIR-V_1.1 SPIR-V_1.2 SPIR-V_1.3 SPIR-V_1.4
    ILs with version                              SPIR-V                                                           0x400000 (1.0.0)
                                                  SPIR-V                                                           0x401000 (1.1.0)
                                                  SPIR-V                                                           0x402000 (1.2.0)
                                                  SPIR-V                                                           0x403000 (1.3.0)
                                                  SPIR-V                                                           0x404000 (1.4.0)
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_expect_assume cl_khr_extended_versioning cl_khr_icd cl_khr_il_program cl_khr_spirv_no_integer_wrap_decoration cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_integer_dot_product cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_gl_sharing cl_khr_image2d_from_buffer cl_khr_3d_image_writes cl_khr_pci_bus_info cl_khr_device_uuid cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative
  Device Extensions with Version                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_khr_expect_assume                                             0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                          0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_integer_dot_product                                       0x800000 (2.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_khr_image2d_from_buffer                                       0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle                                          0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle_relative                                 0x400000 (1.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  rusticl
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   Mesa Intel(R) Xe Graphics (TGL GT2)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   Mesa Intel(R) Xe Graphics (TGL GT2)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   Mesa Intel(R) Xe Graphics (TGL GT2)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.3.2
  ICD loader Profile                              OpenCL 3.0

@obj-obj
Copy link
Author

obj-obj commented May 24, 2024

New clinfo output:
GPU is still disabled in F@H.

Number of platforms                               1
  Platform Name                                   rusticl
  Platform Vendor                                 Mesa/X.org
  Platform Version                                OpenCL 3.0 
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_expect_assume cl_khr_extended_versioning cl_khr_icd cl_khr_il_program cl_khr_spirv_no_integer_wrap_decoration
  Platform Extensions with Version                cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_khr_expect_assume                                             0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                          0x400000 (1.0.0)
  Platform Numeric Version                        0xc00000 (3.0.0)
  Platform Extensions function suffix             MESA
  Platform Host timer resolution                  1ns

  Platform Name                                   rusticl
Number of devices                                 1
  Device Name                                     AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 3.0 
  Device UUID                                     00000000-2d00-0000-0000-000000000000
  Driver UUID                                     414d442d-4d45-5341-2d44-525600000000
  Valid Device LUID                               No
  Device LUID                                     0000-000000000000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  24.0.7-arch1.3.1
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C Numeric Version                 0x402000 (1.2.0)
  Device OpenCL C all versions                    OpenCL C                                                         0xc00000 (3.0.0)
                                                  OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x400000 (1.0.0)
  Device OpenCL C features                        __opencl_c_integer_dot_product_input_4x8bit_packed               0x800000 (2.0.0)
                                                  __opencl_c_integer_dot_product_input_4x8bit                      0x800000 (2.0.0)
                                                  __opencl_c_fp64                                                  0x400000 (1.0.0)
                                                  __opencl_c_int64                                                 0x400000 (1.0.0)
                                                  __opencl_c_images                                                0x400000 (1.0.0)
                                                  __opencl_c_3d_image_writes                                       0x400000 (1.0.0)
                                                  __opencl_c_subgroups                                             0x400000 (1.0.0)
  Latest conformance test passed                  v0000-01-01-00
  Device Type                                     GPU
  Device PCI bus info (KHR)                       PCI-E, 0000:2d:00.0
  Device Profile                                  EMBEDDED_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               32
  Max clock frequency                             2725MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  Preferred work group size multiple (device)     64
  Preferred work group size multiple (kernel)     64
  Max sub-groups per work group                   32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              8589934592 (8GiB)
  Error Correction support                        No
  Max memory allocation                           2147483648 (2GiB)
  Unified memory for Host and Device              No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 No
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Atomic memory capabilities                      relaxed, work-group scope
  Atomic fence capabilities                       relaxed, acquire/release, work-group scope
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        None
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            268435455 pixels
    Max 1D or 2D image array size                 8192 images
    Base address alignment for 2D image buffers   0 bytes
    Pitch alignment for 2D image buffers          0 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             8192x8192x8192 pixels
    Max number of read image args                 32
    Max number of write image args                16
    Max number of read/write image args           0
  Pipe support                                    No
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Global
  Local memory size                               65536 (64KiB)
  Max number of constant args                     16
  Max constant buffer size                        67108864 (64MiB)
  Generic address space support                   No
  Max size of kernel argument                     4096 (4KiB)
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Device enqueue capabilities                     (n/a)
  Queue properties (on device)                    
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      10ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Non-uniform work-groups                       No
    Work-group collective functions               No
    Sub-group independent forward progress        No
    IL version                                    SPIR-V_1.0 SPIR-V_1.1 SPIR-V_1.2 SPIR-V_1.3 SPIR-V_1.4
    ILs with version                              SPIR-V                                                           0x400000 (1.0.0)
                                                  SPIR-V                                                           0x401000 (1.1.0)
                                                  SPIR-V                                                           0x402000 (1.2.0)
                                                  SPIR-V                                                           0x403000 (1.3.0)
                                                  SPIR-V                                                           0x404000 (1.4.0)
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_expect_assume cl_khr_extended_versioning cl_khr_icd cl_khr_il_program cl_khr_spirv_no_integer_wrap_decoration cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_integer_dot_product cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_gl_sharing cles_khr_int64 cl_khr_3d_image_writes cl_khr_pci_bus_info cl_khr_device_uuid cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative
  Device Extensions with Version                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_khr_expect_assume                                             0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                          0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_integer_dot_product                                       0x800000 (2.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cles_khr_int64                                                   0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle                                          0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle_relative                                 0x400000 (1.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  rusticl
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 rusticl
    Device Name                                   AMD Radeon RX 6650 XT (radeonsi, navi23, LLVM 17.0.6, DRM 3.57, 6.9.0-273-tkg-eevdf-llvm)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.3.2
  ICD loader Profile                              OpenCL 3.0

@jcoffland
Copy link
Member

Did you restart fah-client? The above OpenCL profile should work. It supports both OpenCL version 3.0 and the cl_khr_pci_bus_info extension. Please post the top of the client log.

@jcoffland jcoffland added the bug Something isn't working label May 25, 2024
@obj-obj
Copy link
Author

obj-obj commented May 25, 2024

Did you restart fah-client? The above OpenCL profile should work. It supports both OpenCL version 3.0 and the cl_khr_pci_bus_info extension. Please post the top of the client log.

Yeah, I did. I’m not at home right now (will be back in 2 days), and I’ll post it then.

@obj-obj
Copy link
Author

obj-obj commented May 30, 2024

*********************** Log Started 2024-05-30T05:02:59Z ***********************
05:02:59:I1:*********************** Folding@home Client ***********************
05:02:59:I1:    Version: 8.3.5
05:02:59:I1:     Author: Joseph Coffland <[email protected]>
05:02:59:I1:        Org: foldingathome.org
05:02:59:I1:  Copyright: 2023-2024, foldingathome.org
05:02:59:I1:   Homepage: https://foldingathome.org/
05:02:59:I1:    License: GPL-3.0-or-later
05:02:59:I1:        URL: https://beta.foldingathome.org/
05:02:59:I1:       Date: Feb 14 2024
05:02:59:I1:       Time: 14:09:03
05:02:59:I1:   Revision: 652c05f093c7d9542ec3d2effdfa693c80a77e8d
05:02:59:I1:     Branch: master
05:02:59:I1:   Compiler: GNU 8.3.0
05:02:59:I1:    Options: -faligned-new -std=c++17 -fsigned-char -ffunction-sections
05:02:59:I1:             -fdata-sections -O3 -funroll-loops -fno-pie
05:02:59:I1:   Platform: linux 4.19.0-26-cloud-amd64
05:02:59:I1:       Bits: 64
05:02:59:I1:       Mode: Release
05:02:59:I1:       Args: --log=/var/log/fah-client/log.txt
05:02:59:I1:             --log-rotate-dir=/var/log/fah-client/
05:02:59:I1:****************************** CBang ******************************
05:02:59:I1:    Version: 1.7.2
05:02:59:I1:     Author: Joseph Coffland <[email protected]>
05:02:59:I1:        Org: Cauldron Development LLC
05:02:59:I1:  Copyright: Cauldron Development LLC, 2003-2024
05:02:59:I1:   Homepage: https://cauldrondevelopment.com/
05:02:59:I1:    License: LGPL-2.1-or-later
05:02:59:I1:       Date: Feb 14 2024
05:02:59:I1:       Time: 13:50:56
05:02:59:I1:   Revision: 8f6145ef68f2c46b23df4487130e8f7e4fc8c757
05:02:59:I1:     Branch: master
05:02:59:I1:   Compiler: GNU 8.3.0
05:02:59:I1:    Options: -faligned-new -std=c++17 -fsigned-char -ffunction-sections
05:02:59:I1:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
05:02:59:I1:   Platform: linux 4.19.0-26-cloud-amd64
05:02:59:I1:       Bits: 64
05:02:59:I1:       Mode: Release
05:02:59:I1:***************************** System ******************************
05:02:59:I1:        CPU: AMD Ryzen 7 3700X 8-Core Processor
05:02:59:I1:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
05:02:59:I1:       CPUs: 16
05:02:59:I1:     Memory: 78.47GiB
05:02:59:I1:Free Memory: 31.98GiB
05:02:59:I1: OS Version: 6.9
05:02:59:I1:Has Battery: false
05:02:59:I1: On Battery: false
05:02:59:I1:   Hostname: ArchDesktop
05:02:59:I1: UTC Offset: -7
05:02:59:I1:        PID: 222509
05:02:59:I1:        CWD: /var/lib/fah-client
05:02:59:I1:       Exec: /usr/bin/fah-client
05:02:59:I1:*******************************************************************
05:02:59:I2:<config/>
05:02:59:I1:Opening Database
05:02:59:I1:F@H ID = cyx9boeePHKbWSJNUBrGnCtk9932D556GujBquMbKvw
05:02:59:I3:Loading default group
05:02:59:I3:Loading default resource group
05:02:59:I1:Listening for HTTP on 127.0.0.1:7396
05:02:59:I3:Loaded 0 wus.
�[91m05:02:59:E :CUDA not supported: Failed to open dynamic library 'libcuda.so': libcuda.so: cannot open shared object file: No such file or directory�[0m
05:02:59:I3:gpus = {
05:02:59:I3:  "gpu:45:00:00": {"vendor": 4098, "device": 29679, "type": "amd", "supported": false, "description": "Navi 23 [Radeon RX 6650XT]"}
05:02:59:I3:}
05:02:59:I3:Connecting to node1.foldingathome.org:443
05:02:59:I1:OUT1:> GET wss://node1.foldingathome.org/ws/client HTTP/1.1
05:02:59:I1:OUT1:< HTTP/1.1 101 HTTP_SWITCHING_PROTOCOLS
05:02:59:I1:Logging into node account

@jcoffland
Copy link
Member

Is this still not supported? I see Navi 23 [Radeon RX 6650XT] as supported in my gpus.json. I think as long as you have at least OpenCL support it should work. Please try the latest alpha v8.4.8. https://foldingathome.org/alpha/

@muziqaz
Copy link
Contributor

muziqaz commented Nov 20, 2024

I wonder if that is really sorted now. I saw similar issues with other distributed computing projects, where Mesa drivers are highly problematic.

@jcoffland jcoffland removed the bug Something isn't working label Nov 21, 2024
@kbernhagen
Copy link
Contributor

I believe the Mesa drivers still have emulated FP64

@Davilink
Copy link

Davilink commented Dec 22, 2024

Ok, i think i made it work. First i add this to the /lib/systemd/system/fah-client.service

Environment="RUSTICL_ENABLE=radeonsi"
Environment="RUSTICL_FEATURES=fp16,fp64"
Environment="OCL_ICD_VENDORS=/etc/OpenCL/vendors/rusticl.icd"

image

but then i add to compile the client myself, because i had to comment some part of the code, i don't why (maybe this could help investigate), i'm not a c++ developer neither an opencl developer.

in cbang project
image

image

image

@muziqaz
Copy link
Contributor

muziqaz commented Dec 22, 2024

Any chance of performance comparison numbers between AMD opencl and mesa? If there is no difference, then this would be great. Though currently AMD opencl on Linux is quite broken as is performance wise :(

@Davilink
Copy link

Sadly, rusticl doesn't seem to be stable enough, my computer rebooted after 20 minutes (around 2.5~3.0%). I didn't had this problem when i was using the amdgpu opencl

@muziqaz
Copy link
Contributor

muziqaz commented Dec 22, 2024

is there a specific reason this line doesn't mention fp32?
Environment="RUSTICL_FEATURES=fp16,fp64"
FAH is not using fp16

@Davilink
Copy link

is there a specific reason this line doesn't mention fp32? Environment="RUSTICL_FEATURES=fp16,fp64" FAH is not using fp16

There not fp32 feature_flag in the list
image
https://docs.mesa3d.org/envvars.html#envvar-RUSTICL_ENABLE

So i changed Environment=fp16,fp64 to Environment=fp64, at but first it's seem to be better (i reach ~1 hour, so ~6%) but sadly, i still got the same outcome, PC rebooted. I wonder if my graphic card is hitting a heat limit or if it's something else

@muziqaz
Copy link
Contributor

muziqaz commented Dec 23, 2024

Heat would have been an issue even with AMD drivers.
It is possible that Mesa compute just triggers something within system (GPU) to crash the PC

@obj-obj
Copy link
Author

obj-obj commented Dec 24, 2024

is there a specific reason this line doesn't mention fp32? Environment="RUSTICL_FEATURES=fp16,fp64" FAH is not using fp16

There not fp32 feature_flag in the list image https://docs.mesa3d.org/envvars.html#envvar-RUSTICL_ENABLE

So i changed Environment=fp16,fp64 to Environment=fp64, at but first it's seem to be better (i reach ~1 hour, so ~6%) but sadly, i still got the same outcome, PC rebooted. I wonder if my graphic card is hitting a heat limit or if it's something else

I think fp32 might be enabled by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants