Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GB20x Support to NvAPI_GPU_GetArchInfo (follow up) #240

Merged
merged 6 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ The following environment variables tweak DXVK-NVAPI's runtime behavior:
- `TU100` (Turing)
- `GA100` (Ampere)
- `AD100` (Ada)
- `GB200` (Blackwell)
- `DXVK_NVAPI_LOG_LEVEL` set to `info` prints log statements. The default behavior omits any logging. Please fill an issue if using log servery `info` creates log spam. Setting severity to `trace` logs all entry points enter and exits, this has a severe effect on performance. All other log levels will be interpreted as `none`.
- `DXVK_NVAPI_LOG_PATH` enables file logging additionally to console output and sets the path where the log file `nvapi.log`/`nvapi64.log`/`nvofapi64.log` should be written to. Log statements are appended to an existing file. Please remove this file once in a while to prevent excessive grow. This requires `DXVK_NVAPI_LOG_LEVEL` set to `info` or `trace`.

Expand Down
2 changes: 1 addition & 1 deletion external/Vulkan-Headers
Submodule Vulkan-Headers updated 38 files
+46 −6 .github/workflows/ci.yml
+4 −4 .reuse/dep5
+2 −0 BUILD.gn
+37 −9 CMakeLists.txt
+108 −0 Makefile.release
+392 −0 include/vk_video/vulkan_video_codec_av1std.h
+109 −0 include/vk_video/vulkan_video_codec_av1std_decode.h
+252 −22 include/vulkan/vulkan.cppm
+1,278 −651 include/vulkan/vulkan.hpp
+748 −71 include/vulkan/vulkan_core.h
+1,124 −960 include/vulkan/vulkan_enums.hpp
+490 −170 include/vulkan/vulkan_extension_inspection.hpp
+9 −9 include/vulkan/vulkan_format_traits.hpp
+4,413 −1,256 include/vulkan/vulkan_funcs.hpp
+1,996 −449 include/vulkan/vulkan_handles.hpp
+836 −126 include/vulkan/vulkan_hash.hpp
+4 −3 include/vulkan/vulkan_hpp_macros.hpp
+6 −6 include/vulkan/vulkan_metal.h
+1,304 −676 include/vulkan/vulkan_raii.hpp
+55 −2 include/vulkan/vulkan_shared.hpp
+483 −44 include/vulkan/vulkan_static_assertions.hpp
+18,572 −13,433 include/vulkan/vulkan_structs.hpp
+177 −54 include/vulkan/vulkan_to_string.hpp
+1,015 −1 include/vulkan/vulkan_video.hpp
+1 −1 registry/apiconventions.py
+14 −9 registry/cgenerator.py
+46 −11 registry/generator.py
+5 −4 registry/parse_dependency.py
+153 −62 registry/profiles/VP_KHR_roadmap.json
+45 −4 registry/reg.py
+30 −3 registry/spec_tools/conventions.py
+1 −0 registry/spec_tools/util.py
+2 −2 registry/stripAPI.py
+6,411 −6,866 registry/validusage.json
+485 −0 registry/video.xml
+1,781 −593 registry/vk.xml
+4 −3 registry/vkconventions.py
+1 −1 tests/CMakeLists.txt
21 changes: 21 additions & 0 deletions src/nvapi/nvapi_adapter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,18 @@ namespace dxvk {
deviceProperties2.pNext = &m_vkFragmentShadingRateProperties;
}

if (IsVkDeviceExtensionSupported(VK_KHR_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME)) {
m_vkComputeShaderDerivativesProperties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_COMPUTE_SHADER_DERIVATIVES_PROPERTIES_KHR;
m_vkComputeShaderDerivativesProperties.pNext = deviceProperties2.pNext;
deviceProperties2.pNext = &m_vkComputeShaderDerivativesProperties;
}

if (IsVkDeviceExtensionSupported(VK_NV_CUDA_KERNEL_LAUNCH_EXTENSION_NAME)) {
m_vkCudaKernelLaunchProperties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CUDA_KERNEL_LAUNCH_PROPERTIES_NV;
m_vkCudaKernelLaunchProperties.pNext = deviceProperties2.pNext;
deviceProperties2.pNext = &m_vkCudaKernelLaunchProperties;
}

m_vkIdProperties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES;
m_vkIdProperties.pNext = deviceProperties2.pNext;
deviceProperties2.pNext = &m_vkIdProperties;
Expand Down Expand Up @@ -232,6 +244,11 @@ namespace dxvk {
return NV_GPU_ARCHITECTURE_GP100;
}

// GB20x supports mesh+task derivatives
if (IsVkDeviceExtensionSupported(VK_KHR_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME)
&& m_vkComputeShaderDerivativesProperties.meshAndTaskShaderDerivatives)
return NV_GPU_ARCHITECTURE_GB200;

// In lieu of a more idiomatic Vulkan-based solution, check the PCI
// DeviceID to determine if an Ada card is present
if (m_vkProperties.deviceID >= 0x2600)
Expand Down Expand Up @@ -276,6 +293,10 @@ namespace dxvk {
return NV_GPU_ARCHITECTURE_GK100;
}

std::pair<uint32_t, uint32_t> NvapiAdapter::GetComputeCapability() const {
return std::make_pair(m_vkCudaKernelLaunchProperties.computeCapabilityMajor, m_vkCudaKernelLaunchProperties.computeCapabilityMinor);
}

bool NvapiAdapter::IsVkDeviceExtensionSupported(const std::string& name) const {
return m_vkExtensions.find(name) != m_vkExtensions.end();
}
Expand Down
3 changes: 3 additions & 0 deletions src/nvapi/nvapi_adapter.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ namespace dxvk {
[[nodiscard]] uint32_t GetBoardId() const;
[[nodiscard]] std::optional<LUID> GetLuid() const;
[[nodiscard]] NV_GPU_ARCHITECTURE_ID GetArchitectureId() const;
[[nodiscard]] std::pair<uint32_t, uint32_t> GetComputeCapability() const;
[[nodiscard]] bool IsVkDeviceExtensionSupported(const std::string& name) const;
[[nodiscard]] const MemoryInfo& GetMemoryInfo() const;
[[nodiscard]] MemoryBudgetInfo GetCurrentMemoryBudgetInfo() const;
Expand Down Expand Up @@ -67,6 +68,8 @@ namespace dxvk {
VkPhysicalDevicePCIBusInfoPropertiesEXT m_vkPciBusProperties{};
VkPhysicalDeviceDriverPropertiesKHR m_vkDriverProperties{};
VkPhysicalDeviceFragmentShadingRatePropertiesKHR m_vkFragmentShadingRateProperties{};
VkPhysicalDeviceComputeShaderDerivativesPropertiesKHR m_vkComputeShaderDerivativesProperties{};
VkPhysicalDeviceCudaKernelLaunchPropertiesNV m_vkCudaKernelLaunchProperties{};
uint32_t m_vkDriverVersion{};
uint32_t m_dxgiVendorId{};
uint32_t m_dxgiDeviceId{};
Expand Down
44 changes: 8 additions & 36 deletions src/nvapi_d3d12.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -202,51 +202,23 @@ extern "C" {
if (luid.has_value())
adapter = nvapiAdapterRegistry->FindAdapter(luid.value());

if (adapter == nullptr || (!adapter->HasNvProprietaryDriver() && !adapter->HasNvkDriver()))
if (adapter == nullptr)
return Ok(str::format(n, " (sm_0)"));

// From https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/ and https://en.wikipedia.org/wiki/CUDA#GPUs_supported
// Note: One might think that SM here is D3D12 Shader Model, in fact it is the "Streaming Multiprocessor" architecture version
// Values are valid for Turing and newer only, due to VK_NV_cuda_kernel_launch not being supported by earlier generations
auto computeCapability = adapter->GetComputeCapability();
pGraphicsCaps->majorSMVersion = computeCapability.first;
pGraphicsCaps->minorSMVersion = computeCapability.second;

// Might be related to VK_NV_scissor_exclusive (which isn't used by VKD3D-Proton), but unknown in the context of D3D12
// pGraphicsCaps->bExclusiveScissorRectsSupported = adapter->IsVkDeviceExtensionSupported(VK_NV_SCISSOR_EXCLUSIVE_EXTENSION_NAME);

// Note that adapter->IsVkDeviceExtensionSupported returns the extensions supported by DXVK, not by VKD3D-Proton,
// so we might be wrong here in case of an old VKD3D-Proton version or when VKD3D_DISABLE_EXTENSIONS is in use
pGraphicsCaps->bVariablePixelRateShadingSupported = adapter->IsVkDeviceExtensionSupported(VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME);

// From https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
// Note: One might think that SM here is D3D12 Shader Model, in fact it is the "Streaming Multiprocessor" architecture name
switch (adapter->GetArchitectureId()) {
case NV_GPU_ARCHITECTURE_AD100:
pGraphicsCaps->majorSMVersion = 8;
pGraphicsCaps->minorSMVersion = 9;
break;
case NV_GPU_ARCHITECTURE_GA100:
pGraphicsCaps->majorSMVersion = 8;
pGraphicsCaps->minorSMVersion = 6; // Take the risk that no one uses an NVIDIA A100 with this implementation
break;
case NV_GPU_ARCHITECTURE_TU100:
pGraphicsCaps->majorSMVersion = 7;
pGraphicsCaps->minorSMVersion = 5;
break;
case NV_GPU_ARCHITECTURE_GV100:
pGraphicsCaps->majorSMVersion = 7;
pGraphicsCaps->minorSMVersion = 0;
break;
case NV_GPU_ARCHITECTURE_GP100:
pGraphicsCaps->majorSMVersion = 6;
pGraphicsCaps->minorSMVersion = 0;
break;
case NV_GPU_ARCHITECTURE_GM200:
pGraphicsCaps->majorSMVersion = 5;
pGraphicsCaps->minorSMVersion = 2;
break;
case NV_GPU_ARCHITECTURE_GM000:
pGraphicsCaps->majorSMVersion = 5;
pGraphicsCaps->minorSMVersion = 0;
break;
default:
break;
}

return Ok(str::format(n, " (sm_", pGraphicsCaps->majorSMVersion, pGraphicsCaps->minorSMVersion, ")"));
}

Expand Down
3 changes: 3 additions & 0 deletions src/nvapi_gpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -579,6 +579,9 @@ extern "C" {
// usage.
NV_GPU_ARCH_IMPLEMENTATION_ID implementationId;
switch (architectureId) {
case NV_GPU_ARCHITECTURE_GB200:
implementationId = NV_GPU_ARCH_IMPLEMENTATION_GB202;
break;
case NV_GPU_ARCHITECTURE_AD100:
implementationId = NV_GPU_ARCH_IMPLEMENTATION_AD102;
break;
Expand Down
4 changes: 4 additions & 0 deletions src/nvapi_private.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@
#undef __NVAPI_EMPTY_SAL
#include "nvapi_lite_salend.h"

// TODO: Remove those once they are part of NVAPI headers
#define NV_GPU_ARCHITECTURE_GB200 (NV_GPU_ARCHITECTURE_ID)0x000001B0
#define NV_GPU_ARCH_IMPLEMENTATION_GB202 (NV_GPU_ARCH_IMPLEMENTATION_ID)0x00000002

#ifdef __GNUC__
#pragma GCC diagnostic pop
#endif // __GNUC__
Expand Down
1 change: 1 addition & 0 deletions src/util/util_env.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ namespace dxvk::env {
CHECK_ARCH(TU100)
CHECK_ARCH(GA100)
CHECK_ARCH(AD100)
CHECK_ARCH(GB200)
#undef CHECK_ARCH

if (override) {
Expand Down
40 changes: 21 additions & 19 deletions tests/nvapi_d3d12.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -229,40 +229,42 @@ TEST_CASE("D3D12 methods succeed", "[.d3d12]") {
struct Data {
VkDriverId driverId;
uint32_t deviceId;
std::string extensionName;
std::set<std::string> extensionNames;
uint16_t expectedMajorSMVersion;
uint16_t expectedMinorSMVersion;
bool variablePixelRateShadingSupported;
};
auto args = GENERATE(
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2600, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME, 8, 9, true},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME, 8, 6, true},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_KHR_FRAGMENT_SHADER_BARYCENTRIC_EXTENSION_NAME, 7, 5, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_NVX_IMAGE_VIEW_HANDLE_EXTENSION_NAME, 7, 0, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_NV_CLIP_SPACE_W_SCALING_EXTENSION_NAME, 6, 0, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_NV_VIEWPORT_ARRAY2_EXTENSION_NAME, 5, 2, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, VK_EXT_SHADER_IMAGE_ATOMIC_INT64_EXTENSION_NAME, 5, 0, false},
Data{VK_DRIVER_ID_MESA_NVK, 0x2600, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME, 8, 9, true},
Data{VK_DRIVER_ID_AMD_OPEN_SOURCE, 0x2000, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME, 0, 0, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, "ext", 0, 0, false});
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, {VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME, VK_NV_CUDA_KERNEL_LAUNCH_EXTENSION_NAME}, 8, 6, true},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, {VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME}, 0, 0, true},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, {}, 0, 0, false},
Data{VK_DRIVER_ID_MESA_NVK, 0x2600, {VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME}, 0, 0, true},
Data{VK_DRIVER_ID_MESA_NVK, 0x2600, {}, 0, 0, false},
Data{VK_DRIVER_ID_AMD_OPEN_SOURCE, 0x2000, {}, 0, 0, false},
Data{VK_DRIVER_ID_NVIDIA_PROPRIETARY, 0x2000, {}, 0, 0, false});

::SetEnvironmentVariableA("DXVK_NVAPI_ALLOW_OTHER_DRIVERS", "1");

auto extensionNames = std::set<std::string>{VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME};
extensionNames.insert(args.extensionNames.begin(), args.extensionNames.end());

luid.HighPart = 0x00000002;
luid.LowPart = 0x00000001;

ALLOW_CALL(*vk, GetDeviceExtensions(_, _))
.RETURN(std::set<std::string>{VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME, args.extensionName});
.RETURN(extensionNames);
ALLOW_CALL(*vk, GetPhysicalDeviceProperties2(_, _, _))
.SIDE_EFFECT(
ConfigureGetPhysicalDeviceProperties2(_3,
[&args, &luid](auto props, auto idProps, auto pciBusInfoProps, auto driverProps, auto fragmentShadingRateProps) {
memcpy(&idProps->deviceLUID, &luid, sizeof(luid));
idProps->deviceLUIDValid = VK_TRUE;
driverProps->driverID = args.driverId;
props->deviceID = args.deviceId;
if (args.extensionName == VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME)
fragmentShadingRateProps->primitiveFragmentShadingRateWithMultipleViewports = VK_TRUE;
[&args, &luid](auto vkProps) {
memcpy(&vkProps.idProps->deviceLUID, &luid, sizeof(luid));
vkProps.idProps->deviceLUIDValid = VK_TRUE;
vkProps.driverProps->driverID = args.driverId;
vkProps.props->deviceID = args.deviceId;
if (args.extensionNames.contains(VK_NV_CUDA_KERNEL_LAUNCH_EXTENSION_NAME)) {
vkProps.cudaKernelLaunchProperties->computeCapabilityMajor = 8;
vkProps.cudaKernelLaunchProperties->computeCapabilityMinor = 6;
}
}));

SetupResourceFactory(std::move(dxgiFactory), std::move(vk), std::move(nvml), std::move(lfx));
Expand Down
Loading
Loading