Change log for October 11, 2024 Vulkan 1.3.298 spec update:

Public Issues * Add location order to the definition of from-reads in the <<memory-model-acyclicity, Acyclicity>> section (public PR 2402). Internal Issues * Add VK_KHR_cooperative_matrix to the <<memory-model-cooperative-matrix, Cooperative Matrix Memory Access>> section and allow multiple invocations to do the load (internal MR 6833). * Fix VkIndirectCommandsPushConstantTokenEXT::pPushConstant XML for all relevant union `selection` values (internal MR 6906). * Add missing limittypes to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT in XML (internal MR 6907). * Fix missing exception for VK_IMAGE_CREATE_EXTENDED_USAGE_BIT in the <<video-profile-compatibility, Video Profile Compatibility>> section (internal MR 6908). * Add missing `const` to VkGeneratedCommandsMemoryRequirementsInfoEXT::pNext in XML (internal MR 6912). New Extensions * VK_AMDX_shader_enqueue (provisional extension updated to V2 API) (public PR 2442).
KhronosGroup · Oct 11, 2024 · 05d5444 · 05d5444
1 parent 2ff3b67
commit 05d5444
Show file tree

Hide file tree

Showing 13 changed files with 166 additions and 92 deletions.
diff --git a/ChangeLog.adoc b/ChangeLog.adoc
@@ -14,6 +14,37 @@ appears frequently in the change log.
 
 '''
 
+Change log for October 11, 2024 Vulkan 1.3.298 spec update:
+
+Public Issues
+
+  * Add location order to the definition of from-reads in the
+    <<memory-model-acyclicity, Acyclicity>> section (public PR 2402).
+
+Internal Issues
+
+  * Add VK_KHR_cooperative_matrix to the <<memory-model-cooperative-matrix,
+    Cooperative Matrix Memory Access>> section and allow multiple
+    invocations to do the load (internal MR 6833).
+  * Fix VkIndirectCommandsPushConstantTokenEXT::pPushConstant XML for all
+    relevant union `selection` values (internal MR 6906).
+  * Add missing limittypes to
+    VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT in XML (internal MR
+    6907).
+  * Fix missing exception for VK_IMAGE_CREATE_EXTENDED_USAGE_BIT in the
+    <<video-profile-compatibility, Video Profile Compatibility>> section
+    (internal MR 6908).
+  * Add missing `const` to
+    VkGeneratedCommandsMemoryRequirementsInfoEXT::pNext in XML (internal MR
+    6912).
+
+New Extensions
+
+  * VK_AMDX_shader_enqueue (provisional extension updated to V2 API) (public
+    PR 2442).
+
+'''
+
 Change log for October 4, 2024 Vulkan 1.3.297 spec update:
 
 Public Issues

diff --git a/Makefile b/Makefile
@@ -139,7 +139,7 @@ VERBOSE =
 # ADOCOPTS options for asciidoc->HTML5 output
 
 NOTEOPTS     = -a editing-notes -a implementation-guide
-PATCHVERSION = 297
+PATCHVERSION = 298
 BASEOPTS     =
 
 ifneq (,$(findstring VKSC_VERSION_1_0,$(VERSIONS)))

diff --git a/appendices/VK_AMDX_shader_enqueue.adoc b/appendices/VK_AMDX_shader_enqueue.adoc
@@ -30,8 +30,8 @@ between revisions, and before final release.*
 
 === Description
 
-This extension adds the ability for developers to enqueue mesh
-and compute shader workgroups from other compute shaders.
+This extension adds the ability for developers to enqueue mesh and compute
+shader workgroups from other compute shaders.
 
 include::{generated}/interfaces/VK_AMDX_shader_enqueue.adoc[]
 

diff --git a/appendices/memorymodel.adoc b/appendices/memorymodel.adoc
@@ -1016,8 +1016,9 @@ value written by the first operation.
 _From-reads_ is a relation between operations, where the first operation is
 a read, the second operation is a write, and the first operation reads a
 value written earlier than the second operation in the second operation's
-scoped modification order or location order (or the first operation reads from
-the initial value, and the second operation is any write to the same locations).
+scoped modification order or location order (or the first operation reads
+from the initial value, and the second operation is any write to the same
+locations).
 
 Then the implementation must: guarantee that no cycles exist in the union of
 the following relations:
@@ -1181,13 +1182,20 @@ with a scope of code:Workgroup, then X is location-ordered before Y, and if
 X is a write and Y is a read then X is visible-to Y.
 
 
-ifdef::VK_NV_cooperative_matrix[]
+ifdef::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[]
 [[memory-model-cooperative-matrix]]
 == Cooperative Matrix Memory Access
 
-For each dynamic instance of a cooperative matrix load or store instruction
-(code:OpCooperativeMatrixLoadNV or code:OpCooperativeMatrixStoreNV), a
-single implementation-dependent invocation within the instance of the
-matrix's scope performs a non-atomic load or store (respectively) to each
-memory location that is defined to be accessed by the instruction.
-endif::VK_NV_cooperative_matrix[]
+For each dynamic instance of a cooperative matrix load instruction
+(code:OpCooperativeMatrixLoadKHR
+ifdef::VK_NV_cooperative_matrix[, code:OpCooperativeMatrixLoadNV]
+), some implementation-dependent invocation(s) within the instance of the
+matrix's scope perform a non-atomic load from each memory location that is
+defined to be accessed by the instruction.
+
+For each memory location accessed by a dynamic instance of a cooperative
+matrix store instruction (code:OpCooperativeMatrixStoreKHR
+ifdef::VK_NV_cooperative_matrix[, code:OpCooperativeMatrixStoreNV]
+), a single implementation-dependent invocation within the instance of the
+matrix's scope performs a non-atomic store to that memory location.
+endif::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[]
diff --git a/appendices/spirvenv.adoc b/appendices/spirvenv.adoc
@@ -2170,14 +2170,14 @@ ifdef::VK_AMDX_shader_enqueue[]
     The code:ShaderEnqueueAMDX capability must: only be used in shaders with
     the code:GLCompute
 ifdef::VK_EXT_mesh_shader[]
-	or code:MeshEXT
+    or code:MeshEXT
 endif::VK_EXT_mesh_shader[]
-	execution model
+    execution model
   * [[VUID-{refpage}-NodePayloadAMDX-09192]]
     Variables in the code:NodePayloadAMDX storage class must: only be
     declared in the code:GLCompute
 ifdef::VK_EXT_mesh_shader[]
-	or code:MeshEXT
+    or code:MeshEXT
 endif::VK_EXT_mesh_shader[]
     execution model
   * [[VUID-{refpage}-maxExecutionGraphShaderPayloadSize-09193]]

diff --git a/chapters/commonvalidity/dispatch_graph_common.adoc b/chapters/commonvalidity/dispatch_graph_common.adoc
@@ -9,9 +9,11 @@ include::{chapters}/commonvalidity/draw_dispatch_common.adoc[]
     pname:commandBuffer must: not be a protected command buffer
   * [[VUID-{refpage}-commandBuffer-09182]]
     pname:commandBuffer must: be a primary command buffer
-  * pname:scratch must: be the device address of an allocated memory range
+  * [[VUID-{refpage}-scratch-10192]]
+    pname:scratch must: be the device address of an allocated memory range
     at least as large as pname:scratchSize
-  * pname:scratchSize must: be greater than or equal to
+  * [[VUID-{refpage}-scratchSize-10193]]
+    pname:scratchSize must: be greater than or equal to
     slink:VkExecutionGraphPipelineScratchSizeAMDX::pname:minSize returned by
     flink:vkGetExecutionGraphPipelineScratchSizeAMDX for the currently bound
     execution graph pipeline
@@ -22,10 +24,12 @@ ifdef::VK_KHR_maintenance5[]
     or ename:VK_BUFFER_USAGE_2_EXECUTION_GRAPH_SCRATCH_BIT_AMDX
 endif::VK_KHR_maintenance5[]
     flag
-  * The device memory range [pname:scratch,pname:scratch + pname:scratchSize]
-    must: have been initialized with flink:vkCmdInitializeGraphScratchMemoryAMDX
-    using the currently bound execution graph pipeline, and not modified after
-    that by anything other than another execution graph dispatch command
+  * [[VUID-{refpage}-scratch-10194]]
+    The device memory range [pname:scratch,pname:scratch +
+    pname:scratchSize] must: have been initialized with
+    flink:vkCmdInitializeGraphScratchMemoryAMDX using the currently bound
+    execution graph pipeline, and not modified after that by anything other
+    than another execution graph dispatch command
   * [[VUID-{refpage}-maxComputeWorkGroupCount-09186]]
     Execution of this command must: not cause a node to be dispatched with a
     larger number of workgroups than that specified by either a
@@ -41,7 +45,8 @@ endif::VK_KHR_maintenance5[]
     specified by the max number of payloads for that decoration.
     This requirement applies to each code:NodeMaxPayloadsAMDX decoration
     separately
-  * If the currently bound execution graph pipeline includes draw nodes,
+  * [[VUID-{refpage}-None-10195]]
+    If the currently bound execution graph pipeline includes draw nodes,
     this command must: be called within a render pass instance that is
-	compatible with the graphics pipeline used to create each of those nodes
+ compatible with the graphics pipeline used to create each of those nodes
 // Common Valid Usage
diff --git a/chapters/executiongraphs.adoc b/chapters/executiongraphs.adoc
@@ -147,18 +147,21 @@ include::{chapters}/commonvalidity/compute_graph_pipeline_create_info_common.ado
     sname:VkPhysicalDeviceLimits::pname:maxPerStageResources
   * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-09133]]
     If pname:pLibraryInfo is not `NULL`, each element of
-    pname:pLibraryInfo->pLibraries must: be either a compute pipeline,
-    an execution graph pipeline, or a graphics pipeline
-  * If pname:pLibraryInfo is not `NULL`, each element of
-    pname:pLibraryInfo->pLibraries that is a compute pipeline
-    or a graphics pipeline must: have been created with
+    pname:pLibraryInfo->pLibraries must: be either a compute pipeline, an
+    execution graph pipeline, or a graphics pipeline
+  * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-10181]]
+    If pname:pLibraryInfo is not `NULL`, each element of
+    pname:pLibraryInfo->pLibraries that is a compute pipeline or a graphics
+    pipeline must: have been created with
     ename:VK_PIPELINE_CREATE_2_EXECUTION_GRAPH_BIT_AMDX set
-  * If the <<features-shaderMeshEnqueue,pname:shaderMeshEnqueue>> feature
-    is not enabled, and pname:pLibraryInfo->pLibraries is not `NULL`,
+  * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-shaderMeshEnqueue-10182]]
+    If the <<features-shaderMeshEnqueue,pname:shaderMeshEnqueue>> feature is
+    not enabled, and pname:pLibraryInfo->pLibraries is not `NULL`,
     pname:pLibraryInfo->pLibraries must: not contain any graphics pipelines
 ifdef::VK_EXT_graphics_pipeline_library[]
-  * Any element of pname:pLibraryInfo->pLibraries identifying a
-    graphics pipeline must: have been created with
+  * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-10183]]
+    Any element of pname:pLibraryInfo->pLibraries identifying a graphics
+    pipeline must: have been created with
     <<pipelines-graphics-subsets-complete, all possible state subsets>>
 endif::VK_EXT_graphics_pipeline_library[]
   * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-None-09134]]
@@ -183,10 +186,12 @@ endif::VK_EXT_graphics_pipeline_library[]
     matches the shader name of any other node in the graph, the size of the
     output payload must: match the size of the input payload in the matching
     node
-  * If pname:flags does not include ename:VK_PIPELINE_CREATE_LIBRARY_BIT_KHR,
-    and an output payload declared in any shader in the pipeline does not
-    have a code:PayloadNodeSparseArrayAMDX decoration, there must: be a node
-    in the graph corresponding to every index from 0 to its
+  * [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-10184]]
+    If pname:flags does not include
+    ename:VK_PIPELINE_CREATE_LIBRARY_BIT_KHR, and an output payload declared
+    in any shader in the pipeline does not have a
+    code:PayloadNodeSparseArrayAMDX decoration, there must: be a node in the
+    graph corresponding to every index from 0 to its
     code:PayloadNodeArraySizeAMDX decoration
 ****
 
@@ -255,7 +260,8 @@ graph, call:
 
 include::{generated}/api/protos/vkGetExecutionGraphPipelineNodeIndexAMDX.adoc[]
 
-  * pname:device is the logical device that pname:executionGraph was created on.
+  * pname:device is the logical device that pname:executionGraph was created
+    on.
   * pname:executionGraph is the execution graph pipeline to query the
     internal node index for.
   * pname:pNodeInfo is a pointer to a
@@ -297,7 +303,8 @@ To query the scratch space required to dispatch an execution graph, call:
 
 include::{generated}/api/protos/vkGetExecutionGraphPipelineScratchSizeAMDX.adoc[]
 
-  * pname:device is the logical device that pname:executionGraph was created on.
+  * pname:device is the logical device that pname:executionGraph was created
+    on.
   * pname:executionGraph is the execution graph pipeline to query the
     scratch space for.
   * pname:pSizeInfo is a pointer to a
@@ -325,14 +332,14 @@ include::{generated}/api/structs/VkExecutionGraphPipelineScratchSizeAMDX.adoc[]
     dispatching the queried execution graph.
   * pname:maxSize indicates the maximum scratch space that can be used for
     dispatching the queried execution graph.
-  * pname:sizeGranularity indicates the granularity at which the scratch space can be
-    increased from pname:minSize.
+  * pname:sizeGranularity indicates the granularity at which the scratch
+    space can be increased from pname:minSize.
 
 Applications can: use any amount of scratch memory greater than
-pname:minSize for dispatching a graph, however only the values equal to pname:minSize
-+ an integer multiple of pname:sizeGranularity will be used.
-Greater values may: result in higher performance, up to pname:maxSize which indicates the most memory
-that an implementation can use effectively.
+pname:minSize for dispatching a graph, however only the values equal to
+pname:minSize + an integer multiple of pname:sizeGranularity will be used.
+Greater values may: result in higher performance, up to pname:maxSize which
+indicates the most memory that an implementation can use effectively.
 
 include::{generated}/validity/structs/VkExecutionGraphPipelineScratchSizeAMDX.adoc[]
 --
@@ -350,7 +357,8 @@ include::{generated}/api/protos/vkCmdInitializeGraphScratchMemoryAMDX.adoc[]
   * pname:executionGraph is the execution graph pipeline to initialize the
     scratch memory for.
   * pname:scratch is the address of scratch memory to be initialized.
-  * pname:scratchSize is a range in bytes of scratch memory to be initialized.
+  * pname:scratchSize is a range in bytes of scratch memory to be
+    initialized.
 
 This command must: be called before using pname:scratch to dispatch the
 currently bound execution graph pipeline.
@@ -371,9 +379,11 @@ against it.
 
 .Valid Usage
 ****
-  * pname:scratch must: be the device address of an allocated memory range
+  * [[VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratch-10185]]
+    pname:scratch must: be the device address of an allocated memory range
     at least as large as pname:scratchSize
-  * pname:scratchSize must: be greater than or equal to
+  * [[VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratchSize-10186]]
+    pname:scratchSize must: be greater than or equal to
     slink:VkExecutionGraphPipelineScratchSizeAMDX::pname:minSize returned by
     flink:vkGetExecutionGraphPipelineScratchSizeAMDX for the currently bound
     execution graph pipeline
@@ -414,8 +424,8 @@ any way against each other once they are dispatched.
 There are no rasterization order guarantees between separately dispatched
 graphics nodes, though individual primitives within a single dispatch do
 adhere to rasterization order.
-Draw calls executed before or after the execution graph also execute relative to
-each graphics node with respect to rasterization order.
+Draw calls executed before or after the execution graph also execute
+relative to each graphics node with respect to rasterization order.
 
 For this command, all device/host pointers in substructures are treated as
 host pointers and read only during host execution of this command.
@@ -489,8 +499,8 @@ any way against each other once they are dispatched.
 There are no rasterization order guarantees between separately dispatched
 graphics nodes, though individual primitives within a single dispatch do
 adhere to rasterization order.
-Draw calls executed before or after the execution graph also execute relative to
-each graphics node with respect to rasterization order.
+Draw calls executed before or after the execution graph also execute
+relative to each graphics node with respect to rasterization order.
 
 For this command, all device/host pointers in substructures are treated as
 device pointers and read during device execution of this command.
@@ -788,7 +798,8 @@ ifdef::VK_EXT_mesh_shader[]
 
 Graphics pipelines added as nodes to an execution graph are executed in a
 manner similar to a flink:vkCmdDrawMeshTasksIndirectEXT, using the same
-payloads as compute shaders, but capturing some state from the command buffer.
+payloads as compute shaders, but capturing some state from the command
+buffer.
 
 [[executiongraphs-meshnodes-statecapture]]
 When an execution graph dispatch is recorded into a command buffer, it

diff --git a/chapters/features.adoc b/chapters/features.adoc
@@ -7408,9 +7408,9 @@ This structure describes the following feature:
 
   * [[features-shaderEnqueue]] pname:shaderEnqueue indicates whether the
     implementation supports <<executiongraphs,execution graphs>>.
-  * [[features-shaderMeshEnqueue]] pname:shaderMeshEnqueue indicates whether the
-    implementation supports
-    <<executiongraphs-meshnodes,mesh nodes in execution graphs>>.
+  * [[features-shaderMeshEnqueue]] pname:shaderMeshEnqueue indicates whether
+    the implementation supports <<executiongraphs-meshnodes,mesh nodes in
+    execution graphs>>.
 
 :refpage: VkPhysicalDeviceShaderEnqueueFeaturesAMDX
 include::{chapters}/features.adoc[tag=features]

diff --git a/chapters/limits.adoc b/chapters/limits.adoc
@@ -4735,12 +4735,12 @@ structure describe the following limits:
     non-scratch basetype:VkDeviceAddress arguments consumed by graph
     dispatch commands.
   * [[limits-maxExecutionGraphWorkgroupCount]]
-    pname:maxExecutionGraphWorkgroupCount[3] is the maximum number of
-    local workgroups that a shader can: be dispatched with in X, Y, and Z
+    pname:maxExecutionGraphWorkgroupCount[3] is the maximum number of local
+    workgroups that a shader can: be dispatched with in X, Y, and Z
     dimensions, respectively.
-  * [[limits-maxExecutionGraphWorkgroups]]
-    pname:maxExecutionGraphWorkgroups is the total number of
-    local workgroups that a shader can: be dispatched with.
+  * [[limits-maxExecutionGraphWorkgroups]] pname:maxExecutionGraphWorkgroups
+    is the total number of local workgroups that a shader can: be dispatched
+    with.
 
 :refpage: VkPhysicalDeviceShaderEnqueuePropertiesAMDX
 include::{chapters}/limits.adoc[tag=limits_desc]