Skip to content

Commit

Permalink
Change log for October 11, 2024 Vulkan 1.3.298 spec update:
Browse files Browse the repository at this point in the history
Public Issues

  * Add location order to the definition of from-reads in the
    <<memory-model-acyclicity, Acyclicity>> section (public PR 2402).

Internal Issues

  * Add VK_KHR_cooperative_matrix to the <<memory-model-cooperative-matrix,
    Cooperative Matrix Memory Access>> section and allow multiple
    invocations to do the load (internal MR 6833).
  * Fix VkIndirectCommandsPushConstantTokenEXT::pPushConstant XML for all
    relevant union `selection` values (internal MR 6906).
  * Add missing limittypes to
    VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT in XML (internal MR
    6907).
  * Fix missing exception for VK_IMAGE_CREATE_EXTENDED_USAGE_BIT in the
    <<video-profile-compatibility, Video Profile Compatibility>> section
    (internal MR 6908).
  * Add missing `const` to
    VkGeneratedCommandsMemoryRequirementsInfoEXT::pNext in XML (internal MR
    6912).

New Extensions

  * VK_AMDX_shader_enqueue (provisional extension updated to V2 API) (public
    PR 2442).
  • Loading branch information
oddhack committed Oct 11, 2024
1 parent 2ff3b67 commit 05d5444
Show file tree
Hide file tree
Showing 13 changed files with 166 additions and 92 deletions.
31 changes: 31 additions & 0 deletions ChangeLog.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,37 @@ appears frequently in the change log.

'''

Change log for October 11, 2024 Vulkan 1.3.298 spec update:

Public Issues

* Add location order to the definition of from-reads in the
<<memory-model-acyclicity, Acyclicity>> section (public PR 2402).

Internal Issues

* Add VK_KHR_cooperative_matrix to the <<memory-model-cooperative-matrix,
Cooperative Matrix Memory Access>> section and allow multiple
invocations to do the load (internal MR 6833).
* Fix VkIndirectCommandsPushConstantTokenEXT::pPushConstant XML for all
relevant union `selection` values (internal MR 6906).
* Add missing limittypes to
VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT in XML (internal MR
6907).
* Fix missing exception for VK_IMAGE_CREATE_EXTENDED_USAGE_BIT in the
<<video-profile-compatibility, Video Profile Compatibility>> section
(internal MR 6908).
* Add missing `const` to
VkGeneratedCommandsMemoryRequirementsInfoEXT::pNext in XML (internal MR
6912).

New Extensions

* VK_AMDX_shader_enqueue (provisional extension updated to V2 API) (public
PR 2442).

'''

Change log for October 4, 2024 Vulkan 1.3.297 spec update:

Public Issues
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ VERBOSE =
# ADOCOPTS options for asciidoc->HTML5 output

NOTEOPTS = -a editing-notes -a implementation-guide
PATCHVERSION = 297
PATCHVERSION = 298
BASEOPTS =

ifneq (,$(findstring VKSC_VERSION_1_0,$(VERSIONS)))
Expand Down
4 changes: 2 additions & 2 deletions appendices/VK_AMDX_shader_enqueue.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ between revisions, and before final release.*

=== Description

This extension adds the ability for developers to enqueue mesh
and compute shader workgroups from other compute shaders.
This extension adds the ability for developers to enqueue mesh and compute
shader workgroups from other compute shaders.

include::{generated}/interfaces/VK_AMDX_shader_enqueue.adoc[]

Expand Down
26 changes: 17 additions & 9 deletions appendices/memorymodel.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -1016,8 +1016,9 @@ value written by the first operation.
_From-reads_ is a relation between operations, where the first operation is
a read, the second operation is a write, and the first operation reads a
value written earlier than the second operation in the second operation's
scoped modification order or location order (or the first operation reads from
the initial value, and the second operation is any write to the same locations).
scoped modification order or location order (or the first operation reads
from the initial value, and the second operation is any write to the same
locations).

Then the implementation must: guarantee that no cycles exist in the union of
the following relations:
Expand Down Expand Up @@ -1181,13 +1182,20 @@ with a scope of code:Workgroup, then X is location-ordered before Y, and if
X is a write and Y is a read then X is visible-to Y.


ifdef::VK_NV_cooperative_matrix[]
ifdef::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[]
[[memory-model-cooperative-matrix]]
== Cooperative Matrix Memory Access

For each dynamic instance of a cooperative matrix load or store instruction
(code:OpCooperativeMatrixLoadNV or code:OpCooperativeMatrixStoreNV), a
single implementation-dependent invocation within the instance of the
matrix's scope performs a non-atomic load or store (respectively) to each
memory location that is defined to be accessed by the instruction.
endif::VK_NV_cooperative_matrix[]
For each dynamic instance of a cooperative matrix load instruction
(code:OpCooperativeMatrixLoadKHR
ifdef::VK_NV_cooperative_matrix[, code:OpCooperativeMatrixLoadNV]
), some implementation-dependent invocation(s) within the instance of the
matrix's scope perform a non-atomic load from each memory location that is
defined to be accessed by the instruction.

For each memory location accessed by a dynamic instance of a cooperative
matrix store instruction (code:OpCooperativeMatrixStoreKHR
ifdef::VK_NV_cooperative_matrix[, code:OpCooperativeMatrixStoreNV]
), a single implementation-dependent invocation within the instance of the
matrix's scope performs a non-atomic store to that memory location.
endif::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[]
6 changes: 3 additions & 3 deletions appendices/spirvenv.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2170,14 +2170,14 @@ ifdef::VK_AMDX_shader_enqueue[]
The code:ShaderEnqueueAMDX capability must: only be used in shaders with
the code:GLCompute
ifdef::VK_EXT_mesh_shader[]
or code:MeshEXT
or code:MeshEXT
endif::VK_EXT_mesh_shader[]
execution model
execution model
* [[VUID-{refpage}-NodePayloadAMDX-09192]]
Variables in the code:NodePayloadAMDX storage class must: only be
declared in the code:GLCompute
ifdef::VK_EXT_mesh_shader[]
or code:MeshEXT
or code:MeshEXT
endif::VK_EXT_mesh_shader[]
execution model
* [[VUID-{refpage}-maxExecutionGraphShaderPayloadSize-09193]]
Expand Down
21 changes: 13 additions & 8 deletions chapters/commonvalidity/dispatch_graph_common.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,11 @@ include::{chapters}/commonvalidity/draw_dispatch_common.adoc[]
pname:commandBuffer must: not be a protected command buffer
* [[VUID-{refpage}-commandBuffer-09182]]
pname:commandBuffer must: be a primary command buffer
* pname:scratch must: be the device address of an allocated memory range
* [[VUID-{refpage}-scratch-10192]]
pname:scratch must: be the device address of an allocated memory range
at least as large as pname:scratchSize
* pname:scratchSize must: be greater than or equal to
* [[VUID-{refpage}-scratchSize-10193]]
pname:scratchSize must: be greater than or equal to
slink:VkExecutionGraphPipelineScratchSizeAMDX::pname:minSize returned by
flink:vkGetExecutionGraphPipelineScratchSizeAMDX for the currently bound
execution graph pipeline
Expand All @@ -22,10 +24,12 @@ ifdef::VK_KHR_maintenance5[]
or ename:VK_BUFFER_USAGE_2_EXECUTION_GRAPH_SCRATCH_BIT_AMDX
endif::VK_KHR_maintenance5[]
flag
* The device memory range [pname:scratch,pname:scratch + pname:scratchSize]
must: have been initialized with flink:vkCmdInitializeGraphScratchMemoryAMDX
using the currently bound execution graph pipeline, and not modified after
that by anything other than another execution graph dispatch command
* [[VUID-{refpage}-scratch-10194]]
The device memory range [pname:scratch,pname:scratch +
pname:scratchSize] must: have been initialized with
flink:vkCmdInitializeGraphScratchMemoryAMDX using the currently bound
execution graph pipeline, and not modified after that by anything other
than another execution graph dispatch command
* [[VUID-{refpage}-maxComputeWorkGroupCount-09186]]
Execution of this command must: not cause a node to be dispatched with a
larger number of workgroups than that specified by either a
Expand All @@ -41,7 +45,8 @@ endif::VK_KHR_maintenance5[]
specified by the max number of payloads for that decoration.
This requirement applies to each code:NodeMaxPayloadsAMDX decoration
separately
* If the currently bound execution graph pipeline includes draw nodes,
* [[VUID-{refpage}-None-10195]]
If the currently bound execution graph pipeline includes draw nodes,
this command must: be called within a render pass instance that is
compatible with the graphics pipeline used to create each of those nodes
compatible with the graphics pipeline used to create each of those nodes
// Common Valid Usage
69 changes: 40 additions & 29 deletions chapters/executiongraphs.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -147,18 +147,21 @@ include::{chapters}/commonvalidity/compute_graph_pipeline_create_info_common.ado
sname:VkPhysicalDeviceLimits::pname:maxPerStageResources
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-09133]]
If pname:pLibraryInfo is not `NULL`, each element of
pname:pLibraryInfo->pLibraries must: be either a compute pipeline,
an execution graph pipeline, or a graphics pipeline
* If pname:pLibraryInfo is not `NULL`, each element of
pname:pLibraryInfo->pLibraries that is a compute pipeline
or a graphics pipeline must: have been created with
pname:pLibraryInfo->pLibraries must: be either a compute pipeline, an
execution graph pipeline, or a graphics pipeline
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-10181]]
If pname:pLibraryInfo is not `NULL`, each element of
pname:pLibraryInfo->pLibraries that is a compute pipeline or a graphics
pipeline must: have been created with
ename:VK_PIPELINE_CREATE_2_EXECUTION_GRAPH_BIT_AMDX set
* If the <<features-shaderMeshEnqueue,pname:shaderMeshEnqueue>> feature
is not enabled, and pname:pLibraryInfo->pLibraries is not `NULL`,
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-shaderMeshEnqueue-10182]]
If the <<features-shaderMeshEnqueue,pname:shaderMeshEnqueue>> feature is
not enabled, and pname:pLibraryInfo->pLibraries is not `NULL`,
pname:pLibraryInfo->pLibraries must: not contain any graphics pipelines
ifdef::VK_EXT_graphics_pipeline_library[]
* Any element of pname:pLibraryInfo->pLibraries identifying a
graphics pipeline must: have been created with
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-10183]]
Any element of pname:pLibraryInfo->pLibraries identifying a graphics
pipeline must: have been created with
<<pipelines-graphics-subsets-complete, all possible state subsets>>
endif::VK_EXT_graphics_pipeline_library[]
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-None-09134]]
Expand All @@ -183,10 +186,12 @@ endif::VK_EXT_graphics_pipeline_library[]
matches the shader name of any other node in the graph, the size of the
output payload must: match the size of the input payload in the matching
node
* If pname:flags does not include ename:VK_PIPELINE_CREATE_LIBRARY_BIT_KHR,
and an output payload declared in any shader in the pipeline does not
have a code:PayloadNodeSparseArrayAMDX decoration, there must: be a node
in the graph corresponding to every index from 0 to its
* [[VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-10184]]
If pname:flags does not include
ename:VK_PIPELINE_CREATE_LIBRARY_BIT_KHR, and an output payload declared
in any shader in the pipeline does not have a
code:PayloadNodeSparseArrayAMDX decoration, there must: be a node in the
graph corresponding to every index from 0 to its
code:PayloadNodeArraySizeAMDX decoration
****

Expand Down Expand Up @@ -255,7 +260,8 @@ graph, call:

include::{generated}/api/protos/vkGetExecutionGraphPipelineNodeIndexAMDX.adoc[]

* pname:device is the logical device that pname:executionGraph was created on.
* pname:device is the logical device that pname:executionGraph was created
on.
* pname:executionGraph is the execution graph pipeline to query the
internal node index for.
* pname:pNodeInfo is a pointer to a
Expand Down Expand Up @@ -297,7 +303,8 @@ To query the scratch space required to dispatch an execution graph, call:

include::{generated}/api/protos/vkGetExecutionGraphPipelineScratchSizeAMDX.adoc[]

* pname:device is the logical device that pname:executionGraph was created on.
* pname:device is the logical device that pname:executionGraph was created
on.
* pname:executionGraph is the execution graph pipeline to query the
scratch space for.
* pname:pSizeInfo is a pointer to a
Expand Down Expand Up @@ -325,14 +332,14 @@ include::{generated}/api/structs/VkExecutionGraphPipelineScratchSizeAMDX.adoc[]
dispatching the queried execution graph.
* pname:maxSize indicates the maximum scratch space that can be used for
dispatching the queried execution graph.
* pname:sizeGranularity indicates the granularity at which the scratch space can be
increased from pname:minSize.
* pname:sizeGranularity indicates the granularity at which the scratch
space can be increased from pname:minSize.

Applications can: use any amount of scratch memory greater than
pname:minSize for dispatching a graph, however only the values equal to pname:minSize
+ an integer multiple of pname:sizeGranularity will be used.
Greater values may: result in higher performance, up to pname:maxSize which indicates the most memory
that an implementation can use effectively.
pname:minSize for dispatching a graph, however only the values equal to
pname:minSize + an integer multiple of pname:sizeGranularity will be used.
Greater values may: result in higher performance, up to pname:maxSize which
indicates the most memory that an implementation can use effectively.

include::{generated}/validity/structs/VkExecutionGraphPipelineScratchSizeAMDX.adoc[]
--
Expand All @@ -350,7 +357,8 @@ include::{generated}/api/protos/vkCmdInitializeGraphScratchMemoryAMDX.adoc[]
* pname:executionGraph is the execution graph pipeline to initialize the
scratch memory for.
* pname:scratch is the address of scratch memory to be initialized.
* pname:scratchSize is a range in bytes of scratch memory to be initialized.
* pname:scratchSize is a range in bytes of scratch memory to be
initialized.

This command must: be called before using pname:scratch to dispatch the
currently bound execution graph pipeline.
Expand All @@ -371,9 +379,11 @@ against it.

.Valid Usage
****
* pname:scratch must: be the device address of an allocated memory range
* [[VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratch-10185]]
pname:scratch must: be the device address of an allocated memory range
at least as large as pname:scratchSize
* pname:scratchSize must: be greater than or equal to
* [[VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratchSize-10186]]
pname:scratchSize must: be greater than or equal to
slink:VkExecutionGraphPipelineScratchSizeAMDX::pname:minSize returned by
flink:vkGetExecutionGraphPipelineScratchSizeAMDX for the currently bound
execution graph pipeline
Expand Down Expand Up @@ -414,8 +424,8 @@ any way against each other once they are dispatched.
There are no rasterization order guarantees between separately dispatched
graphics nodes, though individual primitives within a single dispatch do
adhere to rasterization order.
Draw calls executed before or after the execution graph also execute relative to
each graphics node with respect to rasterization order.
Draw calls executed before or after the execution graph also execute
relative to each graphics node with respect to rasterization order.

For this command, all device/host pointers in substructures are treated as
host pointers and read only during host execution of this command.
Expand Down Expand Up @@ -489,8 +499,8 @@ any way against each other once they are dispatched.
There are no rasterization order guarantees between separately dispatched
graphics nodes, though individual primitives within a single dispatch do
adhere to rasterization order.
Draw calls executed before or after the execution graph also execute relative to
each graphics node with respect to rasterization order.
Draw calls executed before or after the execution graph also execute
relative to each graphics node with respect to rasterization order.

For this command, all device/host pointers in substructures are treated as
device pointers and read during device execution of this command.
Expand Down Expand Up @@ -788,7 +798,8 @@ ifdef::VK_EXT_mesh_shader[]

Graphics pipelines added as nodes to an execution graph are executed in a
manner similar to a flink:vkCmdDrawMeshTasksIndirectEXT, using the same
payloads as compute shaders, but capturing some state from the command buffer.
payloads as compute shaders, but capturing some state from the command
buffer.

[[executiongraphs-meshnodes-statecapture]]
When an execution graph dispatch is recorded into a command buffer, it
Expand Down
6 changes: 3 additions & 3 deletions chapters/features.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7408,9 +7408,9 @@ This structure describes the following feature:

* [[features-shaderEnqueue]] pname:shaderEnqueue indicates whether the
implementation supports <<executiongraphs,execution graphs>>.
* [[features-shaderMeshEnqueue]] pname:shaderMeshEnqueue indicates whether the
implementation supports
<<executiongraphs-meshnodes,mesh nodes in execution graphs>>.
* [[features-shaderMeshEnqueue]] pname:shaderMeshEnqueue indicates whether
the implementation supports <<executiongraphs-meshnodes,mesh nodes in
execution graphs>>.

:refpage: VkPhysicalDeviceShaderEnqueueFeaturesAMDX
include::{chapters}/features.adoc[tag=features]
Expand Down
10 changes: 5 additions & 5 deletions chapters/limits.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4735,12 +4735,12 @@ structure describe the following limits:
non-scratch basetype:VkDeviceAddress arguments consumed by graph
dispatch commands.
* [[limits-maxExecutionGraphWorkgroupCount]]
pname:maxExecutionGraphWorkgroupCount[3] is the maximum number of
local workgroups that a shader can: be dispatched with in X, Y, and Z
pname:maxExecutionGraphWorkgroupCount[3] is the maximum number of local
workgroups that a shader can: be dispatched with in X, Y, and Z
dimensions, respectively.
* [[limits-maxExecutionGraphWorkgroups]]
pname:maxExecutionGraphWorkgroups is the total number of
local workgroups that a shader can: be dispatched with.
* [[limits-maxExecutionGraphWorkgroups]] pname:maxExecutionGraphWorkgroups
is the total number of local workgroups that a shader can: be dispatched
with.

:refpage: VkPhysicalDeviceShaderEnqueuePropertiesAMDX
include::{chapters}/limits.adoc[tag=limits_desc]
Expand Down
Loading

0 comments on commit 05d5444

Please sign in to comment.