diff --git a/util/test/demos/3rdparty/ags/LICENSE.txt b/util/test/demos/3rdparty/ags/LICENSE.txt
new file mode 100644
index 000000000..6adade017
--- /dev/null
+++ b/util/test/demos/3rdparty/ags/LICENSE.txt
@@ -0,0 +1,19 @@
+Copyright (c) 2020 Advanced Micro Devices, Inc. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
diff --git a/util/test/demos/3rdparty/ags/README.md b/util/test/demos/3rdparty/ags/README.md
new file mode 100644
index 000000000..9aff86080
--- /dev/null
+++ b/util/test/demos/3rdparty/ags/README.md
@@ -0,0 +1,121 @@
+# AMD AGS SDK
+
+
+The AMD GPU Services (AGS) library provides software developers with the ability to query AMD GPU software and hardware state information that is not normally available through standard operating systems or graphics APIs. If you are serious about getting the most from your AMD GPU, then AGS can help you harness that power.
+
+AGS includes support for querying graphics driver version info, GPU hardware info, performance metrics, shader extensions and additional extensions supported in the AMD drivers for DirectX 11 and DirectX 12. AGS also provides Crossfire™ (AMD's multi-GPU rendering technology) support and Eyefinity (AMD's multi-display rendering technology) configuration info.
+
+In addition to the library itself, the AGS SDK includes several samples to demonstrate use of the library.
+
+
+

+
+
+### What's new in AGS 6.0
+Version 6.0 introduces several new shader intrinsics, namely a DX12 ray tracing hit token for RDNA2 hardware for ray tracing optimisation, ReadLaneAt and explicit float conversions. There is also a change to the initialization API to make sure the AGS dll matches the header and calling code.
+
+### What's new in AGS 5.4.2
+Version 5.4.2 reinstates the sharedMemoryInBytes field which is required when calculating the memory available on APUs.
+
+### What's new in AGS 5.4.1
+Version 5.4.1 includes x86 libs and Visual Studio 2019 support. There is also support for base vertex and base instance intrinsics as well as GetWaveSize intrinsics. The samples have been ported from premake to cmake.
+
+### What's new in AGS 5.4
+Version 5.4 adds a better description of the GPU architecture for those wishing to fine tune their games for specific code paths. In addition, there are now shader intrinsics for getting the draw index for execute indirect calls as well as support for atomic U64 ops.
+
+Radeon 7 and RDNA GPU core and memory speeds are now returned.
+
+### What's new in AGS 5.3
+Version 5.3 adds DirectX 11 deferred context support for our MultiDrawIndirect and UAV overlap extensions, along with a helper function to let your app determine if the installed driver meets your game's minimum driver version requirements. If you're a Vulkan user, you can pair that with our machine readable AMD Vulkan versions database, to get more information about the Vulkan implementation in our client driver.
+
+Lastly, there's a new FreeSync 2 gamma 2.2 mode. It uses a 10-bit (per RGB component, 2-bit alpha) swapchain, as opposed to the 16-bit (per RGB component, 16-bit alpha) swapchain needed for FreeSync 2 scRGB.
+
+### What's new in AGS 5.2
+Version 5.2 adds support for app registration in DirectX 12. App registration lets you give more information about your game or application to our driver, which can then use that (ideally unique) information to better support the game or app if we need to make driver-side changes to help things run as efficiently and correctly as possible.
+
+We also changed how you get access to extensions under DX12, requiring you to create your GPU device using `agsDriverExtensionsDX12_CreateDevice()` , instead of the normal `D3D12CreateDevice()` call you’d make to D3D.
+
+Lastly, we’ve also added support for breadcrumb markers in D3D11. Using the `agsDriverExtensionsDX11_WriteBreadcrumb()` API, you can put in place a strategy for debugging driver issues more easily. Sometimes your game or app can interact with the driver in a way that causes it to crash or TDR. The new API gives you the ability to leave markers around your D3D11 API calls, helping you narrow down exactly what interaction with the driver caused the problem.
+
+### What's new in AGS 5.1
+Version 5.1 is a partly developer-focused update to AGS 5. We've listened to feedback about how difficult it can be to integrate the binary AGS libs into your games, and while we can't open the source code to AGS to allow you to integrate it from source, we canvassed developers to figure out what pre-built binaries would be most useful to provide. So we've now added builds of the AGS binary library that are linkable with Visual Studio projects built with `/MT` and `/MD`, used to select a particular CRT.
+
+There's also a breaking change in how you get access to DX11 AMD extensions. Look at the Changelog for details on that. We've also added an application registration extension for DX11 apps. That lets you tell the driver that your game can be considered in a uniquely indentifiable way, which is particularly helpful if you build on top of popular middleware like Unity or UE4 and make rendering changes.
+
+There's also support for FreeSync 2 HDR, DX12 application user markers for [Radeon GPU Profiler](https://gpuopen.com/gaming-product/radeon-gpu-profiler-rgp/), VS2017 versions of the shipping samples, and new wave-level shader intrinsics for both DX11 and DX12.
+
+See the full Changelog for more details.
+
+### What's new in AGS 5.0
+Version 5.0 is a major overhaul of the library designed to provide a much clearer view of the GPUs in the system and the displays attached to them. It also exposes the ability to query each display for HDR capabilities and put those HDR-capable displays into various HDR modes.
+
+Highlights include the following:
+* Full GPU enumeration with adapter string, device id, revision id and vendor id.
+* Per-GPU display enumeration including information on display name, resolution, and HDR capabilities.
+* Optional user-supplied memory allocator.
+* Function to set displays into HDR mode.[1](#ags-sdk-footnote1)
+* A Microsoft WACK compliant version of the library.
+* DirectX 11 shader compiler controls.[1](#ags-sdk-footnote1)
+* DirectX 11 multiview extension.[2](#ags-sdk-footnote1)
+* DirectX 11 Crossfire API updates.
+ * Now supports using the API without needing a driver profile.
+ * You can also now specify the transfer engine.
+
+### Driver extensions
+AGS exposes GCN shader extensions for both DirectX 11 and DirectX 12. It also provides access to additional extensions available in the AMD driver for DirectX 11:
+ * Quad List primitive type
+ * UAV overlap
+ * Depth bounds test
+ * Multi-draw indirect
+ * Multiview
+
+### Prerequisites
+* AMD Radeon™ GCN-based GPU (HD 7000 series or newer)
+ * Or other DirectX® 11 compatible GPU with Shader Model 5 support[3](#ags-sdk-footnote1)
+* 64-bit Windows® 7 (SP1 with the [Platform Update](https://msdn.microsoft.com/en-us/library/windows/desktop/jj863687.aspx)), Windows® 8.1, or Windows® 10
+* Visual Studio® 2013 or Visual Studio® 2015
+* Recommended driver: Radeon Software Crimson ReLive Edition 16.12.1 (driver version 16.50.2001) or later
+
+### Getting Started
+* It is recommended to take a look at the sample source code.
+ * There are three samples: ags_sample, crossfire_sample, and eyefinity_sample.
+* Visual Studio projects for VS2013 and VS2015 can be found in each sample's `build` directory.
+* Additional documentation, including API documentation and instructions on how to add AGS support to an existing project, can be found in the `ags_lib\doc` directory.
+ * There is also [online documentation](http://gpuopen-librariesandsdks.github.io/ags/).
+
+### Additional Samples
+In addition to the three samples included in this repo, there are other samples available on GitHub that use AGS:
+* [CrossfireAPI11](https://github.com/GPUOpen-LibrariesAndSDKs/CrossfireAPI11) - a larger example of using the explicit Crossfire API
+ * The CrossfireAPI11 sample also comes with an extensive guide for multi-GPU: the *AMD Crossfire guide for Direct3D® 11 applications*
+* [DepthBoundsTest11](https://github.com/GPUOpen-LibrariesAndSDKs/DepthBoundsTest11) - a sample showing how to use the depth bounds test extension
+* [Barycentrics11](https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics11) - a sample showing how to use the GCN shader extensions for DirectX 11
+* [Barycentrics12](https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics12) - a sample showing how to use the GCN shader extensions for DirectX 12
+
+### Premake
+The Visual Studio projects in each sample's `build` directory were generated with Premake. To generate the project files yourself, open a command prompt in the sample's `premake` directory (where the premake5.lua script for that sample is located, not the top-level directory where the premake5 executable is located) and execute the following command:
+
+* `..\..\premake\premake5.exe [action]`
+* For example: `..\..\premake\premake5.exe vs2015`
+
+Alternatively, to regenerate all Visual Studio files for the SDK, execute `ags_update_vs_files.bat` in the top-level `premake` directory.
+
+This version of Premake has been modified from the stock version to use the property sheet technique for the Windows SDK from this [Visual C++ Team blog post](http://blogs.msdn.com/b/vcblog/archive/2012/11/23/using-the-windows-8-sdk-with-visual-studio-2010-configuring-multiple-projects.aspx). The technique was originally described for using the Windows 8.0 SDK with Visual Studio 2010, but it applies more generally to using newer versions of the Windows SDK with older versions of Visual Studio.
+
+By default, Visual Studio 2013 projects will compile against the Windows 8.1 SDK. However, the VS2013 projects generated with this version of Premake will use the next higher SDK (i.e. the Windows 10 SDK), if the newer SDK exists on the user's machine.
+
+For Visual Studio 2015, the systemversion Premake function is used to add the `WindowsTargetPlatformVersion` element to the project file, to specify which version of the Windows SDK will be used. To change `WindowsTargetPlatformVersion` for Visual Studio 2015, change the value for `_AMD_WIN_SDK_VERSION` in `premake\amd_premake_util.lua` and regenerate the Visual Studio files.
+
+### Third-Party Software
+* DXUT is distributed under the terms of the MIT License. See `eyefinity_sample\dxut\MIT.txt`.
+* Premake is distributed under the terms of the BSD License. See `premake\LICENSE.txt`.
+
+### Attribution
+* AMD, the AMD Arrow logo, Radeon, Crossfire, and combinations thereof are either registered trademarks or trademarks of Advanced Micro Devices, Inc. in the United States and/or other countries.
+* Microsoft, DirectX, Visual Studio, and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
+
+### Notes
+1: Requires Radeon Software Crimson Edition 16.9.2 (driver version 16.40.2311) or later.
+
+2: Requires Radeon Software Crimson ReLive Edition 16.12.1 (driver version 16.50.2001) or later.
+
+3: While the AGS SDK samples will run on non-AMD hardware, they will be of limited usefulness, since the purpose of AGS is to provide convenient access to AMD-specific information and extensions.
diff --git a/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx11.hlsl.h b/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx11.hlsl.h
new file mode 100644
index 000000000..0aa7c4e4a
--- /dev/null
+++ b/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx11.hlsl.h
@@ -0,0 +1,3740 @@
+static const char *ags_shader_intrinsics_dx11_hlsl_array[] = {R"EOSHADER(
+// Copyright (c) 2020 Advanced Micro Devices, Inc. All rights reserved.
+//
+// Permission is hereby granted, free of charge, to any person obtaining a copy
+// of this software and associated documentation files (the "Software"), to deal
+// in the Software without restriction, including without limitation the rights
+// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+// copies of the Software, and to permit persons to whom the Software is
+// furnished to do so, subject to the following conditions:
+//
+// The above copyright notice and this permission notice shall be included in
+// all copies or substantial portions of the Software.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+// THE SOFTWARE.
+//
+
+/**
+*************************************************************************************************************
+* @file ags_shader_intrinsics_dx11.hlsl
+*
+* @brief
+* AMD D3D Shader Intrinsics API hlsl file.
+* This include file contains the shader intrinsics definitions (structures, enums, constant)
+* and HLSL shader intrinsics functions.
+*
+* @version 2.3
+*
+*************************************************************************************************************
+*/
+
+#ifndef _AMDDXEXTSHADERINTRINSICS_HLSL_
+#define _AMDDXEXTSHADERINTRINSICS_HLSL_
+
+/**
+*************************************************************************************************************
+* Definitions to construct the intrinsic instruction composed of an opcode and optional immediate data.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsics_MagicCodeShift 28
+#define AmdDxExtShaderIntrinsics_MagicCodeMask 0xf
+#define AmdDxExtShaderIntrinsics_OpcodePhaseShift 24
+#define AmdDxExtShaderIntrinsics_OpcodePhaseMask 0x3
+#define AmdDxExtShaderIntrinsics_DataShift 8
+#define AmdDxExtShaderIntrinsics_DataMask 0xffff
+#define AmdDxExtShaderIntrinsics_OpcodeShift 0
+#define AmdDxExtShaderIntrinsics_OpcodeMask 0xff
+
+#define AmdDxExtShaderIntrinsics_MagicCode 0x5
+
+
+/**
+*************************************************************************************************************
+* Intrinsic opcodes.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsOpcode_Readfirstlane 0x01
+#define AmdDxExtShaderIntrinsicsOpcode_Readlane 0x02
+#define AmdDxExtShaderIntrinsicsOpcode_LaneId 0x03
+#define AmdDxExtShaderIntrinsicsOpcode_Swizzle 0x04
+#define AmdDxExtShaderIntrinsicsOpcode_Ballot 0x05
+#define AmdDxExtShaderIntrinsicsOpcode_MBCnt 0x06
+#define AmdDxExtShaderIntrinsicsOpcode_Min3U 0x08
+#define AmdDxExtShaderIntrinsicsOpcode_Min3F 0x09
+#define AmdDxExtShaderIntrinsicsOpcode_Med3U 0x0a
+#define AmdDxExtShaderIntrinsicsOpcode_Med3F 0x0b
+#define AmdDxExtShaderIntrinsicsOpcode_Max3U 0x0c
+#define AmdDxExtShaderIntrinsicsOpcode_Max3F 0x0d
+#define AmdDxExtShaderIntrinsicsOpcode_BaryCoord 0x0e
+#define AmdDxExtShaderIntrinsicsOpcode_VtxParam 0x0f
+#define AmdDxExtShaderIntrinsicsOpCode_ViewportIndex 0x10
+#define AmdDxExtShaderIntrinsicsOpCode_RtArraySlice 0x11
+#define AmdDxExtShaderIntrinsicsOpCode_WaveReduce 0x12
+#define AmdDxExtShaderIntrinsicsOpCode_WaveScan 0x13
+#define AmdDxExtShaderIntrinsicsOpCode_Reserved1 0x14
+#define AmdDxExtShaderIntrinsicsOpCode_Reserved2 0x15
+#define AmdDxExtShaderIntrinsicsOpCode_Reserved3 0x16
+#define AmdDxExtShaderIntrinsicsOpCode_DrawIndex 0x17
+#define AmdDxExtShaderIntrinsicsOpCode_AtomicU64 0x18
+#define AmdDxExtShaderIntrinsicsOpcode_GetWaveSize 0x19
+#define AmdDxExtShaderIntrinsicsOpcode_BaseInstance 0x1a
+#define AmdDxExtShaderIntrinsicsOpcode_BaseVertex 0x1b
+
+
+/**
+*************************************************************************************************************
+* Intrinsic opcode phases.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsOpcodePhase_0 0x0
+#define AmdDxExtShaderIntrinsicsOpcodePhase_1 0x1
+#define AmdDxExtShaderIntrinsicsOpcodePhase_2 0x2
+#define AmdDxExtShaderIntrinsicsOpcodePhase_3 0x3
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsSwizzle defines for common swizzles. Can be used as the operation parameter for
+* the AmdDxExtShaderIntrinsics_Swizzle intrinsic.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsSwizzle_SwapX1 0x041f
+#define AmdDxExtShaderIntrinsicsSwizzle_SwapX2 0x081f
+#define AmdDxExtShaderIntrinsicsSwizzle_SwapX4 0x101f
+#define AmdDxExtShaderIntrinsicsSwizzle_SwapX8 0x201f
+#define AmdDxExtShaderIntrinsicsSwizzle_SwapX16 0x401f
+#define AmdDxExtShaderIntrinsicsSwizzle_ReverseX2 0x041f
+#define AmdDxExtShaderIntrinsicsSwizzle_ReverseX4 0x0c1f
+#define AmdDxExtShaderIntrinsicsSwizzle_ReverseX8 0x1c1f
+#define AmdDxExtShaderIntrinsicsSwizzle_ReverseX16 0x3c1f
+#define AmdDxExtShaderIntrinsicsSwizzle_ReverseX32 0x7c1f
+#define AmdDxExtShaderIntrinsicsSwizzle_BCastX2 0x003e
+#define AmdDxExtShaderIntrinsicsSwizzle_BCastX4 0x003c
+#define AmdDxExtShaderIntrinsicsSwizzle_BCastX8 0x0038
+#define AmdDxExtShaderIntrinsicsSwizzle_BCastX16 0x0030
+#define AmdDxExtShaderIntrinsicsSwizzle_BCastX32 0x0020
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsBarycentric defines for barycentric interpolation mode. To be used with
+* AmdDxExtShaderIntrinsicsOpcode_IjBarycentricCoords to specify the interpolation mode.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsBarycentric_LinearCenter 0x1
+#define AmdDxExtShaderIntrinsicsBarycentric_LinearCentroid 0x2
+#define AmdDxExtShaderIntrinsicsBarycentric_LinearSample 0x3
+#define AmdDxExtShaderIntrinsicsBarycentric_PerspCenter 0x4
+#define AmdDxExtShaderIntrinsicsBarycentric_PerspCentroid 0x5
+#define AmdDxExtShaderIntrinsicsBarycentric_PerspSample 0x6
+#define AmdDxExtShaderIntrinsicsBarycentric_PerspPullModel 0x7
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsBarycentric defines for specifying vertex and parameter indices. To be used as
+* the inputs to the AmdDxExtShaderIntrinsicsOpcode_VertexParameter function
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsBarycentric_Vertex0 0x0
+#define AmdDxExtShaderIntrinsicsBarycentric_Vertex1 0x1
+#define AmdDxExtShaderIntrinsicsBarycentric_Vertex2 0x2
+
+#define AmdDxExtShaderIntrinsicsBarycentric_Param0 0x00
+#define AmdDxExtShaderIntrinsicsBarycentric_Param1 0x01
+#define AmdDxExtShaderIntrinsicsBarycentric_Param2 0x02
+#define AmdDxExtShaderIntrinsicsBarycentric_Param3 0x03
+#define AmdDxExtShaderIntrinsicsBarycentric_Param4 0x04
+#define AmdDxExtShaderIntrinsicsBarycentric_Param5 0x05
+#define AmdDxExtShaderIntrinsicsBarycentric_Param6 0x06
+#define AmdDxExtShaderIntrinsicsBarycentric_Param7 0x07
+#define AmdDxExtShaderIntrinsicsBarycentric_Param8 0x08
+#define AmdDxExtShaderIntrinsicsBarycentric_Param9 0x09
+#define AmdDxExtShaderIntrinsicsBarycentric_Param10 0x0a
+#define AmdDxExtShaderIntrinsicsBarycentric_Param11 0x0b
+#define AmdDxExtShaderIntrinsicsBarycentric_Param12 0x0c
+#define AmdDxExtShaderIntrinsicsBarycentric_Param13 0x0d
+#define AmdDxExtShaderIntrinsicsBarycentric_Param14 0x0e
+#define AmdDxExtShaderIntrinsicsBarycentric_Param15 0x0f
+#define AmdDxExtShaderIntrinsicsBarycentric_Param16 0x10
+#define AmdDxExtShaderIntrinsicsBarycentric_Param17 0x11
+#define AmdDxExtShaderIntrinsicsBarycentric_Param18 0x12
+#define AmdDxExtShaderIntrinsicsBarycentric_Param19 0x13
+#define AmdDxExtShaderIntrinsicsBarycentric_Param20 0x14
+#define AmdDxExtShaderIntrinsicsBarycentric_Param21 0x15
+#define AmdDxExtShaderIntrinsicsBarycentric_Param22 0x16
+#define AmdDxExtShaderIntrinsicsBarycentric_Param23 0x17
+#define AmdDxExtShaderIntrinsicsBarycentric_Param24 0x18
+#define AmdDxExtShaderIntrinsicsBarycentric_Param25 0x19
+#define AmdDxExtShaderIntrinsicsBarycentric_Param26 0x1a
+#define AmdDxExtShaderIntrinsicsBarycentric_Param27 0x1b
+#define AmdDxExtShaderIntrinsicsBarycentric_Param28 0x1c
+#define AmdDxExtShaderIntrinsicsBarycentric_Param29 0x1d
+#define AmdDxExtShaderIntrinsicsBarycentric_Param30 0x1e
+#define AmdDxExtShaderIntrinsicsBarycentric_Param31 0x1f
+
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentX 0x0
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentY 0x1
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentZ 0x2
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentW 0x3
+
+#define AmdDxExtShaderIntrinsicsBarycentric_ParamShift 0
+#define AmdDxExtShaderIntrinsicsBarycentric_ParamMask 0x1f
+#define AmdDxExtShaderIntrinsicsBarycentric_VtxShift 0x5
+#define AmdDxExtShaderIntrinsicsBarycentric_VtxMask 0x3
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentShift 0x7
+#define AmdDxExtShaderIntrinsicsBarycentric_ComponentMask 0x3
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsWaveOp defines for supported operations. Can be used as the parameter for the
+* AmdDxExtShaderIntrinsicsOpcode_WaveOp intrinsic.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsWaveOp_AddF 0x01
+#define AmdDxExtShaderIntrinsicsWaveOp_AddI 0x02
+#define AmdDxExtShaderIntrinsicsWaveOp_AddU 0x03
+#define AmdDxExtShaderIntrinsicsWaveOp_MulF 0x04
+#define AmdDxExtShaderIntrinsicsWaveOp_MulI 0x05
+#define AmdDxExtShaderIntrinsicsWaveOp_MulU 0x06
+#define AmdDxExtShaderIntrinsicsWaveOp_MinF 0x07
+#define AmdDxExtShaderIntrinsicsWaveOp_MinI 0x08
+#define AmdDxExtShaderIntrinsicsWaveOp_MinU 0x09
+#define AmdDxExtShaderIntrinsicsWaveOp_MaxF 0x0a
+#define AmdDxExtShaderIntrinsicsWaveOp_MaxI 0x0b
+#define AmdDxExtShaderIntrinsicsWaveOp_MaxU 0x0c
+#define AmdDxExtShaderIntrinsicsWaveOp_And 0x0d // Reduction only
+#define AmdDxExtShaderIntrinsicsWaveOp_Or 0x0e // Reduction only
+#define AmdDxExtShaderIntrinsicsWaveOp_Xor 0x0f // Reduction only
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsWaveOp masks and shifts for opcode and flags
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift 0
+#define AmdDxExtShaderIntrinsicsWaveOp_OpcodeMask 0xff
+#define AmdDxExtShaderIntrinsicsWaveOp_FlagShift 8
+#define AmdDxExtShaderIntrinsicsWaveOp_FlagMask 0xff
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsWaveOp flags for use with AmdDxExtShaderIntrinsicsOpcode_WaveScan.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsWaveOp_Inclusive 0x01
+#define AmdDxExtShaderIntrinsicsWaveOp_Exclusive 0x02
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsicsAtomic defines for supported operations. Can be used as the parameter for the
+* AmdDxExtShaderIntrinsicsOpcode_AtomicU64 intrinsic.
+*************************************************************************************************************
+*/
+#define AmdDxExtShaderIntrinsicsAtomicOp_MinU64 0x01
+#define AmdDxExtShaderIntrinsicsAtomicOp_MaxU64 0x02
+#define AmdDxExtShaderIntrinsicsAtomicOp_AndU64 0x03
+#define AmdDxExtShaderIntrinsicsAtomicOp_OrU64 0x04
+#define AmdDxExtShaderIntrinsicsAtomicOp_XorU64 0x05
+#define AmdDxExtShaderIntrinsicsAtomicOp_AddU64 0x06
+#define AmdDxExtShaderIntrinsicsAtomicOp_XchgU64 0x07
+#define AmdDxExtShaderIntrinsicsAtomicOp_CmpXchgU64 0x08
+
+
+/**
+*************************************************************************************************************
+* Resource slots for intrinsics using imm_atomic_cmp_exch.
+*************************************************************************************************************
+*/
+#ifndef AmdDxExtShaderIntrinsicsUAVSlot
+#define AmdDxExtShaderIntrinsicsUAVSlot u7
+#endif
+
+RWByteAddressBuffer AmdDxExtShaderIntrinsicsUAV : register(AmdDxExtShaderIntrinsicsUAVSlot);
+
+/**
+*************************************************************************************************************
+* Resource and sampler slots for intrinsics using sample_l.
+*************************************************************************************************************
+*/
+#ifndef AmdDxExtShaderIntrinsicsResSlot
+#define AmdDxExtShaderIntrinsicsResSlot t127
+#endif
+
+#ifndef AmdDxExtShaderIntrinsicsSamplerSlot
+#define AmdDxExtShaderIntrinsicsSamplerSlot s15
+#endif
+
+SamplerState AmdDxExtShaderIntrinsicsSamplerState : register (AmdDxExtShaderIntrinsicsSamplerSlot);
+Texture3D AmdDxExtShaderIntrinsicsResource : register (AmdDxExtShaderIntrinsicsResSlot);
+
+/**
+*************************************************************************************************************
+* MakeAmdShaderIntrinsicsInstruction
+*
+* Creates instruction from supplied opcode and immediate data.
+* NOTE: This is an internal function and should not be called by the source HLSL shader directly.
+*
+*************************************************************************************************************
+*/
+uint MakeAmdShaderIntrinsicsInstruction(uint opcode, uint opcodePhase, uint immediateData)
+{
+ return ((AmdDxExtShaderIntrinsics_MagicCode << AmdDxExtShaderIntrinsics_MagicCodeShift) |
+ (immediateData << AmdDxExtShaderIntrinsics_DataShift) |
+ (opcodePhase << AmdDxExtShaderIntrinsics_OpcodePhaseShift) |
+ (opcode << AmdDxExtShaderIntrinsics_OpcodeShift));
+}
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_ReadfirstlaneF
+*
+* Returns the value of float src for the first active lane of the wavefront.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Readfirstlane) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_ReadfirstlaneF(float src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Readfirstlane,
+ 0, 0);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_ReadfirstlaneU
+*
+* Returns the value of unsigned integer src for the first active lane of the wavefront.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Readfirstlane) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_ReadfirstlaneU(uint src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Readfirstlane,
+ 0, 0);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_ReadlaneF
+*
+* Returns the value of float src for the lane within the wavefront specified by laneId.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Readlane) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_ReadlaneF(float src, uint laneId)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Readlane, 0,
+ laneId);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_ReadlaneU
+*
+* Returns the value of unsigned integer src for the lane within the wavefront specified by laneId.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Readlane) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_ReadlaneU(uint src, uint laneId)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Readlane, 0,
+ laneId);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_LaneId
+*
+* Returns the current lane id for the thread within the wavefront.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_LaneId) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_LaneId()
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_LaneId, 0, 0);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetWaveSize
+*
+* Returns the wave size for the current shader, including active, inactive and helper lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_GetWaveSize) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetWaveSize()
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_GetWaveSize, 0, 0);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Swizzle
+*
+* Generic instruction to shuffle the float src value among different lanes as specified by the
+* operation.
+* Note that the operation parameter must be an immediately specified value not a value from a variable.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Swizzle) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_SwizzleF(float src, uint operation)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Swizzle, 0,
+ operation);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_SwizzleU
+*
+* Generic instruction to shuffle the unsigned integer src value among different lanes as specified by the
+* operation.
+* Note that the operation parameter must be an immediately specified value not a value from a variable.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Swizzle) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_SwizzleU(uint src, uint operation)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Swizzle, 0,
+ operation);
+
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Ballot
+*
+* Given an input predicate returns a bit mask indicating for which lanes the predicate is true.
+* Inactive or non-existent lanes will always return 0. The number of existent lanes is the
+* wavefront size.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_Ballot(bool predicate)
+{
+ uint instruction;
+
+ uint retVal1;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Ballot,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0, 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, predicate, 0, retVal1);
+
+ uint retVal2;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Ballot,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1, 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, predicate, 0, retVal2);
+
+ return uint2(retVal1, retVal2);
+}
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_BallotAny
+*
+* Convenience routine that uses Ballot and returns true if for any of the active lanes the predicate
+* is true.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+*************************************************************************************************************
+*/
+bool AmdDxExtShaderIntrinsics_BallotAny(bool predicate)
+{
+ uint2 retVal = AmdDxExtShaderIntrinsics_Ballot(predicate);
+
+ return ((retVal.x | retVal.y) != 0 ? true : false);
+}
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_BallotAll
+*
+* Convenience routine that uses Ballot and returns true if for all of the active lanes the predicate
+* is true.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+*************************************************************************************************************
+*/
+bool AmdDxExtShaderIntrinsics_BallotAll(bool predicate)
+{
+ uint2 ballot = AmdDxExtShaderIntrinsics_Ballot(predicate);
+
+ uint2 execMask = AmdDxExtShaderIntrinsics_Ballot(true);
+
+ return ((ballot.x == execMask.x) && (ballot.y == execMask.y));
+}
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_MBCnt
+*
+* Returns the masked bit count of the source register for this thread within all the active threads
+* within a wavefront.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_MBCnt) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_MBCnt(uint2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_MBCnt, 0, 0);
+
+ uint retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, src.y, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Min3F
+*
+* Returns the minimum value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_Min3F(float src0, float src1, float src2)
+{
+ uint minimum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Min3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), minimum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Min3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), minimum, minimum);
+
+ return asfloat(minimum);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Min3U
+*
+* Returns the minimum value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_Min3U(uint src0, uint src1, uint src2)
+{
+ uint minimum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Min3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, minimum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Min3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, minimum, minimum);
+
+ return minimum;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Med3F
+*
+* Returns the median value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_Med3F(float src0, float src1, float src2)
+{
+ uint median;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Med3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), median);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Med3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), median, median);
+
+ return asfloat(median);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Med3U
+*
+* Returns the median value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_Med3U(uint src0, uint src1, uint src2)
+{
+ uint median;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Med3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, median);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Med3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, median, median);
+
+ return median;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Max3F
+*
+* Returns the maximum value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_Max3F(float src0, float src1, float src2)
+{
+ uint maximum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Max3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), maximum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Max3F,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), maximum, maximum);
+
+ return asfloat(maximum);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_Max3U
+*
+* Returns the maximum value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_Max3U(uint src0, uint src1, uint src2)
+{
+ uint maximum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Max3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, maximum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_Max3U,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, maximum, maximum);
+
+ return maximum;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_IjBarycentricCoords
+*
+* Returns the (i, j) barycentric coordinate pair for this shader invocation with the specified
+* interpolation mode at the specified pixel location. Should not be used for "pull-model" interpolation,
+* PullModelBarycentricCoords should be used instead
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_BaryCoord) returned S_OK.
+*
+* Can only be used in pixel shader stages.
+*
+*************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_IjBarycentricCoords(uint interpMode)
+{
+ uint2 retVal;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaryCoord,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ interpMode);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, 0, 0, retVal.x);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaryCoord,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ interpMode);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, retVal.x, 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_PullModelBarycentricCoords
+*
+* Returns the (1/W,1/I,1/J) coordinates at the pixel center which can be used for custom interpolation at
+* any location in the pixel.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_BaryCoord) returned S_OK.
+*
+* Can only be used in pixel shader stages.
+*
+*************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_PullModelBarycentricCoords()
+{
+ uint3 retVal;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaryCoord,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ AmdDxExtShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, 0, 0, retVal.x);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaryCoord,
+ AmdDxExtShaderIntrinsicsOpcodePhase_1,
+ AmdDxExtShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, retVal.x, 0, retVal.y);
+
+ uint instruction3 = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaryCoord,
+ AmdDxExtShaderIntrinsicsOpcodePhase_2,
+ AmdDxExtShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction3, retVal.y, 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_VertexParameter
+*
+* Returns the triangle's parameter information at the specified triangle vertex.
+* The vertex and parameter indices must specified as immediate values.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_VtxParam) returned S_OK.
+*
+* Only available in pixel shader stages.
+*
+*************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_VertexParameter(uint vertexIdx, uint parameterIdx)
+{
+ uint4 retVal;
+ uint4 instruction;
+
+ instruction.x = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_VtxParam,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdDxExtShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdDxExtShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdDxExtShaderIntrinsicsBarycentric_ComponentX << AmdDxExtShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.x, 0, 0, retVal.x);
+
+ instruction.y = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_VtxParam,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdDxExtShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdDxExtShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdDxExtShaderIntrinsicsBarycentric_ComponentY << AmdDxExtShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.y, 0, 0, retVal.y);
+
+ instruction.z = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_VtxParam,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdDxExtShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdDxExtShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdDxExtShaderIntrinsicsBarycentric_ComponentZ << AmdDxExtShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.z, 0, 0, retVal.z);
+
+ instruction.w = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_VtxParam,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdDxExtShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdDxExtShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdDxExtShaderIntrinsicsBarycentric_ComponentW << AmdDxExtShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.w, 0, 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_VertexParameterComponent
+*
+* Returns the triangle's parameter information at the specified triangle vertex and component.
+* The vertex, parameter and component indices must be specified as immediate values.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_VtxParam) returned S_OK.
+*
+* Only available in pixel shader stages.
+*
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_VertexParameterComponent(uint vertexIdx, uint parameterIdx, uint componentIdx)
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_VtxParam,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdDxExtShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdDxExtShaderIntrinsicsBarycentric_ParamShift) |
+ (componentIdx << AmdDxExtShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetViewportIndex
+*
+* Returns current viewport index for replicated draws when MultiView extension is enabled (broadcast masks
+* are set).
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_MultiViewIndices) returned S_OK.
+*
+* Only available in vertex/geometry/domain shader stages.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetViewportIndex()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_ViewportIndex, 0, 0);
+
+ retVal = asuint(AmdDxExtShaderIntrinsicsResource.SampleLevel(AmdDxExtShaderIntrinsicsSamplerState,
+ float3(0, 0, 0),
+ asfloat(instruction)).x);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetViewportIndexPsOnly
+*
+* Returns current viewport index for replicated draws when MultiView extension is enabled (broadcast masks
+* are set).
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_MultiViewIndices) returned S_OK.
+*
+* Only available in pixel shader stage.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetViewportIndexPsOnly()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_ViewportIndex, 0, 0);
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetRTArraySlice
+*
+* Returns current RT array slice for replicated draws when MultiView extension is enabled (broadcast masks
+* are set).
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_MultiViewIndices) returned S_OK.
+*
+* Only available in vertex/geometry/domain shader stages.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetRTArraySlice()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_RtArraySlice, 0, 0);
+
+ retVal = asuint(AmdDxExtShaderIntrinsicsResource.SampleLevel(AmdDxExtShaderIntrinsicsSamplerState,
+ float3(0, 0, 0),
+ asfloat(instruction)).x);
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetRTArraySlicePsOnly
+*
+* Returns current RT array slice for replicated draws when MultiView extension is enabled (broadcast masks
+* are set).
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_MultiViewIndices) returned S_OK.
+*
+* Only available in pixel shader stage.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetRTArraySlicePsOnly()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_RtArraySlice, 0, 0);
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce
+*
+* The following functions perform the specified reduction operation across a wavefront.
+*
+* They are available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+*************************************************************************************************************
+*/
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : float
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, float src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : float2
+*************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, float2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint2 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : float3
+*************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, float3 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint3 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : float4
+*************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, float4 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint4 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.w), 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : int
+*************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, int src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : int2
+*************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, int2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint2 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : int3
+*************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, int3 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint3 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveReduce : int4
+*************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveReduce(uint waveOp, int4 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveReduce,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift));
+ uint4 retVal;
+
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.w), 0, retVal.w);
+
+ return retVal;
+}
+
+
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveScan
+*
+* The following functions perform the specified scan operation across a wavefront.
+*
+* They are available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+*************************************************************************************************************
+*/
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveScan : float
+*************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveScan(uint waveOp, uint flags, float src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveScan,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdDxExtShaderIntrinsicsWaveOp_FlagShift));
+ uint retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveScan : float2
+*************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveScan(uint waveOp, uint flags, float2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveScan,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdDxExtShaderIntrinsicsWaveOp_FlagShift));
+ uint2 retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveScan : float3
+*************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveScan(uint waveOp, uint flags, float3 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveScan,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdDxExtShaderIntrinsicsWaveOp_FlagShift));
+ uint3 retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveScan : float4
+*************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveScan(uint waveOp, uint flags, float4 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_WaveScan,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdDxExtShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdDxExtShaderIntrinsicsWaveOp_FlagShift));
+ uint4 retVal;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.w), 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetDrawIndex
+*
+* Returns the 0-based draw index in an indirect draw. Always returns 0 for direct draws.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_DrawIndex) returned S_OK.
+*
+* Only available in vertex shader stage.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetDrawIndex()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpCode_DrawIndex,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetBaseInstance
+*
+* Returns the StartInstanceLocation parameter passed to direct or indirect drawing commands.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_BaseInstance) returned S_OK.
+*
+* Only available in vertex shader stage.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetBaseInstance()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaseInstance,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+*************************************************************************************************************
+* AmdDxExtShaderIntrinsics_GetBaseVertex
+*
+* For non-indexed draw commands, returns the StartVertexLocation parameter. For indexed draw commands,
+* returns the BaseVertexLocation parameter.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsSupport_BaseVertex) returned S_OK.
+*
+* Only available in vertex shader stage.
+*
+*************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_GetBaseVertex()
+{
+ uint retVal;
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdDxExtShaderIntrinsicsOpcode_BaseVertex,
+ AmdDxExtShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_MakeAtomicInstructions
+*
+* Creates uint4 with x/y/z/w components containing phase 0/1/2/3 for atomic instructions.
+* NOTE: This is an internal function and should not be called by the source HLSL shader directly.
+*
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_MakeAtomicInstructions(uint op)
+{
+ uint4 instructions;
+ instructions.x = MakeAmdShaderIntrinsicsInstruction(
+ AmdDxExtShaderIntrinsicsOpCode_AtomicU64, AmdDxExtShaderIntrinsicsOpcodePhase_0, op);
+ instructions.y = MakeAmdShaderIntrinsicsInstruction(
+ AmdDxExtShaderIntrinsicsOpCode_AtomicU64, AmdDxExtShaderIntrinsicsOpcodePhase_1, op);
+ instructions.z = MakeAmdShaderIntrinsicsInstruction(
+ AmdDxExtShaderIntrinsicsOpCode_AtomicU64, AmdDxExtShaderIntrinsicsOpcodePhase_2, op);
+ instructions.w = MakeAmdShaderIntrinsicsInstruction(
+ AmdDxExtShaderIntrinsicsOpCode_AtomicU64, AmdDxExtShaderIntrinsicsOpcodePhase_3, op);
+ return instructions;
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicOp
+*
+* Creates intrinstic instructions for the specified atomic op.
+* NOTE: These are internal functions and should not be called by the source HLSL shader directly.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(RWByteAddressBuffer uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav.Store(retVal.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(RWTexture1D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[retVal.x] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(RWTexture2D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint2(retVal.x, retVal.x)] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(RWTexture3D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint3(retVal.x, retVal.x, retVal.x)] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(
+ RWByteAddressBuffer uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav.Store(retVal.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(
+ RWTexture1D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[retVal.x] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(
+ RWTexture2D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint2(retVal.x, retVal.x)] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOp(
+ RWTexture3D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdDxExtShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint3(retVal.x, retVal.x, retVal.x)] = retVal.y;
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdDxExtShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicMinU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic minimum of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicMinU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MinU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMinU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MinU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMinU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MinU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMinU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MinU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicMaxU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic maximum of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicMaxU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMaxU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMaxU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicMaxU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicAndU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic AND of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicAndU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AndU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAndU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AndU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAndU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AndU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAndU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AndU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicOrU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic OR of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicOrU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_OrU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOrU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_OrU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOrU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_OrU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicOrU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_OrU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicXorU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic XOR of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicXorU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XorU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXorU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XorU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXorU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XorU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXorU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XorU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicAddU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic add of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicAddU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AddU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAddU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AddU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAddU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AddU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicAddU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_AddU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicXchgU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic exchange of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicXchgU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXchgU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXchgU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicXchgU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_AtomicCmpXchgU64
+*
+* The following functions are available if CheckSupport(AmdDxExtShaderIntrinsicsOpCode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic compare of comparison value with UAV at address, stores value if values match,
+* returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_AtomicCmpXchgU64(
+ RWByteAddressBuffer uav, uint address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), compare_value, value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture1D uav, uint address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), compare_value, value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture2D uav, uint2 address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), compare_value, value, op);
+}
+
+uint2 AmdDxExtShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture3D uav, uint3 address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdDxExtShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdDxExtShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), compare_value, value, op);
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+*
+* Performs reduction operation across a wave and returns the result of the reduction (sum of all threads in a wave)
+* to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveActiveSum(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveActiveSum(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveActiveSum(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveActiveSum(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveSum(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveSum(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveSum(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveSum(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveSum(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveSum(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveSum(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveSum(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+*
+* Performs reduction operation across a wave and returns the result of the reduction (product of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveActiveProduct(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveActiveProduct(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveActiveProduct(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveActiveProduct(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveProduct(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulI, src);
+}
+
+)EOSHADER",
+ R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveProduct(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveProduct(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveProduct(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveProduct(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveProduct(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveProduct(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveProduct(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+*
+* Performs reduction operation across a wave and returns the result of the reduction (minimum of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveActiveMin(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveActiveMin(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveActiveMin(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveActiveMin(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveMin(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveMin(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveMin(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveMin(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveMin(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveMin(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveMin(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveMin(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+*
+* Performs reduction operation across a wave and returns the result of the reduction (maximum of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WaveActiveMax(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WaveActiveMax(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WaveActiveMax(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WaveActiveMax(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveMax(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveMax(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveMax(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveMax(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveMax(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxU, src);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveMax(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveMax(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveMax(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_MaxU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+*
+* Performs reduction operation across a wave and returns the result of the reduction (Bitwise AND of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveBitAnd(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveBitAnd(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitAnd
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveBitAnd(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_And, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+*
+* Performs reduction operation across a wave and returns the result of the reduction (Bitwise OR of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveBitOr(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce( AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveBitOr(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveBitOr(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveBitOr(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveBitOr(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveBitOr(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveBitOr(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitOr
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveBitOr(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Or, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+*
+* Performs reduction operation across a wave and returns the result of the reduction (Bitwise XOR of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WaveActiveBitXor(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WaveActiveBitXor(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WaveActiveBitXor(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WaveActiveBitXor(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WaveActiveBitXor(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WaveActiveBitXor(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WaveActiveBitXor(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WaveActiveBitXor
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WaveActiveBitXor(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveReduce(AmdDxExtShaderIntrinsicsWaveOp_Xor, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+*
+* Performs a prefix (exclusive) scan operation across a wave and returns the resulting sum to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePrefixSum(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePrefixSum(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePrefixSum(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePrefixSum(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePrefixSum(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(
+ AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePrefixSum(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePrefixSum(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePrefixSum(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePrefixSum(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePrefixSum(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePrefixSum(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixSum
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePrefixSum(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+*
+* Performs a prefix scan operation across a wave and returns the resulting product to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePrefixProduct(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePrefixProduct(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePrefixProduct(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePrefixProduct(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePrefixProduct(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePrefixProduct(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePrefixProduct(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePrefixProduct(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePrefixProduct(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePrefixProduct(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePrefixProduct(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixProduct
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePrefixProduct(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+*
+* Performs a prefix scan operation across a wave and returns the resulting minimum value to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePrefixMin(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePrefixMin(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePrefixMin(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePrefixMin(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePrefixMin(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePrefixMin(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePrefixMin(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePrefixMin(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePrefixMin(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePrefixMin(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePrefixMin(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMin
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePrefixMin(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+*
+* Performs a prefix scan operation across a wave and returns the resulting maximum value to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePrefixMax(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePrefixMax(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePrefixMax(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePrefixMax(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePrefixMax(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePrefixMax(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePrefixMax(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePrefixMax(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePrefixMax(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePrefixMax(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePrefixMax(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePrefixMax
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePrefixMax(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Exclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+*
+* Performs a Postfix (Inclusive) scan operation across a wave and returns the resulting sum to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePostfixSum(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePostfixSum(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePostfixSum(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePostfixSum(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePostfixSum(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePostfixSum(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePostfixSum(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePostfixSum(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePostfixSum(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePostfixSum(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePostfixSum(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixSum
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePostfixSum(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_AddU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+*
+* Performs a Postfix scan operation across a wave and returns the resulting product to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePostfixProduct(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePostfixProduct(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePostfixProduct(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePostfixProduct(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePostfixProduct(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePostfixProduct(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePostfixProduct(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePostfixProduct(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePostfixProduct(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePostfixProduct(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePostfixProduct(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixProduct
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePostfixProduct(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MulU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+*
+* Performs a Postfix scan operation across a wave and returns the resulting minimum value to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePostfixMin(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePostfixMin(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePostfixMin(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePostfixMin(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePostfixMin(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePostfixMin(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePostfixMin(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePostfixMin(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePostfixMin(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePostfixMin(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePostfixMin(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMin
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePostfixMin(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MinU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+*
+* Performs a Postfix scan operation across a wave and returns the resulting maximum value to all participating lanes.
+*
+* Available if CheckSupport(AmdDxExtShaderIntrinsicsOpcode_WaveScan) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdDxExtShaderIntrinsics_WavePostfixMax(float src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+float2 AmdDxExtShaderIntrinsics_WavePostfixMax(float2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+float3 AmdDxExtShaderIntrinsics_WavePostfixMax(float3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+float4 AmdDxExtShaderIntrinsics_WavePostfixMax(float4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxF,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+int AmdDxExtShaderIntrinsics_WavePostfixMax(int src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+int2 AmdDxExtShaderIntrinsics_WavePostfixMax(int2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+int3 AmdDxExtShaderIntrinsics_WavePostfixMax(int3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+int4 AmdDxExtShaderIntrinsics_WavePostfixMax(int4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxI,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+uint AmdDxExtShaderIntrinsics_WavePostfixMax(uint src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+uint2 AmdDxExtShaderIntrinsics_WavePostfixMax(uint2 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+uint3 AmdDxExtShaderIntrinsics_WavePostfixMax(uint3 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdDxExtShaderIntrinsics_WavePostfixMax
+***********************************************************************************************************************
+*/
+uint4 AmdDxExtShaderIntrinsics_WavePostfixMax(uint4 src)
+{
+ return AmdDxExtShaderIntrinsics_WaveScan(AmdDxExtShaderIntrinsicsWaveOp_MaxU,
+ AmdDxExtShaderIntrinsicsWaveOp_Inclusive,
+ src);
+}
+
+
+#endif // _AMDDXEXTSHADERINTRINSICS_HLSL_
+)EOSHADER"};
+
+inline std::string ags_shader_intrinsics_dx11_hlsl()
+{
+ std::string ret;
+ for(size_t i = 0; i < sizeof(ags_shader_intrinsics_dx11_hlsl_array) /
+ sizeof(ags_shader_intrinsics_dx11_hlsl_array[0]);
+ i++)
+ ret += ags_shader_intrinsics_dx11_hlsl_array[i];
+ return ret;
+}
diff --git a/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx12.hlsl.h b/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx12.hlsl.h
new file mode 100644
index 000000000..08f027040
--- /dev/null
+++ b/util/test/demos/3rdparty/ags/ags_shader_intrinsics_dx12.hlsl.h
@@ -0,0 +1,4031 @@
+static const char *ags_shader_intrinsics_dx12_hlsl_array[] = {R"EOSHADER(
+//
+// Copyright (c) 2020 Advanced Micro Devices, Inc. All rights reserved.
+//
+// Permission is hereby granted, free of charge, to any person obtaining a copy
+// of this software and associated documentation files (the "Software"), to deal
+// in the Software without restriction, including without limitation the rights
+// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+// copies of the Software, and to permit persons to whom the Software is
+// furnished to do so, subject to the following conditions:
+//
+// The above copyright notice and this permission notice shall be included in
+// all copies or substantial portions of the Software.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+// THE SOFTWARE.
+//
+
+/**
+***********************************************************************************************************************
+* @file ags_shader_intrinsics_dx12.hlsl
+* @brief
+* AMD D3D Shader Intrinsics HLSL include file.
+* This include file contains the Shader Intrinsics definitions used in shader code by the application.
+* @note
+* This does not work with immediate values or values that the compiler determines can produces denorms
+*
+***********************************************************************************************************************
+*/
+
+#ifndef _AMDEXTD3DSHADERINTRINICS_HLSL
+#define _AMDEXTD3DSHADERINTRINICS_HLSL
+
+// Default AMD shader intrinsics designated SpaceId.
+#define AmdExtD3DShaderIntrinsicsSpaceId space2147420894
+
+// Dummy UAV used to access shader intrinsics. Applications need to add a root signature entry for this resource in
+// order to use shader extensions. Applications may specify an alternate UAV binding by defining
+// AMD_EXT_SHADER_INTRINSIC_UAV_OVERRIDE. The application must also call IAmdExtD3DShaderIntrinsics1::SetExtensionUavBinding()
+// in order to use an alternate binding. This must be done before creating the root signature and pipeline.
+#ifdef AMD_EXT_SHADER_INTRINSIC_UAV_OVERRIDE
+RWByteAddressBuffer AmdExtD3DShaderIntrinsicsUAV : register(AMD_EXT_SHADER_INTRINSIC_UAV_OVERRIDE);
+#else
+RWByteAddressBuffer AmdExtD3DShaderIntrinsicsUAV : register(u0, AmdExtD3DShaderIntrinsicsSpaceId);
+#endif
+
+/**
+***********************************************************************************************************************
+* Definitions to construct the intrinsic instruction composed of an opcode and optional immediate data.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsics_MagicCodeShift 28
+#define AmdExtD3DShaderIntrinsics_MagicCodeMask 0xf
+#define AmdExtD3DShaderIntrinsics_OpcodePhaseShift 24
+#define AmdExtD3DShaderIntrinsics_OpcodePhaseMask 0x3
+#define AmdExtD3DShaderIntrinsics_DataShift 8
+#define AmdExtD3DShaderIntrinsics_DataMask 0xffff
+#define AmdExtD3DShaderIntrinsics_OpcodeShift 0
+#define AmdExtD3DShaderIntrinsics_OpcodeMask 0xff
+
+#define AmdExtD3DShaderIntrinsics_MagicCode 0x5
+
+
+/**
+***********************************************************************************************************************
+* Intrinsic opcodes.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsOpcode_Readfirstlane 0x01
+#define AmdExtD3DShaderIntrinsicsOpcode_Readlane 0x02
+#define AmdExtD3DShaderIntrinsicsOpcode_LaneId 0x03
+#define AmdExtD3DShaderIntrinsicsOpcode_Swizzle 0x04
+#define AmdExtD3DShaderIntrinsicsOpcode_Ballot 0x05
+#define AmdExtD3DShaderIntrinsicsOpcode_MBCnt 0x06
+#define AmdExtD3DShaderIntrinsicsOpcode_Min3U 0x07
+#define AmdExtD3DShaderIntrinsicsOpcode_Min3F 0x08
+#define AmdExtD3DShaderIntrinsicsOpcode_Med3U 0x09
+#define AmdExtD3DShaderIntrinsicsOpcode_Med3F 0x0a
+#define AmdExtD3DShaderIntrinsicsOpcode_Max3U 0x0b
+#define AmdExtD3DShaderIntrinsicsOpcode_Max3F 0x0c
+#define AmdExtD3DShaderIntrinsicsOpcode_BaryCoord 0x0d
+#define AmdExtD3DShaderIntrinsicsOpcode_VtxParam 0x0e
+#define AmdExtD3DShaderIntrinsicsOpcode_Reserved1 0x0f
+#define AmdExtD3DShaderIntrinsicsOpcode_Reserved2 0x10
+#define AmdExtD3DShaderIntrinsicsOpcode_Reserved3 0x11
+#define AmdExtD3DShaderIntrinsicsOpcode_WaveReduce 0x12
+#define AmdExtD3DShaderIntrinsicsOpcode_WaveScan 0x13
+#define AmdExtD3DShaderIntrinsicsOpcode_LoadDwAtAddr 0x14
+#define AmdExtD3DShaderIntrinsicsOpcode_DrawIndex 0x17
+#define AmdExtD3DShaderIntrinsicsOpcode_AtomicU64 0x18
+#define AmdExtD3DShaderIntrinsicsOpcode_GetWaveSize 0x19
+#define AmdExtD3DShaderIntrinsicsOpcode_BaseInstance 0x1a
+#define AmdExtD3DShaderIntrinsicsOpcode_BaseVertex 0x1b
+#define AmdExtD3DShaderIntrinsicsOpcode_FloatConversion 0x1c
+#define AmdExtD3DShaderIntrinsicsOpcode_ReadlaneAt 0x1d
+
+/**
+***********************************************************************************************************************
+* Intrinsic opcode phases.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsOpcodePhase_0 0x0
+#define AmdExtD3DShaderIntrinsicsOpcodePhase_1 0x1
+#define AmdExtD3DShaderIntrinsicsOpcodePhase_2 0x2
+#define AmdExtD3DShaderIntrinsicsOpcodePhase_3 0x3
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsWaveOp defines for supported operations. Can be used as the parameter for the
+* AmdExtD3DShaderIntrinsicsOpcode_WaveOp intrinsic.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsWaveOp_AddF 0x01
+#define AmdExtD3DShaderIntrinsicsWaveOp_AddI 0x02
+#define AmdExtD3DShaderIntrinsicsWaveOp_AddU 0x03
+#define AmdExtD3DShaderIntrinsicsWaveOp_MulF 0x04
+#define AmdExtD3DShaderIntrinsicsWaveOp_MulI 0x05
+#define AmdExtD3DShaderIntrinsicsWaveOp_MulU 0x06
+#define AmdExtD3DShaderIntrinsicsWaveOp_MinF 0x07
+#define AmdExtD3DShaderIntrinsicsWaveOp_MinI 0x08
+#define AmdExtD3DShaderIntrinsicsWaveOp_MinU 0x09
+#define AmdExtD3DShaderIntrinsicsWaveOp_MaxF 0x0a
+#define AmdExtD3DShaderIntrinsicsWaveOp_MaxI 0x0b
+#define AmdExtD3DShaderIntrinsicsWaveOp_MaxU 0x0c
+#define AmdExtD3DShaderIntrinsicsWaveOp_And 0x0d // Reduction only
+#define AmdExtD3DShaderIntrinsicsWaveOp_Or 0x0e // Reduction only
+#define AmdExtD3DShaderIntrinsicsWaveOp_Xor 0x0f // Reduction only
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsWaveOp masks and shifts for opcode and flags
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift 0
+#define AmdExtD3DShaderIntrinsicsWaveOp_OpcodeMask 0xff
+#define AmdExtD3DShaderIntrinsicsWaveOp_FlagShift 8
+#define AmdExtD3DShaderIntrinsicsWaveOp_FlagMask 0xff
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsWaveOp flags for use with AmdExtD3DShaderIntrinsicsOpcode_WaveScan.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsWaveOp_Inclusive 0x01
+#define AmdExtD3DShaderIntrinsicsWaveOp_Exclusive 0x02
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsSwizzle defines for common swizzles. Can be used as the operation parameter for the
+* AmdExtD3DShaderIntrinsics_Swizzle intrinsic.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsSwizzle_SwapX1 0x041f
+#define AmdExtD3DShaderIntrinsicsSwizzle_SwapX2 0x081f
+#define AmdExtD3DShaderIntrinsicsSwizzle_SwapX4 0x101f
+#define AmdExtD3DShaderIntrinsicsSwizzle_SwapX8 0x201f
+#define AmdExtD3DShaderIntrinsicsSwizzle_SwapX16 0x401f
+#define AmdExtD3DShaderIntrinsicsSwizzle_ReverseX2 0x041f
+#define AmdExtD3DShaderIntrinsicsSwizzle_ReverseX4 0x0c1f
+#define AmdExtD3DShaderIntrinsicsSwizzle_ReverseX8 0x1c1f
+#define AmdExtD3DShaderIntrinsicsSwizzle_ReverseX16 0x3c1f
+#define AmdExtD3DShaderIntrinsicsSwizzle_ReverseX32 0x7c1f
+#define AmdExtD3DShaderIntrinsicsSwizzle_BCastX2 0x003e
+#define AmdExtD3DShaderIntrinsicsSwizzle_BCastX4 0x003c
+#define AmdExtD3DShaderIntrinsicsSwizzle_BCastX8 0x0038
+#define AmdExtD3DShaderIntrinsicsSwizzle_BCastX16 0x0030
+#define AmdExtD3DShaderIntrinsicsSwizzle_BCastX32 0x0020
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsBarycentric defines for barycentric interpolation mode. To be used with
+* AmdExtD3DShaderIntrinsicsOpcode_IjBarycentricCoords to specify the interpolation mode.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsBarycentric_LinearCenter 0x1
+#define AmdExtD3DShaderIntrinsicsBarycentric_LinearCentroid 0x2
+#define AmdExtD3DShaderIntrinsicsBarycentric_LinearSample 0x3
+#define AmdExtD3DShaderIntrinsicsBarycentric_PerspCenter 0x4
+#define AmdExtD3DShaderIntrinsicsBarycentric_PerspCentroid 0x5
+#define AmdExtD3DShaderIntrinsicsBarycentric_PerspSample 0x6
+#define AmdExtD3DShaderIntrinsicsBarycentric_PerspPullModel 0x7
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsBarycentric defines for specifying vertex and parameter indices. To be used as inputs to
+* the AmdExtD3DShaderIntrinsicsOpcode_VertexParameter function
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsBarycentric_Vertex0 0x0
+#define AmdExtD3DShaderIntrinsicsBarycentric_Vertex1 0x1
+#define AmdExtD3DShaderIntrinsicsBarycentric_Vertex2 0x2
+
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param0 0x00
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param1 0x01
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param2 0x02
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param3 0x03
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param4 0x04
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param5 0x05
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param6 0x06
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param7 0x07
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param8 0x08
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param9 0x09
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param10 0x0a
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param11 0x0b
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param12 0x0c
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param13 0x0d
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param14 0x0e
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param15 0x0f
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param16 0x10
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param17 0x11
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param18 0x12
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param19 0x13
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param20 0x14
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param21 0x15
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param22 0x16
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param23 0x17
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param24 0x18
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param25 0x19
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param26 0x1a
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param27 0x1b
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param28 0x1c
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param29 0x1d
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param30 0x1e
+#define AmdExtD3DShaderIntrinsicsBarycentric_Param31 0x1f
+
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentX 0x0
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentY 0x1
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentZ 0x2
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentW 0x3
+
+#define AmdExtD3DShaderIntrinsicsBarycentric_ParamShift 0
+#define AmdExtD3DShaderIntrinsicsBarycentric_ParamMask 0x1f
+#define AmdExtD3DShaderIntrinsicsBarycentric_VtxShift 0x5
+#define AmdExtD3DShaderIntrinsicsBarycentric_VtxMask 0x3
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift 0x7
+#define AmdExtD3DShaderIntrinsicsBarycentric_ComponentMask 0x3
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsAtomic defines for supported operations. Can be used as the parameter for the
+* AmdExtD3DShaderIntrinsicsOpcode_AtomicU64 intrinsic.
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsAtomicOp_MinU64 0x01
+#define AmdExtD3DShaderIntrinsicsAtomicOp_MaxU64 0x02
+#define AmdExtD3DShaderIntrinsicsAtomicOp_AndU64 0x03
+#define AmdExtD3DShaderIntrinsicsAtomicOp_OrU64 0x04
+#define AmdExtD3DShaderIntrinsicsAtomicOp_XorU64 0x05
+#define AmdExtD3DShaderIntrinsicsAtomicOp_AddU64 0x06
+#define AmdExtD3DShaderIntrinsicsAtomicOp_XchgU64 0x07
+#define AmdExtD3DShaderIntrinsicsAtomicOp_CmpXchgU64 0x08
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsicsFloatConversion defines for supported rounding modes from float to float16 conversions.
+* To be used as an input AmdExtD3DShaderIntrinsicsOpcode_FloatConversion instruction
+***********************************************************************************************************************
+*/
+#define AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16Near 0x01
+#define AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16NegInf 0x02
+#define AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16PlusInf 0x03
+
+
+/**
+***********************************************************************************************************************
+* MakeAmdShaderIntrinsicsInstruction
+*
+* Creates instruction from supplied opcode and immediate data.
+* NOTE: This is an internal function and should not be called by the source HLSL shader directly.
+*
+***********************************************************************************************************************
+*/
+uint MakeAmdShaderIntrinsicsInstruction(uint opcode, uint opcodePhase, uint immediateData)
+{
+ return ((AmdExtD3DShaderIntrinsics_MagicCode << AmdExtD3DShaderIntrinsics_MagicCodeShift) |
+ (immediateData << AmdExtD3DShaderIntrinsics_DataShift) |
+ (opcodePhase << AmdExtD3DShaderIntrinsics_OpcodePhaseShift) |
+ (opcode << AmdExtD3DShaderIntrinsics_OpcodeShift));
+}
+
+)EOSHADER", R"EOSHADER(
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadfirstlaneF
+*
+* Returns the value of float src for the first active lane of the wavefront.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Readfirstlane) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_ReadfirstlaneF(float src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Readfirstlane, 0, 0);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadfirstlaneU
+*
+* Returns the value of unsigned integer src for the first active lane of the wavefront.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Readfirstlane) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_ReadfirstlaneU(uint src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Readfirstlane, 0, 0);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Readlane
+*
+* Returns the value of float src for the lane within the wavefront specified by laneId.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Readlane) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_ReadlaneF(float src, uint laneId)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Readlane, 0, laneId);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadlaneU
+*
+* Returns the value of unsigned integer src for the lane within the wavefront specified by laneId.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Readlane) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_ReadlaneU(uint src, uint laneId)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Readlane, 0, laneId);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_LaneId
+*
+* Returns the current lane id for the thread within the wavefront.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_LaneId) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_LaneId()
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_LaneId, 0, 0);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_GetWaveSize
+*
+* Returns the wave size for the current shader, including active, inactive and helper lanes.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_GetWaveSize) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_GetWaveSize()
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_GetWaveSize, 0, 0);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Swizzle
+*
+* Generic instruction to shuffle the float src value among different lanes as specified by the operation.
+* Note that the operation parameter must be an immediately specified value not a value from a variable.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Swizzle) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_SwizzleF(float src, uint operation)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Swizzle, 0, operation);
+
+ uint retVal;
+ //InterlockedCompareExchange(AmdExtD3DShaderIntrinsicsUAV[instruction], asuint(src), 0, retVal);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_SwizzleU
+*
+* Generic instruction to shuffle the unsigned integer src value among different lanes as specified by the operation.
+* Note that the operation parameter must be an immediately specified value not a value from a variable.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Swizzle) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_SwizzleU(uint src, uint operation)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Swizzle, 0, operation);
+
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Ballot
+*
+* Given an input predicate returns a bit mask indicating for which lanes the predicate is true.
+* Inactive or non-existent lanes will always return 0. The number of existent lanes is the wavefront size.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_Ballot(bool predicate)
+{
+ uint instruction;
+
+ uint retVal1;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Ballot,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0, 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, predicate, 0, retVal1);
+
+ uint retVal2;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Ballot,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1, 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, predicate, 0, retVal2);
+
+ return uint2(retVal1, retVal2);
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_BallotAny
+*
+* Convenience routine that uses Ballot and returns true if for any of the active lanes the predicate is true.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+bool AmdExtD3DShaderIntrinsics_BallotAny(bool predicate)
+{
+ uint2 retVal = AmdExtD3DShaderIntrinsics_Ballot(predicate);
+
+ return ((retVal.x | retVal.y) != 0 ? true : false);
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_BallotAll
+*
+* Convenience routine that uses Ballot and returns true if for all of the active lanes the predicate is true.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Ballot) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+bool AmdExtD3DShaderIntrinsics_BallotAll(bool predicate)
+{
+ uint2 ballot = AmdExtD3DShaderIntrinsics_Ballot(predicate);
+
+ uint2 execMask = AmdExtD3DShaderIntrinsics_Ballot(true);
+
+ return ((ballot.x == execMask.x) && (ballot.y == execMask.y));
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_MBCnt
+*
+* Returns the masked bit count of the source register for this thread within all the active threads within a
+* wavefront.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_MBCnt) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_MBCnt(uint2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_MBCnt, 0, 0);
+
+ uint retVal;
+
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, src.y, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Min3F
+*
+* Returns the minimum value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_Min3F(float src0, float src1, float src2)
+{
+ uint minimum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Min3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), minimum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Min3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), minimum, minimum);
+
+ return asfloat(minimum);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Min3U
+*
+* Returns the minimum value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_Min3U(uint src0, uint src1, uint src2)
+{
+ uint minimum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Min3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, minimum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Min3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, minimum, minimum);
+
+ return minimum;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Med3F
+*
+* Returns the median value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_Med3F(float src0, float src1, float src2)
+{
+ uint median;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Med3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), median);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Med3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), median, median);
+
+ return asfloat(median);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Med3U
+*
+* Returns the median value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_Med3U(uint src0, uint src1, uint src2)
+{
+ uint median;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Med3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, median);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Med3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, median, median);
+
+ return median;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Max3F
+*
+* Returns the maximum value of the three floating point source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_Max3F(float src0, float src1, float src2)
+{
+ uint maximum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Max3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, asuint(src0), asuint(src1), maximum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Max3F,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, asuint(src2), maximum, maximum);
+
+ return asfloat(maximum);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_Max3U
+*
+* Returns the maximum value of the three unsigned integer source arguments.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_Compare3) returned S_OK.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_Max3U(uint src0, uint src1, uint src2)
+{
+ uint maximum;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Max3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, src0, src1, maximum);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_Max3U,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, src2, maximum, maximum);
+
+ return maximum;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_IjBarycentricCoords
+*
+* Returns the (i, j) barycentric coordinate pair for this shader invocation with the specified interpolation mode at
+* the specified pixel location. Should not be used for "pull-model" interpolation, PullModelBarycentricCoords should
+* be used instead
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_BaryCoord) returned S_OK.
+*
+* Can only be used in pixel shader stages.
+*
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_IjBarycentricCoords(uint interpMode)
+{
+ uint2 retVal;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaryCoord,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ interpMode);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, 0, 0, retVal.x);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaryCoord,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ interpMode);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, retVal.x, 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_PullModelBarycentricCoords
+*
+* Returns the (1/W,1/I,1/J) coordinates at the pixel center which can be used for custom interpolation at any
+* location in the pixel.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_BaryCoord) returned S_OK.
+*
+* Can only be used in pixel shader stages.
+*
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_PullModelBarycentricCoords()
+{
+ uint3 retVal;
+
+ uint instruction1 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaryCoord,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ AmdExtD3DShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction1, 0, 0, retVal.x);
+
+ uint instruction2 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaryCoord,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ AmdExtD3DShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction2, retVal.x, 0, retVal.y);
+
+ uint instruction3 = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaryCoord,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_2,
+ AmdExtD3DShaderIntrinsicsBarycentric_PerspPullModel);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction3, retVal.y, 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_VertexParameter
+*
+* Returns the triangle's parameter information at the specified triangle vertex.
+* The vertex and parameter indices must specified as immediate values.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_VtxParam) returned S_OK.
+*
+* Only available in pixel shader stages.
+*
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_VertexParameter(uint vertexIdx, uint parameterIdx)
+{
+ uint4 retVal;
+ uint4 instruction;
+
+ instruction.x = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_VtxParam,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdExtD3DShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdExtD3DShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdExtD3DShaderIntrinsicsBarycentric_ComponentX << AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.x, 0, 0, retVal.x);
+
+ instruction.y = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_VtxParam,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdExtD3DShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdExtD3DShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdExtD3DShaderIntrinsicsBarycentric_ComponentY << AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.y, 0, 0, retVal.y);
+
+ instruction.z = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_VtxParam,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdExtD3DShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdExtD3DShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdExtD3DShaderIntrinsicsBarycentric_ComponentZ << AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.z, 0, 0, retVal.z);
+
+ instruction.w = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_VtxParam,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdExtD3DShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdExtD3DShaderIntrinsicsBarycentric_ParamShift) |
+ (AmdExtD3DShaderIntrinsicsBarycentric_ComponentW << AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction.w, 0, 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_VertexParameterComponent
+*
+* Returns the triangle's parameter information at the specified triangle vertex and component.
+* The vertex, parameter and component indices must be specified as immediate values.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_VtxParam) returned S_OK.
+*
+* Only available in pixel shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_VertexParameterComponent(uint vertexIdx, uint parameterIdx, uint componentIdx)
+{
+ uint retVal;
+ uint instruction =
+ MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_VtxParam,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ ((vertexIdx << AmdExtD3DShaderIntrinsicsBarycentric_VtxShift) |
+ (parameterIdx << AmdExtD3DShaderIntrinsicsBarycentric_ParamShift) |
+ (componentIdx << AmdExtD3DShaderIntrinsicsBarycentric_ComponentShift)));
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_WaveReduce) returned S_OK.
+*
+* Performs reduction operation on wavefront (thread group) data.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : float
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, float src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : float2
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, float2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ uint2 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : float3
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, float3 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ uint3 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : float4
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, float4 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ uint4 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.w), 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : int
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, int src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ int retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : int2
+***********************************************************************************************************************
+*/
+int2 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, int2 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ int2 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : int3
+***********************************************************************************************************************
+*/
+int3 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, int3 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ int3 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.z, 0, retVal.z);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveReduce : int4
+***********************************************************************************************************************
+*/
+int4 AmdExtD3DShaderIntrinsics_WaveReduce(uint waveOp, int4 src)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift));
+
+ int4 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.z, 0, retVal.z);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.w, 0, retVal.w);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_WaveScan) returned S_OK.
+*
+* Performs scan operation on wavefront (thread group) data.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : float
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, float src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+ uint retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), 0, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : float2
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, float2 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ uint2 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+
+ return float2(asfloat(retVal.x), asfloat(retVal.y));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : float3
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, float3 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ uint3 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+
+ return float3(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : float4
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, float4 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ uint4 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.y), 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.z), 0, retVal.z);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src.w), 0, retVal.w);
+
+ return float4(asfloat(retVal.x), asfloat(retVal.y), asfloat(retVal.z), asfloat(retVal.w));
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : int
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, int src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ int retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, 0, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : int2
+***********************************************************************************************************************
+*/
+int2 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, int2 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ int2 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : int3
+***********************************************************************************************************************
+*/
+int3 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, int3 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ int3 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.z, 0, retVal.z);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveScan : int4
+***********************************************************************************************************************
+*/
+int4 AmdExtD3DShaderIntrinsics_WaveScan(uint waveOp, uint flags, int4 src)
+{
+ const uint waveScanOp = (waveOp << AmdExtD3DShaderIntrinsicsWaveOp_OpcodeShift) |
+ (flags << AmdExtD3DShaderIntrinsicsWaveOp_FlagShift);
+
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_WaveScan,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ waveScanOp);
+
+ int4 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.x, 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.y, 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.z, 0, retVal.z);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src.w, 0, retVal.w);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_LoadDwordAtAddr
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_LoadDwAtAddr) returned S_OK.
+*
+* Loads a DWORD from GPU memory from a given 64-bit GPU VA and 32-bit offset.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_LoadDwordAtAddr
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(uint gpuVaLoBits, uint gpuVaHiBits, uint offset)
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_LoadDwAtAddr,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, gpuVaLoBits, gpuVaHiBits, retVal);
+
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_LoadDwAtAddr,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_1,
+ 0);
+
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, offset, 0, retVal);
+
+ return retVal;
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_LoadDwordAtAddrx2
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_LoadDwordAtAddrx2(uint gpuVaLoBits, uint gpuVaHiBits, uint offset)
+{
+ uint2 retVal;
+
+ retVal.x = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset);
+ retVal.y = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset + 0x4);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_LoadDwordAtAddrx4
+***********************************************************************************************************************
+*/
+uint4 AmdExtD3DShaderIntrinsics_LoadDwordAtAddrx4(uint gpuVaLoBits, uint gpuVaHiBits, uint offset)
+{
+ uint4 retVal;
+
+ retVal.x = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset);
+ retVal.y = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset + 0x4);
+ retVal.z = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset + 0x8);
+ retVal.w = AmdExtD3DShaderIntrinsics_LoadDwordAtAddr(gpuVaLoBits, gpuVaHiBits, offset + 0xC);
+
+ return retVal;
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_GetDrawIndex
+*
+* The following function is available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_GetDrawIndex) returned S_OK.
+*
+* Returns the 0-based draw index in an indirect draw. Always returns 0 for direct draws.
+*
+* Available in vertex shader stage only.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_GetDrawIndex()
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_DrawIndex,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_GetBaseInstance
+*
+* The following function is available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_BaseInstance) returned S_OK.
+*
+* Returns the StartInstanceLocation parameter passed to direct or indirect drawing commands.
+*
+* Available in vertex shader stage only.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_GetBaseInstance()
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaseInstance,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_GetBaseVertex
+*
+* The following function is available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_BaseVertex) returned S_OK.
+*
+* For non-indexed draw commands, returns the StartVertexLocation parameter. For indexed draw commands, returns the
+* BaseVertexLocation parameter.
+*
+* Available in vertex shader stage only.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_GetBaseVertex()
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_BaseVertex,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, 0, 0, retVal);
+
+ return retVal;
+}
+
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadlaneAt : uint
+*
+* The following function is available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_ReadlaneAt) returned S_OK.
+*
+* Returns the value of the source for the given lane index within the specified wave. The lane index
+* can be non-uniform across the wave.
+*
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_ReadlaneAt(uint src, uint laneId)
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_ReadlaneAt,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, src, laneId, retVal);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadlaneAt : int
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_ReadlaneAt(int src, uint laneId)
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_ReadlaneAt,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), laneId, retVal);
+
+ return asint(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ReadlaneAt : float
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_ReadlaneAt(float src, uint laneId)
+{
+ uint retVal;
+
+ uint instruction;
+ instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_ReadlaneAt,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ 0);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(src), laneId, retVal);
+
+ return asfloat(retVal);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ConvertF32toF16
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsSupport_FloatConversion) returned
+* S_OK.
+*
+* Converts 32bit floating point numbers into 16bit floating point number using a specified rounding mode
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ConvertF32toF16 - helper to convert f32 to f16 number
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_ConvertF32toF16(in uint convOp, in float3 val)
+{
+ uint instruction = MakeAmdShaderIntrinsicsInstruction(AmdExtD3DShaderIntrinsicsOpcode_FloatConversion,
+ AmdExtD3DShaderIntrinsicsOpcodePhase_0,
+ convOp);
+
+ uint3 retVal;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(val.x), 0, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(val.y), 0, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instruction, asuint(val.z), 0, retVal.z);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ConvertF32toF16Near - convert f32 to f16 number using nearest rounding mode
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_ConvertF32toF16Near(in float3 inVec)
+{
+ return AmdExtD3DShaderIntrinsics_ConvertF32toF16(AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16Near, inVec);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ConvertF32toF16Near - convert f32 to f16 number using -inf rounding mode
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_ConvertF32toF16NegInf(in float3 inVec)
+{
+ return AmdExtD3DShaderIntrinsics_ConvertF32toF16(AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16NegInf, inVec);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_ConvertF32toF16Near - convert f32 to f16 number using +inf rounding mode
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_ConvertF32toF16PosInf(in float3 inVec)
+{
+ return AmdExtD3DShaderIntrinsics_ConvertF32toF16(AmdExtD3DShaderIntrinsicsFloatConversionOp_FToF16PlusInf, inVec);
+}
+
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_MakeAtomicInstructions
+*
+* Creates uint4 with x/y/z/w components containing phase 0/1/2/3 for atomic instructions.
+* NOTE: This is an internal function and should not be called by the source HLSL shader directly.
+*
+***********************************************************************************************************************
+*/
+uint4 AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(uint op)
+{
+ uint4 instructions;
+ instructions.x = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_AtomicU64, AmdExtD3DShaderIntrinsicsOpcodePhase_0, op);
+ instructions.y = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_AtomicU64, AmdExtD3DShaderIntrinsicsOpcodePhase_1, op);
+ instructions.z = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_AtomicU64, AmdExtD3DShaderIntrinsicsOpcodePhase_2, op);
+ instructions.w = MakeAmdShaderIntrinsicsInstruction(
+ AmdExtD3DShaderIntrinsicsOpcode_AtomicU64, AmdExtD3DShaderIntrinsicsOpcodePhase_3, op);
+ return instructions;
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicOp
+*
+* Creates intrinstic instructions for the specified atomic op.
+* NOTE: These are internal functions and should not be called by the source HLSL shader directly.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(RWByteAddressBuffer uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav.Store(retVal.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(RWTexture1D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[retVal.x] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(RWTexture2D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint2(retVal.x, retVal.x)] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(RWTexture3D uav, uint3 address, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint3(retVal.x, retVal.x, retVal.x)] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(
+ RWByteAddressBuffer uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav.Store(retVal.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(
+ RWTexture1D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[retVal.x] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(
+ RWTexture2D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint2(retVal.x, retVal.x)] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOp(
+ RWTexture3D uav, uint3 address, uint2 compare_value, uint2 value, uint op)
+{
+ uint2 retVal;
+
+ const uint4 instructions = AmdExtD3DShaderIntrinsics_MakeAtomicInstructions(op);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.x, address.x, address.y, retVal.x);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.y, address.z, value.x, retVal.y);
+ uav[uint3(retVal.x, retVal.x, retVal.x)] = retVal.y;
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.z, value.y, compare_value.x, retVal.y);
+ AmdExtD3DShaderIntrinsicsUAV.InterlockedCompareExchange(instructions.w, compare_value.y, retVal.y, retVal.y);
+
+ return retVal;
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicMinU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic minimum of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicMinU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MinU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMinU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MinU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMinU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MinU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMinU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MinU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicMaxU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic maximum of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicMaxU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMaxU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMaxU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicMaxU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_MaxU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicAndU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic AND of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicAndU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AndU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAndU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AndU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAndU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AndU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAndU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AndU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicOrU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic OR of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicOrU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_OrU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOrU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_OrU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOrU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_OrU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicOrU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_OrU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicXorU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic XOR of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicXorU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XorU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXorU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XorU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXorU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XorU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXorU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XorU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+)EOSHADER", R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicAddU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic add of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicAddU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AddU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAddU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AddU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAddU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AddU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicAddU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_AddU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicXchgU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic exchange of value with the UAV at address, returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicXchgU64(RWByteAddressBuffer uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXchgU64(RWTexture1D uav, uint address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXchgU64(RWTexture2D uav, uint2 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicXchgU64(RWTexture3D uav, uint3 address, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_XchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), value, op);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_AtomicCmpXchgU64
+*
+* The following functions are available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_AtomicU64) returned S_OK.
+*
+* Performs 64-bit atomic compare of comparison value with UAV at address, stores value if values match,
+* returns the original value.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_AtomicCmpXchgU64(
+ RWByteAddressBuffer uav, uint address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), compare_value, value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture1D uav, uint address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address, 0, 0), compare_value, value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture2D uav, uint2 address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, 0), compare_value, value, op);
+}
+
+uint2 AmdExtD3DShaderIntrinsics_AtomicCmpXchgU64(
+ RWTexture3D uav, uint3 address, uint2 compare_value, uint2 value)
+{
+ const uint op = AmdExtD3DShaderIntrinsicsAtomicOp_CmpXchgU64;
+ return AmdExtD3DShaderIntrinsics_AtomicOp(uav, uint3(address.x, address.y, address.z), compare_value, value, op);
+}
+
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+*
+* Performs reduction operation across a wave and returns the result of the reduction (sum of all threads in a wave)
+* to all participating lanes.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveActiveSum(float src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveActiveSum(float2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveActiveSum(float3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveActiveSum(float4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_WaveActiveSum(int src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int2 AmdExtD3DShaderIntrinsics_WaveActiveSum(int2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int3 AmdExtD3DShaderIntrinsics_WaveActiveSum(int3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+int4 AmdExtD3DShaderIntrinsics_WaveActiveSum(int4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_WaveActiveSum(uint src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_WaveActiveSum(uint2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_WaveActiveSum(uint3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveSum
+***********************************************************************************************************************
+*/
+uint4 AmdExtD3DShaderIntrinsics_WaveActiveSum(uint4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_AddU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+*
+* Performs reduction operation across a wave and returns the result of the reduction (product of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveActiveProduct(float src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveActiveProduct(float2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveActiveProduct(float3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveActiveProduct(float4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_WaveActiveProduct(int src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int2 AmdExtD3DShaderIntrinsics_WaveActiveProduct(int2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulI, src);
+}
+
+)EOSHADER",
+ R"EOSHADER(
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int3 AmdExtD3DShaderIntrinsics_WaveActiveProduct(int3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+int4 AmdExtD3DShaderIntrinsics_WaveActiveProduct(int4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_WaveActiveProduct(uint src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_WaveActiveProduct(uint2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_WaveActiveProduct(uint3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveProduct
+***********************************************************************************************************************
+*/
+uint4 AmdExtD3DShaderIntrinsics_WaveActiveProduct(uint4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MulU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+*
+* Performs reduction operation across a wave and returns the result of the reduction (minimum of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveActiveMin(float src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveActiveMin(float2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveActiveMin(float3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveActiveMin(float4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int AmdExtD3DShaderIntrinsics_WaveActiveMin(int src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int2 AmdExtD3DShaderIntrinsics_WaveActiveMin(int2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int3 AmdExtD3DShaderIntrinsics_WaveActiveMin(int3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+int4 AmdExtD3DShaderIntrinsics_WaveActiveMin(int4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinI, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint AmdExtD3DShaderIntrinsics_WaveActiveMin(uint src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint2 AmdExtD3DShaderIntrinsics_WaveActiveMin(uint2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint3 AmdExtD3DShaderIntrinsics_WaveActiveMin(uint3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMin
+***********************************************************************************************************************
+*/
+uint4 AmdExtD3DShaderIntrinsics_WaveActiveMin(uint4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MinU, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMax
+*
+* Performs reduction operation across a wave and returns the result of the reduction (maximum of all threads in a
+* wave) to all participating lanes.
+*
+* Available if CheckSupport(AmdExtD3DShaderIntrinsicsOpcode_WaveReduce) returned S_OK.
+*
+* Available in all shader stages.
+*
+***********************************************************************************************************************
+*/
+float AmdExtD3DShaderIntrinsics_WaveActiveMax(float src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float2 AmdExtD3DShaderIntrinsics_WaveActiveMax(float2 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float3 AmdExtD3DShaderIntrinsics_WaveActiveMax(float3 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMax
+***********************************************************************************************************************
+*/
+float4 AmdExtD3DShaderIntrinsics_WaveActiveMax(float4 src)
+{
+ return AmdExtD3DShaderIntrinsics_WaveReduce(AmdExtD3DShaderIntrinsicsWaveOp_MaxF, src);
+}
+
+/**
+***********************************************************************************************************************
+* AmdExtD3DShaderIntrinsics_WaveActiveMax