Add a test of AMD shader operations

This commit is contained in:
baldurk
2020-11-04 17:48:49 +00:00
parent 6352786d31
commit 8be0da2ce3
13 changed files with 9787 additions and 23 deletions
+19
View File
@@ -0,0 +1,19 @@
Copyright (c) 2020 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
+121
View File
@@ -0,0 +1,121 @@
# AMD AGS SDK
![AMD AGS SDK](http://gpuopen-librariesandsdks.github.io/media/amd_logo_black.png)
The AMD GPU Services (AGS) library provides software developers with the ability to query AMD GPU software and hardware state information that is not normally available through standard operating systems or graphics APIs. If you are serious about getting the most from your AMD GPU, then AGS can help you harness that power.
AGS includes support for querying graphics driver version info, GPU hardware info, performance metrics, shader extensions and additional extensions supported in the AMD drivers for DirectX 11 and DirectX 12. AGS also provides Crossfire™ (AMD's multi-GPU rendering technology) support and Eyefinity (AMD's multi-display rendering technology) configuration info.
In addition to the library itself, the AGS SDK includes several samples to demonstrate use of the library.
<div>
<a href="https://github.com/GPUOpen-LibrariesAndSDKs/AGS_SDK/releases/latest/"><img src="http://gpuopen-librariesandsdks.github.io/media/latest-release-button.svg" alt="Latest release" title="Latest release"></a>
</div>
### What's new in AGS 6.0
Version 6.0 introduces several new shader intrinsics, namely a DX12 ray tracing hit token for RDNA2 hardware for ray tracing optimisation, ReadLaneAt and explicit float conversions. There is also a change to the initialization API to make sure the AGS dll matches the header and calling code.
### What's new in AGS 5.4.2
Version 5.4.2 reinstates the sharedMemoryInBytes field which is required when calculating the memory available on APUs.
### What's new in AGS 5.4.1
Version 5.4.1 includes x86 libs and Visual Studio 2019 support. There is also support for base vertex and base instance intrinsics as well as GetWaveSize intrinsics. The samples have been ported from premake to cmake.
### What's new in AGS 5.4
Version 5.4 adds a better description of the GPU architecture for those wishing to fine tune their games for specific code paths. In addition, there are now shader intrinsics for getting the draw index for execute indirect calls as well as support for atomic U64 ops.
Radeon 7 and RDNA GPU core and memory speeds are now returned.
### What's new in AGS 5.3
Version 5.3 adds DirectX 11 deferred context support for our MultiDrawIndirect and UAV overlap extensions, along with a helper function to let your app determine if the installed driver meets your game's minimum driver version requirements. If you're a Vulkan user, you can pair that with our machine readable AMD Vulkan versions database, to get more information about the Vulkan implementation in our client driver.
Lastly, there's a new FreeSync 2 gamma 2.2 mode. It uses a 10-bit (per RGB component, 2-bit alpha) swapchain, as opposed to the 16-bit (per RGB component, 16-bit alpha) swapchain needed for FreeSync 2 scRGB.
### What's new in AGS 5.2
Version 5.2 adds support for app registration in DirectX 12. App registration lets you give more information about your game or application to our driver, which can then use that (ideally unique) information to better support the game or app if we need to make driver-side changes to help things run as efficiently and correctly as possible.
We also changed how you get access to extensions under DX12, requiring you to create your GPU device using `agsDriverExtensionsDX12_CreateDevice()` , instead of the normal `D3D12CreateDevice()` call youd make to D3D.
Lastly, weve also added support for breadcrumb markers in D3D11. Using the `agsDriverExtensionsDX11_WriteBreadcrumb()` API, you can put in place a strategy for debugging driver issues more easily. Sometimes your game or app can interact with the driver in a way that causes it to crash or TDR. The new API gives you the ability to leave markers around your D3D11 API calls, helping you narrow down exactly what interaction with the driver caused the problem.
### What's new in AGS 5.1
Version 5.1 is a partly developer-focused update to AGS 5. We've listened to feedback about how difficult it can be to integrate the binary AGS libs into your games, and while we can't open the source code to AGS to allow you to integrate it from source, we canvassed developers to figure out what pre-built binaries would be most useful to provide. So we've now added builds of the AGS binary library that are linkable with Visual Studio projects built with `/MT` and `/MD`, used to select a particular CRT.
There's also a breaking change in how you get access to DX11 AMD extensions. Look at the Changelog for details on that. We've also added an application registration extension for DX11 apps. That lets you tell the driver that your game can be considered in a uniquely indentifiable way, which is particularly helpful if you build on top of popular middleware like Unity or UE4 and make rendering changes.
There's also support for FreeSync 2 HDR, DX12 application user markers for [Radeon GPU Profiler](https://gpuopen.com/gaming-product/radeon-gpu-profiler-rgp/), VS2017 versions of the shipping samples, and new wave-level shader intrinsics for both DX11 and DX12.
See the full Changelog for more details.
### What's new in AGS 5.0
Version 5.0 is a major overhaul of the library designed to provide a much clearer view of the GPUs in the system and the displays attached to them. It also exposes the ability to query each display for HDR capabilities and put those HDR-capable displays into various HDR modes.
Highlights include the following:
* Full GPU enumeration with adapter string, device id, revision id and vendor id.
* Per-GPU display enumeration including information on display name, resolution, and HDR capabilities.
* Optional user-supplied memory allocator.
* Function to set displays into HDR mode.<sup>[1](#ags-sdk-footnote1)</sup>
* A Microsoft WACK compliant version of the library.
* DirectX 11 shader compiler controls.<sup>[1](#ags-sdk-footnote1)</sup>
* DirectX 11 multiview extension.<sup>[2](#ags-sdk-footnote1)</sup>
* DirectX 11 Crossfire API updates.
* Now supports using the API without needing a driver profile.
* You can also now specify the transfer engine.
### Driver extensions
AGS exposes GCN shader extensions for both DirectX 11 and DirectX 12. It also provides access to additional extensions available in the AMD driver for DirectX 11:
* Quad List primitive type
* UAV overlap
* Depth bounds test
* Multi-draw indirect
* Multiview
### Prerequisites
* AMD Radeon&trade; GCN-based GPU (HD 7000 series or newer)
* Or other DirectX&reg; 11 compatible GPU with Shader Model 5 support<sup>[3](#ags-sdk-footnote1)</sup>
* 64-bit Windows&reg; 7 (SP1 with the [Platform Update](https://msdn.microsoft.com/en-us/library/windows/desktop/jj863687.aspx)), Windows&reg; 8.1, or Windows&reg; 10
* Visual Studio&reg; 2013 or Visual Studio&reg; 2015
* Recommended driver: Radeon Software Crimson ReLive Edition 16.12.1 (driver version 16.50.2001) or later
### Getting Started
* It is recommended to take a look at the sample source code.
* There are three samples: ags_sample, crossfire_sample, and eyefinity_sample.
* Visual Studio projects for VS2013 and VS2015 can be found in each sample's `build` directory.
* Additional documentation, including API documentation and instructions on how to add AGS support to an existing project, can be found in the `ags_lib\doc` directory.
* There is also [online documentation](http://gpuopen-librariesandsdks.github.io/ags/).
### Additional Samples
In addition to the three samples included in this repo, there are other samples available on GitHub that use AGS:
* [CrossfireAPI11](https://github.com/GPUOpen-LibrariesAndSDKs/CrossfireAPI11) - a larger example of using the explicit Crossfire API
* The CrossfireAPI11 sample also comes with an extensive guide for multi-GPU: the *AMD Crossfire guide for Direct3D&reg; 11 applications*
* [DepthBoundsTest11](https://github.com/GPUOpen-LibrariesAndSDKs/DepthBoundsTest11) - a sample showing how to use the depth bounds test extension
* [Barycentrics11](https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics11) - a sample showing how to use the GCN shader extensions for DirectX 11
* [Barycentrics12](https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics12) - a sample showing how to use the GCN shader extensions for DirectX 12
### Premake
The Visual Studio projects in each sample's `build` directory were generated with Premake. To generate the project files yourself, open a command prompt in the sample's `premake` directory (where the premake5.lua script for that sample is located, not the top-level directory where the premake5 executable is located) and execute the following command:
* `..\..\premake\premake5.exe [action]`
* For example: `..\..\premake\premake5.exe vs2015`
Alternatively, to regenerate all Visual Studio files for the SDK, execute `ags_update_vs_files.bat` in the top-level `premake` directory.
This version of Premake has been modified from the stock version to use the property sheet technique for the Windows SDK from this [Visual C++ Team blog post](http://blogs.msdn.com/b/vcblog/archive/2012/11/23/using-the-windows-8-sdk-with-visual-studio-2010-configuring-multiple-projects.aspx). The technique was originally described for using the Windows 8.0 SDK with Visual Studio 2010, but it applies more generally to using newer versions of the Windows SDK with older versions of Visual Studio.
By default, Visual Studio 2013 projects will compile against the Windows 8.1 SDK. However, the VS2013 projects generated with this version of Premake will use the next higher SDK (i.e. the Windows 10 SDK), if the newer SDK exists on the user's machine.
For Visual Studio 2015, the systemversion Premake function is used to add the `WindowsTargetPlatformVersion` element to the project file, to specify which version of the Windows SDK will be used. To change `WindowsTargetPlatformVersion` for Visual Studio 2015, change the value for `_AMD_WIN_SDK_VERSION` in `premake\amd_premake_util.lua` and regenerate the Visual Studio files.
### Third-Party Software
* DXUT is distributed under the terms of the MIT License. See `eyefinity_sample\dxut\MIT.txt`.
* Premake is distributed under the terms of the BSD License. See `premake\LICENSE.txt`.
### Attribution
* AMD, the AMD Arrow logo, Radeon, Crossfire, and combinations thereof are either registered trademarks or trademarks of Advanced Micro Devices, Inc. in the United States and/or other countries.
* Microsoft, DirectX, Visual Studio, and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
### Notes
<a name="ags-sdk-footnote1">1</a>: Requires Radeon Software Crimson Edition 16.9.2 (driver version 16.40.2311) or later.
<a name="ags-sdk-footnote1">2</a>: Requires Radeon Software Crimson ReLive Edition 16.12.1 (driver version 16.50.2001) or later.
<a name="ags-sdk-footnote1">3</a>: While the AGS SDK samples will run on non-AMD hardware, they will be of limited usefulness, since the purpose of AGS is to provide convenient access to AMD-specific information and extensions.
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,378 @@
/******************************************************************************
* The MIT License (MIT)
*
* Copyright (c) 2019-2020 Baldur Karlsson
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
******************************************************************************/
#include "d3d11_test.h"
#include "3rdparty/ags/ags_shader_intrinsics_dx11.hlsl.h"
#include "3rdparty/ags/amd_ags.h"
RD_TEST(D3D11_AMD_Shader_Extensions, D3D11GraphicsTest)
{
static constexpr const char *Description = "Tests using AMD shader extensions.";
AGS_INITIALIZE dyn_agsInitialize = NULL;
AGS_DEINITIALIZE dyn_agsDeInitialize = NULL;
AGS_DRIVEREXTENSIONSDX11_CREATEDEVICE dyn_agsDriverExtensionsDX11_CreateDevice = NULL;
AGS_DRIVEREXTENSIONSDX11_DESTROYDEVICE dyn_agsDriverExtensionsDX11_DestroyDevice = NULL;
AGSContext *ags = NULL;
std::string BaryCentricPixel = R"EOSHADER(
float4 main() : SV_Target0
{
float2 bary = AmdDxExtShaderIntrinsics_IjBarycentricCoords( AmdDxExtShaderIntrinsicsBarycentric_LinearCenter );
float3 bary3 = float3(bary.x, bary.y, 1.0 - (bary.x + bary.y));
if(bary3.x > bary3.y && bary3.x > bary3.z)
return float4(1.0f, 0.0f, 0.0f, 1.0f);
else if(bary3.y > bary3.x && bary3.y > bary3.z)
return float4(0.0f, 1.0f, 0.0f, 1.0f);
else
return float4(0.0f, 0.0f, 1.0f, 1.0f);
}
)EOSHADER";
std::string MaxCompute = R"EOSHADER(
RWByteAddressBuffer inUAV : register(u0);
RWByteAddressBuffer outUAV : register(u1);
[numthreads(256, 1, 1)]
void main(uint3 threadID : SV_DispatchThreadID)
{
// read input from source
uint2 input;
input.x = inUAV.Load(threadID.x * 8);
input.y = inUAV.Load(threadID.x * 8 + 4);
AmdDxExtShaderIntrinsics_AtomicMaxU64(outUAV, 0, input);
}
)EOSHADER";
void Prepare(int argc, char **argv)
{
D3D11GraphicsTest::Prepare(argc, argv);
if(!Avail.empty())
return;
std::string agsname = sizeof(void *) == 8 ? "amd_ags_x64.dll" : "amd_ags_x86.dll";
HMODULE agsLib = LoadLibraryA(agsname.c_str());
// try in local plugins folder
if(!agsLib)
{
agsLib = LoadLibraryA(
("../../" + std::string(sizeof(void *) == 8 ? "plugins-win64/" : "plugins/win32/") +
"amd/ags/" + agsname)
.c_str());
}
if(!agsLib)
{
// try in plugins folder next to renderdoc.dll
HMODULE rdocmod = GetModuleHandleA("renderdoc.dll");
char path[MAX_PATH + 1] = {};
if(rdocmod)
{
GetModuleFileNameA(rdocmod, path, MAX_PATH);
std::string tmp = path;
tmp.resize(tmp.size() - (sizeof("/renderdoc.dll") - 1));
agsLib = LoadLibraryA((tmp + "/plugins/amd/ags/" + agsname).c_str());
}
}
if(!agsLib)
{
Avail = "Couldn't load AGS dll";
return;
}
dyn_agsInitialize = (AGS_INITIALIZE)GetProcAddress(agsLib, "agsInitialize");
dyn_agsDeInitialize = (AGS_DEINITIALIZE)GetProcAddress(agsLib, "agsDeInitialize");
dyn_agsDriverExtensionsDX11_CreateDevice = (AGS_DRIVEREXTENSIONSDX11_CREATEDEVICE)GetProcAddress(
agsLib, "agsDriverExtensionsDX11_CreateDevice");
dyn_agsDriverExtensionsDX11_DestroyDevice =
(AGS_DRIVEREXTENSIONSDX11_DESTROYDEVICE)GetProcAddress(
agsLib, "agsDriverExtensionsDX11_DestroyDevice");
if(!dyn_agsInitialize || !dyn_agsDeInitialize || !dyn_agsDriverExtensionsDX11_CreateDevice ||
!dyn_agsDriverExtensionsDX11_DestroyDevice)
{
Avail = "AGS didn't have all necessary entry points - too old?";
return;
}
AGSReturnCode agsret = dyn_agsInitialize(
AGS_MAKE_VERSION(AMD_AGS_VERSION_MAJOR, AMD_AGS_VERSION_MINOR, AMD_AGS_VERSION_PATCH), NULL,
&ags, NULL);
if(agsret != AGS_SUCCESS || ags == NULL)
{
Avail = "AGS couldn't initialise";
return;
}
ID3D11Device *ags_devHandle = NULL;
ID3D11DeviceContext *ags_ctxHandle = NULL;
CreateExtendedDevice(&ags_devHandle, &ags_ctxHandle);
// once we've checked that we can create an extension device on an adapter, we can release it
// and return ready to run
if(ags_devHandle)
{
Avail = "";
unsigned int dummy = 0;
dyn_agsDriverExtensionsDX11_DestroyDevice(ags, ags_devHandle, &dummy, ags_ctxHandle, &dummy);
return;
}
Avail = "AGS couldn't create device on any selected adapter.";
}
void CreateExtendedDevice(ID3D11Device * *ags_devHandle, ID3D11DeviceContext * *ags_ctxHandle)
{
std::vector<IDXGIAdapterPtr> adapters = GetAdapters();
for(IDXGIAdapterPtr a : adapters)
{
AGSDX11DeviceCreationParams devCreate = {};
AGSDX11ExtensionParams extCreate = {};
AGSDX11ReturnedParams ret = {};
devCreate.pAdapter = a.GetInterfacePtr();
devCreate.DriverType = D3D_DRIVER_TYPE_UNKNOWN;
devCreate.Flags = createFlags | (debugDevice ? D3D11_CREATE_DEVICE_DEBUG : 0);
devCreate.pFeatureLevels = &feature_level;
devCreate.FeatureLevels = 1;
devCreate.SDKVersion = D3D11_SDK_VERSION;
extCreate.uavSlot = 7;
extCreate.crossfireMode = AGS_CROSSFIRE_MODE_DISABLE;
extCreate.pAppName = L"RenderDoc demos";
extCreate.pEngineName = L"RenderDoc demos";
AGSReturnCode agsret =
dyn_agsDriverExtensionsDX11_CreateDevice(ags, &devCreate, &extCreate, &ret);
if(agsret == AGS_SUCCESS && ret.pDevice)
{
// don't accept devices that don't support the intrinsics we want
if(!ret.extensionsSupported.intrinsics16 || !ret.extensionsSupported.intrinsics19)
{
unsigned int dummy = 0;
dyn_agsDriverExtensionsDX11_DestroyDevice(ags, ret.pDevice, &dummy, ret.pImmediateContext,
&dummy);
}
else
{
*ags_devHandle = ret.pDevice;
*ags_ctxHandle = ret.pImmediateContext;
return;
}
}
}
}
int main()
{
// initialise, create window, create device, etc
if(!Init())
return 3;
// release the old device and swapchain
dev = NULL;
ctx = NULL;
dev1 = NULL;
dev2 = NULL;
dev3 = NULL;
dev4 = NULL;
dev5 = NULL;
ctx1 = NULL;
ctx2 = NULL;
ctx3 = NULL;
ctx4 = NULL;
annot = NULL;
swapBlitVS = NULL;
swapBlitPS = NULL;
// and swapchain & related
swap = NULL;
bbTex = NULL;
bbRTV = NULL;
// we don't use these directly, just copy them into the real device so we know that ags is the
// one to destroy the last reference to the device
ID3D11Device *ags_devHandle = NULL;
ID3D11DeviceContext *ags_ctxHandle = NULL;
CreateExtendedDevice(&ags_devHandle, &ags_ctxHandle);
dev = ags_devHandle;
ctx = ags_ctxHandle;
annot = ctx;
// create the swapchain on the new AGS-extended device
DXGI_SWAP_CHAIN_DESC swapDesc = MakeSwapchainDesc(mainWindow);
HRESULT hr = fact->CreateSwapChain(dev, &swapDesc, &swap);
if(FAILED(hr))
{
TEST_ERROR("Couldn't create swapchain");
return 4;
}
hr = swap->GetBuffer(0, __uuidof(ID3D11Texture2D), (void **)&bbTex);
if(FAILED(hr))
{
TEST_ERROR("Couldn't get swapchain backbuffer");
return 4;
}
hr = dev->CreateRenderTargetView(bbTex, NULL, &bbRTV);
if(FAILED(hr))
{
TEST_ERROR("Couldn't create swapchain RTV");
return 4;
}
std::string ags_header = ags_shader_intrinsics_dx11_hlsl();
ID3DBlobPtr vsblob = Compile(D3DDefaultVertex, "main", "vs_4_0");
// can't skip optimising and still have the extensions work, sadly
ID3DBlobPtr psblob = Compile(ags_header + BaryCentricPixel, "main", "ps_5_0", false);
ID3DBlobPtr csblob = Compile(ags_header + MaxCompute, "main", "cs_5_0", false);
CreateDefaultInputLayout(vsblob);
ID3D11VertexShaderPtr vs = CreateVS(vsblob);
ID3D11PixelShaderPtr ps = CreatePS(psblob);
ID3D11ComputeShaderPtr cs = CreateCS(csblob);
SetDebugName(cs, "cs");
ID3D11BufferPtr vb = MakeBuffer().Vertex().Data(DefaultTri);
// make a simple texture so that the structured data includes texture initial states
ID3D11Texture2DPtr fltTex = MakeTexture(DXGI_FORMAT_R32G32B32A32_FLOAT, 4, 4).RTV();
ID3D11RenderTargetViewPtr fltRT = MakeRTV(fltTex);
const int numInputValues = 16384;
std::vector<uint64_t> values;
values.resize(numInputValues);
uint64_t cpuMax = 0;
for(uint64_t &v : values)
{
v = 0;
for(uint32_t byte = 0; byte < 8; byte++)
{
uint64_t b = (rand() & 0xff0) >> 4;
v |= b << (byte * 8);
}
cpuMax = std::max(v, cpuMax);
}
ID3D11BufferPtr inBuf = MakeBuffer()
.UAV()
.ByteAddressed()
.Data(values.data())
.Size(UINT(sizeof(uint64_t) * values.size()));
ID3D11BufferPtr outBuf = MakeBuffer().UAV().ByteAddressed().Size(32);
SetDebugName(outBuf, "outBuf");
ID3D11UnorderedAccessViewPtr inUAV = MakeUAV(inBuf).Format(DXGI_FORMAT_R32_TYPELESS);
ID3D11UnorderedAccessViewPtr outUAV = MakeUAV(outBuf).Format(DXGI_FORMAT_R32_TYPELESS);
while(Running())
{
ctx->ClearState();
uint32_t zero[4] = {};
ctx->ClearUnorderedAccessViewUint(outUAV, zero);
ClearRenderTargetView(bbRTV, {0.2f, 0.2f, 0.2f, 1.0f});
ClearRenderTargetView(fltRT, {0.2f, 0.2f, 0.2f, 1.0f});
IASetVertexBuffer(vb, sizeof(DefaultA2V), 0);
ctx->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
ctx->IASetInputLayout(defaultLayout);
ctx->VSSetShader(vs, NULL, 0);
ctx->PSSetShader(ps, NULL, 0);
RSSetViewport({0.0f, 0.0f, (float)screenWidth, (float)screenHeight, 0.0f, 1.0f});
ctx->OMSetRenderTargets(1, &bbRTV.GetInterfacePtr(), NULL);
ctx->Draw(3, 0);
ctx->CSSetShader(cs, NULL, 0);
ctx->CSSetUnorderedAccessViews(0, 1, &inUAV.GetInterfacePtr(), NULL);
ctx->CSSetUnorderedAccessViews(1, 1, &outUAV.GetInterfacePtr(), NULL);
ctx->Dispatch(numInputValues / 256, 1, 1);
ctx->Flush();
std::vector<byte> output = GetBufferData(outBuf, 0, 8);
uint64_t gpuMax = 0;
memcpy(&gpuMax, output.data(), sizeof(uint64_t));
setMarker("cpuMax: " + std::to_string(cpuMax));
setMarker("gpuMax: " + std::to_string(gpuMax));
Present();
}
dev = NULL;
ctx = NULL;
annot = NULL;
unsigned int dummy = 0;
dyn_agsDriverExtensionsDX11_DestroyDevice(ags, ags_devHandle, &dummy, ags_ctxHandle, &dummy);
dyn_agsDeInitialize(ags);
return 0;
}
};
REGISTER_TEST();
+36 -20
View File
@@ -163,20 +163,7 @@ bool D3D11GraphicsTest::Init(IDXGIAdapterPtr pAdapter)
mainWindow = win;
DXGI_SWAP_CHAIN_DESC swapDesc;
ZeroMemory(&swapDesc, sizeof(swapDesc));
swapDesc.BufferCount = backbufferCount;
swapDesc.BufferDesc.Format = backbufferFmt;
swapDesc.BufferDesc.Width = screenWidth;
swapDesc.BufferDesc.Height = screenHeight;
swapDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT | DXGI_USAGE_SHADER_INPUT;
swapDesc.SampleDesc.Count = backbufferMSAA;
swapDesc.SampleDesc.Quality = 0;
swapDesc.OutputWindow = win->wnd;
swapDesc.Windowed = TRUE;
swapDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD;
swapDesc.Flags = 0;
DXGI_SWAP_CHAIN_DESC swapDesc = MakeSwapchainDesc(win);
hr = CreateDevice(pAdapter, &swapDesc, features, flags);
@@ -210,11 +197,35 @@ bool D3D11GraphicsTest::Init(IDXGIAdapterPtr pAdapter)
return true;
}
DXGI_SWAP_CHAIN_DESC D3D11GraphicsTest::MakeSwapchainDesc(GraphicsWindow *win)
{
DXGI_SWAP_CHAIN_DESC swapDesc = {};
swapDesc.BufferCount = backbufferCount;
swapDesc.BufferDesc.Format = backbufferFmt;
swapDesc.BufferDesc.Width = screenWidth;
swapDesc.BufferDesc.Height = screenHeight;
swapDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT | DXGI_USAGE_SHADER_INPUT;
swapDesc.SampleDesc.Count = backbufferMSAA;
swapDesc.SampleDesc.Quality = 0;
swapDesc.OutputWindow = ((Win32Window *)win)->wnd;
swapDesc.Windowed = TRUE;
swapDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD;
swapDesc.Flags = 0;
return swapDesc;
}
GraphicsWindow *D3D11GraphicsTest::MakeWindow(int width, int height, const char *title)
{
return new Win32Window(width, height, title);
}
std::vector<IDXGIAdapterPtr> D3D11GraphicsTest::GetAdapters()
{
return adapters;
}
HRESULT D3D11GraphicsTest::CreateDevice(IDXGIAdapterPtr adapterToTry, DXGI_SWAP_CHAIN_DESC *swapDesc,
D3D_FEATURE_LEVEL *features, UINT flags)
{
@@ -597,16 +608,21 @@ void D3D11GraphicsTest::SetStencilRef(UINT ref)
ctx->OMSetDepthStencilState(state, ref);
}
ID3DBlobPtr D3D11GraphicsTest::Compile(std::string src, std::string entry, std::string profile)
ID3DBlobPtr D3D11GraphicsTest::Compile(std::string src, std::string entry, std::string profile,
bool skipoptimise)
{
ID3DBlobPtr blob = NULL;
ID3DBlobPtr error = NULL;
HRESULT hr =
dyn_D3DCompile(src.c_str(), src.length(), "", NULL, NULL, entry.c_str(), profile.c_str(),
D3DCOMPILE_WARNINGS_ARE_ERRORS | D3DCOMPILE_DEBUG |
D3DCOMPILE_SKIP_OPTIMIZATION | D3DCOMPILE_OPTIMIZATION_LEVEL0,
0, &blob, &error);
UINT flags = D3DCOMPILE_WARNINGS_ARE_ERRORS | D3DCOMPILE_DEBUG;
if(skipoptimise)
flags |= D3DCOMPILE_SKIP_OPTIMIZATION | D3DCOMPILE_OPTIMIZATION_LEVEL0;
else
flags |= D3DCOMPILE_OPTIMIZATION_LEVEL0;
HRESULT hr = dyn_D3DCompile(src.c_str(), src.length(), "", NULL, NULL, entry.c_str(),
profile.c_str(), flags, 0, &blob, &error);
if(FAILED(hr))
{
+6 -1
View File
@@ -47,8 +47,12 @@ struct D3D11GraphicsTest : public GraphicsTest
void Prepare(int argc, char **argv);
bool Init(IDXGIAdapterPtr pAdapter = NULL);
void Shutdown();
GraphicsWindow *MakeWindow(int width, int height, const char *title);
DXGI_SWAP_CHAIN_DESC MakeSwapchainDesc(GraphicsWindow *win);
std::vector<IDXGIAdapterPtr> GetAdapters();
HRESULT CreateDevice(IDXGIAdapterPtr adapterToTry, DXGI_SWAP_CHAIN_DESC *swapDesc,
D3D_FEATURE_LEVEL *features, UINT flags);
@@ -70,7 +74,8 @@ struct D3D11GraphicsTest : public GraphicsTest
BufUAVType = 0xf00,
};
ID3DBlobPtr Compile(std::string src, std::string entry, std::string profile);
ID3DBlobPtr Compile(std::string src, std::string entry, std::string profile,
bool skipoptimise = true);
void Strip(ID3DBlobPtr &ptr);
void WriteBlob(std::string name, ID3DBlobPtr blob, bool compress);
+4
View File
@@ -121,6 +121,7 @@
<ItemGroup>
<ClCompile Include="3rdparty\fmt\format.cc" />
<ClCompile Include="d3d11\d3d11_1_many_uavs.cpp" />
<ClCompile Include="d3d11\d3d11_amd_shader_extensions.cpp" />
<ClCompile Include="d3d11\d3d11_array_interpolator.cpp" />
<ClCompile Include="d3d11\d3d11_binding_hazards.cpp" />
<ClCompile Include="d3d11\d3d11_buffer_truncation.cpp" />
@@ -321,6 +322,9 @@
<ClCompile Include="win32\win32_window.cpp" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="3rdparty\ags\ags_shader_intrinsics_dx11.hlsl.h" />
<ClInclude Include="3rdparty\ags\ags_shader_intrinsics_dx12.hlsl.h" />
<ClInclude Include="3rdparty\ags\amd_ags.h" />
<ClInclude Include="3rdparty\fmt\core.h" />
<ClInclude Include="3rdparty\fmt\format-inl.h" />
<ClInclude Include="3rdparty\fmt\format.h" />
+15
View File
@@ -565,6 +565,9 @@
<ClCompile Include="d3d12\d3d12_list_alloc_tests.cpp">
<Filter>D3D12\demos</Filter>
</ClCompile>
<ClCompile Include="d3d11\d3d11_amd_shader_extensions.cpp">
<Filter>D3D11\demos</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<Filter Include="D3D11">
@@ -630,6 +633,9 @@
<Filter Include="3rdparty\fmt">
<UniqueIdentifier>{47ec0443-2084-47f0-be07-2a633a0c22e4}</UniqueIdentifier>
</Filter>
<Filter Include="3rdparty\ags">
<UniqueIdentifier>{329344bd-312a-4cd6-b618-aadeb4eb13cb}</UniqueIdentifier>
</Filter>
</ItemGroup>
<ItemGroup>
<ClInclude Include="d3d11\d3d11_helpers.h">
@@ -811,5 +817,14 @@
<ClInclude Include="3rdparty\fmt\format-inl.h">
<Filter>3rdparty\fmt</Filter>
</ClInclude>
<ClInclude Include="3rdparty\ags\ags_shader_intrinsics_dx11.hlsl.h">
<Filter>3rdparty\ags</Filter>
</ClInclude>
<ClInclude Include="3rdparty\ags\ags_shader_intrinsics_dx12.hlsl.h">
<Filter>3rdparty\ags</Filter>
</ClInclude>
<ClInclude Include="3rdparty\ags\amd_ags.h">
<Filter>3rdparty\ags</Filter>
</ClInclude>
</ItemGroup>
</Project>
+2 -2
View File
@@ -214,8 +214,8 @@ def unpack_data(fmt: rd.ResourceFormat, data: bytes, data_offset: int):
format_chars = {
# 012345678
rd.CompType.UInt: "xBHxIxxxL",
rd.CompType.SInt: "xbhxixxxl",
rd.CompType.UInt: "xBHxIxxxQ",
rd.CompType.SInt: "xbhxixxxq",
rd.CompType.Float: "xxexfxxxd", # only 2, 4 and 8 are valid
}
+10
View File
@@ -486,6 +486,16 @@ class TestCase:
return None
def get_resource_by_name(self, name: str):
resources = self.controller.GetResources()
for r in resources:
r: rd.ResourceDescription
if r.name == name:
return r
return None
def get_last_draw(self):
last_draw: rd.DrawcallDescription = self.controller.GetDrawcalls()[-1]
@@ -0,0 +1,92 @@
import renderdoc as rd
import rdtest
import struct
class D3D11_AMD_Shader_Extensions(rdtest.TestCase):
demos_test_name = 'D3D11_AMD_Shader_Extensions'
def check_capture(self):
draw = self.get_last_draw()
self.controller.SetFrameEvent(draw.eventId, False)
# Should have barycentrics showing the closest vertex for each pixel in the triangle
# Without relying on barycentric order, ensure that the three pixels are red, green, and blue
pixels = []
picked: rd.PixelValue = self.controller.PickPixel(draw.copyDestination, 125, 215, rd.Subresource(),
rd.CompType.UNorm)
pixels.append(picked.floatValue[0:4])
picked: rd.PixelValue = self.controller.PickPixel(draw.copyDestination, 200, 85, rd.Subresource(),
rd.CompType.UNorm)
pixels.append(picked.floatValue[0:4])
picked: rd.PixelValue = self.controller.PickPixel(draw.copyDestination, 285, 215, rd.Subresource(),
rd.CompType.UNorm)
pixels.append(picked.floatValue[0:4])
if (not [1.0, 0.0, 0.0, 1.0] in pixels) or (not [1.0, 0.0, 0.0, 1.0] in pixels) or (
not [1.0, 0.0, 0.0, 1.0] in pixels):
raise rdtest.TestFailureException("Expected red, green and blue in picked pixels. Got {}".format(pixels))
rdtest.log.success("Picked barycentric values are as expected")
# find the cpuMax and gpuMax draws
cpuMax = self.find_draw("cpuMax")
gpuMax = self.find_draw("gpuMax")
# The values should be identical
cpuMax = int(cpuMax.name[8:])
gpuMax = int(gpuMax.name[8:])
if cpuMax != gpuMax or cpuMax == 0:
raise rdtest.TestFailureException(
"captured cpuMax and gpuMax are not equal and positive: {} vs {}".format(cpuMax, gpuMax))
rdtest.log.success("recorded cpuMax and gpuMax are as expected")
outBuf = self.get_resource_by_name("outBuf")
data = self.controller.GetBufferData(outBuf.resourceId, 0, 8)
replayedGpuMax = struct.unpack("Q", data)[0]
if replayedGpuMax != gpuMax:
raise rdtest.TestFailureException(
"captured gpuMax and replayed gpuMax are not equal: {} vs {}".format(gpuMax, replayedGpuMax))
rdtest.log.success("replayed gpuMax is as expected")
cs = self.get_resource_by_name("cs")
pipe = rd.ResourceId()
refl: rd.ShaderReflection = self.controller.GetShader(pipe, cs.resourceId,
rd.ShaderEntryPoint("main", rd.ShaderStage.Compute))
disasm = self.controller.DisassembleShader(pipe, refl, "")
if "amd_u64_atomic" not in disasm:
raise rdtest.TestFailureException(
"Didn't find expected AMD opcode in disassembly: {}".format(disasm))
rdtest.log.success("compute shader disassembly is as expected")
if refl.debugInfo.debuggable:
self.controller.SetFrameEvent(self.find_draw("Dispatch").eventId, False)
trace: rd.ShaderDebugTrace = self.controller.DebugThread([0, 0, 0], [0, 0, 0])
if trace.debugger is None:
self.controller.FreeTrace(trace)
raise rdtest.TestFailureException("Couldn't debug compute shader")
cycles, variables = self.process_trace(trace)
if cycles < 3:
raise rdtest.TestFailureException("Compute shader has too few cycles {}".format(cycles))
else:
raise rdtest.TestFailureException(
"Compute shader is listed as non-debuggable: {}".format(refl.debugInfo.debugStatus))
rdtest.log.success("compute shader debugged successfully")