* For the most part we implement this as a thin pass-through layer. Where we
care about things (image barriers for layout transitions and queue
submissions) we do two different things:
- For image barriers, we "downcast" to plain VkImageMemoryBarrier. Currently
the only thing that's unique to VkImageMemoryBarrier2KHR is extra access
flags and pipeline stages, which we don't care about. This keeps a lot of
code from having to either handle two paths or handle the new path and then
do lots of conversions back to VkImageMemoryBarrier when running on older
drivers.
- For queue submissions we do the opposite. We promote old VkSubmitInfo to
VkSubmitInfo2KHR and process that in a common function, then if necessary
we decay back to VkSubmitInfo before sending to the driver.
* Note the sync2 structs aren't properly sorted in vk_next_chains.cpp because
I'm about to implement it and I'm too lazy to insert them in the right places
only to go find them all again to remove them.
* It seems like some drivers don't free some memory from the resident set until
a certain point, meaning the peak memory is higher after a couple of captures
then goes down. This isn't a true leak and us checking the entire process's
working set size is a very poor litmus test, so bumping this value is fine.
* When displaying mip 0 of a texture at less than 100% zoom we linear sampling,
but we don't want to linear sample across slices. Adding a half pixel offset
in z ensures we sample precisely on the slice itself.
* We force FetchUAV/FetchSRV to always return something, and it can be empty if
the resource is unbound meaning we get proper OOB behaviour for it. The D3D12
implementation still prints an error message for registers that don't
correspond to the root signature at all, but for D3D11 we silently allow it.
* These map more naturally to python tuples and are easier to wrap in and out.
* We also tidy up the FloatVecVal etc and standardise the members of
ShaderValue.
* Debugging shaders with UAV access could(likely will) have side-effects on
those UAV contents, so when ContinueDebug needs to fetch UAV contents we
should replay back to before the draw to fetch them.
* This replay only happens once per ContinueDebug, and UAV contents are cached
globally so we only do it once per UAV that's accessed, so the impact is low.