* We also fix a number of issues that could cause incorrect formats to be
generated.
* Test cases added for D3D11/GL/Vulkan to test different struct types. These
aren't automated at the moment because most of the code they're testing is in
the UI itself.
* On replay on macOS we use NSOpenGLContext so we can render to windows.
* We have two windowing systems on mac - one for Metal compatible outputs and
one for OpenGL compatible outputs.
* Fortunately the extension doesn't add any functionality that can't be achieved
through the old bindings, it's just better decoupled.
* What we do is set up our own VAO attrib/binding tracking, and translate all
the functions (old and new style) into the equivalent modifications of that
state, then each time a change happens we flush out the attribs and bindings
using the old attrib functions.
* We also intercept the queries to the new bindings and return the right values,
so even if loading a capture that uses ARB_vertex_attrib_binding would work as
expected (as all the translation to old bindings happens under the emulation
layer).
* Split up old uber-cbuffers used for unrelated shaders into shader-specific
cbuffers (DebugPixelCBufferData -> TexDisplayPSCBuffer / CheckerboardCBuffer /
MeshPixelCBuffer).
* Split up HLSL files so not everything is lumped into 'debugdisplay.hlsl' but
has separate files as appropriate.
* Use #include in GLSL and HLSL to better organise shader code together rather
than relying on lumping files together on the C++ side.
* Renamed files like 'debugcbuffers.h' to 'hlsl_cbuffers.h'.
* Trimmed out some extensions/cruft that isn't needed in the GLSL side since we
no longer use separable shaders.
* Combine some shaders like Outline/Checkerboard that were similar into central
place.
* If we have a renderpass that we stop replaying at subpass 0, we want to
pretend that the image is preserved as it was at that point - layout and all.
However since we're replaying with the original renderpass any subpass
transitions and finalLayout transitions will take effect. When we end any
active RP during vkEndCommandBuffer, we then undo any of these implicit
transitions to put the image back as it was.
* If a library is loaded as RTLD_DEEPBIND it might be linked directly against
libGL, in which case we need to plt hook the functions we care about as well
as just dlopen.
* When a library libBlah.so is loaded, it might be linked to libGL.so. If so, we
need to go populate our original functions and perform any callbacks when
libGL.so gets loaded as a dependency. This is true even if the library isn't
loaded as LOCAL.
When `Intervals<T>::update` was called with `finish=UINT64_MAX`, the loop
condition `i->start() < finish` would remain true for every interval, until `i`
went past the last interval, at which point `i->start()` (and all the other
accesses to `i` within the loop) have undefined behaviuor.
The current frame reference tracking handles `VkDeviceMemory`s as single
resource--e.g. if one region of the memory is read, and a disjoint region of the
memory is written, this is treated exactly the same way as if the read and write
were on the same region of the memory.
The new behaviour tracks the frame references for ranges of device memory; this
will be used to improve the detection of device memory that does not require
initialization or resetting.
The `FrameRefType` associated with the entire memory resource (e.g. through
`MarkResourceFrameReferenced`), should now keep an up-to-date maximum of the
`FrameRefType`s of all of the intervals within that memory. Maximum is used here
because `FrameRefType`s with stronger init/reset requirements have larger
values.
Previously, members were accessed as methods of the iterator (e.g.
`iter.start()`); now this becomes `iter->start()`.
This also adds const iterators for `Intervals<T>`, which allows the use of
`const Intervals<T> &`.
The bounds for intervals have been renamed `start` (inclusive lower bound) and
`finish` (exclusive upper bound), to avoid confusion with `begin` and `end`,
used for iterators.
Finally, the behaviour of the old `setValue` method was very confusing, and this
has been replaced by three methods with behaviour that is much easier to
understad: `setValue`, `split` and `mergeLeft`.
This value is similar to `None`, in that the resource need not be initialized
for correct replay; however, a resource with `Clear` type might be modified in
the frame, where a resource with `None` type may not be modified in the frame.
This distinction may be helpful in initializing the irrelevant initial contents
to some known value (e.g. 0), so that users inspecting these resources don't see
unexpected values (such as a value written later in the frame). `None` resources
need only set this dummy initial contents once, where `Clear` resources need to
reset this dummy initial contents every frame.
The accumulated `FrameRefType` of the accesses to a resource within a frame
determines how the resource needs to be initialized. The possibilities are:
* `None`: The initial value of the resource is never used within the frame,
either because it is never read, or because it is only read after being
cleared
* `InitOnce`: The resource needs to be initialized before replaying the frame
for the first time, but does not ever need to be reset; this occurs when a
resource is read within the frame, but not written within the frame.
* `Reset`: The resource needs to be reset each time before replaying the frame.
This occurs when the resource is read and later written within the frame.
* This makes post-mortem debugging easier. It's only active while a remote
server connection is running, so e.g. it will miss crashes of the remote
server itself, this could be improved by adding a margin where it will still
check for messages after the connection is dropped - perhaps on a separate
thread.
* The definition is specifically using chained min/max which return the other
argument if one is NaN, so a NaN being saturated is guaranteed to return 0.
* On D3D the implementation was severely complicated by two parallel paths - one
that 'flattened' vec4s out into easy-to-consume format for debugging. Instead
we do this as a post-process transformation now in the debugging itself.
* Similarly D3D's implementation used DXBC data directly to fetch data, when all
of the information needed is in the ShaderConstant arrays and used by Vulkan.
This allows us to share the implementation.
* Only GL still needs its own implementation since it needs to query the API for
offsets etc as it can change unreliably and can't easily produce a standard
reflection data. We do share the 'read from buffer data' implementation
though.
* For D3D11 byte offsets are always uint32 aligned, but for other APIs that's
not guaranteed. Storing a byte offset is strictly more expressive and a lot
simpler to reason about.