* This is needed because processing of flushing persistent maps and
using shadow memory needs to happen before it's deallocated at the end
of the frame.
* The problem is we need to keep current state for both graphics and
compute because while setting a root signature invalidates bindings,
setting compute doesn't invalidate graphics and vice-versa.
* However it can be invalid to try and re-apply in one go if we're only
drawing and not dispatching and some of the compute elements are
invalid.
* Rather than try to figure out which one *should* be valid, instead we
just skip invalid binds that don't have matching heaps.
* Threaded recording means the recording chunks are all interleaved in
the log so we may be cracking multiple command lists at the same time
and so need multiple allocators. Since the original capture obeyed the
rule of only recording to one list from an allocator at once, making a
cracked allocator for each also obeys that rule.
* By renaming the renderdoc and renderdoccmd projects to something else
(say 'selfhost' and 'selfhostcmd') then they can be used to capture
renderdocui and the replaying that's happening.
* Only supported on development builds and might break down, but it's
handy to have as an easy to enable option.
* There's also a couple of handy python functions exposed -
renderdoc.StaticExports.StartSelfHostCapture(string dllname)
renderdoc.StaticExports.EndSelfHostCapture(string dllname)
which can be used to start and stop the capture around e.g. a shader
debug operation or a pixel history operation or something similar.
* If the application incorrectly calls WSACleanup() before WSAStartup()
then instead of just an error for invalid use, it will instead kill
all of the sockets we created from our early WSAStartup().
* To fix this, we hook those functions and track the process wide ref-
count, to prevent it from every going to 0 until we shut down our own.
* This was found on GTA 5 in particular but possibly some other
application will do this too.
* Still no plans to support these, but at least this will allow them to
be queried for. This might happen if e.g. an extension loader or old
bit of code just loads all functions unconditionally and might error
if one of these functions isn't found, even if it has no intention of
actually calling it.
* If we're only compressing 2D blocks the compressed block depth can be
0 and that's totally valid. We can't unconditionally divide through,
so sanitise inputs to make sure calculations are sensible.
* During capture if there was no unpack buffer we made sure that any
unpack state was applied during serialisation (as this makes sure that
we have a small set of data to serialise and don't have to serialise
loads of padding because of SKIP_ROWS or a large ROW_LENGTH)
* The flipside though is that it means on replay the pixel unpack state
might be set still with some configuration that we've already applied.
So instead we reset it back to identity before replaying any SubImage
call.