Notes
======
- Create a (hopefully) cross-backend performance statistics abstraction as part of FetchFrameInfo. This currently collects statistics about constant buffer binds, sampler binds, resource binds, client and server style resource updates (e.g. Map and UpdateSubresource), index & vertex buffer binds, and draws and dispatches. In my captures this covers approximately half of all API traffic. The rest is often shader sets, and then usual RS, OM, etc., that aren't currently tracked. During READING state parsing on the wrapped device context, record statistics about the calls and store them into the current frame info. We inspect objects occasionally to get things like their type for recording. It may be useful to expand this in the future to check bind types, etc. On the GL/Vulkan backends the stats data is never initialized and we display the same statistics data as before.
- Add a new statistics pane in the UI. A variety of shim arbitration/marshalling is provided to get the statistics data across the managed/native boundary. Removed the old statistics menu item. Currently the statistics pane is just a text log of various gathered data, with ASCII art (woo!) histograms. However in the future we would like to have image based historgrams as well as support for gathering statistics on a mask of stages, etc.
- Remove 'diagnostic' events (e.g. push/pop/marker) from the API call count as it distorts the API:draw call ratio.
- Add _First and _Count to ShaderResourceType and ShaderStageType for weak enumeration (weak due to int cast requirement).
- Provide utility functions Log2Floor for 32 and 64 bit types. These require OS specific clz, which are provided for Windows/VS and Linux/gcc/clang.
TODO
======
- UI toolkit based historgram, ability to set stages enabled for statistics, etc.
- Revisit necessity of gathering statistics across frames--apparently there is no multi-frame capture in a single log anymore?
- Revisit min/max/etc. gathering on the data--is there some way to improve this to not be so mechanical? Perhaps with field enumeration? However if moving to C++/Qt in future, possibly not a good time investment.
* I don't like the fact that it doesn't "just work" but this is mostly
limited by design decisions on the side of the vulkan loader.
* There is no good way with the loader to say 'please also include this
layer in the enumerated lists'. There is an all-or-nothing override of
layer searching, but that would break any layers the application might
actually rely on.
* On balance I've decided to go with this method, as it's a one-off
interruption for the user (unless someone is constantly switching
between installs).
* Rather than explicitly having vkQueueSubmit as a node with command
buffer children, and those with the contents, we now inline everything
and just add labels at beginning and end.
* Also tweaked slightly the fake pass algorithm to handle labels being
present and merging a bit more aggressively to merge adjacent command
buffers that are doing the same pass.
* This will handle the new vulkan binding model with multiple descriptor
sets and arrays of objects in each binding. It also makes a few tidy
ups and improvements to other APIs in presentation - will e.g. now
show thumbnails for vertex and other stages.
* Also the PipelineStateViewer will set its current 'sticky' API type
to the common pipeline state, so that when a log isn't loaded we can
still get API-specific properties that match the last API used.
* For now, we just assume that cbuffers are tightly packed according to
D3D11 rules (matrices, structs, float3/4 are all float4 aligned), and
once final SPIR-V is generated everything should have explicit
offsets, strides, and sizes
* Next step is to display VS and other stage inputs on the input panel.
* Also need to tidy up the fetching of highest mip/array slice etc to
use the same codepath.
* This seems way more reliable and smaller than shipping a compiled .dll
* For some reason I didn't find this method before (I only knew that
you could ship the loose files which wasn't a good solution).
* Tweaked flycam a bit too, but not much.
* Refactored the API/C# side camera classes to avoid exposing a ton of
stuff just to do relative rotations in the arcball via quaternions.
* The arcball lookat position can also be dragged with alt-click or
middle click.
* Also supports other elements as position not just magically-selected
"POSITION" element.
* The option will enable monospaced fonts for all data displays, like
the list of events, API calls, etc as well as pipeline displays, entry
of filename/directory in the capture window and many other places.
Pure UI labelling etc mostly still stays as a serif font.
* A few sizes of controls were tweaked (like headers in the pipeline
windows) so that they didn't just barely overflow with the larger
font.
* While looking at this, it became obvious that buffer viewers and
constant bufferviewers should always display in monospaced regardless,
so that has been changed.
* Errors like syntax and runtime errors in python are thrown as
exceptions. So for when we invoke onto the renderer thread to do some
work, we need to be able to catch those exceptions otherwise the whole
program dies. So over the execute, temporarily switch the thread into
a catching-exception mode, which then gets rethrown on the invoker's
thread.
* Note that BeginInvoke shouldn't be used by python since the callback
might happen after the execution has finished (there's no way to wait
at the moment).
* The UAV array is provided as we expect it - with UAVs from 0 onwards
even if their 'slots' are 4, 5, 6 etc or whatever. UAVStartSlot is
the slot of the 0th UAV.
* In D3D11 the RTs and UAVs have the same namespace, but GL has separate
framebuffer attachments and images. This change splits up the concepts
so the D3D11 case is a special case, in a way, and that we can handle
any generic mix of the two nicely.