Add half.hpp 2.1.0 from http://half.sourceforge.net/

2026-05-04 09:00:44 +00:00 · 2020-07-30 12:04:09 +01:00
parent ab28d102cd
commit d9121a7f90
7 changed files with 4925 additions and 0 deletions
@@ -116,6 +116,10 @@ The following libraries and components are incorporated into RenderDoc, listed h

  Used to intercept nvapi calls on D3D11/D3D12.

+* `half.hpp <http://half.sourceforge.net/>`_ - Copyright (c) 2012-2019 Christian Rau. Distributed under the MIT License.
+
+  Used to emulate half-precision operations in shader debugging.
+
 Thanks
 ------

@@ -0,0 +1,21 @@
+The MIT License
+
+Copyright (c) 2012-2019 Christian Rau
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
@@ -0,0 +1,317 @@
+HALF-PRECISION FLOATING-POINT LIBRARY (Version 2.1.0)
+-----------------------------------------------------
+
+This is a C++ header-only library to provide an IEEE 754 conformant 16-bit 
+half-precision floating-point type along with corresponding arithmetic 
+operators, type conversions and common mathematical functions. It aims for both 
+efficiency and ease of use, trying to accurately mimic the behaviour of the 
+built-in floating-point types at the best performance possible.
+
+
+INSTALLATION AND REQUIREMENTS
+-----------------------------
+
+Conveniently, the library consists of just a single header file containing all 
+the functionality, which can be directly included by your projects, without the 
+neccessity to build anything or link to anything.
+
+Whereas this library is fully C++98-compatible, it can profit from certain 
+C++11 features. Support for those features is checked automatically at compile 
+(or rather preprocessing) time, but can be explicitly enabled or disabled by 
+predefining the corresponding preprocessor symbols to either 1 or 0 yourself 
+before including half.hpp. This is useful when the automatic detection fails 
+(for more exotic implementations) or when a feature should be explicitly 
+disabled:
+
+  - 'long long' integer type for mathematical functions returning 'long long' 
+    results (enabled for VC++ 2003 and icc 11.1 and newer, gcc and clang, 
+    overridable with 'HALF_ENABLE_CPP11_LONG_LONG').
+
+  - Static assertions for extended compile-time checks (enabled for VC++ 2010, 
+    gcc 4.3, clang 2.9, icc 11.1 and newer, overridable with 
+    'HALF_ENABLE_CPP11_STATIC_ASSERT').
+
+  - Generalized constant expressions (enabled for VC++ 2015, gcc 4.6, clang 3.1, 
+    icc 14.0 and newer, overridable with 'HALF_ENABLE_CPP11_CONSTEXPR').
+
+  - noexcept exception specifications (enabled for VC++ 2015, gcc 4.6, 
+    clang 3.0, icc 14.0 and newer, overridable with 'HALF_ENABLE_CPP11_NOEXCEPT').
+
+  - User-defined literals for half-precision literals to work (enabled for 
+    VC++ 2015, gcc 4.7, clang 3.1, icc 15.0 and newer, overridable with 
+    'HALF_ENABLE_CPP11_USER_LITERALS').
+
+  - Thread-local storage for per-thread floating-point exception flags (enabled 
+    for VC++ 2015, gcc 4.8, clang 3.3, icc 15.0 and newer, overridable with 
+    'HALF_ENABLE_CPP11_THREAD_LOCAL').
+
+  - Type traits and template meta-programming features from <type_traits> 
+    (enabled for VC++ 2010, libstdc++ 4.3, libc++ and newer, overridable with 
+    'HALF_ENABLE_CPP11_TYPE_TRAITS').
+
+  - Special integer types from <cstdint> (enabled for VC++ 2010, libstdc++ 4.3, 
+    libc++ and newer, overridable with 'HALF_ENABLE_CPP11_CSTDINT').
+
+  - Certain C++11 single-precision mathematical functions from <cmath> for 
+    floating-point classification during conversions from higher precision types 
+    (enabled for VC++ 2013, libstdc++ 4.3, libc++ and newer, overridable with 
+    'HALF_ENABLE_CPP11_CMATH').
+
+  - Floating-point environment control from <cfenv> for possible exception 
+    propagation to the built-in floating-point platform (enabled for VC++ 2013, 
+    libstdc++ 4.3, libc++ and newer, overridable with 'HALF_ENABLE_CPP11_CFENV').
+
+  - Hash functor 'std::hash' from <functional> (enabled for VC++ 2010, 
+    libstdc++ 4.3, libc++ and newer, overridable with 'HALF_ENABLE_CPP11_HASH').
+
+The library has been tested successfully with Visual C++ 2005-2015, gcc 4-8 
+and clang 3-8 on 32- and 64-bit x86 systems. Please contact me if you have any 
+problems, suggestions or even just success testing it on other platforms.
+
+
+DOCUMENTATION
+-------------
+
+What follows are some general words about the usage of the library and its 
+implementation. For a complete documentation of its interface consult the 
+corresponding website http://half.sourceforge.net. You may also generate the 
+complete developer documentation from the library's only include file's doxygen 
+comments, but this is more relevant to developers rather than mere users.
+
+BASIC USAGE
+
+To make use of the library just include its only header file half.hpp, which 
+defines all half-precision functionality inside the 'half_float' namespace. The 
+actual 16-bit half-precision data type is represented by the 'half' type, which 
+uses the standard IEEE representation with 1 sign bit, 5 exponent bits and 11 
+mantissa bits (including the hidden bit) and supports all types of special 
+values, like subnormal values, infinity and NaNs. This type behaves like the 
+built-in floating-point types as much as possible, supporting the usual 
+arithmetic, comparison and streaming operators, which makes its use pretty 
+straight-forward:
+
+    using half_float::half;
+    half a(3.4), b(5);
+    half c = a * b;
+    c += 3;
+    if(c > a)
+        std::cout << c << std::endl;
+
+Additionally the 'half_float' namespace also defines half-precision versions 
+for all mathematical functions of the C++ standard library, which can be used 
+directly through ADL:
+
+    half a(-3.14159);
+    half s = sin(abs(a));
+    long l = lround(s);
+
+You may also specify explicit half-precision literals, since the library 
+provides a user-defined literal inside the 'half_float::literal' namespace, 
+which you just need to import (assuming support for C++11 user-defined literals):
+
+    using namespace half_float::literal;
+    half x = 1.0_h;
+
+Furthermore the library provides proper specializations for 
+'std::numeric_limits', defining various implementation properties, and 
+'std::hash' for hashing half-precision numbers (assuming support for C++11 
+'std::hash'). Similar to the corresponding preprocessor symbols from <cmath> 
+the library also defines the 'HUGE_VALH' constant and maybe the 'FP_FAST_FMAH' 
+symbol.
+
+CONVERSIONS AND ROUNDING
+
+The half is explicitly constructible/convertible from a single-precision float 
+argument. Thus it is also explicitly constructible/convertible from any type 
+implicitly convertible to float, but constructing it from types like double or 
+int will involve the usual warnings arising when implicitly converting those to 
+float because of the lost precision. On the one hand those warnings are 
+intentional, because converting those types to half neccessarily also reduces 
+precision. But on the other hand they are raised for explicit conversions from 
+those types, when the user knows what he is doing. So if those warnings keep 
+bugging you, then you won't get around first explicitly converting to float 
+before converting to half, or use the 'half_cast' described below. In addition 
+you can also directly assign float values to halfs.
+
+In contrast to the float-to-half conversion, which reduces precision, the 
+conversion from half to float (and thus to any other type implicitly 
+convertible from float) is implicit, because all values represetable with 
+half-precision are also representable with single-precision. This way the 
+half-to-float conversion behaves similar to the builtin float-to-double 
+conversion and all arithmetic expressions involving both half-precision and 
+single-precision arguments will be of single-precision type. This way you can 
+also directly use the mathematical functions of the C++ standard library, 
+though in this case you will invoke the single-precision versions which will 
+also return single-precision values, which is (even if maybe performing the 
+exact same computation, see below) not as conceptually clean when working in a 
+half-precision environment.
+
+The default rounding mode for conversions between half and more precise types 
+as well as for rounding results of arithmetic operations and mathematical 
+functions rounds to the nearest representable value. But by predefining the 
+'HALF_ROUND_STYLE' preprocessor symbol this default can be overridden with one 
+of the other standard rounding modes using their respective constants or the 
+equivalent values of 'std::float_round_style' (it can even be synchronized with 
+the built-in single-precision implementation by defining it to 
+'std::numeric_limits<float>::round_style'):
+
+  - 'std::round_indeterminate' (-1) for the fastest rounding.
+
+  - 'std::round_toward_zero' (0) for rounding toward zero.
+
+  - 'std::round_to_nearest' (1) for rounding to the nearest value (default).
+
+  - 'std::round_toward_infinity' (2) for rounding toward positive infinity.
+
+  - 'std::round_toward_neg_infinity' (3) for rounding toward negative infinity.
+
+In addition to changing the overall default rounding mode one can also use the 
+'half_cast'. This converts between half and any built-in arithmetic type using 
+a configurable rounding mode (or the default rounding mode if none is 
+specified). In addition to a configurable rounding mode, 'half_cast' has 
+another big difference to a mere 'static_cast': Any conversions are performed 
+directly using the given rounding mode, without any intermediate conversion 
+to/from 'float'. This is especially relevant for conversions to integer types, 
+which don't necessarily truncate anymore. But also for conversions from 
+'double' or 'long double' this may produce more precise results than a 
+pre-conversion to 'float' using the single-precision implementation's current 
+rounding mode would.
+
+    half a = half_cast<half>(4.2);
+    half b = half_cast<half,std::numeric_limits<float>::round_style>(4.2f);
+    assert( half_cast<int, std::round_to_nearest>( 0.7_h )     == 1 );
+    assert( half_cast<half,std::round_toward_zero>( 4097 )     == 4096.0_h );
+    assert( half_cast<half,std::round_toward_infinity>( 4097 ) == 4100.0_h );
+    assert( half_cast<half,std::round_toward_infinity>( std::numeric_limits<double>::min() ) > 0.0_h );
+
+ACCURACY AND PERFORMANCE
+
+From version 2.0 onward the library is implemented without employing the 
+underlying floating-point implementation of the system (except for conversions, 
+of course), providing an entirely self-contained half-precision implementation 
+with results independent from the system's existing single- or double-precision 
+implementation and its rounding behaviour.
+
+As to accuracy, many of the operators and functions provided by this library 
+are exact to rounding for all rounding modes, i.e. the error to the exact 
+result is at most 0.5 ULP (unit in the last place) for rounding to nearest and 
+less than 1 ULP for all other rounding modes. This holds for all the operations 
+required by the IEEE 754 standard and many more. Specifically the following 
+functions might exhibit a deviation from the correctly rounded exact result by 
+1 ULP for a select few input values: 'expm1', 'log1p', 'pow', 'atan2', 'erf', 
+'erfc', 'lgamma', 'tgamma' (for more details see the documentation of the 
+individual functions). All other functions and operators are always exact to 
+rounding or independent of the rounding mode altogether.
+
+The increased IEEE-conformance and cleanliness of this implementation comes 
+with a certain performance cost compared to doing computations and mathematical 
+functions in hardware-accelerated single-precision. On average and depending on 
+the platform, the arithemtic operators are about 75% as fast and the 
+mathematical functions about 33-50% as fast as performing the corresponding 
+operations in single-precision and converting between the inputs and outputs. 
+However, directly computing with half-precision values is a rather rare 
+use-case and usually using actual 'float' values for all computations and 
+temproraries and using 'half's only for storage is the recommended way. But 
+nevertheless the goal of this library was to provide a complete and 
+conceptually clean IEEE-confromant half-precision implementation and in the few 
+cases when you do need to compute directly in half-precision you do so for a 
+reason and want accurate results.
+
+If necessary, this internal implementation can be overridden by predefining the 
+'HALF_ARITHMETIC_TYPE' preprocessor symbol to one of the built-in 
+floating-point types ('float', 'double' or 'long double'), which will cause the 
+library to use this type for computing arithmetic operations and mathematical 
+functions (if available). However, due to using the platform's floating-point 
+implementation (and its rounding behaviour) internally, this might cause 
+results to deviate from the specified half-precision rounding mode. It will of 
+course also inhibit the automatic exception detection described below.
+
+The conversion operations between half-precision and single-precision types can 
+also make use of the F16C extension for x86 processors by using the 
+corresponding compiler intrinsics from <immintrin.h>. Support for this is 
+checked at compile-time by looking for the '__F16C__' macro which at least gcc 
+and clang define based on the target platform. It can also be enabled manually 
+by predefining the 'HALF_ENABLE_F16C_INTRINSICS' preprocessor symbol to 1, or 0 
+for explicitly disabling it. However, this will directly use the corresponding 
+intrinsics for conversion without checking if they are available at runtime 
+(possibly crashing if they are not), so make sure they are supported on the 
+target platform before enabling this.
+
+EXCEPTION HANDLING
+
+The half-precision implementation supports all 5 required floating-point 
+exceptions from the IEEE standard to indicate erroneous inputs or inexact 
+results during operations. These are represented by exception flags which 
+actually use the same values as the corresponding 'FE_...' flags defined in 
+C++11's <cfenv> header if supported, specifically:
+
+  - 'FE_INVALID' for invalid inputs to an operation.
+  - 'FE_DIVBYZERO' for finite inputs producing infinite results.
+  - 'FE_OVERFLOW' if a result is too large to represent finitely.
+  - 'FE_UNDERFLOW' for a subnormal or zero result after rounding.
+  - 'FE_INEXACT' if a result needed rounding to be representable.
+  - 'FE_ALL_EXCEPT' as a convenient OR of all possible exception flags.
+
+The internal exception flag state will start with all flags cleared and is 
+maintained per thread if C++11 thread-local storage is supported, otherwise it 
+will be maintained globally and will theoretically NOT be thread-safe (while 
+practically being as thread-safe as a simple integer variable can be). These 
+flags can be managed explicitly using the library's error handling functions, 
+which again try to mimic the built-in functions for handling floating-point 
+exceptions from <cfenv>. You can clear them with 'feclearexcept' (which is the 
+only way a flag can be cleared), test them with 'fetestexcept', explicitly 
+raise errors with 'feraiseexcept' and save and restore their state using 
+'fegetexceptflag' and 'fesetexceptflag'. You can also throw corresponding C++ 
+exceptions based on the current flag state using 'fethrowexcept'.
+
+However, any automatic exception detection and handling during half-precision 
+operations and functions is DISABLED by default, since it comes with a minor 
+performance overhead due to runtime checks, and reacting to IEEE floating-point 
+exceptions is rarely ever needed in application code. But the library fully 
+supports IEEE-conformant detection of floating-point exceptions and various 
+ways for handling them, which can be enabled by pre-defining the corresponding 
+preprocessor symbols to 1. They can be enabled individually or all at once and 
+they will be processed in the order they are listed here:
+
+  - 'HALF_ERRHANDLING_FLAGS' sets the internal exception flags described above 
+    whenever the corresponding exception occurs.
+  - 'HALF_ERRHANDLING_ERRNO' sets the value of 'errno' from <cerrno> similar to 
+    the behaviour of the built-in floating-point types when 'MATH_ERRNO' is used.
+  - 'HALF_ERRHANDLING_FENV' will propagate exceptions to the built-in 
+    floating-point implementation using 'std::feraiseexcept' if support for 
+    C++11 floating-point control is enabled. However, this does not synchronize 
+    exceptions: neither will clearing  propagate nor will it work in reverse.
+  - 'HALF_ERRHANDLING_THROW_...' can be defined to a string literal which will 
+    be used as description message for a C++ exception that is thrown whenever 
+    a 'FE_...' exception occurs, similar to the behaviour of 'fethrowexcept'.
+
+If any of the above error handling is activated, non-quiet operations on 
+half-precision values will also raise a 'FE_INVALID' exception whenever 
+they encounter a signaling NaN value, in addition to transforming the value 
+into a quiet NaN. If error handling is disabled, signaling NaNs will be 
+treated like quiet NaNs (while still getting explicitly quieted if propagated 
+to the result). There can also be additional treatment of overflow and 
+underflow errors after they have been processed as above, which is ENABLED by 
+default (but of course only takes effect if any other exception handling is 
+activated) unless overridden by pre-defining the corresponding preprocessor 
+symbol to 0:
+
+  - 'HALF_ERRHANDLING_OVERFLOW_TO_INEXACT' will cause overflow errors to also 
+    raise a 'FE_INEXACT' exception.
+  - 'HALF_ERRHANDLING_UNDERFLOW_TO_INEXACT' will cause underflow errors to also 
+    raise a 'FE_INEXACT' exception. This will also slightly change the 
+    behaviour of the underflow exception, which will ONLY be raised if the 
+    result is actually inexact due to underflow. If this is disabled, underflow 
+    exceptions will be raised for ANY (possibly exact) subnormal result.
+
+
+CREDITS AND CONTACT
+-------------------
+
+This library is developed by CHRISTIAN RAU and released under the MIT License 
+(see LICENSE.txt). If you have any questions or problems with it, feel free to 
+contact me at rauy@users.sourceforge.net.
+
+Additional credit goes to JEROEN VAN DER ZIJP for his paper on "Fast Half Float 
+Conversions", whose algorithms have been used in the library for converting 
+between half-precision and single-precision values.
@@ -122,6 +122,7 @@
    <ClInclude Include="3rdparty\compressonator\CMP_Core.h" />
    <ClInclude Include="3rdparty\compressonator\cmp_math_vec4.h" />
    <ClInclude Include="3rdparty\compressonator\Common_Def.h" />
+    <ClInclude Include="3rdparty\half\half.hpp" />
    <ClInclude Include="3rdparty\interceptor-lib\lib\AArch64\target_aarch64.h" />
    <ClInclude Include="3rdparty\interceptor-lib\lib\ARM\target_arm.h" />
    <ClInclude Include="3rdparty\interceptor-lib\lib\code_generator.h" />
@@ -142,6 +142,9 @@
    <Filter Include="3rdparty\compressonator">
      <UniqueIdentifier>{3d212b9b-bf42-40db-84a3-b94e064fd281}</UniqueIdentifier>
    </Filter>
+    <Filter Include="3rdparty\half">
+      <UniqueIdentifier>{1b49d2aa-1cfb-4f0a-8fcf-1065af55ea2b}</UniqueIdentifier>
+    </Filter>
  </ItemGroup>
  <ItemGroup>
    <ClInclude Include="maths\camera.h">
@@ -543,6 +546,9 @@
    <ClInclude Include="3rdparty\compressonator\Common_Def.h">
      <Filter>3rdparty\compressonator</Filter>
    </ClInclude>
+    <ClInclude Include="3rdparty\half\half.hpp">
+      <Filter>3rdparty\half</Filter>
+    </ClInclude>
    <ClInclude Include="api\replay\rdcflatmap.h">
      <Filter>API\Replay</Filter>
    </ClInclude>
@@ -38,5 +38,6 @@ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLI
 {\field{\*\fldinst{HYPERLINK "{\pntext\f1\'B7\tab}https://github.com/Microsoft/microsoft-pdb/"}}{\fldrslt{\ul\cf1 https://github.com/Microsoft/microsoft-pdb}}}\f0\fs22\line Microsoft PDB Information distributed under the MIT License. Copyright (c) 2015 Microsoft Corporation.\par
 {\field{\*\fldinst{HYPERLINK "{\pntext\f1\'B7\tab}https://github.com/GPUOpen-Tools/compressonator/"}}{\fldrslt{\ul\cf1 https://github.com/GPUOpen-Tools/compressonator}}}\f0\fs22\line Compressonator distributed under the MIT License. Copyright (c) 2018 Advanced Micro Devices, Inc. Copyright (c) 2004-2006 ATI Technologies Inc. Distributed under the MIT License.\par
 {\field{\*\fldinst{HYPERLINK "{\pntext\f1\'B7\tab}http://download.nvidia.com/XFree86/nvapi-open-source-sdk/"}}{\fldrslt{\ul\cf1 http://download.nvidia.com/XFree86/nvapi-open-source-sdk}}}\f0\fs22\line nvapi open source SDK distributed under the MIT License. Copyright (c) 2019, NVIDIA CORPORATION.\par
+{\field{\*\fldinst{HYPERLINK "{\pntext\f1\'B7\tab}http://half.sourceforge.net/"}}{\fldrslt{\ul\cf1 http://half.sourceforge.net/}}}\f0\fs22\line half.hpp distributed under the MIT License. Copyright (c) 2012-2019 Christian Rau.\par
 }