## f32, u32, and const

Some time ago, I wrote “floats, bits, and constant expressions” about converting floating point number into its representative ones and zeros as a C++ constant expression – constructing the IEEE 754 representation without being able to examine the bits directly.

I’ve been playing around with Rust recently, and rewrote that conversion code as a bit of a learning exercise for myself, with a thoroughly contrived set of constraints: using integer and single-precision floating point math, at compile time, without unsafe blocks, while using as few unstable features as possible.

I’ve included the listing below, for your bemusement and/or head-shaking, and you can play with the code in the Rust Playground and rust.godbolt.org

Continue reading “f32, u32, and const”

## Modern C++ Randomness

So I did a little digging to satisfy my own curiosity about the “modern C++” version, and have learned a few things that I didn’t know previously…

(this is a manual unrolled twitter thread that starts here, with slight modifications)

Nearly all of this I gleaned from the invaluable and . Comments about implementation refer specifically to the gcc-8.1 C++ standard library, examined using Compiler Explorer and the -E command line option.

std::random_device is a platform-specific source of entropy.

std: mt19937 is a parameterized typedef of std::mersenne_twister_engine

specifically:
std::mersenne_twister_engine<uint_fast32_t, 32, 624, 397, 31, 0x9908b0df, 11, 0xffffffff, 7, 0x9d2c5680, 15, 0xefc60000, 18, 1812433253>
(What do those number mean? I don’t know.)

And std::uniform_int_distribution produces uniformly distributed random numbers over a specified range, from a provided generator.

The default constructor for std::random_device takes an implementation-defined argument, with a default value.

The meaning of the argument is implementation-defined – but the type is not: std::string. (I’m not sure why a dynamically modifiable string object was the right choice to be the configuration parameter for an entropy generator.)

There are out-of-line private functions for much of this implementation of std::random_device. The constructor that calls the out-of-line init function is itself inline – so the construction and destruction of the default std::string param is also generated inline.

Also, peeking inside std::random_generator, there is a union with two members:

void* _M_file, which I guess would be used to store a file handle for /dev/urandom or similar.

std::mt19937 _M_mt, which is a … parameterized std::mersenne_twister_engine object.

So it seems reasonable to me that if you can’t get entropy* from outside your program, generate your own approximation. It looks like it is possible that the entropy for the std::mersenne_twister_engine will be provided by a std::mersenne_twister_engine.

Unlike std::random_device, which has its implementation out of line, std::mersenne_twister_engine‘s implementation seems to be all inline. It is unclear what benefits this brings, but it results in a few hundred additional instructions generated.

And then there’s std::uniform_int_distribution, which seems mostly unsurprising. It is again fully inline, which (from a cursory eyeballing) may allow a sufficiently insightful compiler to avoid a couple of branches and function calls.

The code that got me started on this was presented in jest – but (std::random_device + std::mt19937 + std::uniform_int_distribution) is a commonly recommended pattern for generating random numbers using these modern C++ library features.

My takeaways:
std::random_device is potentially very expensive to use – and doesn’t provide strong cross-platform guarantees about the randomness it provides. It is configured with an std::string – the meaning of which is platform dependent. I am not compelled to use this type.

std::mt19937 adds a sizeable chunk of codegen via its inline implementation – and there are better options than Mersenne Twister.

Bottom line: I’m probably going to stick with rand(), and if I need something a little fancier,  or one of the other suggestions provided as replies to the twitter thread.

## Watch as the OS rewrites my buggy program.

I didn’t know that SetErrorMode(SEM_NOALIGNMENTFAULTEXCEPT) was a thing, until I wrote a bad test that wouldn’t crash.

Digging into it, I found that a movaps instruction was being rewritten as movups, which was a thoroughly confusing thing to see.

The one clue I had was that a fault due to an unaligned load had been observed in non-test code, but did not reproduce when written as a test using the google-test framework. A short hunt later (including a failed attempt at writing a small repro case), I found an explanation: google test suppresses this class of failure.

The code below will successfully demonstrate the behavior, printing out the SIMD load instruction before and after calling the function with an unaligned pointer.

[Gist]

## A little bit of floating point in a memory allocator — Part 2: The floating point

[Previously]

This post contains the same material as this thread of tweets, with a few minor edits.

In IEEE754, floating point numbers are represented like this:

±2ⁿⁿⁿ×1.sss…

nnn is the exponent, which is floor(log2(size)) — which happens to be the fl value computed by TLSF.

sss… is the significand fraction: the part that follows the decimal point, which happens to be sl.

And so to calculate fl and sl, all we need to do is convert size to a floating point value (on recent x86 hardware, that’s a single instruction). Then we can extract the exponent, and the upper bits of the fractional part, and we’re all done :D

That can be implemented like this:

```double sf = (int64_t)size;
uint64_t sfi;
memcpy(&sfi, &sf, 8);
fl = (sfi >> 52) - (1023 + 7);
sl = (sfi >> 47) & 31;
```

There’s some subtleties (there always is). I’ll break it down…

```double sf = (int64_t)size;
```

Convert size to a double, with an explicit cast. size has type size_t, but using TLSF from github.com/mattconte/tlsf, the largest supported allocation on 64bit architecture is 2^32 bytes – comfortably less than the precision provided by the double type. If you need your TLSF allocator to allocate chunks bigger than 2^53, this isn’t the technique for you :)

I first tried using float (not double), which can provide correct results — but only if the rounding mode happens to be set correctly. double is easier.

The cast to (int64_t) results in better codegen on x86: without it, the compiler will generate a full 64bit unsigned conversion, and there is no single instruction for that.

The cast tells the compiler to (in effect) consider the bits of size as if they were a two’s complement signed value — and there is an SSE instruction to handle that case (cvtsi2sdq or similar). Again, with the implementation we’re using size can’t be that big, so this will do the Right Thing.

```uint64_t sfi;
memcpy(&sfi, &sf, 8);
```

Copy the 8 bytes of the double into an unsigned integer variable. There are a lot of ways that C/C++ programmers copy bits from floating point to integer – some of them are well defined :) memcpy() does what we want, and any moderately respectable compiler knows how to select decent instructions to implement it.

Now we have floating point bits in an integer register, consisting of one sign bit (always zero for this, because size is always positive), eleven exponent bits (offset by 1023), and 52 bits of significant fraction. All we need to do is extract those, and we’re done :)

```fl = (sfi >> 52) - (1023 + 7);
```

Extract the exponent: shift it down (ignoring the always-zero sign bit), subtract the offset (1023), and that 7 we saw earlier, at the same time.

```sl = (sfi >> 47) & 31;
```

Extract the five most significant bits of the fraction – we do need to mask out the exponent.

And, just like that*, we have mapping_insert(), implemented in terms of integer -> floating point conversion.

* Actual code (rather than fragments) may be included in a later post…

## A little bit of floating point in a memory allocator — Part 1: Background

This post contains the same material as this thread of tweets, with a few minor edits.

Over my holiday break at the end of 2017, I took a look into the TLSF (Two Level Segregated Fit) memory allocator to better understand how it works. I’ve made use of this allocator and have been impressed by its real world performance, but never really done a deep dive to properly understand it.

The mapping_insert() function is a key part of the allocator implementation, and caught my eye. Here’s how that function is described in the paper A constant-time dynamic storage allocator for real-time systems:

I’ll be honest: from that description, I never developed a clear picture in my mind of what that function does.

(Reading it now, it seems reasonably clear – but I can say that only after I spent quite a bit of time using other methods to develop my understanding)

Something that helped me a lot was by looking at the implementation of that function from github.com/mattconte/tlsf/.  There’s a bunch of long-named macro constants in there, and a few extra implementation details. If you collapse those it looks something like this:

```void mapping_insert(size_t size, int* fli, int* sli)
{
int fl, sl;
if (size < 256)
{
fl = 0;
sl = (int)size / 8;
}
else
{
fl = fls(size);
sl = (int)(size >> (fl - 5)) ^ 0x20;
fl -= 7;
}
*fli = fl;
*sli = sl;
}
```

It’s a pretty simple function (it really is). But I still failed to *see* the pattern of results that would be produced in my mind’s eye.

I went so far as to make a giant spreadsheet of all the intermediate values for a range of inputs, to paint myself a picture of the effect of each step :) That helped immensely.

Breaking it down…

There are two cases handled in the function: one for when size is below a certain threshold, and on for when it is larger. The first is straightforward, and accounts for a small number of possible input values. The large size case is more interesting.

The function computes two values: fl and sl, the first and second level indices for a lookup table. For the large case, fl (where fl is “first level”) is computed via fls(size) (where fls is short for “find last set” – similar names, just to keep you on your toes).

fls() returns the index of the largest bit set, counting from the least significant slbit, which is the index of the largest power of two. In the words of the paper:

“the instruction fls can be used to compute the ⌊log2(x)⌋ function”

Which is, in C-like syntax: floor(log2(x))

And there’s that “fl -= 7” at the end. That will show up again later.

For the large case, the computation of sl has a few steps:

sl = (size >> (fl – 5)) ^ 0x20;

Depending on shift down size by some amount (based on fl), and mask out the sixth bit?

(Aside: The CellBE programmer in me is flinching at that variable shift)

It took me a while (longer than I would have liked…) to realize that this
size >> (fl – 5) is shifting size to generate a number that has exactly six significant bits, at the least significant end of the register (bits 5 thru 0).

Because fl is the index of the most significant bit, after this shift, bit 5 will always be 1 – and that “^ 0x20” will unset it, leaving the result as a value between 0 and 31 (inclusive).

So here’s where floating point comes into it, and the cute thing I saw: another way to compute fl and sl is to convert size into an IEEE754 floating point number, and extract the exponent, and most significant bits of the mantissa. I’ll cover that in the next part, here.

## Aside: Over-engineered Min() [C++, variadic templates, constexpr, fold left]

Q: Given a function `constexpr int Min(int a, int b)`, construct a function `constexpr int Min(Args... args)` that returns the minimum of all the provided args. Fail to justify your over-engineering.

A: Rename `Min(int, int)` as `MinImpl(int, int)` or stick it in a namespace. Overloading the function is not only unnecessary, it gets in the way of the implementation.

```constexpr int MinImpl(int a, int b)
{
return a < b ? a : b;
}```

Implement a `constexpr` fold left function. If we can use it for `Min()`, we should be able to do the same for `Max()`, and other similar functions. Should we be able to find any (#prematuregeneralization).

```template<typename ArgA, typename ArgB, typename Func>
constexpr auto foldl(Func func, ArgA a, ArgB b)
{
return func(a, b);
}

template<typename ArgA, typename ArgB,
typename Func, typename ...Args>
constexpr auto foldl(Func func, ArgA a, ArgB b, Args... args)
{
return foldl(func, func(a, b), args...);
}```

Combine the two.

```template<typename ...Args>
constexpr auto Min(Args... args)
{
return foldl(MinImpl, args...);
}```

Add the bare minimum amount of testing for a constexpr function: slap a `static_assert()` on it.

`static_assert(Min(6, 4, 5, 3, 9) == 3), "Nope");`

I did so with Visual Studio 2015 Update 2. It did not object.

Addendum: Some discussion with @nlguillemot and @DrPizza led to this attempt to do something similar with a C++17/C++1z fold-expression:

```#include <limits.h>

constexpr int MinImpl1(int a, int b) { return a < b ? a : b; }

constexpr void MinImpl2(int* m, int a, int b) { *m = a < b ? a : b; }

template<typename ...Args>
constexpr int Min(Args... args)
{
int m = INT_MAX;
// a binary expression in an operand of a fold-expression
// is not allowed, so this won't compile:
//((m = MinImpl1(m, args), ...);

// But this does:
(MinImpl2(/*out*/&m, m, args), ...);
return m;
}

int main()
{
static_assert(Min(3,4,5) == 3, "nope");
}```

This compiles with a gcc-6 pre-release snapshot.

Update: Here’s a further updated version, based on a refinement by @dotstdy.

## floats, bits, and constant expressions

Can you access the bits that represent an IEEE754 single precision float in a C++14 constant expression (constexpr)?

(Why would you want to do that? Maybe you want to run a fast inverse square root at compile time. Or maybe you want to do something that is actually useful. I wanted to know if it could be done.)

For context: this article is based on experiences using gcc-5.3.0 and clang-3.7.1 with -std=c++14 -march=native on a Sandy Bridge Intel i7. Where I reference sections from the C++ standard, I’m referring to the November 2014 draft.

Before going further, I’ll quote 5.20.6 from the standard:

Since this International Standard imposes no restrictions on the accuracy of floating-point operations, it is unspecified whether the evaluation of a floating-point expression during translation yields the same result as the evaluation of the same expression (or the same operations on the same values) during program execution.88

88) Nonetheless, implementations are encouraged to provide consistent results, irrespective of whether the evaluation was performed during translation and/or during program execution.

In this post, I document things that worked (and didn’t work) for me. You may have a different experience.

### Methods of conversion that won’t work

(Error text from g++-5.3.0)

You can’t access the bits of a float via a typecast pointer [which is undefined behavior, and covered by 5.20.2.5]:

```  constexpr uint32_t bits_cast(float f)
{
return *(uint32_t*)&f; // [2]
}
```
```error:
accessing value of 'f' through a 'uint32_t {aka unsigned int}' glvalue
in a constant expression```

You can’t convert it via a reinterpret cast [5.20.2.13]

```constexpr uint32_t bits_reinterpret_cast(float f)
{
const unsigned char* cf = reinterpret_cast<const unsigned char*>(&f);
// endianness notwithstanding
return (cf[3] << 24) | (cf[2] << 16) | (cf[1] << 8) | cf[0];
}
```
```error:
'*(cf + 3u)' is not a constant expression
```

(gcc reports an error with the memory access, but does not object to the `reinterpret_cast`. clang produces a specific error for the cast.)

You can’t convert it through a union [gcc, for example, permits this for non-constant expressions, but the standard forbids it in 5.20.2.8]:

```constexpr uint32_t bits_union(float f)
{
union Convert {
uint32_t u;
float f;
constexpr Convert(float f_) : f(f_) {}
};
return Convert(f).u;
}
```
```error:
initialized 'bits_union(float)::Convert::f' member
in constant expression
```

You can’t use `memcpy()` [5.20.2.2]:

```constexpr uint32_t bits_memcpy(float f)
{
uint32_t u = 0;
memcpy(&u, &f, sizeof f);
return u;
}
```
```error:
'memcpy(((void*)(&u)), ((const void*)(&f)), 4ul)'
is not a constant expression
```

And you can’t define a constexpr `memcpy()`-like function that is capable of the task [5.20.2.11]:

```constexpr void* memcpy(void* dest, const void* src, size_t n)
{
char* d = (char*)dest;
const char* s = (const char*)src;
while(n-- > 0)
*d++ = *s++;
return dest;
}

constexpr uint32_t bits_memcpy(float f)
{
uint32_t u = 0;
memcpy(&u, &f, sizeof f);
return u;
}
```
```error:
accessing value of 'u' through a 'char' glvalue
in a constant expression
```

So what can you do?

### Floating point operations in constant expressions

For `constexpr float f = 2.0f, g = 2.0f` the following operations are available [as they are not ruled out by anything I can see in 5.20]:

• Comparison of floating point values e.g.
`static_assert(f == g, "not equal");`
• Floating point arithmetic operations e.g.
`static_assert(f * 2.0f == 4.0f, "arithmetic failed");`
• Casts from float to integral value, often with well-defined semantics e.g.
`constexpr int i = (int)2.0f; static_assert(i == 2, "conversion failed");`

So I wrote a function (`uint32_t bits(float)`) that will return the binary representation of an IEEE754 single precision float. The full function is at the end of this post. I’ll go through the various steps required to produce (my best approximation of) the desired result.

### 1. Zero

When `bits()` is passed the value zero, we want this behavior:

`static_assert(bits(0.0f) == 0x00000000);`

And we can have it:

```  if (f == 0.0f)
return 0;
```

### 2. Negative zero

In IEEE754 land, negative zero is a thing. Ideally, we’d like this behavior:

`static_assert(bits(-0.0f) == 0x80000000)`

But the check for zero also matches negative zero. Negative zero is not something that the C++ standard has anything to say about, given that IEEE754 is an implementation choice [3.9.1.8: “The value representation of floating-point types is implementation defined”]. My compilers treat negative zero the same as zero for all comparisons and arithmetic operations. As such, `bits()` returns the wrong value when considering negative zero, returning `0x00000000` rather than the desired `0x80000000`.

I did look into other methods for detecting negative zero in C++, without finding something that would work in a constant expression. I have seen divide by zero used as a way to detect negative zero (resulting in ±infinity, depending on the sign of the zero), but that doesn’t compile in a constant expression:

`constexpr float r = 1.0f / -0.0f;`
```error: '(1.0e+0f / -0.0f)' is not a constant expression
```

and divide by zero is explicitly named as undefined behavior in 5.6.4, and so by 5.20.2.5 is unusable in a constant expression.

Result: negative zero becomes positive zero.

### 3. Infinity

We want this:

`static_assert(bits(INFINITY) == 0x7f800000);`

And this:

```  else if (f == INFINITY)
return 0x7f800000;
```

works as expected.

### 4. Negative Infinity

Same idea, different sign:

`static_assert(bits(-INFINITY) == 0xff800000);`
```  else if (f == -INFINITY)
return 0xff800000;
```

Also works.

### 5. NaNs

There’s no way to generate arbitrary NaN constants in a constant expression that I can see (not least because casting bits to floats isn’t possible in a constant expression, either), so it seems impossible to get this right in general.

In practice, maybe this is good enough:

`static_assert(bits(NAN) == 0x7fc00000);`

NaN values can be anywhere in the range of `0x7f800001 -- 0x7fffffff` and `0xff800001 -- 0xffffffff`. I have no idea as to the specific values that are seen in practice, nor what they mean. `0x7fc00000` shows up in `/usr/include/bits/nan.h` on the system I’m using to write this, so — right or wrong — I’ve chosen that as the reference value.

It is possible to detect a NaN value in a constant expression, but not its payload. (At least that I’ve been able to find). So there’s this:

```  else if (f != f) // NaN
return 0x7fc00000; // This is my NaN...
```

Which means that of the 2*(223-1) possible NaNs, one will be handled correctly (in this case, `0x7fc00000`). For the other 16,777,213 values, the wrong value will be returned (in this case, `0x7fc00000`).

So… partial success? NaNs are correctly detected, but the bits for only one NaN value will be returned correctly.

(On the other hand, the probability that it will ever matter could be stored as a denormalized float)

### 6. Normalized Values

```// pseudo-code
static_assert(bits({  0x1p-126f, ...,  0x1.ffff7p127})
== { 0x00800000, ...,  0x7f7fffff});
static_assert(bits({ -0x1p-126f, ..., -0x1.ffff7p127})
== { 0x80800000, ...,  0xff7fffff});
```

[That `0x1pnnnf` format happens to be a convenient way to represent exact values that can be stored as binary floating point numbers]

It is possible to detect and correctly construct bits for every normalized value. It does requires a little care to avoid truncation and undefined behavior. I wrote a few different implementations — the one that I describe here requires relatively little code, and doesn’t perform terribly [0].

The first step is to find and clear the sign bit. This simplifies subsequent steps.

``` bool sign = f < 0.0f;
float abs_f = sign ? -f : f;
```

Now we have `abs_f` — it’s positive, non-zero, non-infinite, and not a NaN.

What happens when a float is cast to an integral type?

```uint64_t i = (uint64_t)f;
```

The value of `f` will be stored in `i`, according to the following rules:

• The value will be rounded towards zero which, for positive values, means truncation of any fractional part.
• If the value in `f` is too large to be represented as a `uint64_t` (i.e. `f` > 264-1) the result is undefined.

If truncation takes place, data is lost. If the number is too large, the result is (probably) meaningless.

For our conversion function, if we can scale `abs_f` into a range where it is not larger than (264-1), and it has no fractional part, we have access to an exact representation of the bits that make up the float. We just need to keep track of the amount of scaling being done.

Single precision IEEE 754 floating point numbers have, at most, (23+1) bits of precision (23 in the significand, 1 implicit). This means that we can scale down large numbers and scale up small numbers into the required range.

Multiplying by powers of two change only the exponent of the float, and leave the significand unmodified. As such, we can arbitrarily scale a float by a power of two and — so long as we don’t over- or under-flow the float — we will not lose any of the bits in the significand.

For the sake of simplicity (believe it or not [1]), my approach is to scale `abs_f` in steps of 241 so that (`abs_f` ≥ 287) like so:

```  int exponent = 254;

while(abs_f < 0x1p87f)
{
abs_f *= 0x1p41f;
exponent -= 41;
}
```

If `abs_f` ≥ 287, the least significant bit of `abs_f`, if set, is 2(87-23)==264.

Next, `abs_f` is scaled back down by 264 (which adds no fractional part as the least significant bit is 264) and converted to an unsigned 64 bit integer.

```  uint64_t a = (uint64_t)(abs_f * 0x1p-64f);
```

All of the bits of `abs_f` are now present in `a`, without overflow or truncation. All that is needed now is to determine where they are:

```  int lz = count_leading_zeroes(a);
```

```  exponent -= lz;
```

and construct the result:

```  uint32_t significand = (a << (lz + 1)) >> (64 - 23); // [3]
return (sign << 31) | (exponent << 23) | significand;
```

With this, we have correct results for every normalized float.

### 7. Denormalized Values

```// pseudo-code
static_assert(bits({  0x1.0p-149f, ...,  0x1.ffff7p-127f})
== {  0x00000001,  ...,  0x007fffff});
static_assert(bits({ -0x1.0p-149f, ..., -0x1.ffff7p-127f})
== {  0x80000001,  ...,  0x807fffff});
```

The final detail is denormalized values. Handling of normalized values as presented so far fails because denormals will have additional leading zeroes. They are fairly easy to account for:

```   if (exponent <= 0)
{
exponent = 0;
lz = 8 - 1;
}
```

To attempt to demystify that `lz = 8 - 1` a little: there are 8 leading bits that aren’t part of the significand of a denormalized single precision float after the repeated 2-41 scaling that has taken place. There is also no leading 1 bit that is present in all normalized numbers (which is accounted for in the calculation of `significand` above as `(lz + 1)`). So the leading zero count (`lz`) is set to account for the 8 bits of offset to the start of the denormalized significand, minus the one that the subsequent calculation assumes it needs to skip over.

And that’s it. All the possible values of a float are accounted for.

(Side note: If you’re compiling with -ffast-math, passing denormalized numbers to `bits()` will return invalid results. That’s -ffast-math for you. With gcc or clang, you could add an `#ifdef __FAST_MATH__` around the test for negative exponent.)

### Conclusion

You can indeed obtain the bit representation of a floating point number at compile time. Mostly. Negative zero is wrong, NaNs are detected but otherwise not accurately converted.

The whole deal:

```// Based on code from
// https://graphics.stanford.edu/~seander/bithacks.html
{
constexpr char bit_position[64] = {
0,  1,  2,  7,  3, 13,  8, 19,  4, 25, 14, 28,  9, 34, 20, 40,
5, 17, 26, 38, 15, 46, 29, 48, 10, 31, 35, 54, 21, 50, 41, 57,
63,  6, 12, 18, 24, 27, 33, 39, 16, 37, 45, 47, 30, 53, 49, 56,
62, 11, 23, 32, 36, 44, 52, 55, 61, 22, 43, 51, 60, 42, 59, 58 };

v |= v >> 1; // first round down to one less than a power of 2
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v |= v >> 32;
v = (v >> 1) + 1;

return 63 - bit_position[(v * 0x0218a392cd3d5dbf)>>58]; // [3]
}

constexpr uint32_t bits(float f)
{
if (f == 0.0f)
return 0; // also matches -0.0f and gives wrong result
else if (f == INFINITY)
return 0x7f800000;
else if (f == -INFINITY)
return 0xff800000;
else if (f != f) // NaN
return 0x7fc00000; // This is my NaN...

bool sign = f < 0.0f;
float abs_f = sign ? -f : f;

int exponent = 254;

while(abs_f < 0x1p87f)
{
abs_f *= 0x1p41f;
exponent -= 41;
}

uint64_t a = (uint64_t)(abs_f * 0x1p-64f);
exponent -= lz;

if (exponent <= 0)
{
exponent = 0;
lz = 8 - 1;
}

uint32_t significand = (a << (lz + 1)) >> (64 - 23); // [3]
return (sign << 31) | (exponent << 23) | significand;
}
```

[0] Why does runtime performance matter? Because that’s how I tested the conversion function while implementing it. I was applying Bruce Dawson’s advice for testing floats and the quicker I found out that I’d broken the conversion the better. For the implementation described in this post, it takes about 97 seconds to test all four billion float values on my laptop — half that time if I wasn’t testing negative numbers (which are unlikely to cause problems due to the way I handle the sign bit). The implementation I’ve described in this post is not the fastest solution to the problem, but it is relatively compact, and well behaved in the face of `-ffast-math`.

Admission buried in a footnote: I have not validated correct behavior of this code for every floating point number in actual compile-time constant expressions. Compile-time evaluation of four billion invocations of `bits()` takes more time than I’ve been willing to invest so far.

[1] It is conceptually simpler to multiply `abs_f` by two (or one half) until the result is exactly positioned so that no leading zero count is required after the cast — at least, that was what I did in my first attempt. The approach described here was found to be significantly faster. I have no doubt that better-performing constant-expression-friendly approaches exist.

[2] Update 2016-03-28: Thanks to satbyy for pointing out the missing ampersand — it was lost sometime after copying the code into the article.

[3] Update 2016-03-28: Thanks to louiswins for pointing out additional code errors.

## Another another C++11 ‘countof’

My earlier post received this comment which is a pretty neat little improvement over the one from g-truc.net.

Here it is, with one further tweak:

```template<typename T, std::size_t N>
constexpr
std::integral_constant<std::size_t, N> countof(T const (&)[N]) noexcept
{
return {};
}

#define COUNTOF(...) decltype(countof(__VA_ARGS__))::value```

The change I’ve made to pfultz2’s version is to use `::value` rather than `{}` after `decltype` in the macro.

This makes the type of the result `std::size_t` not `std::integral_constant`, so it can be used in va_arg settings without triggering compiler or static analysis warnings.

It also has the advantage of not triggering extra warnings in VS2015U1 (this issue).

## Another C++11 ‘countof’

Note: There’s an update here.

Read “Better array ‘countof’ implementation with C++ 11” for context. Specifically, it presents Listing 5 as an implementation of `countof()` using C++11 constexpr:

• ```template<typename T, std::size_t N>
constexpr std::size_t countof(T const (&)[N]) noexcept
{
return N;
}```

But this falls short. Just a little.

There are arguments that could be passed to a naive `sizeof(a)/sizeof(a[0])` macro that will cause the above to fail to compile.

Consider:

```struct S
{
int a[4];
};

void f(S* s)
{
constexpr size_t s_a_count = countof(s->a);
int b[s_a_count];
// do things...
}```

This does not compile. `s` is not constant, and `countof()` is a constexpr function whose result is needed at compile time, and so expects a constexpr-friendly argument. Even though it is not used.

Errors from this kind of thing can look like this from clang-3.7.0:

```error: constexpr variable 's_a_count' must be initialized by a
constant expression
note:  read of non-constexpr variable 's' is not allowed in a
constant expression```

or this from Visual Studio 2015 Update 1:

`error: C2131: expression did not evaluate to a constant`

(Aside: At the time of writing, the error C2131 seems to be undocumented for VS2015. But Visual Studio 6.0 had an error with the same number)

Here’s a C++11 version of `countof()` that will give the correct result for `countof(s->a)` above:

```#include <type_traits>

template<typename Tin>
constexpr std::size_t countof()
{
using T = typename std::remove_reference<Tin>::type;
static_assert(std::is_array<T>::value,
"countof() requires an array argument");
static_assert(std::extent<T>::value > 0,  // [0]
"zero- or unknown-size array");
return std::extent<T>::value;
}

#define countof(a) countof<decltype(a)>()```

Some of the details:

Adding a `countof()` macro allows use of `decltype()` in the caller’s context, which provides the type of the member array of a non-const object at compile time.

`std::remove_reference` is needed to get the array type from the result of `decltype()`. Without it, `std::is_array` and `std::extent` produce false and zero, respectively.

The first static assert ensures that `countof()` is being called on an actual array. The upside over failed template instantiation or specialization is that you can write your own human-readable, slightly more context aware error message (better than mine).

The second static assert validates that the array size is known, and is greater than zero. Without it, `countof<int[]>()` will return zero (which will be wrong) without error. And zero-sized arrays will also result in zero — in practice they rarely actually contain zero elements. This isn’t a function for finding the size of those arrays.

And then `std::extent<T>::value` produces the actual count of the elements of the array.

If replacing an existing `sizeof`-based macro with a constexpr `countof()` alternate, Visual Studio 2015 Update 1 will trigger warnings in certain cases where there previously were no warnings.
```warning C4267: conversion from 'size_t' to 'int', possible loss of data