constexpr int Min(int a, int b)
, construct a function constexpr int Min(Args... args)
that returns the minimum of all the provided args. Fail to justify your over-engineering.
A: Rename Min(int, int)
as MinImpl(int, int)
or stick it in a namespace. Overloading the function is not only unnecessary, it gets in the way of the implementation.
constexpr int MinImpl(int a, int b) { return a < b ? a : b; }
Implement a constexpr
fold left function. If we can use it for Min()
, we should be able to do the same for Max()
, and other similar functions. Should we be able to find any (#prematuregeneralization).
template<typename ArgA, typename ArgB, typename Func> constexpr auto foldl(Func func, ArgA a, ArgB b) { return func(a, b); } template<typename ArgA, typename ArgB, typename Func, typename ...Args> constexpr auto foldl(Func func, ArgA a, ArgB b, Args... args) { return foldl(func, func(a, b), args...); }
Combine the two.
template<typename ...Args> constexpr auto Min(Args... args) { return foldl(MinImpl, args...); }
Add the bare minimum amount of testing for a constexpr function: slap a static_assert()
on it.
static_assert(Min(6, 4, 5, 3, 9) == 3), "Nope");
I did so with Visual Studio 2015 Update 2. It did not object.
Addendum: Some discussion with @nlguillemot and @DrPizza led to this attempt to do something similar with a C++17/C++1z fold-expression:
#include <limits.h> constexpr int MinImpl1(int a, int b) { return a < b ? a : b; } constexpr void MinImpl2(int* m, int a, int b) { *m = a < b ? a : b; } template<typename ...Args> constexpr int Min(Args... args) { int m = INT_MAX; // a binary expression in an operand of a fold-expression // is not allowed, so this won't compile: //((m = MinImpl1(m, args), ...); // But this does: (MinImpl2(/*out*/&m, m, args), ...); return m; } int main() { static_assert(Min(3,4,5) == 3, "nope"); }
This compiles with a gcc-6 pre-release snapshot.
Update: Here’s a further updated version, based on a refinement by @dotstdy.
]]>(Why would you want to do that? Maybe you want to run a fast inverse square root at compile time. Or maybe you want to do something that is actually useful. I wanted to know if it could be done.)
For context: this article is based on experiences using gcc-5.3.0 and clang-3.7.1 with -std=c++14 -march=native on a Sandy Bridge Intel i7. Where I reference sections from the C++ standard, I’m referring to the November 2014 draft.
Before going further, I’ll quote 5.20.6 from the standard:
Since this International Standard imposes no restrictions on the accuracy of floating-point operations, it is unspecified whether the evaluation of a floating-point expression during translation yields the same result as the evaluation of the same expression (or the same operations on the same values) during program execution.^{88 }
88) Nonetheless, implementations are encouraged to provide consistent results, irrespective of whether the evaluation was performed during translation and/or during program execution.
In this post, I document things that worked (and didn’t work) for me. You may have a different experience.
(Error text from g++-5.3.0)
You can’t access the bits of a float via a typecast pointer [which is undefined behavior, and covered by 5.20.2.5]:
constexpr uint32_t bits_cast(float f) { return *(uint32_t*)&f; // [2] }
error: accessing value of 'f' through a 'uint32_t {aka unsigned int}' glvalue in a constant expression
You can’t convert it via a reinterpret cast [5.20.2.13]
constexpr uint32_t bits_reinterpret_cast(float f) { const unsigned char* cf = reinterpret_cast<const unsigned char*>(&f); // endianness notwithstanding return (cf[3] << 24) | (cf[2] << 16) | (cf[1] << 8) | cf[0]; }
error: '*(cf + 3u)' is not a constant expression
(gcc reports an error with the memory access, but does not object to the reinterpret_cast
. clang produces a specific error for the cast.)
You can’t convert it through a union [gcc, for example, permits this for non-constant expressions, but the standard forbids it in 5.20.2.8]:
constexpr uint32_t bits_union(float f) { union Convert { uint32_t u; float f; constexpr Convert(float f_) : f(f_) {} }; return Convert(f).u; }
error: accessing 'bits_union(float)::Convert::u' member instead of initialized 'bits_union(float)::Convert::f' member in constant expression
You can’t use memcpy()
[5.20.2.2]:
constexpr uint32_t bits_memcpy(float f) { uint32_t u = 0; memcpy(&u, &f, sizeof f); return u; }
error: 'memcpy(((void*)(&u)), ((const void*)(&f)), 4ul)' is not a constant expression
And you can’t define a constexpr memcpy()
-like function that is capable of the task [5.20.2.11]:
constexpr void* memcpy(void* dest, const void* src, size_t n) { char* d = (char*)dest; const char* s = (const char*)src; while(n-- > 0) *d++ = *s++; return dest; } constexpr uint32_t bits_memcpy(float f) { uint32_t u = 0; memcpy(&u, &f, sizeof f); return u; }
error: accessing value of 'u' through a 'char' glvalue in a constant expression
So what can you do?
For constexpr float f = 2.0f, g = 2.0f
the following operations are available [as they are not ruled out by anything I can see in 5.20]:
static_assert(f == g, "not equal");
static_assert(f * 2.0f == 4.0f, "arithmetic failed");
constexpr int i = (int)2.0f; static_assert(i == 2, "conversion failed");
So I wrote a function (uint32_t bits(float)
) that will return the binary representation of an IEEE754 single precision float. The full function is at the end of this post. I’ll go through the various steps required to produce (my best approximation of) the desired result.
When bits()
is passed the value zero, we want this behavior:
static_assert(bits(0.0f) == 0x00000000);
And we can have it:
if (f == 0.0f) return 0;
Nothing difficult about that.
In IEEE754 land, negative zero is a thing. Ideally, we’d like this behavior:
static_assert(bits(-0.0f) == 0x80000000)
But the check for zero also matches negative zero. Negative zero is not something that the C++ standard has anything to say about, given that IEEE754 is an implementation choice [3.9.1.8: “The value representation of floating-point types is implementation defined”]. My compilers treat negative zero the same as zero for all comparisons and arithmetic operations. As such, bits()
returns the wrong value when considering negative zero, returning 0x00000000
rather than the desired 0x80000000
.
I did look into other methods for detecting negative zero in C++, without finding something that would work in a constant expression. I have seen divide by zero used as a way to detect negative zero (resulting in ±infinity, depending on the sign of the zero), but that doesn’t compile in a constant expression:
constexpr float r = 1.0f / -0.0f;
error: '(1.0e+0f / -0.0f)' is not a constant expression
and divide by zero is explicitly named as undefined behavior in 5.6.4, and so by 5.20.2.5 is unusable in a constant expression.
Result: negative zero becomes positive zero.
We want this:
static_assert(bits(INFINITY) == 0x7f800000);
And this:
else if (f == INFINITY) return 0x7f800000;
works as expected.
Same idea, different sign:
static_assert(bits(-INFINITY) == 0xff800000);
else if (f == -INFINITY) return 0xff800000;
Also works.
There’s no way to generate arbitrary NaN constants in a constant expression that I can see (not least because casting bits to floats isn’t possible in a constant expression, either), so it seems impossible to get this right in general.
In practice, maybe this is good enough:
static_assert(bits(NAN) == 0x7fc00000);
NaN values can be anywhere in the range of 0x7f800001 -- 0x7fffffff
and 0xff800001 -- 0xffffffff
. I have no idea as to the specific values that are seen in practice, nor what they mean. 0x7fc00000
shows up in /usr/include/bits/nan.h
on the system I’m using to write this, so — right or wrong — I’ve chosen that as the reference value.
It is possible to detect a NaN value in a constant expression, but not its payload. (At least that I’ve been able to find). So there’s this:
else if (f != f) // NaN return 0x7fc00000; // This is my NaN...
Which means that of the 2*(2^{23}-1) possible NaNs, one will be handled correctly (in this case, 0x7fc00000
). For the other 16,777,213 values, the wrong value will be returned (in this case, 0x7fc00000
).
So… partial success? NaNs are correctly detected, but the bits for only one NaN value will be returned correctly.
(On the other hand, the probability that it will ever matter could be stored as a denormalized float)
// pseudo-code static_assert(bits({ 0x1p-126f, ..., 0x1.ffff7p127}) == { 0x00800000, ..., 0x7f7fffff}); static_assert(bits({ -0x1p-126f, ..., -0x1.ffff7p127}) == { 0x80800000, ..., 0xff7fffff});
[That 0x1pnnnf
format happens to be a convenient way to represent exact values that can be stored as binary floating point numbers]
It is possible to detect and correctly construct bits for every normalized value. It does requires a little care to avoid truncation and undefined behavior. I wrote a few different implementations — the one that I describe here requires relatively little code, and doesn’t perform terribly [0].
The first step is to find and clear the sign bit. This simplifies subsequent steps.
bool sign = f < 0.0f; float abs_f = sign ? -f : f;
Now we have abs_f
— it’s positive, non-zero, non-infinite, and not a NaN.
What happens when a float is cast to an integral type?
uint64_t i = (uint64_t)f;
The value of f
will be stored in i
, according to the following rules:
f
is too large to be represented as a uint64_t
(i.e. f
> 2^{64}-1) the result is undefined.If truncation takes place, data is lost. If the number is too large, the result is (probably) meaningless.
For our conversion function, if we can scale abs_f
into a range where it is not larger than (2^{64}-1), and it has no fractional part, we have access to an exact representation of the bits that make up the float. We just need to keep track of the amount of scaling being done.
Single precision IEEE 754 floating point numbers have, at most, (23+1) bits of precision (23 in the significand, 1 implicit). This means that we can scale down large numbers and scale up small numbers into the required range.
Multiplying by powers of two change only the exponent of the float, and leave the significand unmodified. As such, we can arbitrarily scale a float by a power of two and — so long as we don’t over- or under-flow the float — we will not lose any of the bits in the significand.
For the sake of simplicity (believe it or not [1]), my approach is to scale abs_f
in steps of 2^{41} so that (abs_f
≥ 2^{87}) like so:
int exponent = 254; while(abs_f < 0x1p87f) { abs_f *= 0x1p41f; exponent -= 41; }
If abs_f
≥ 2^{87}, the least significant bit of abs_f
, if set, is 2^{(87-23)}==2^{64.}
Next, abs_f
is scaled back down by 2^{64} (which adds no fractional part as the least significant bit is 2^{64}) and converted to an unsigned 64 bit integer.
uint64_t a = (uint64_t)(abs_f * 0x1p-64f);
All of the bits of abs_f
are now present in a
, without overflow or truncation. All that is needed now is to determine where they are:
int lz = count_leading_zeroes(a);
adjust the exponent accordingly:
exponent -= lz;
and construct the result:
uint32_t significand = (a << (lz + 1)) >> (64 - 23); // [3] return (sign << 31) | (exponent << 23) | significand;
With this, we have correct results for every normalized float.
// pseudo-code static_assert(bits({ 0x1.0p-149f, ..., 0x1.ffff7p-127f}) == { 0x00000001, ..., 0x007fffff}); static_assert(bits({ -0x1.0p-149f, ..., -0x1.ffff7p-127f}) == { 0x80000001, ..., 0x807fffff});
The final detail is denormalized values. Handling of normalized values as presented so far fails because denormals will have additional leading zeroes. They are fairly easy to account for:
if (exponent <= 0) { exponent = 0; lz = 8 - 1; }
To attempt to demystify that lz = 8 - 1
a little: there are 8 leading bits that aren’t part of the significand of a denormalized single precision float after the repeated 2^{-41} scaling that has taken place. There is also no leading 1 bit that is present in all normalized numbers (which is accounted for in the calculation of significand
above as (lz + 1)
). So the leading zero count (lz
) is set to account for the 8 bits of offset to the start of the denormalized significand, minus the one that the subsequent calculation assumes it needs to skip over.
And that’s it. All the possible values of a float are accounted for.
(Side note: If you’re compiling with -ffast-math, passing denormalized numbers to bits()
will return invalid results. That’s -ffast-math for you. With gcc or clang, you could add an #ifdef __FAST_MATH__
around the test for negative exponent.)
You can indeed obtain the bit representation of a floating point number at compile time. Mostly. Negative zero is wrong, NaNs are detected but otherwise not accurately converted.
Enjoy your compile-time bit-twiddling!
The whole deal:
// Based on code from // https://graphics.stanford.edu/~seander/bithacks.html constexpr int count_leading_zeroes(uint64_t v) { constexpr char bit_position[64] = { 0, 1, 2, 7, 3, 13, 8, 19, 4, 25, 14, 28, 9, 34, 20, 40, 5, 17, 26, 38, 15, 46, 29, 48, 10, 31, 35, 54, 21, 50, 41, 57, 63, 6, 12, 18, 24, 27, 33, 39, 16, 37, 45, 47, 30, 53, 49, 56, 62, 11, 23, 32, 36, 44, 52, 55, 61, 22, 43, 51, 60, 42, 59, 58 }; v |= v >> 1; // first round down to one less than a power of 2 v |= v >> 2; v |= v >> 4; v |= v >> 8; v |= v >> 16; v |= v >> 32; v = (v >> 1) + 1; return 63 - bit_position[(v * 0x0218a392cd3d5dbf)>>58]; // [3] } constexpr uint32_t bits(float f) { if (f == 0.0f) return 0; // also matches -0.0f and gives wrong result else if (f == INFINITY) return 0x7f800000; else if (f == -INFINITY) return 0xff800000; else if (f != f) // NaN return 0x7fc00000; // This is my NaN... bool sign = f < 0.0f; float abs_f = sign ? -f : f; int exponent = 254; while(abs_f < 0x1p87f) { abs_f *= 0x1p41f; exponent -= 41; } uint64_t a = (uint64_t)(abs_f * 0x1p-64f); int lz = count_leading_zeroes(a); exponent -= lz; if (exponent <= 0) { exponent = 0; lz = 8 - 1; } uint32_t significand = (a << (lz + 1)) >> (64 - 23); // [3] return (sign << 31) | (exponent << 23) | significand; }
[0] Why does runtime performance matter? Because that’s how I tested the conversion function while implementing it. I was applying Bruce Dawson’s advice for testing floats and the quicker I found out that I’d broken the conversion the better. For the implementation described in this post, it takes about 97 seconds to test all four billion float values on my laptop — half that time if I wasn’t testing negative numbers (which are unlikely to cause problems due to the way I handle the sign bit). The implementation I’ve described in this post is not the fastest solution to the problem, but it is relatively compact, and well behaved in the face of -ffast-math
.
Admission buried in a footnote: I have not validated correct behavior of this code for every floating point number in actual compile-time constant expressions. Compile-time evaluation of four billion invocations of bits()
takes more time than I’ve been willing to invest so far.
[1] It is conceptually simpler to multiply abs_f
by two (or one half) until the result is exactly positioned so that no leading zero count is required after the cast — at least, that was what I did in my first attempt. The approach described here was found to be significantly faster. I have no doubt that better-performing constant-expression-friendly approaches exist.
[2] Update 2016-03-28: Thanks to satbyy for pointing out the missing ampersand — it was lost sometime after copying the code into the article.
[3] Update 2016-03-28: Thanks to louiswins for pointing out additional code errors.
]]>Here it is, with one further tweak:
template<typename T, std::size_t N>
constexpr
std::integral_constant<std::size_t, N> countof(T const (&)[N]) noexcept
{
return {};
}
#define COUNTOF(...) decltype(countof(__VA_ARGS__))::value
The change I’ve made to pfultz2’s version is to use ::value
rather than {}
after decltype
in the macro.
This makes the type of the result std::size_t
not std::integral_constant
, so it can be used in va_arg settings without triggering compiler or static analysis warnings.
It also has the advantage of not triggering extra warnings in VS2015U1 (this issue).
]]>Read “Better array ‘countof’ implementation with C++ 11” for context. Specifically, it presents Listing 5 as an implementation of countof()
using C++11 constexpr:
template<typename T, std::size_t N>
constexpr std::size_t countof(T const (&)[N]) noexcept
{
return N;
}
But this falls short. Just a little.
There are arguments that could be passed to a naive sizeof(a)/sizeof(a[0])
macro that will cause the above to fail to compile.
Consider:
struct S
{
int a[4];
};
void f(S* s)
{
constexpr size_t s_a_count = countof(s->a);
int b[s_a_count];
// do things...
}
This does not compile. s
is not constant, and countof()
is a constexpr function whose result is needed at compile time, and so expects a constexpr-friendly argument. Even though it is not used.
Errors from this kind of thing can look like this from clang-3.7.0:
error: constexpr variable 's_a_count' must be initialized by a constant expression note: read of non-constexpr variable 's' is not allowed in a constant expression
or this from Visual Studio 2015 Update 1:
error: C2131: expression did not evaluate to a constant
(Aside: At the time of writing, the error C2131 seems to be undocumented for VS2015. But Visual Studio 6.0 had an error with the same number)
Here’s a C++11 version of countof()
that will give the correct result for countof(s->a)
above:
#include <type_traits>
template<typename Tin>
constexpr std::size_t countof()
{
using T = typename std::remove_reference<Tin>::type;
static_assert(std::is_array<T>::value,
"countof() requires an array argument");
static_assert(std::extent<T>::value > 0, // [0]
"zero- or unknown-size array");
return std::extent<T>::value;
}
#define countof(a) countof<decltype(a)>()
Some of the details:
Adding a countof()
macro allows use of decltype()
in the caller’s context, which provides the type of the member array of a non-const object at compile time.
std::remove_reference
is needed to get the array type from the result of decltype()
. Without it, std::is_array
and std::extent
produce false and zero, respectively.
The first static assert ensures that countof()
is being called on an actual array. The upside over failed template instantiation or specialization is that you can write your own human-readable, slightly more context aware error message (better than mine).
The second static assert validates that the array size is known, and is greater than zero. Without it, countof<int[]>()
will return zero (which will be wrong) without error. And zero-sized arrays will also result in zero — in practice they rarely actually contain zero elements. This isn’t a function for finding the size of those arrays.
And then std::extent<T>::value
produces the actual count of the elements of the array.
Addendum:
If replacing an existing sizeof
-based macro with a constexpr countof()
alternate, Visual Studio 2015 Update 1 will trigger warnings in certain cases where there previously were no warnings.
warning C4267: conversion from 'size_t' to 'int', possible loss of data
It is unfortunate to have to add explicit casts when the safety of such operations is able to be determined by the compiler. I have optimistically submitted this as an issue at connect.microsoft.com.
[0] Typo fix thanks to this commentor
]]>In the process of upgrading Visual Studio 2012 to Visual Studio 2015, I encountered some brand new link errors that looked something like this:
error LNK2001: unresolved external symbol
"public: __cdecl FooData::FooData(struct FooData& const &)"
It’s not a new error in VS2015 — VS2012 can certainly produce it. I mean “new” in the sense that there were no problems linking this code when using the older compiler.
The struct in question looks vaguely like this:
struct FooData
{
int m_Bar;
volatile int m_Baz;
};
The problem is m_Baz. In C++14, the language was changed to say that structs are not trivially constructible if they have non-static volatile members. And that, I think, is why there’s no default copy constructor being generated. I can’t quote chapter and verse to back up that assertion, though.
[Update: Actually… maybe not? I’m beginning to wonder if VS2015 is doing the wrong thing here.]
But the fix is simple: add a copy constructor. And then, when the program fails to compile, declare a default constructor (because, of course, adding a copy constructor causes the implicit default constructor to be marked as deleted).
I found that developing an understanding of exactly what was happening and why to be the more difficult problem. Initially because the the compiler gave no indication that there was a problem at all, and willingly generated calls to a copy constructor that couldn’t possibly exist. Deeper than that, I’m still trying to piece together my own understanding of exactly why (and how) this change was made to the standard.
]]>This one’s pretty easy to answer with this information from the C++ standard:
The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented.
0xffffffff is a hexadecimal constant, it’s too big to be represented in a (signed) int, so — by the terms of the standard — the type of 0xffffffff is unsigned int.
Furthermore, each of these hexadecimal literals will have a different type:
0x7fffffff // int
0xffffffff // unsigned int
0x1ffffffff // long int (or long long int)
0x1ffffffffu // unsigned long int (or unsigned long long int)
But to answer the original question, there is no difference between 0xffffffff and 0xffffffffu apart from this:
@twoscomplement One is a commonly used curse when the compiler screws up.
— Colin Riley (@domipheus) January 30, 2015
]]>warning C4146: unary minus operator applied to unsigned type, result still unsigned
I saw this warning recently.
“Aha!” I thought. “A common source of errors, able to strike down the unsuspecting programmer. Thank you crafters of Visual C++ compiler warnings, tirelessly laboring to uncover wrong assumptions and naively written code.”
“What?” I exclaimed. “Of course the result is still unsigned. That’s how the language is designed, and that’s what I wanted!”
Nevertheless, I read the documentation for the warning to see if there was anything I could glean from it — particularly to see if I could find sufficient reason to not just #pragma disable it.
This is what you can find in the documentation:
Unsigned types can hold only non-negative values, so unary minus (negation) does not usually make sense when applied to an unsigned type. Both the operand and the result are non-negative.
Negation of an unsigned value may not make sense if you don’t know what it means — it is well defined. Regardless, this is a level 2 warning. It is designed to catch common mistakes and misunderstandings and notify the programmer to have them look more closely. It may be an entirely reasonable thing to warn about.
The documentation continues with some rationale:
Practically, this occurs when the programmer is trying to express the minimum integer value, which is -2147483648. This value cannot be written as -2147483648 because the expression is processed in two stages:
- The number 2147483648 is evaluated. Because it is greater than the maximum integer value of 2147483647, the type of 2147483648 is not int, but unsigned int.
- Unary minus is applied to the value, with an unsigned result, which also happens to be 2147483648.
The first point is wrong. Wrong for a standards-conformant C++ implementation, anyway. The second would be accurate if the first was accurate (because 2^{32 }– 2^{31 }== 2^{31})
Here’s what the most recent draft of the C++ standard says about the integer literal types:
The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented.
2147483648 is a decimal constant with no suffix. When using VC++ with it’s 32 bit long int type, the first of the corresponding list in which its value can be represented is the 64 bit long long int. An unsigned type is never an option.
Unary minus should then be applied to long long int 2147483648, which should result in long long int -2147483648. There’s nothing unsigned in this process
Use of the result should behave in an unsurprising way, too — long long int -2147483648 can be assigned to a variable of type int and nothing unexpected will happen. The type can be converted without affecting the value.
According to the standard, the rationale is flawed, and the warning seems pointless to me.
So I tried compiling the example program from the documentation to see what would happen.
// C4146.cpp
// compile with: /W2
#include <stdio.h>
void check(int i)
{
if (i > -2147483648) // C4146
printf_s("%d is greater than the most negative int\n", i);
}
int main()
{
check(-100);
check(1);
}
The documentation predicts the following outcome:
The expected second line, 1 is greater than the most negative int, is not printed because ((unsigned int)1) > 2147483648 is false.
If I build the program with gcc 4.9.2, both lines print.
If I build the program with Visual C++ 2012, or even 2015 Preview, only one line is printed (as was predicted).
So there is legitimacy to this warning — this is an area that Visual C++ is not compliant with the standard.
Maybe it’s because the standard has changed? I looked at the earliest version of the text available in the cplusplus github repo dating from late 2011, and that has the same rules as quoted above.
I went back further and found copies of the standard from 2003 and 1998, both of which state:
The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int; if the value cannot be represented as a long int, the behavior is undefined.
So it’s a detail that was previously undefined, which means that the compiler is permitted to do whatever it wants. In this case, we’ll get a warning, but only if the programmer has asked for it using option /W2.
The documentation is accurate, and Visual C++ hasn’t kept up with changes in the standard. This shouldn’t be surprising.
Update: long long int was added to the standard as part of C++11. It appears that VC++ has had long long support since at least Visual Studio .NET 2003
This investigation arose from my reading of Visual C++ documentation in the context of what I knew of a recent draft of the C++ standard. It turns out that these two things are less connected than I had assumed. Unsurprisingly, the Visual C++ documentation describes Visual C++, not the standard.
While it would be nice if deviations from the standard were clearly marked in the documentation, and even nicer if the Visual C++ compiler was consistent with the ISO standard, the reality is that they are not and it is not.
One should always pay close attention to context, which happens to apply as much when reading about the C++ language as it does when writing C++ code.
]]>It doesn’t seem like it should be difficult to answer — it’s only three characters: –, 1, and u. And, knowing a little bit about C++, it seems like that’ll be (-1) negative one with that u making ((-1)u) an unsigned int. Right?
To be more specific, on an architecture where int is a 32 bit type, and negative numbers are represented using two’s complement (i.e. just about all of them), negative one has the binary value 11111111111111111111111111111111. And converting that to unsigned int should … still be those same thirty two ones. Shouldn’t it?
I can test that hypothesis! Here’s a program that will answer the question once and for all:
#include <stdio.h>
#include <type_traits>
int main()
{
static_assert(std::is_unsigned<decltype(-1u)>::value,
"actually not unsigned");
printf("-1u is %zu bytes, with the value %#08x\n ",
sizeof -1u, -1u);
}
Compile and run it like this:
g++ -std=c++11 minus_one_u.cpp -o minus_one_u && minus_one_u
If I do that, I see the following output:
-1u is 4 bytes, with the value 0xffffffff
I’m using -std=c++11 to be able to use std::is_unsigned, decltype and static_assert which combine to assure me that (-1u) is actually unsigned as the program wouldn’t have compiled if that wasn’t the case. And the output shows the result I had hoped for: it’s a four byte value, containing 0xffffffff (which is the same as that string of thirty two ones I was looking for).
I have now proven that -1u means “convert -1 to an unsigned int.” Yay me!
It just so happened that I was reading about integer literals in a recent draft of the ISO C++ standard. Here’s the part of the standard that describes the format of decimal integer literals:
2.14.2 Integer literals
1 An integer literal is a sequence of digits that has no period or exponent part, with optional separating single quotes that are ignored when determining its value. An integer literal may have a prefix that specifies its base and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant. A decimal integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits.
Can you see where it describes negative integer literals?
I can’t see where it describes negative integer literals.
Oh.
I though -1u was ((-1)u). I was wrong. Integer literals do not work that way.
Obviously -1u didn’t just stop producing an unsigned int with the value 0xffffffff (the program proved it!!1), but the reason it has that value is not the reason I thought.
So, what is -1u?
The standard says that 1u is an integer literal. So now I need to work out exactly what that – is doing. What does it mean to negate 1u? Back to the standard I go.
5.3.1 Unary operators
8 The operand of the unary – operator shall have arithmetic or unscoped enumeration type and the result is the negation of its operand. Integral promotion is performed on integral or enumeration operands. The negative of an unsigned quantity is computed by subtracting its value from 2^{n}, where n is the number of bits in the promoted operand. The type of the result is the type of the promoted operand.
I feel like I’m getting closer to some real answers.
So there’s a numerical operation to apply to this thing. But first, this:
Integral promotion is performed on integral or enumeration operands.
I have an integral operand (1u), so integral promotion must be performed. Here is the part of the standard that deals with that:
4.5 Integral promotions
1 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
I’m going to cut a corner here: integer literals are prvalues, but I couldn’t find a place in the standard that explicitly declares this to be the case. It does seem pretty clear from 3.10 that they can’t be anything else. This page gives a good rundown on C++ value categories, and does state that integer literals are prvalues, so let’s go with that.
If 1u is a prvalue, and its type is unsigned int, I can collapse the standard text a little:
4.5 Integral promotions (prvalue edition)
A value of an integer type whose integer conversion rank (4.13) is less than the rank of int …
and I’m going to stop right there. Conversion rank what now? To 4.13!
4.13 Integer conversion rank
1 Every integer type has an integer conversion rank defined as follows:
Then a list of ten different rules, including this one:
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type.
Without knowing more about conversion ranks, this rule gives me enough information to determine what 4.5 means for unsigned int values: unsigned int has the same rank as int. So I can rewrite 4.5 one more time like this:
4.5 Integral promotions (unsigned int edition)
1 [This space intentionally left blank]
Integral promotion of an unsigned int value doesn’t change a thing.
Now I can rewrite 5.3.1 with the knowledge that 1u requires no integral promotion:
5.3.1 Unary operators (unsigned int operand edition)
8 The [result of] the unary – operator … is the negation of its operand. The negative of an unsigned quantity is computed by subtracting its value from 2^{n}, where n is the number of bits in the promoted operand. The type of the result is the type of the operand.
And, at long last, I get to do the negating. For an unsigned value that means:
[subtract] its value from 2^{n}, where n is the number of bits in the promoted operand.
My unsigned int has 32 bits, so that would be 2^{32} – 1. Which in hexadecimal looks something like this:
0x100000000
- 0x000000001
0x0ffffffff
But that leading zero I’ve left on the result goes away because
The type of the result is the type of the (promoted) operand.
And I am now certain that I know how -1u becomes an unsigned int with the value 0xffffffff. In fact, it’s not even dependent on having a platform that uses two’s complement — nothing in the conversion relies on that.
For -1u? I don’t see this ever causing actual problems. There are situations that arise from the way that C++ integer literals are defined that can cause surprises (i.e. bugs) for the unsuspecting programmer.
There is a particular case described in the documentation for Visual C++ compiler warning C4146, but I think the rationale for that warning is wrong (or, at least, imprecise), but not because of something I’ve covered in this article. As I’ve already written far too many words about these three characters, I’ll keep that discussion for some time in the future.
]]>
Completing what I started here, I’ve charted the numbers from Christophe’s data for C++11, C++11 Concurrency, C++14 and C++17.
The data is taken entirely from the linked pdf with one exception: N3664 is a clarification that permits optimization, not a requirement for compliance. Compilers that do not perform this optimization are no less compliant with C++14. I’ve recomputed the percentages for all compiler versions to take this into account.
In addition to the references from the previous post, the approval date of C++14 was taken from http://en.wikipedia.org/wiki/C++14
]]>A few days ago, Christophe Riccio tweeted a link to a pdf that shows the level of support for “Modern C++” standards in four C++ compilers: Visual C++, GCC, Clang, and ICC.
One of the things I wanted to see was not just how support had advanced between versions of each compiler, but how compilers had changed relative to one another over time. I extracted the numbers for C++11 from Christophe’s document, found the release dates for each compiler, and created a chart that puts it all together.
It’s interesting to see how far behind Clang starts in comparison to the others, and that it ends up in a close dance with GCC on the way to full C++11 support. It also highlights how disappointing VC++ has been in terms of language feature advancement — particularly when VS2010 was ahead of Clang and ICC for C++11 features.
Creating the chart also served as an opportunity to play around with data visualization using Bokeh. As such, you can click on the chart above and you’ll see a version that you can zoom, pan, and resize (which is only a small part of what Bokeh offers). I intend to write about my experiences with Bokeh at a later date.
Release dates for each compiler were taken from the following pages:
The date used to mark the approval of the C++11 standard is taken from http://en.wikipedia.org/wiki/C++11
]]>