Category: programming

C++14 and volatile implicity

[Update 2016-03-07: It appears that this was a bug in VS2015, and has been fixed in Update 2 RC]

In the process of upgrading Visual Studio 2012 to Visual Studio 2015, I encountered some brand new link errors that looked something like this:

error LNK2001: unresolved external symbol 
"public: __cdecl FooData::FooData(struct FooData& const &)"

It’s not a new error in VS2015 — VS2012 can certainly produce it. I mean “new” in the sense that there were no problems linking this code when using the older compiler.

The struct in question looks vaguely like this:

struct FooData
{
  int m_Bar;
  volatile int m_Baz;
};

The problem is m_Baz. In C++14, the language was changed to say that structs are not trivially constructible if they have non-static volatile members. And that, I think, is why there’s no default copy constructor being generated. I can’t quote chapter and verse to back up that assertion, though.

[Update: Actually… maybe not? I’m beginning to wonder if VS2015 is doing the wrong thing here.]

But the fix is simple: add a copy constructor. And then, when the program fails to compile, declare a default constructor (because, of course, adding a copy constructor causes the implicit default constructor to be marked as deleted).

I found that developing an understanding of exactly what was happening and why to be the more difficult problem. Initially because the the compiler gave no indication that there was a problem at all, and willingly generated calls to a copy constructor that couldn’t possibly exist. Deeper than that, I’m still trying to piece together my own understanding of exactly why (and how) this change was made to the standard.

What’s the difference between 0xffffffff and 0xffffffffu?

In C++, what is the difference between 0xffffffff and 0xffffffffu?

This one’s pretty easy to answer with this information from the C++ standard:

The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented.

0xffffffff is a hexadecimal constant, it’s too big to be represented in a (signed) int, so — by the terms of the standard — the type of 0xffffffff is unsigned int.

Furthermore, each of these hexadecimal literals will have a different type:

0x7fffffff   // int
0xffffffff   // unsigned int
0x1ffffffff  // long int (or long long int)
0x1ffffffffu // unsigned long int (or unsigned long long int)

But to answer the original question, there is no difference between 0xffffffff and 0xffffffffu apart from this:

@twoscomplement One is a commonly used curse when the compiler screws up.

— Colin Riley (@domipheus) January 30, 2015

Standards vs Compilers: Warning C4146

warning C4146: unary minus operator applied to unsigned type, result still unsigned

I saw this warning recently.

“Aha!” I thought. “A common source of errors, able to strike down the unsuspecting programmer. Thank you crafters of Visual C++ compiler warnings, tirelessly laboring to uncover wrong assumptions and naively written code.”

“What?” I exclaimed. “Of course the result is still unsigned. That’s how the language is designed, and that’s what I wanted!”

Nevertheless, I read the documentation for the warning to see if there was anything I could glean from it — particularly to see if I could find sufficient reason to not just #pragma disable it.

This is what you can find in the documentation:

Unsigned types can hold only non-negative values, so unary minus (negation) does not usually make sense when applied to an unsigned type. Both the operand and the result are non-negative.

Negation of an unsigned value may not make sense if you don’t know what it means — it is well defined. Regardless, this is a level 2 warning. It is designed to catch common mistakes and misunderstandings and notify the programmer to have them look more closely. It may be an entirely reasonable thing to warn about.

The documentation continues with some rationale:

Practically, this occurs when the programmer is trying to express the minimum integer value, which is -2147483648. This value cannot be written as -2147483648 because the expression is processed in two stages:

The number 2147483648 is evaluated. Because it is greater than the maximum integer value of 2147483647, the type of 2147483648 is not int, but unsigned int.

Unary minus is applied to the value, with an unsigned result, which also happens to be 2147483648.

The first point is wrong. Wrong for a standards-conformant C++ implementation, anyway. The second would be accurate if the first was accurate (because 2³²– 2³¹== 2³¹)

Here’s what the most recent draft of the C++ standard says about the integer literal types:

The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented.

2147483648 is a decimal constant with no suffix. When using VC++ with it’s 32 bit long int type, the first of the corresponding list in which its value can be represented is the 64 bit long long int. An unsigned type is never an option.

Unary minus should then be applied to long long int 2147483648, which should result in long long int -2147483648. There’s nothing unsigned in this process

Use of the result should behave in an unsurprising way, too — long long int -2147483648 can be assigned to a variable of type int and nothing unexpected will happen. The type can be converted without affecting the value.

According to the standard, the rationale is flawed, and the warning seems pointless to me.

In theory, there’s no difference between theory and practise

So I tried compiling the example program from the documentation to see what would happen.

// C4146.cpp
// compile with: /W2
#include <stdio.h>

void check(int i)
{
  if (i > -2147483648) // C4146
    printf_s("%d is greater than the most negative int\n", i);
}

int main()
{
  check(-100);
  check(1);
}

The documentation predicts the following outcome:

The expected second line, 1 is greater than the most negative int, is not printed because ((unsigned int)1) > 2147483648 is false.

If I build the program with gcc 4.9.2, both lines print.

If I build the program with Visual C++ 2012, or even 2015 Preview, only one line is printed (as was predicted).

So there is legitimacy to this warning — this is an area that Visual C++ is not compliant with the standard.

Maybe it’s because the standard has changed? I looked at the earliest version of the text available in the cplusplus github repo dating from late 2011, and that has the same rules as quoted above.

I went back further and found copies of the standard from 2003 and 1998, both of which state:

The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int; if the value cannot be represented as a long int, the behavior is undefined.

So it’s a detail that was previously undefined, which means that the compiler is permitted to do whatever it wants. In this case, we’ll get a warning, but only if the programmer has asked for it using option /W2.

The documentation is accurate, and Visual C++ hasn’t kept up with changes in the standard. This shouldn’t be surprising.

Update: long long int was added to the standard as part of C++11. It appears that VC++ has had long long support since at least Visual Studio .NET 2003

So what?

This investigation arose from my reading of Visual C++ documentation in the context of what I knew of a recent draft of the C++ standard. It turns out that these two things are less connected than I had assumed. Unsurprisingly, the Visual C++ documentation describes Visual C++, not the standard.

While it would be nice if deviations from the standard were clearly marked in the documentation, and even nicer if the Visual C++ compiler was consistent with the ISO standard, the reality is that they are not and it is not.

One should always pay close attention to context, which happens to apply as much when reading about the C++ language as it does when writing C++ code.

What is -1u?

In C++, what exactly is -1u?

It doesn’t seem like it should be difficult to answer — it’s only three characters: –, 1, and u. And, knowing a little bit about C++, it seems like that’ll be (-1) negative one with that u making ((-1)u) an unsigned int. Right?

To be more specific, on an architecture where int is a 32 bit type, and negative numbers are represented using two’s complement (i.e. just about all of them), negative one has the binary value 11111111111111111111111111111111. And converting that to unsigned int should … still be those same thirty two ones. Shouldn’t it?

I can test that hypothesis! Here’s a program that will answer the question once and for all:

#include <stdio.h>
#include <type_traits>

int main()
{
 static_assert(std::is_unsigned<decltype(-1u)>::value, 
               "actually not unsigned");
 printf("-1u is %zu bytes, with the value %#08x\n ", 
        sizeof -1u, -1u);
}

Compile and run it like this:

g++ -std=c++11 minus_one_u.cpp -o minus_one_u && minus_one_u

If I do that, I see the following output:

-1u is 4 bytes, with the value 0xffffffff

I’m using -std=c++11 to be able to use std::is_unsigned, decltype and static_assert which combine to assure me that (-1u) is actually unsigned as the program wouldn’t have compiled if that wasn’t the case. And the output shows the result I had hoped for: it’s a four byte value, containing 0xffffffff (which is the same as that string of thirty two ones I was looking for).

I have now proven that -1u means “convert -1 to an unsigned int.” Yay me!

Not so much.

It just so happened that I was reading about integer literals in a recent draft of the ISO C++ standard. Here’s the part of the standard that describes the format of decimal integer literals:

2.14.2 Integer literals
1 An integer literal is a sequence of digits that has no period or exponent part, with optional separating single quotes that are ignored when determining its value. An integer literal may have a prefix that specifies its base and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant. A decimal integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits.

Can you see where it describes negative integer literals?

I can’t see where it describes negative integer literals.

Oh.

I though -1u was ((-1)u). I was wrong. Integer literals do not work that way.

Obviously -1u didn’t just stop producing an unsigned int with the value 0xffffffff (the program proved it!!1), but the reason it has that value is not the reason I thought.

So, what is -1u?

The standard says that 1u is an integer literal. So now I need to work out exactly what that – is doing. What does it mean to negate 1u? Back to the standard I go.

5.3.1 Unary operators
8 The operand of the unary – operator shall have arithmetic or unscoped enumeration type and the result is the negation of its operand. Integral promotion is performed on integral or enumeration operands. The negative of an unsigned quantity is computed by subtracting its value from 2ⁿ, where n is the number of bits in the promoted operand. The type of the result is the type of the promoted operand.

I feel like I’m getting closer to some real answers.

So there’s a numerical operation to apply to this thing. But first, this:

Integral promotion is performed on integral or enumeration operands.

Believe me when I tell you that this section changes nothing and you should skip it.

I have an integral operand (1u), so integral promotion must be performed. Here is the part of the standard that deals with that:

4.5 Integral promotions
1 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

I’m going to cut a corner here: integer literals are prvalues, but I couldn’t find a place in the standard that explicitly declares this to be the case. It does seem pretty clear from 3.10 that they can’t be anything else. This page gives a good rundown on C++ value categories, and does state that integer literals are prvalues, so let’s go with that.

If 1u is a prvalue, and its type is unsigned int, I can collapse the standard text a little:

4.5 Integral promotions (prvalue edition)
A value of an integer type whose integer conversion rank (4.13) is less than the rank of int …

and I’m going to stop right there. Conversion rank what now? To 4.13!

4.13 Integer conversion rank
1 Every integer type has an integer conversion rank defined as follows:

Then a list of ten different rules, including this one:

— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type.

Without knowing more about conversion ranks, this rule gives me enough information to determine what 4.5 means for unsigned int values: unsigned int has the same rank as int. So I can rewrite 4.5 one more time like this:

4.5 Integral promotions (unsigned int edition)
1 [This space intentionally left blank]

Integral promotion of an unsigned int value doesn’t change a thing.

Where was I?

Now I can rewrite 5.3.1 with the knowledge that 1u requires no integral promotion:

5.3.1 Unary operators (unsigned int operand edition)
8 The [result of] the unary – operator … is the negation of its operand. The negative of an unsigned quantity is computed by subtracting its value from 2ⁿ, where n is the number of bits in the promoted operand. The type of the result is the type of the operand.

And, at long last, I get to do the negating. For an unsigned value that means:

[subtract] its value from 2ⁿ, where n is the number of bits in the promoted operand.

My unsigned int has 32 bits, so that would be 2³² – 1. Which in hexadecimal looks something like this:

  0x100000000
- 0x000000001
  0x0ffffffff

But that leading zero I’ve left on the result goes away because

The type of the result is the type of the (promoted) operand.

And I am now certain that I know how -1u becomes an unsigned int with the value 0xffffffff. In fact, it’s not even dependent on having a platform that uses two’s complement — nothing in the conversion relies on that.

But… when could this possibly ever matter?

For -1u? I don’t see this ever causing actual problems. There are situations that arise from the way that C++ integer literals are defined that can cause surprises (i.e. bugs) for the unsuspecting programmer.

There is a particular case described in the documentation for Visual C++ compiler warning C4146, but I think the rationale for that warning is wrong (or, at least, imprecise), but not because of something I’ve covered in this article. As I’ve already written far too many words about these three characters, I’ll keep that discussion for some time in the future.

The Growth of Modern C++ Support

Completing what I started here, I’ve charted the numbers from Christophe’s data for C++11, C++11 Concurrency, C++14 and C++17.

The data is taken entirely from the linked pdf with one exception: N3664 is a clarification that permits optimization, not a requirement for compliance. Compilers that do not perform this optimization are no less compliant with C++14. I’ve recomputed the percentages for all compiler versions to take this into account.

In addition to the references from the previous post, the approval date of C++14 was taken from http://en.wikipedia.org/wiki/C++14

The Growth of C++11 Support

Update: This chart has been updated and I’ve added charts for C++11 Concurrency, C++14, and C++17 here.

A few days ago, Christophe Riccio tweeted a link to a pdf that shows the level of support for “Modern C++” standards in four C++ compilers: Visual C++, GCC, Clang, and ICC.

One of the things I wanted to see was not just how support had advanced between versions of each compiler, but how compilers had changed relative to one another over time. I extracted the numbers for C++11 from Christophe’s document, found the release dates for each compiler, and created a chart that puts it all together.

It’s interesting to see how far behind Clang starts in comparison to the others, and that it ends up in a close dance with GCC on the way to full C++11 support. It also highlights how disappointing VC++ has been in terms of language feature advancement — particularly when VS2010 was ahead of Clang and ICC for C++11 features.

Creating the chart also served as an opportunity to play around with data visualization using Bokeh. As such, you can click on the chart above and you’ll see a version that you can zoom, pan, and resize (which is only a small part of what Bokeh offers). I intend to write about my experiences with Bokeh at a later date.

Release dates for each compiler were taken from the following pages:

Visual Studio: http://en.wikipedia.org/wiki/Microsoft_Visual_Studio
GCC: https://gcc.gnu.org/releases.html
Clang: http://llvm.org/releases/
ICC: http://en.wikipedia.org/wiki/Intel_C++_Compiler

The date used to mark the approval of the C++11 standard is taken from http://en.wikipedia.org/wiki/C++11

Learn more about caches & library implementations. Use fewer linked lists.

CppCon 2014 talk by Chandler Carruth:
“Efficiency with Algorithms, Performance with Data Structure”

It starts slow, but there’s plenty of good things in this one:

Don’t use linked lists [0]
std::map is terrible
Computers are made of memory hierarchies [1]
[If you’re going to use it, ] know how the standard library works [2]
std::unordered_map has a ridiculously long name and it isn’t a good hash map

[0] The thing about linked lists is that they’re at their worst when there is no spatial locality between the elements. Because of the way that std::list allocates elements, this will almost always be the case.

Linked lists can be useful for gathering data, but they’re expensive to traverse and modify. If you need to traverse the list more than once, consider using the first traversal to repack the list into a vector.

[1] The presentation does give some insight into the relative cost of cached memory accesses, and it would be remiss of me to not link to Tony Albrecht’s interactive cache performance visualizer. And std::vector is recommended over std::list for the spatial benefits (elements known to be near other elements, which is good). But the value of ordered accesses was overlooked in the presentation.

Put your data in a vector and process it in order from one end to the other. Any decent CPU will detect the ordered access and prefetch the data into the cache for you, and less time will be spent waiting for that data. It’s like magic.

[2] The presentation included two very specific examples regarding the inner workings of the standard library, one for std::vector, and one for std::unordered_list. It is right and good that you understand the workings of any library that you are using, but there were more general principles that were imho overlooked:

For std::vector and std::list (and not exclusive to them): JIT memory allocation is terrible — use what you know about the problem to minimize and manage memory allocations. To quote from Andreas Fredrikssons’ fine rant (which you should read):
Never make memory management decisions at the lowest level of a system.
Always explicitly keep a copy of intermediate results that are used more than once. Don’t trust the compiler to optimize for you — more to the point, understand why compilers legitimately can’t optimize away your lazy copypasta. (Mike Acton’s CppCon keynote has some material that covers this)

Unquestionably bad

Question 5:

Consider the following 6 data structures:

Stack

Queue

Hash table

Doubly-linked list

Binary search tree

Directed acyclic graph

Using these as the subject matter, construct 6 really good puns.

Answers:

After receiving a range of questions from different sources, I was unsure which to answer first — I was stack as to where to begin. And so because this was the last question that I received, it became the first that I answered.

—

Don’t get me wrong — I did appreciate the question. The capacity of my gratitude is, theoretically, unbounded. Thanqueue.

—

We have a cuckoo aviary. I keyp a record of each birth in a hatch table.

—

I noticed that I was leaning to one side. I spoke to a physician about it — he told me I was overweight because I was eating too much bread. My list, it seems, is linked to my dough-belly.

—

On a school trip to a pickle factory, my daughter went missing. I was able to climb the brinery search tree and spot her, though it took longer than I had hoped due to my poor balance.

—

While out walking, I deflected a cyclist’s gaffe, knocking him aside as he rode the wrong way down a one-way street. I looked down my nose at him and gave a topological snort to help him on his way.

The reader may decide whether the answers satisfy the requirements of the question.

Bugs of the day, 2013-04-11

Don’t delete code that does things

I got carried away while deleting some code that didn’t do anything and accidentally removed some parts that did do things — function calls with side effects that I didn’t fully understand. I make the same kind of change later to a different part of the codebase and got it right there. I didn’t think to go back and look for the possibility of this problem in my earlier change.

This one got checked in and caused problems for others later. It would have been caught by better testing on my part. (I’m convinced that I did test this and saw no problems, so either I didn’t actually test a build with the change or I my testing wasn’t really testing of the code affected)

Unravel the whole garment

In a related deletion, I managed to break another piece of code connected to one that I was modifying. I broke one of the assumptions of the other piece of code, which was far less obvious to me because they interacted via a mechanism that I was not familiar with.

Again, this would have been caught by better testing on my part.

My apologies to those affected. Clearly, I suck.

(On the upside, I’ve now learned a little bit more about another part of the codebase. This is a good thing.)

Bug review, 2013-04-10

A fairly quiet two weeks compared to the one before. The main reasons being a week of illness (I find fever and an ability to concentrate to be mutually exclusive :\) and a change in work focus to a time of information gathering and planning for a different project, which has resulted in less opportunity for writing new bugs.

Summary of the weeks

Wednesday 2013-03-27

An optimistic beginning to the new week: no bugs to write about, so I posted a photograph.

Thursday 2013-03-28

An example of using the compiler to help identify particular code patterns that I wanted to eliminate.

Friday 2013-03-29

An actual bug, and worse than a runtime bug: checked-in build breaker. Shameful.

Tuesday 2013-04-02

Implicitly finding bugs in other people’s code

Sunday 2013-04-07

Not at all bug related, but a point of curiosity that (thanks to a lot of input from others) turned into a interesting list of studios that have been around for more than twenty years.

Trends

Not a many bugs in the list this time, so doesn’t seem like there’s a lot to say about these two weeks — except that the rate of bugs does go down when you’re not writing any code…