Some notes on memcpy

Recently I encountered a bug due to memcpy() being used like this:

    memcpy( elem, elem + 1, n * sizeof(Element) );

What’s the problem?

The prototype for memcpy is defined something like:

    void* memcpy(void* dest, const void* source, size_t size);

memcpy is wonderfully simple – use it to copy size bytes from source to dest. But the definition of memcpy from the standard goes something like this:

“The memcpy() function shall copy n bytes from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.” [source]

If you notice the example at the top of the email, the copy is from elem + 1 to elem with a size of (potentially) more than one Element. It’s quite possible that objects overlap. And they did.

The magic word in the definition is undefined. When you do something that is undefined in C or C++, anything can happen. The compiler can format your hard drive, email a resignation letter to your manager, there could be nose goblins. Or your data could become corrupt. Relying on undefined behavior is a bad idea.

Nose goblins notwithstanding, what can actually go wrong for memcpy? If memcpy is implemented as walking forward through the source range and writing to the destination, the above case should be OK, right?

Well, maybe.

But if it starts at the end of the source array and works backwards, the destination range will end up with corrupt data.

Probably.

Even if memcpy walks forward, data could be corrupted if the source has a lower address than the destination (and they overlap). There are plenty of ways to write a fast memcpy implementation that can corrupt data for overlapping ranges. And implementations that will cause problems exist and are widely used: e.g. http://lwn.net/Articles/414467/ Even better are implementations that will step forward through memory for small values of size, and backwards for larger ones.

Conclusion: don’t use memcpy for overlapping ranges. Instead, use memmove which is well defined if the ranges overlap:

“The memmove() function shall copy n bytes from the object pointed to by s2 into the object pointed to by s1. Copying takes place as if the n bytes from the object pointed to by s2 are first copied into a temporary array of n bytes that does not overlap the objects pointed to by s1 and s2, and then the n bytes from the temporary array are copied into the object pointed to by s1.” — [source]

(With thanks to Andreas for his help in finding the source of the bug that inspired this piece)

One thought on “Some notes on memcpy”

AMD64 is faster copying from high to low, EM64T from low to high. So this isn’t of academic interest. Flick the right code generation flags on the right processor and you’ll encounter it.

What’s the problem?

One thought on “Some notes on memcpy”

Leave a Reply