Is signed integer overflow still undefined behavior in C++?

c++ c++11 undefined-behavior integer-overflow

As we know, signed integer overflow is undefined behavior. But there is something interesting in C++11 cstdint documentation:

signed integer type with width of exactly 8, 16, 32 and 64 bits respectively with no padding bits and using 2's complement for negative values (provided only if the implementation directly supports the type)

_{See link}

And here is my question: since the standard says explicitly that for int8_t, int16_t, int32_t and int64_t negative numbers are 2's complement, is still overflow of these types an undefined behavior?

Edit I checked C++11 and C11 Standards and here is what I found:

C++11, §18.4.1:

The header defines all functions, types, and macros the same as 7.20 in the C standard.

C11, §7.20.1.1:

The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.

Never forget that the only primary documentation for C++ is the standard. Everything else, even a wiki like CppReference, is a secondary source. That doesn't mean it's wrong; just not entirely reliable.

I would expect it to be UB, there is no exemption for these types in C, I don't see why C++ would add one.

possible duplicate of Do C99 signed integer types defined in stdint.h exhibit well-defined behaviour in case of an overflow?

I'm a bit confused: where is the verb in the sentence "signed integer type with width of exactly 8, 16, 32 and 64 bits respectively with no padding bits and using 2's complement for negative values (provided only if the implementation directly supports the type)?" Is a bit missing? What does it mean?

C++11 is based on C99, not C11. But this is not important anyway

Andy Prowl

is still overflow of these types an undefined behavior?

Yes. Per Paragraph 5/4 of the C++11 Standard (regarding any expression in general):

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [...]

The fact that a two's complement representation is used for those signed types does not mean that arithmetic modulo 2^n is used when evaluating expressions of those types.

Concerning unsigned arithmetic, on the other hand, the Standard explicitly specifies that (Paragraph 3.9.1/4):

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer

This means that the result of an unsigned arithmetic operation is always "mathematically defined", and the result is always within the representable range; therefore, 5/4 does not apply. Footnote 46 explains this:

46) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.

This paragraph would also imply that unsigned overflow is undefined, which is not.

@Archie: Not really, since unsigned values are defined modulo the unsigned range.

@Archie: I tried to clarify, but basically you got the answer from LightnessRacesinOrbit

It actually does not matter if unsigned overflow is defined or not if it cannot occur due to modulo calculation...

There are unsigned operations whose result isn't "mathematically defined" - particularly division by zero - so perhaps your wording isn't quite what you meant in that sentence. ITYM that when the result is mathematically defined, then it's also defined in C++.

ecatmur

Just because a type is defined to use 2s complement representation, it doesn't follow that arithmetic overflow in that type becomes defined.

The undefined behaviour of signed arithmetic overflow is used to enable optimisations; for example, the compiler can assume that if a > b then a + 1 > b also; this doesn't hold in unsigned arithmetic where the second check would need to be carried out because of the possibility that a + 1 might wrap around to 0. Also, some platforms can generate a trap signal on arithmetic overflow (see e.g. http://www.gnu.org/software/libc/manual/html_node/Program-Error-Signals.html); the standard continues to allow this to occur.

It may be worth noting that many people "worry" more about the possibilities of traps, but compiler assumptions are actually more insidious (one of the reasons I wish there were a category between Implementation-Defined and Undefined Behavior--unlike implementation-defined behavior which requires particular implementations to do something in consistent documented fashion, I'd like an "implementation-constrained" behavior which would require implementations to specify everything that could happen as a consequence of something (the specs could explicitly include Undefined Behavior, but...

...implementations would be encouraged to be more specific when practical). On a hardware where two's-complement numbers would "wrap" naturally, there's no sensible reason for code which wants a wrapped-integer result to execute many instructions trying to perform without integer overflow a computation which the hardware could do in just one or two instructions.

@supercat In fact, the code which wants wrapped result can (on 2's complement CPUs) just cast operands to corresponding unsigned types and perform the operation (and then cast back, getting implementation-defined value): this works for addition, subtraction and multiplication. The only problem is with division, modulo and such functions as abs. For those operations when it works it doesn't require more instructions than with signed arthmetic.

@Ruslan: In cases where code needs a precisely-wrapped result, a cast to unsigned would be ugly but not necessarily generate extra code. A bigger issue would be with code that needs to quickly identify "potentially interesting" candidates, which will spend most of its time rejecting uninteresting candidates. If one gives a compiler the freedom to arbitrarily keep or discard extra precision with signed integer values, but requires that a cast back to an integer type truncate any such precision, that will enable most of the useful optimizations that would be achieved by making overflow UB, ...

...but would allow code that needs precise wrapping to use one cast rather than two (e.g. (int)(x+y)>z would compare a wrapped result), and would also allow programmers to write x+y>z in cases where it would be acceptable for code to yield 0 or 1 in case of overflow provided it has no other side-effect. If either 0 or 1 would be an equally acceptable result, letting the programmer write that rather than either (long)x+y>z or (int)((unsigned)x+y)>z would allow compilers to select whichever of the latter functions was cheaper in any given context [each would be cheaper in some cases].

Community

I would bet so.

From the standard documentation (pg.4 and 5):

1.3.24 undefined behavior behavior for which this International Standard imposes no requirements [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed.-- end note]

Is signed integer overflow still undefined behavior in C++?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Links

Contact US