diff options
| author | Craig Jennings <c@cjennings.net> | 2024-04-07 13:41:34 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2024-04-07 13:41:34 -0500 |
| commit | 754bbf7a25a8dda49b5d08ef0d0443bbf5af0e36 (patch) | |
| tree | f1190704f78f04a2b0b4c977d20fe96a828377f1 /devdocs/c/language%2Fbehavior.html | |
new repository
Diffstat (limited to 'devdocs/c/language%2Fbehavior.html')
| -rw-r--r-- | devdocs/c/language%2Fbehavior.html | 166 |
1 files changed, 166 insertions, 0 deletions
diff --git a/devdocs/c/language%2Fbehavior.html b/devdocs/c/language%2Fbehavior.html new file mode 100644 index 00000000..0a883ffc --- /dev/null +++ b/devdocs/c/language%2Fbehavior.html @@ -0,0 +1,166 @@ + <h1 id="firstHeading" class="firstHeading">Undefined behavior</h1> <p>The C language standard precisely specifies the <a href="as_if" title="c/language/as if">observable behavior</a> of C language programs, except for the ones in the following categories:</p> +<ul> +<li> <i>undefined behavior</i> - there are no restrictions on the behavior of the program. Examples of undefined behavior are memory accesses outside of array bounds, signed integer overflow, null pointer dereference, modification of the same scalar <a href="eval_order" title="c/language/eval order">more than once</a> in an expression without sequence points, access to an object through a pointer of a different type, etc. Compilers are not required to diagnose undefined behavior (although many simple situations are diagnosed), and the compiled program is not required to do anything meaningful. </li> +<li> <i>unspecified behavior</i> - two or more behaviors are permitted and the implementation is not required to document the effects of each behavior. For example, <a href="eval_order" title="c/language/eval order">order of evaluation</a>, whether identical <a href="string_literal" title="c/language/string literal">string literals</a> are distinct, etc. Each unspecified behavior results in one of a set of valid results and may produce a different result when repeated in the same program. </li> +<li> <i>implementation-defined behavior</i> - unspecified behavior where each implementation documents how the choice is made. For example, number of bits in a byte, or whether signed integer right shift is arithmetic or logical. </li> +<li> <i>locale-specific behavior</i> - implementation-defined behavior that depends on the <a href="../locale/setlocale" title="c/locale/setlocale">currently chosen locale</a>. For example, whether <code><a href="../string/byte/islower" title="c/string/byte/islower">islower</a></code> returns true for any character other than the 26 lowercase Latin letters. </li> +</ul> <p>(Note: <a href="conformance" title="c/language/conformance">Strictly conforming</a> programs do not depend on any unspecified, undefined, or implementation-defined behavior)</p> +<p>The compilers are required to issue diagnostic messages (either errors or warnings) for any programs that violates any C syntax rule or semantic constraint, even if its behavior is specified as undefined or implementation-defined or if the compiler provides a language extension that allows it to accept such program. Diagnostics for undefined behavior are not otherwise required.</p> +<h3 id="UB_and_optimization"> UB and optimization</h3> <p>Because correct C programs are free of undefined behavior, compilers may produce unexpected results when a program that actually has UB is compiled with optimization enabled:</p> +<p>For example,</p> +<h4 id="Signed_overflow"> Signed overflow</h4> <div class="c source-c"><pre data-language="c">int foo(int x) +{ + return x + 1 > x; // either true or UB due to signed overflow +}</pre></div> <p>may be compiled as (<a rel="nofollow" class="external text" href="https://godbolt.org/z/9dh7b71TK">demo</a>)</p> +<div class="c source-c"><pre data-language="c">foo: + mov eax, 1 + ret</pre></div> <h4 id="Access_out_of_bounds"> Access out of bounds</h4> <div class="c source-c"><pre data-language="c">int table[4] = {0}; +int exists_in_table(int v) +{ + // return 1 in one of the first 4 iterations or UB due to out-of-bounds access + for (int i = 0; i <= 4; i++) + if (table[i] == v) + return 1; + return 0; +}</pre></div> <p>May be compiled as (<a rel="nofollow" class="external text" href="https://godbolt.org/z/48bn19Tsb">demo</a>)</p> +<div class="c source-c"><pre data-language="c">exists_in_table: + mov eax, 1 + ret</pre></div> <h4 id="Uninitialized_scalar"> Uninitialized scalar</h4> <div class="c source-c"><pre data-language="c">_Bool p; // uninitialized local variable +if (p) // UB access to uninitialized scalar + puts("p is true"); +if (!p) // UB access to uninitialized scalar + puts("p is false");</pre></div> <p>May produce the following output (observed with an older version of gcc):</p> +<div class="c source-c"><pre data-language="c">p is true +p is false</pre></div> <div class="c source-c"><pre data-language="c">size_t f(int x) +{ + size_t a; + if (x) // either x nonzero or UB + a = 42; + return a; +}</pre></div> <p>May be compiled as (<a rel="nofollow" class="external text" href="https://godbolt.org/z/9nz6EMPTG">demo</a>)</p> +<div class="c source-c"><pre data-language="c">f: + mov eax, 42 + ret</pre></div> <h4 id="Invalid_scalar"> Invalid scalar</h4> <div class="c source-c"><pre data-language="c">int f(void) +{ + _Bool b = 0; + unsigned char* p = (unsigned char*)&b; + *p = 10; + // reading from b is now UB + return b == 0; +}</pre></div> <p>May be compiled as (<a rel="nofollow" class="external text" href="https://godbolt.org/z/rjx77bjoh">demo</a>)</p> +<div class="c source-c"><pre data-language="c">f: + mov eax, 11 + ret</pre></div> <h4 id="Null_pointer_dereference"> Null pointer dereference</h4> <div class="c source-c"><pre data-language="c">int foo(int* p) +{ + int x = *p; + if (!p) + return x; // Either UB above or this branch is never taken + else + return 0; +} + +int bar() +{ + int* p = NULL; + return *p; // Unconditional UB +}</pre></div> <p>may be compiled as (<a rel="nofollow" class="external text" href="https://godbolt.org/z/8jnjMjcPz">demo</a>)</p> +<div class="c source-c"><pre data-language="c">foo: + xor eax, eax + ret +bar: + ret</pre></div> <h4 id="Access_to_pointer_passed_to_realloc"> Access to pointer passed to <code><a href="../memory/realloc" title="c/memory/realloc">realloc</a></code> +</h4> <div class="t-example"> +<p>Choose clang to observe the output shown</p> +<div class="c source-c"><pre data-language="c">#include <stdio.h> +#include <stdlib.h> + +int main(void) +{ + int *p = (int*)malloc(sizeof(int)); + int *q = (int*)realloc(p, sizeof(int)); + *p = 1; // UB access to a pointer that was passed to realloc + *q = 2; + if (p == q) // UB access to a pointer that was passed to realloc + printf("%d%d\n", *p, *q); +}</pre></div> <p>Possible output:</p> +<div class="text source-text"><pre data-language="c">12</pre></div> </div> <h4 id="Infinite_loop_without_side-effects"> Infinite loop without side-effects</h4> <div class="t-example"> +<p>Choose clang to observe the output shown</p> +<div class="c source-c"><pre data-language="c">#include <stdio.h> + +int fermat() +{ + const int MAX = 1000; + // Endless loop with no side effects is UB + for (int a = 1, b = 1, c = 1; 1;) + { + if (((a * a * a) == ((b * b * b) + (c * c * c)))) + return 1; + ++a; + if (a > MAX) + { + a = 1; + ++b; + } + if (b > MAX) + { + b = 1; + ++c; + } + if (c > MAX) + c = 1; + } + return 0; +} + +int main(void) +{ + if (fermat()) + puts("Fermat's Last Theorem has been disproved."); + else + puts("Fermat's Last Theorem has not been disproved."); +}</pre></div> <p>Possible output:</p> +<div class="text source-text"><pre data-language="c">Fermat's Last Theorem has been disproved.</pre></div> </div> <h3 id="References"> References</h3> <ul> +<li> C23 standard (ISO/IEC 9899:2023): </li> +<ul> +<li> 3.4 Behavior (p: TBD) </li> +<li> 4 Conformance (p: TBD) </li> +</ul> +<li> C17 standard (ISO/IEC 9899:2018): </li> +<ul> +<li> 3.4 Behavior (p: 3-4) </li> +<li> 4 Conformance (p: 8) </li> +</ul> +<li> C11 standard (ISO/IEC 9899:2011): </li> +<ul> +<li> 3.4 Behavior (p: 3-4) </li> +<li> 4/2 Undefined behavior (p: 8) </li> +</ul> +<li> C99 standard (ISO/IEC 9899:1999): </li> +<ul> +<li> 3.4 Behavior (p: 3-4) </li> +<li> 4/2 Undefined behavior (p: 7) </li> +</ul> +<li> C89/C90 standard (ISO/IEC 9899:1990): </li> +<ul><li> 1.6 DEFINITIONS OF TERMS </li></ul> +</ul> <h3 id="External_links"> External links</h3> <table> <tr style="vertical-align:top;"> <td>1. </td> <td> +<a rel="nofollow" class="external text" href="https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html">What Every C Programmer Should Know About Undefined Behavior #1/3</a> </td> +</tr> <tr style="vertical-align:top;"> <td>2. </td> <td> +<a rel="nofollow" class="external text" href="https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html">What Every C Programmer Should Know About Undefined Behavior #2/3</a> </td> +</tr> <tr style="vertical-align:top;"> <td>3. </td> <td> +<a rel="nofollow" class="external text" href="https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html">What Every C Programmer Should Know About Undefined Behavior #3/3</a> </td> +</tr> <tr style="vertical-align:top;"> <td>4. </td> <td> +<a rel="nofollow" class="external text" href="https://blogs.msdn.com/b/oldnewthing/archive/2014/06/27/10537746.aspx">Undefined behavior can result in time travel (among other things, but time travel is the funkiest)</a> </td> +</tr> <tr style="vertical-align:top;"> <td>5. </td> <td> +<a rel="nofollow" class="external text" href="https://www.cs.utah.edu/~regehr/papers/overflow12.pdf">Understanding Integer Overflow in C/C++</a> </td> +</tr> <tr style="vertical-align:top;"> <td>6. </td> <td> +<a rel="nofollow" class="external text" href="https://web.archive.org/web/20201108094235/https://kukuruku.co/post/undefined-behavior-and-fermats-last-theorem/">Undefined Behavior and Fermat’s Last Theorem</a> </td> +</tr> <tr style="vertical-align:top;"> <td>7. </td> <td> +<a rel="nofollow" class="external text" href="https://lwn.net/Articles/342330/">Fun with NULL pointers, part 1</a> (local exploit in Linux 2.6.30 caused by UB due to null pointer dereference) </td> +</tr> +</table> <h3 id="See_also"> See also</h3> <table class="t-dsc-begin"> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/language/ub" title="cpp/language/ub">C++ documentation</a></span> for <span class=""><span>Undefined behavior</span></span> </td> +</tr> </table> <div class="_attribution"> + <p class="_attribution-p"> + © cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br> + <a href="https://en.cppreference.com/w/c/language/behavior" class="_attribution-link">https://en.cppreference.com/w/c/language/behavior</a> + </p> +</div> |
