summaryrefslogtreecommitdiff
path: root/devdocs/c/language%2Fescape.html
diff options
context:
space:
mode:
Diffstat (limited to 'devdocs/c/language%2Fescape.html')
-rw-r--r--devdocs/c/language%2Fescape.html82
1 files changed, 82 insertions, 0 deletions
diff --git a/devdocs/c/language%2Fescape.html b/devdocs/c/language%2Fescape.html
new file mode 100644
index 00000000..84167c53
--- /dev/null
+++ b/devdocs/c/language%2Fescape.html
@@ -0,0 +1,82 @@
+ <h1 id="firstHeading" class="firstHeading">Escape sequences</h1> <p>Escape sequences are used to represent certain special characters within <a href="string_literal" title="c/language/string literal">string literals</a> and <a href="character_constant" title="c/language/character constant">character constants</a>.</p>
+<p>The following escape sequences are available. ISO C requires a diagnostic if the backslash is followed by any character not listed here:</p>
+<table class="wikitable"> <tr> <th> Escape<br> sequence </th> <th> Description </th> <th> Representation </th>
+</tr> <tr> <th colspan="3"> Simple escape sequences </th>
+</tr> <tr> <td> <code>\'</code> </td> <td> single quote </td> <td> byte <code>0x27</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\"</code> </td> <td> double quote </td> <td> byte <code>0x22</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\?</code> </td> <td> question mark </td> <td> byte <code>0x3f</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\\</code> </td> <td> backslash </td> <td> byte <code>0x5c</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\a</code> </td> <td> audible bell </td> <td> byte <code>0x07</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\b</code> </td> <td> backspace </td> <td> byte <code>0x08</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\f</code> </td> <td> form feed - new page </td> <td> byte <code>0x0c</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\n</code> </td> <td> line feed - new line </td> <td> byte <code>0x0a</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\r</code> </td> <td> carriage return </td> <td> byte <code>0x0d</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\t</code> </td> <td> horizontal tab </td> <td> byte <code>0x09</code> in ASCII encoding </td>
+</tr> <tr> <td> <code>\v</code> </td> <td> vertical tab </td> <td> byte <code>0x0b</code> in ASCII encoding </td>
+</tr> <tr> <th colspan="3"> Numeric escape sequences </th>
+</tr> <tr> <td> <code>\<i>nnn</i></code> </td> <td> arbitrary octal value </td> <td> code unit <code><i>nnn</i></code> </td>
+</tr> <tr> <td> <code>\x<i>n...</i></code> </td> <td> arbitrary hexadecimal value </td> <td> code unit <code><i>n...</i></code> (arbitrary number of hexadecimal digits) </td>
+</tr> <tr> <th colspan="3"> Universal character names </th>
+</tr> <tr> <td> <code>\u<i>nnnn</i></code> <span class="t-mark-rev t-since-c99">(since C99)</span> </td> <td> <a href="https://en.wikipedia.org/wiki/Unicode" class="extiw" title="enwiki:Unicode"> Unicode</a> value in allowed range;<br>may result in several code units </td> <td> code point <code>U+<i>nnnn</i></code> </td>
+</tr> <tr> <td> <code>\U<i>nnnnnnnn</i></code> <span class="t-mark-rev t-since-c99">(since C99)</span> </td> <td> <a href="https://en.wikipedia.org/wiki/Unicode" class="extiw" title="enwiki:Unicode"> Unicode</a> value in allowed range;<br>may result in several code units </td> <td> code point <code>U+<i>nnnnnnnn</i></code> </td>
+</tr>
+</table> <table class="t-rev-begin"> <tr class="t-rev t-since-c99">
+<td> <h3 id="Range_of_universal_character_names"> Range of universal character names</h3> <p>If a universal character name corresponds to a code point that is not <code>0x24</code> (<span class="st0">'$'</span>), <code>0x40</code> (<span class="st0">'@'</span>), nor <code>0x60</code> (<span class="st0">'`'</span>) and less than <code>0xA0</code>, or a surrogate code point (the range <code>0xD800-0xDFFF</code>, inclusive)<span class="t-rev-inl t-since-c23"><span>, or greater than <code>0x10FFFF</code>, i.e. not a Unicode code point</span><span><span class="t-mark-rev t-since-c23">(since C23)</span></span></span>, the program is ill-formed. In other words, members of <a href="translation_phases" title="c/language/translation phases">basic source character set</a> and control characters (in ranges <code>0x0-0x1F</code> and <code>0x7F-0x9F</code>) cannot be expressed in universal character names.</p>
+</td> <td><span class="t-mark-rev t-since-c99">(since C99)</span></td>
+</tr> </table> <h3 id="Notes"> Notes</h3> <p><code>\0</code> is the most commonly used octal escape sequence, because it represents the terminating null character in null-terminated strings.</p>
+<p>The new-line character <code>\n</code> has special meaning when used in <a href="../io" title="c/io">text mode I/O</a>: it is converted to the OS-specific newline byte or byte sequence.</p>
+<p>Octal escape sequences have a length limit of three octal digits but terminate at the first character that is not a valid octal digit if encountered sooner.</p>
+<p>Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. If the value represented by a single hexadecimal escape sequence does not fit the range of values represented by the character type used in this string literal or character constant (<span class="kw4">char</span><span class="t-rev-inl t-since-c23"><span>, char8_t</span><span><span class="t-mark-rev t-since-c23">(since C23)</span></span></span><span class="t-rev-inl t-since-c11"><span>, char16_t, char32_t</span><span><span class="t-mark-rev t-since-c11">(since C11)</span></span></span>, or <span class="kw4">wchar_t</span>), the result is unspecified.</p>
+<table class="t-rev-begin"> <tr class="t-rev t-since-c99">
+<td> <p>A universal character name in a narrow string literal <span class="t-rev-inl t-since-c11"><span>or a 16-bit string literal</span><span><span class="t-mark-rev t-since-c11">(since C11)</span></span></span> may map to more than one code unit, e.g. <code>\U0001f34c</code> is 4 <span class="kw4">char</span> code units in UTF-8 (<code>\xF0\x9F\x8D\x8C</code>)<span class="t-rev-inl t-since-c11"><span> and 2 char16_t code units in UTF-16 (<code>\xD83C\xDF4C</code>)</span><span><span class="t-mark-rev t-since-c11">(since C11)</span></span></span>.</p>
+</td> <td><span class="t-mark-rev t-since-c99">(since C99)</span></td>
+</tr> </table> <table class="t-rev-begin"> <tr class="t-rev t-since-c99 t-until-c23">
+<td> <p>A universal character name corresponding to a code pointer greater than <code>0x10FFFF</code> (which is undefined in ISO/ISC 10646) can be used in <a href="character_constant" title="c/language/character constant">character constants</a> and <a href="string_literal" title="c/language/string literal">string literals</a>. Such usage is not allowed in C++20.</p>
+</td> <td>
+<span class="t-mark-rev t-since-c99">(since C99)</span><br><span class="t-mark-rev t-until-c23">(until C23)</span>
+</td>
+</tr> </table> <table class="t-rev-begin"> <tr class="t-rev t-until-c23">
+<td> <p>The question mark escape sequence <code>\?</code> is used to prevent <a href="operator_alternative" title="c/language/operator alternative">trigraphs</a> from being interpreted inside string literals: a string such as <code>"??/"</code> is compiled as <code>"\"</code>, but if the second question mark is escaped, as in <code>"?\?/"</code>, it becomes <code>"??/"</code></p>
+</td> <td><span class="t-mark-rev t-until-c23">(until C23)</span></td>
+</tr> </table> <h3 id="Example"> Example</h3> <div class="t-example"> <div class="c source-c"><pre data-language="c">#include &lt;stdio.h&gt;
+
+int main(void)
+{
+ printf("This\nis\na\ntest\n\nShe said, \"How are you?\"\n");
+}</pre></div> <p>Output:</p>
+<div class="text source-text"><pre data-language="c">This
+is
+a
+test
+
+She said, "How are you?"</pre></div> </div> <h3 id="References"> References</h3> <ul>
+<li> C17 standard (ISO/IEC 9899:2018): </li>
+<ul>
+<li> 5.2.2 Character display semantics (p: 18-19) </li>
+<li> 6.4.3 Universal Character names (p: 44) </li>
+<li> 6.4.4.4 Character constants (p: 48-50) </li>
+</ul>
+<li> C11 standard (ISO/IEC 9899:2011): </li>
+<ul>
+<li> 5.2.2 Character display semantics (p: 24-25) </li>
+<li> 6.4.3 Universal Character names (p: 61) </li>
+<li> 6.4.4.4 Character constants (p: 67-70) </li>
+</ul>
+<li> C99 standard (ISO/IEC 9899:1999): </li>
+<ul>
+<li> 5.2.2 Character display semantics (p: 19-20) </li>
+<li> 6.4.3 Universal Character names (p: 53) </li>
+<li> 6.4.4.4 Character constants (p: 59-61) </li>
+</ul>
+<li> C89/C90 standard (ISO/IEC 9899:1990): </li>
+<ul>
+<li> 2.2.2 Character display semantics </li>
+<li> 3.1.3.4 Character constants </li>
+</ul>
+</ul> <h3 id="See_also"> See also</h3> <ul><li> <a href="ascii" title="c/language/ascii">ASCII chart</a> </li></ul> <table class="t-dsc-begin"> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/language/escape" title="cpp/language/escape">C++ documentation</a></span> for <span class=""><span>Escape sequences</span></span> </td>
+</tr> </table> <div class="_attribution">
+ <p class="_attribution-p">
+ &copy; cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br>
+ <a href="https://en.cppreference.com/w/c/language/escape" class="_attribution-link">https://en.cppreference.com/w/c/language/escape</a>
+ </p>
+</div>