summaryrefslogtreecommitdiff
path: root/devdocs/c/language%2Fcharacter_constant.html
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2024-04-07 13:41:34 -0500
committerCraig Jennings <c@cjennings.net>2024-04-07 13:41:34 -0500
commit754bbf7a25a8dda49b5d08ef0d0443bbf5af0e36 (patch)
treef1190704f78f04a2b0b4c977d20fe96a828377f1 /devdocs/c/language%2Fcharacter_constant.html
new repository
Diffstat (limited to 'devdocs/c/language%2Fcharacter_constant.html')
-rw-r--r--devdocs/c/language%2Fcharacter_constant.html106
1 files changed, 106 insertions, 0 deletions
diff --git a/devdocs/c/language%2Fcharacter_constant.html b/devdocs/c/language%2Fcharacter_constant.html
new file mode 100644
index 00000000..acf4a443
--- /dev/null
+++ b/devdocs/c/language%2Fcharacter_constant.html
@@ -0,0 +1,106 @@
+ <h1 id="firstHeading" class="firstHeading">Character constant</h1> <h3 id="Syntax"> Syntax</h3> <table class="t-sdsc-begin"> <tr class="t-sdsc"> <td> <code>'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (1) </td> <td class="t-sdsc-nopad"> </td>
+</tr> <tr class="t-sdsc"> <td> <code>u8'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (2) </td> <td> <span class="t-mark-rev t-since-c23">(since C23)</span> </td>
+</tr> <tr class="t-sdsc"> <td> <code>u'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (3) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span> </td>
+</tr> <tr class="t-sdsc"> <td> <code>U'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (4) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span> </td>
+</tr> <tr class="t-sdsc"> <td> <code>L'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (5) </td> <td class="t-sdsc-nopad"> </td>
+</tr> <tr class="t-sdsc"> <td> <code>'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (6) </td> <td class="t-sdsc-nopad"> </td>
+</tr> <tr class="t-sdsc"> <td> <code>L'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (7) </td> <td class="t-sdsc-nopad"> </td>
+</tr> <tr class="t-sdsc"> <td> <code>u'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (8) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span><span class="t-mark-rev t-until-c23">(removed in C23)</span> </td>
+</tr> <tr class="t-sdsc"> <td> <code>U'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (9) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span><span class="t-mark-rev t-until-c23">(removed in C23)</span> </td>
+</tr>
+</table> <p>where</p>
+<ul>
+<li> <span class="t-spar">c-char</span> is either </li>
+<ul>
+<li> a character from the basic source character set minus single-quote (<code>'</code>), backslash (<code>\</code>), or the newline character. </li>
+<li> escape sequence: one of special character escapes <code>\'</code> <code>\"</code> <code>\?</code> <code>\\</code> <code>\a</code> <code>\b</code> <code>\f</code> <code>\n</code> <code>\r</code> <code>\t</code> <code>\v</code>, hex escapes <code>\x...</code> or octal escapes <code>\...</code> as defined in <a href="escape" title="c/language/escape">escape sequences</a>. </li>
+</ul>
+</ul> <table class="t-rev-begin"> <tr class="t-rev t-since-c99">
+<td> <ul><li>universal character name, <code>\u...</code> or <code>\U...</code> as defined in <a href="escape" title="c/language/escape">escape sequences</a>. </li></ul> </td> <td><span class="t-mark-rev t-since-c99">(since C99)</span></td>
+</tr> </table> <ul><li> <span class="t-spar">c-char-sequence</span> is a sequence of two or more <span class="t-spar">c-char</span>s. </li></ul> <div class="t-li1">
+<span class="t-li">1)</span> single-byte integer character constant, e.g. <code>'a'</code> or <code>'\n'</code> or <code>'\13'</code>. Such constant has type <code>int</code> and a value equal to the representation of <span class="t-spar">c-char</span> in the execution character set as a value of type <code>char</code> mapped to <code>int</code>. If <span class="t-spar">c-char</span> is not representable as a single byte in the execution character set, the value is implementation-defined.</div> <div class="t-li1">
+<span class="t-li">2)</span> UTF-8 character constant, e.g. <code>u8'a'</code>. Such constant has type <code>char8_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-8 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0x7F, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-8 code unit, the program is ill-formed.</div> <table class="t-rev-begin"> <tr class="t-rev t-until-c23">
+<td> <span class="t-li">3)</span> 16-bit wide character constant, e.g. <code>u'่ฒ“'</code>, but not <code>u'๐ŸŒ'</code> (<code>u'\U0001f34c'</code>). Such constant has type <code>char16_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in the 16-bit encoding produced by <code><a href="../string/multibyte/mbrtoc16" title="c/string/multibyte/mbrtoc16">mbrtoc16</a></code> (normally UTF-16). If <span class="t-spar">c-char</span> is not representable or maps to more than one 16-bit character, the value is implementation-defined. <span class="t-li">4)</span> 32-bit wide character constant, e.g. <code>U'่ฒ“'</code> or <code>U'๐ŸŒ'</code>. Such constant has type <code>char32_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in in the 32-bit encoding produced by <code><a href="../string/multibyte/mbrtoc32" title="c/string/multibyte/mbrtoc32">mbrtoc32</a></code> (normally UTF-32). If <span class="t-spar">c-char</span> is not representable or maps to more than one 32-bit character, the value is implementation-defined. </td> <td><span class="t-mark-rev t-until-c23">(until C23)</span></td>
+</tr> <tr class="t-rev t-since-c23">
+<td> <span class="t-li">3)</span> UTF-16 character constant, e.g. <code>u'่ฒ“'</code>, but not <code>u'๐ŸŒ'</code> (<code>u'\U0001f34c'</code>). Such constant has type <code>char16_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-16 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0xD7FF or 0xE000-0xFFFF, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-16 code unit, the program is ill-formed. <span class="t-li">4)</span> UTF-32 character constant, e.g. <code>U'่ฒ“'</code> or <code>U'๐ŸŒ'</code>. Such constant has type <code>char32_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-32 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0xD7FF or 0xE000-0x10FFFF, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-32 code unit, the program is ill-formed. </td> <td><span class="t-mark-rev t-since-c23">(since C23)</span></td>
+</tr> </table> <div class="t-li1">
+<span class="t-li">5)</span> wide character constant, e.g. <code>L'ฮฒ'</code> or <code>L'่ฒ“</code>. Such constant has type <code>wchar_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in the execution wide character set (that is, the value that would be produced by <code><a href="../string/multibyte/mbtowc" title="c/string/multibyte/mbtowc">mbtowc</a></code>). If <span class="t-spar">c-char</span> is not representable or maps to more than one wide character (e.g. a non-BMP value on Windows where <code>wchar_t</code> is 16-bit), the value is implementation-defined .</div> <div class="t-li1">
+<span class="t-li">6)</span> multicharacter constant, e.g. <code>'AB'</code>, has type <code>int</code> and implementation-defined value.</div> <div class="t-li1">
+<span class="t-li">7)</span> wide multicharacter constant, e.g. <code>L'AB'</code>, has type <code>wchar_t</code> and implementation-defined value.</div> <div class="t-li1">
+<span class="t-li">8)</span> 16-bit multicharacter constant, e.g. <code>u'CD'</code>, has type <code>char16_t</code> and implementation-defined value.</div> <div class="t-li1">
+<span class="t-li">9)</span> 32-bit multicharacter constant, e.g. <code>U'XY'</code>, has type <code>char32_t</code> and implementation-defined value.</div> <h3 id="Notes"> Notes</h3> <p>Multicharacter constants were inherited by C from the B programming language. Although not specified by the C standard, most compilers (MSVC is a notable exception) implement multicharacter constants as specified in B: the values of each char in the constant initialize successive bytes of the resulting integer, in big-endian zero-padded right-adjusted order, e.g. the value of <code>'\1'</code> is <code>0x00000001</code> and the value of <code>'\1\2\3\4'</code> is <code>0x01020304</code>.</p>
+<p>In C++, encodable ordinary character literals have type <code>char</code>, rather than <code>int</code>.</p>
+<p>Unlike <a href="integer_constant" title="c/language/integer constant">integer constants</a>, a character constant may have a negative value if <code>char</code> is signed: on such implementations <code>'\xFF'</code> is an <code>int</code> with the value <code>-1</code>.</p>
+<p>When used in a controlling expression of <a href="../preprocessor/conditional" title="c/preprocessor/conditional"><code> #if</code></a> or <a href="../preprocessor/conditional" title="c/preprocessor/conditional"><code> #elif</code></a>, character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.</p>
+<p>16/32-bit multicharacter constants are not widely supported and removed in C23. Some common implementations (e.g. clang) do not accept them at all.</p>
+<h3 id="Example"> Example</h3> <div class="t-example"> <div class="c source-c"><pre data-language="c">#include &lt;stddef.h&gt;
+#include &lt;stdio.h&gt;
+#include &lt;uchar.h&gt;
+
+int main (void)
+{
+ printf("constant value \n");
+ printf("-------- ----------\n");
+
+ // integer character constants,
+ int c1='a'; printf("'a':\t %#010x\n", c1);
+ int c2='๐ŸŒ'; printf("'๐ŸŒ':\t %#010x\n\n", c2); // implementation-defined
+
+ // multicharacter constant
+ int c3='ab'; printf("'ab':\t %#010x\n\n", c3); // implementation-defined
+
+ // 16-bit wide character constants
+ char16_t uc1 = u'a'; printf("'a':\t %#010x\n", (int)uc1);
+ char16_t uc2 = u'ยข'; printf("'ยข':\t %#010x\n", (int)uc2);
+ char16_t uc3 = u'็Œซ'; printf("'็Œซ':\t %#010x\n", (int)uc3);
+ // implementation-defined (๐ŸŒ maps to two 16-bit characters)
+ char16_t uc4 = u'๐ŸŒ'; printf("'๐ŸŒ':\t %#010x\n\n", (int)uc4);
+
+ // 32-bit wide character constants
+ char32_t Uc1 = U'a'; printf("'a':\t %#010x\n", (int)Uc1);
+ char32_t Uc2 = U'ยข'; printf("'ยข':\t %#010x\n", (int)Uc2);
+ char32_t Uc3 = U'็Œซ'; printf("'็Œซ':\t %#010x\n", (int)Uc3);
+ char32_t Uc4 = U'๐ŸŒ'; printf("'๐ŸŒ':\t %#010x\n\n", (int)Uc4);
+
+ // wide character constants
+ wchar_t wc1 = L'a'; printf("'a':\t %#010x\n", (int)wc1);
+ wchar_t wc2 = L'ยข'; printf("'ยข':\t %#010x\n", (int)wc2);
+ wchar_t wc3 = L'็Œซ'; printf("'็Œซ':\t %#010x\n", (int)wc3);
+ wchar_t wc4 = L'๐ŸŒ'; printf("'๐ŸŒ':\t %#010x\n\n", (int)wc4);
+}</pre></div> <p>Possible output:</p>
+<div class="text source-text"><pre data-language="c">constant value
+-------- ----------
+'a': 0x00000061
+'๐ŸŒ': 0xf09f8d8c
+
+'ab': 0x00006162
+
+'a': 0x00000061
+'ยข': 0x000000a2
+'็Œซ': 0x0000732b
+'๐ŸŒ': 0x0000df4c
+
+'a': 0x00000061
+'ยข': 0x000000a2
+'็Œซ': 0x0000732b
+'๐ŸŒ': 0x0001f34c
+
+'a': 0x00000061
+'ยข': 0x000000a2
+'็Œซ': 0x0000732b
+'๐ŸŒ': 0x0001f34c</pre></div> </div> <h3 id="References"> References</h3> <ul>
+<li> C17 standard (ISO/IEC 9899:2018): </li>
+<ul><li> 6.4.4.4 Character constants (p: 48-50) </li></ul>
+<li> C11 standard (ISO/IEC 9899:2011): </li>
+<ul><li> 6.4.4.4 Character constants (p: 67-70) </li></ul>
+<li> C99 standard (ISO/IEC 9899:1999): </li>
+<ul><li> 6.4.4.4 Character constants (p: 59-61) </li></ul>
+<li> C89/C90 standard (ISO/IEC 9899:1990): </li>
+<ul><li> 3.1.3.4 Character constants </li></ul>
+</ul> <h3 id="See_also"> See also</h3> <table class="t-dsc-begin"> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/language/character_literal" title="cpp/language/character literal">C++ documentation</a></span> for <span class=""><span>Character literal</span></span> </td>
+</tr> </table> <div class="_attribution">
+ <p class="_attribution-p">
+ &copy; cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br>
+ <a href="https://en.cppreference.com/w/c/language/character_constant" class="_attribution-link">https://en.cppreference.com/w/c/language/character_constant</a>
+ </p>
+</div>