1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
|
<h1 id="firstHeading" class="firstHeading">Character constant</h1> <h3 id="Syntax"> Syntax</h3> <table class="t-sdsc-begin"> <tr class="t-sdsc"> <td> <code>'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (1) </td> <td class="t-sdsc-nopad"> </td>
</tr> <tr class="t-sdsc"> <td> <code>u8'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (2) </td> <td> <span class="t-mark-rev t-since-c23">(since C23)</span> </td>
</tr> <tr class="t-sdsc"> <td> <code>u'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (3) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span> </td>
</tr> <tr class="t-sdsc"> <td> <code>U'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (4) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span> </td>
</tr> <tr class="t-sdsc"> <td> <code>L'</code><span class="t-spar">c-char</span> <code>'</code> </td> <td> (5) </td> <td class="t-sdsc-nopad"> </td>
</tr> <tr class="t-sdsc"> <td> <code>'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (6) </td> <td class="t-sdsc-nopad"> </td>
</tr> <tr class="t-sdsc"> <td> <code>L'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (7) </td> <td class="t-sdsc-nopad"> </td>
</tr> <tr class="t-sdsc"> <td> <code>u'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (8) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span><span class="t-mark-rev t-until-c23">(removed in C23)</span> </td>
</tr> <tr class="t-sdsc"> <td> <code>U'</code><span class="t-spar">c-char-sequence</span> <code>'</code> </td> <td> (9) </td> <td> <span class="t-mark-rev t-since-c11">(since C11)</span><span class="t-mark-rev t-until-c23">(removed in C23)</span> </td>
</tr>
</table> <p>where</p>
<ul>
<li> <span class="t-spar">c-char</span> is either </li>
<ul>
<li> a character from the basic source character set minus single-quote (<code>'</code>), backslash (<code>\</code>), or the newline character. </li>
<li> escape sequence: one of special character escapes <code>\'</code> <code>\"</code> <code>\?</code> <code>\\</code> <code>\a</code> <code>\b</code> <code>\f</code> <code>\n</code> <code>\r</code> <code>\t</code> <code>\v</code>, hex escapes <code>\x...</code> or octal escapes <code>\...</code> as defined in <a href="escape" title="c/language/escape">escape sequences</a>. </li>
</ul>
</ul> <table class="t-rev-begin"> <tr class="t-rev t-since-c99">
<td> <ul><li>universal character name, <code>\u...</code> or <code>\U...</code> as defined in <a href="escape" title="c/language/escape">escape sequences</a>. </li></ul> </td> <td><span class="t-mark-rev t-since-c99">(since C99)</span></td>
</tr> </table> <ul><li> <span class="t-spar">c-char-sequence</span> is a sequence of two or more <span class="t-spar">c-char</span>s. </li></ul> <div class="t-li1">
<span class="t-li">1)</span> single-byte integer character constant, e.g. <code>'a'</code> or <code>'\n'</code> or <code>'\13'</code>. Such constant has type <code>int</code> and a value equal to the representation of <span class="t-spar">c-char</span> in the execution character set as a value of type <code>char</code> mapped to <code>int</code>. If <span class="t-spar">c-char</span> is not representable as a single byte in the execution character set, the value is implementation-defined.</div> <div class="t-li1">
<span class="t-li">2)</span> UTF-8 character constant, e.g. <code>u8'a'</code>. Such constant has type <code>char8_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-8 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0x7F, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-8 code unit, the program is ill-formed.</div> <table class="t-rev-begin"> <tr class="t-rev t-until-c23">
<td> <span class="t-li">3)</span> 16-bit wide character constant, e.g. <code>u'่ฒ'</code>, but not <code>u'๐'</code> (<code>u'\U0001f34c'</code>). Such constant has type <code>char16_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in the 16-bit encoding produced by <code><a href="../string/multibyte/mbrtoc16" title="c/string/multibyte/mbrtoc16">mbrtoc16</a></code> (normally UTF-16). If <span class="t-spar">c-char</span> is not representable or maps to more than one 16-bit character, the value is implementation-defined. <span class="t-li">4)</span> 32-bit wide character constant, e.g. <code>U'่ฒ'</code> or <code>U'๐'</code>. Such constant has type <code>char32_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in in the 32-bit encoding produced by <code><a href="../string/multibyte/mbrtoc32" title="c/string/multibyte/mbrtoc32">mbrtoc32</a></code> (normally UTF-32). If <span class="t-spar">c-char</span> is not representable or maps to more than one 32-bit character, the value is implementation-defined. </td> <td><span class="t-mark-rev t-until-c23">(until C23)</span></td>
</tr> <tr class="t-rev t-since-c23">
<td> <span class="t-li">3)</span> UTF-16 character constant, e.g. <code>u'่ฒ'</code>, but not <code>u'๐'</code> (<code>u'\U0001f34c'</code>). Such constant has type <code>char16_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-16 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0xD7FF or 0xE000-0xFFFF, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-16 code unit, the program is ill-formed. <span class="t-li">4)</span> UTF-32 character constant, e.g. <code>U'่ฒ'</code> or <code>U'๐'</code>. Such constant has type <code>char32_t</code> and the value equal to ISO 10646 code point value of <span class="t-spar">c-char</span>, provided that the code point value is representable with a single UTF-32 code unit (that is, <span class="t-spar">c-char</span> is in the range 0x0-0xD7FF or 0xE000-0x10FFFF, inclusive). If <span class="t-spar">c-char</span> is not representable with a single UTF-32 code unit, the program is ill-formed. </td> <td><span class="t-mark-rev t-since-c23">(since C23)</span></td>
</tr> </table> <div class="t-li1">
<span class="t-li">5)</span> wide character constant, e.g. <code>L'ฮฒ'</code> or <code>L'่ฒ</code>. Such constant has type <code>wchar_t</code> and a value equal to the value of <span class="t-spar">c-char</span> in the execution wide character set (that is, the value that would be produced by <code><a href="../string/multibyte/mbtowc" title="c/string/multibyte/mbtowc">mbtowc</a></code>). If <span class="t-spar">c-char</span> is not representable or maps to more than one wide character (e.g. a non-BMP value on Windows where <code>wchar_t</code> is 16-bit), the value is implementation-defined .</div> <div class="t-li1">
<span class="t-li">6)</span> multicharacter constant, e.g. <code>'AB'</code>, has type <code>int</code> and implementation-defined value.</div> <div class="t-li1">
<span class="t-li">7)</span> wide multicharacter constant, e.g. <code>L'AB'</code>, has type <code>wchar_t</code> and implementation-defined value.</div> <div class="t-li1">
<span class="t-li">8)</span> 16-bit multicharacter constant, e.g. <code>u'CD'</code>, has type <code>char16_t</code> and implementation-defined value.</div> <div class="t-li1">
<span class="t-li">9)</span> 32-bit multicharacter constant, e.g. <code>U'XY'</code>, has type <code>char32_t</code> and implementation-defined value.</div> <h3 id="Notes"> Notes</h3> <p>Multicharacter constants were inherited by C from the B programming language. Although not specified by the C standard, most compilers (MSVC is a notable exception) implement multicharacter constants as specified in B: the values of each char in the constant initialize successive bytes of the resulting integer, in big-endian zero-padded right-adjusted order, e.g. the value of <code>'\1'</code> is <code>0x00000001</code> and the value of <code>'\1\2\3\4'</code> is <code>0x01020304</code>.</p>
<p>In C++, encodable ordinary character literals have type <code>char</code>, rather than <code>int</code>.</p>
<p>Unlike <a href="integer_constant" title="c/language/integer constant">integer constants</a>, a character constant may have a negative value if <code>char</code> is signed: on such implementations <code>'\xFF'</code> is an <code>int</code> with the value <code>-1</code>.</p>
<p>When used in a controlling expression of <a href="../preprocessor/conditional" title="c/preprocessor/conditional"><code> #if</code></a> or <a href="../preprocessor/conditional" title="c/preprocessor/conditional"><code> #elif</code></a>, character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.</p>
<p>16/32-bit multicharacter constants are not widely supported and removed in C23. Some common implementations (e.g. clang) do not accept them at all.</p>
<h3 id="Example"> Example</h3> <div class="t-example"> <div class="c source-c"><pre data-language="c">#include <stddef.h>
#include <stdio.h>
#include <uchar.h>
int main (void)
{
printf("constant value \n");
printf("-------- ----------\n");
// integer character constants,
int c1='a'; printf("'a':\t %#010x\n", c1);
int c2='๐'; printf("'๐':\t %#010x\n\n", c2); // implementation-defined
// multicharacter constant
int c3='ab'; printf("'ab':\t %#010x\n\n", c3); // implementation-defined
// 16-bit wide character constants
char16_t uc1 = u'a'; printf("'a':\t %#010x\n", (int)uc1);
char16_t uc2 = u'ยข'; printf("'ยข':\t %#010x\n", (int)uc2);
char16_t uc3 = u'็ซ'; printf("'็ซ':\t %#010x\n", (int)uc3);
// implementation-defined (๐ maps to two 16-bit characters)
char16_t uc4 = u'๐'; printf("'๐':\t %#010x\n\n", (int)uc4);
// 32-bit wide character constants
char32_t Uc1 = U'a'; printf("'a':\t %#010x\n", (int)Uc1);
char32_t Uc2 = U'ยข'; printf("'ยข':\t %#010x\n", (int)Uc2);
char32_t Uc3 = U'็ซ'; printf("'็ซ':\t %#010x\n", (int)Uc3);
char32_t Uc4 = U'๐'; printf("'๐':\t %#010x\n\n", (int)Uc4);
// wide character constants
wchar_t wc1 = L'a'; printf("'a':\t %#010x\n", (int)wc1);
wchar_t wc2 = L'ยข'; printf("'ยข':\t %#010x\n", (int)wc2);
wchar_t wc3 = L'็ซ'; printf("'็ซ':\t %#010x\n", (int)wc3);
wchar_t wc4 = L'๐'; printf("'๐':\t %#010x\n\n", (int)wc4);
}</pre></div> <p>Possible output:</p>
<div class="text source-text"><pre data-language="c">constant value
-------- ----------
'a': 0x00000061
'๐': 0xf09f8d8c
'ab': 0x00006162
'a': 0x00000061
'ยข': 0x000000a2
'็ซ': 0x0000732b
'๐': 0x0000df4c
'a': 0x00000061
'ยข': 0x000000a2
'็ซ': 0x0000732b
'๐': 0x0001f34c
'a': 0x00000061
'ยข': 0x000000a2
'็ซ': 0x0000732b
'๐': 0x0001f34c</pre></div> </div> <h3 id="References"> References</h3> <ul>
<li> C17 standard (ISO/IEC 9899:2018): </li>
<ul><li> 6.4.4.4 Character constants (p: 48-50) </li></ul>
<li> C11 standard (ISO/IEC 9899:2011): </li>
<ul><li> 6.4.4.4 Character constants (p: 67-70) </li></ul>
<li> C99 standard (ISO/IEC 9899:1999): </li>
<ul><li> 6.4.4.4 Character constants (p: 59-61) </li></ul>
<li> C89/C90 standard (ISO/IEC 9899:1990): </li>
<ul><li> 3.1.3.4 Character constants </li></ul>
</ul> <h3 id="See_also"> See also</h3> <table class="t-dsc-begin"> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/language/character_literal" title="cpp/language/character literal">C++ documentation</a></span> for <span class=""><span>Character literal</span></span> </td>
</tr> </table> <div class="_attribution">
<p class="_attribution-p">
© cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br>
<a href="https://en.cppreference.com/w/c/language/character_constant" class="_attribution-link">https://en.cppreference.com/w/c/language/character_constant</a>
</p>
</div>
|