1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
|
<h1 id="firstHeading" class="firstHeading">mbrtowc</h1> <table class="t-dcl-begin"> <tr class="t-dsc-header"> <th> Defined in header <code><wchar.h></code> </th> <th> </th> <th> </th> </tr> <tr class="t-dcl"> <td> <pre data-language="c">size_t mbrtowc( wchar_t* pwc, const char* s, size_t n, mbstate_t* ps );</pre>
</td> <td class="t-dcl-nopad"> </td> <td> <span class="t-mark-rev t-since-c95">(since C95)</span> </td> </tr> <tr class="t-dcl"> <td> <pre data-language="c">size_t mbrtowc( wchar_t *restrict pwc, const char *restrict s, size_t n,
mbstate_t *restrict ps );</pre>
</td> <td class="t-dcl-nopad"> </td> <td> <span class="t-mark-rev t-since-c99">(since C99)</span> </td> </tr> </table> <p>Converts a narrow multibyte character to its wide character representation.</p>
<p>If <code>s</code> is not a null pointer, inspects at most <code>n</code> bytes of the multibyte character string, beginning with the byte pointed to by <code>s</code> to determine the number of bytes necessary to complete the next multibyte character (including any shift sequences, and taking into account the current multibyte conversion state <code>*ps</code>). If the function determines that the next multibyte character in <code>s</code> is complete and valid, converts it to the corresponding wide character and stores it in <code>*pwc</code> (if <code>pwc</code> is not null).</p>
<p>If <code>s</code> is a null pointer, the values of <code>n</code> and <code>pwc</code> are ignored and call is equivalent to <code>mbrtowc<span class="br0">(</span><a href="http://en.cppreference.com/w/c/types/NULL"><span class="kw103">NULL</span></a>, <span class="st0">""</span>, <span class="nu0">1</span>, ps<span class="br0">)</span></code>.</p>
<p>If the wide character produced is the null character, the conversion state stored in <code>*ps</code> is the initial shift state.</p>
<p>If the environment macro <code>__STDC_ISO_10646__</code> is defined, the values of type <code>wchar_t</code> are the same as the short identifiers of the characters in the Unicode required set (typically UTF-32 encoding); otherwise, it is implementation-defined. In any case, the multibyte character encoding used by this function is specified by the currently active C locale.</p>
<h3 id="Parameters"> Parameters</h3> <table class="t-par-begin"> <tr class="t-par"> <td> pwc </td> <td> - </td> <td> pointer to the location where the resulting wide character will be written </td>
</tr> <tr class="t-par"> <td> s </td> <td> - </td> <td> pointer to the multibyte character string used as input </td>
</tr> <tr class="t-par"> <td> n </td> <td> - </td> <td> limit on the number of bytes in s that can be examined </td>
</tr> <tr class="t-par"> <td> ps </td> <td> - </td> <td> pointer to the conversion state used when interpreting the multibyte character string </td>
</tr>
</table> <h3 id="Return_value"> Return value</h3> <p>The first of the following that applies:</p>
<ul>
<li> <code>0</code> if the character converted from <code>s</code> (and stored in <code>pwc</code> if non-null) was the null character </li>
<li> the number of bytes <code>[1...n]</code> of the multibyte character successfully converted from <code>s</code> </li>
<li> <code><span class="br0">(</span><a href="http://en.cppreference.com/w/c/types/size_t"><span class="kw100">size_t</span></a><span class="br0">)</span><span class="sy2">-</span><span class="nu0">2</span></code> if the next <code>n</code> bytes constitute an incomplete, but so far valid, multibyte character. Nothing is written to <code>*pwc</code>. </li>
<li> <code><span class="br0">(</span><a href="http://en.cppreference.com/w/c/types/size_t"><span class="kw100">size_t</span></a><span class="br0">)</span><span class="sy2">-</span><span class="nu0">1</span></code> if encoding error occurs. Nothing is written to <code>*pwc</code>, the value <code><a href="../../error/errno_macros" title="c/error/errno macros">EILSEQ</a></code> is stored in <code><a href="../../error/errno" title="c/error/errno">errno</a></code> and the value of <code>*ps</code> is left unspecified. </li>
</ul> <h3 id="Example"> Example</h3> <div class="t-example"> <div class="c source-c"><pre data-language="c">#include <stdio.h>
#include <locale.h>
#include <string.h>
#include <wchar.h>
int main(void)
{
setlocale(LC_ALL, "en_US.utf8");
mbstate_t state;
memset(&state, 0, sizeof state);
char in[] = u8"z\u00df\u6c34\U0001F34C"; // or u8"zß水🍌"
size_t in_sz = sizeof in / sizeof *in;
printf("Processing %zu UTF-8 code units: [ ", in_sz);
for(size_t n = 0; n < in_sz; ++n) printf("%#x ", (unsigned char)in[n]);
puts("]");
wchar_t out[in_sz];
char *p_in = in, *end = in + in_sz;
wchar_t *p_out = out;
int rc;
while((rc = mbrtowc(p_out, p_in, end - p_in, &state)) > 0)
{
p_in += rc;
p_out += 1;
}
size_t out_sz = p_out - out + 1;
printf("into %zu wchar_t units: [ ", out_sz);
for(size_t x = 0; x < out_sz; ++x) printf("%#x ", out[x]);
puts("]");
}</pre></div> <p>Output:</p>
<div class="text source-text"><pre data-language="c">Processing 11 UTF-8 code units: [ 0x7a 0xc3 0x9f 0xe6 0xb0 0xb4 0xf0 0x9f 0x8d 0x8c 0 ]
into 5 wchar_t units: [ 0x7a 0xdf 0x6c34 0x1f34c 0 ]</pre></div> </div> <h3 id="References"> References</h3> <ul>
<li> C11 standard (ISO/IEC 9899:2011): </li>
<ul><li> 7.29.6.3.2 The mbrtowc function (p: 443) </li></ul>
<li> C99 standard (ISO/IEC 9899:1999): </li>
<ul><li> 7.24.6.3.2 The mbrtowc function (p: 389) </li></ul>
</ul> <h3 id="See_also"> See also</h3> <table class="t-dsc-begin"> <tr class="t-dsc"> <td> <div><a href="mbtowc" title="c/string/multibyte/mbtowc"> <span class="t-lines"><span>mbtowc</span></span></a></div> </td> <td> converts the next multibyte character to wide character <br> <span class="t-mark">(function)</span> </td>
</tr> <tr class="t-dsc"> <td> <div><a href="wcrtomb" title="c/string/multibyte/wcrtomb"> <span class="t-lines"><span>wcrtomb</span><span>wcrtomb_s</span></span></a></div>
<div><span class="t-lines"><span><span class="t-mark-rev t-since-c95">(C95)</span></span><span><span class="t-mark-rev t-since-c11">(C11)</span></span></span></div> </td> <td> converts a wide character to its multibyte representation, given state <br> <span class="t-mark">(function)</span> </td>
</tr> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/string/multibyte/mbrtowc" title="cpp/string/multibyte/mbrtowc">C++ documentation</a></span> for <code>mbrtowc</code> </td>
</tr> </table> <div class="_attribution">
<p class="_attribution-p">
© cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br>
<a href="https://en.cppreference.com/w/c/string/multibyte/mbrtowc" class="_attribution-link">https://en.cppreference.com/w/c/string/multibyte/mbrtowc</a>
</p>
</div>
|