summaryrefslogtreecommitdiff
path: root/devdocs/c/string%2Fmultibyte%2Fmblen.html
diff options
context:
space:
mode:
Diffstat (limited to 'devdocs/c/string%2Fmultibyte%2Fmblen.html')
-rw-r--r--devdocs/c/string%2Fmultibyte%2Fmblen.html73
1 files changed, 73 insertions, 0 deletions
diff --git a/devdocs/c/string%2Fmultibyte%2Fmblen.html b/devdocs/c/string%2Fmultibyte%2Fmblen.html
new file mode 100644
index 00000000..46facf2a
--- /dev/null
+++ b/devdocs/c/string%2Fmultibyte%2Fmblen.html
@@ -0,0 +1,73 @@
+ <h1 id="firstHeading" class="firstHeading">mblen</h1> <table class="t-dcl-begin"> <tr class="t-dsc-header"> <th> Defined in header <code>&lt;stdlib.h&gt;</code> </th> <th> </th> <th> </th> </tr> <tr class="t-dcl"> <td class="t-dcl-nopad"> <pre data-language="c">int mblen( const char* s, size_t n );</pre>
+</td> <td class="t-dcl-nopad"> </td> <td class="t-dcl-nopad"> </td> </tr> </table> <p>Determines the size, in bytes, of the multibyte character whose first byte is pointed to by <code>s</code>.</p>
+<p>If <code>s</code> is a null pointer, <span class="t-rev-inl t-until-c23"><span>resets the global conversion state and</span><span><span class="t-mark-rev t-until-c23">(until C23)</span></span></span> determined whether shift sequences are used.</p>
+<p>This function is equivalent to the call <code><a href="http://en.cppreference.com/w/c/string/multibyte/mbtowc"><span class="kw575">mbtowc</span></a><span class="br0">(</span><span class="br0">(</span><span class="kw4">wchar_t</span><span class="sy2">*</span><span class="br0">)</span><span class="nu0">0</span>, s, n<span class="br0">)</span></code>, except that conversion state of <code><a href="mbtowc" title="c/string/multibyte/mbtowc">mbtowc</a></code> is unaffected.</p>
+<h3 id="Parameters"> Parameters</h3> <table class="t-par-begin"> <tr class="t-par"> <td> s </td> <td> - </td> <td> pointer to the multibyte character </td>
+</tr> <tr class="t-par"> <td> n </td> <td> - </td> <td> limit on the number of bytes in s that can be examined </td>
+</tr>
+</table> <h3 id="Return_value"> Return value</h3> <p>If <code>s</code> is not a null pointer, returns the number of bytes that are contained in the multibyte character or <code>-1</code> if the first bytes pointed to by <code>s</code> do not form a valid multibyte character or <code>​0​</code> if <code>s</code> is pointing at the null charcter <code>'\0'</code>.</p>
+<p>If <code>s</code> is a null pointer, <span class="t-rev-inl t-until-c23"><span>resets its internal conversion state to represent the initial shift state and</span><span><span class="t-mark-rev t-until-c23">(until C23)</span></span></span> returns <code>​0​</code> if the current multibyte encoding is not state-dependent (does not use shift sequences) or a non-zero value if the current multibyte encoding is state-dependent (uses shift sequences).</p>
+<h3 id="Notes"> Notes</h3> <table class="t-rev-begin"> <tr class="t-rev t-until-c23">
+<td> <p>Each call to <code>mblen</code> updates the internal global conversion state (a static object of type <code><a href="mbstate_t" title="c/string/multibyte/mbstate t">mbstate_t</a></code>, only known to this function). If the multibyte encoding uses shift states, care must be taken to avoid backtracking or multiple scans. In any case, multiple threads should not call <code>mblen</code> without synchronization: <code><a href="mbrlen" title="c/string/multibyte/mbrlen">mbrlen</a></code> may be used instead.</p>
+</td> <td><span class="t-mark-rev t-until-c23">(until C23)</span></td>
+</tr> <tr class="t-rev t-since-c23">
+<td> <p><code>mblen</code> is not allowed to have an internal state.</p>
+</td> <td><span class="t-mark-rev t-since-c23">(since C23)</span></td>
+</tr> </table> <h3 id="Example"> Example</h3> <div class="t-example"> <div class="c source-c"><pre data-language="c">#include &lt;locale.h&gt;
+#include &lt;stdio.h&gt;
+#include &lt;stdlib.h&gt;
+#include &lt;string.h&gt;
+
+// the number of characters in a multibyte string is the sum of mblen()'s
+// note: the simpler approach is mbstowcs(NULL, str, sz)
+size_t strlen_mb(const char* ptr)
+{
+ size_t result = 0;
+ const char* end = ptr + strlen(ptr);
+ mblen(NULL, 0); // reset the conversion state
+ while(ptr &lt; end) {
+ int next = mblen(ptr, end - ptr);
+ if (next == -1) {
+ perror("strlen_mb");
+ break;
+ }
+ ptr += next;
+ ++result;
+ }
+ return result;
+}
+
+void dump_bytes(const char* str)
+{
+ for (const char* end = str + strlen(str); str != end; ++str)
+ printf("%02X ", (unsigned char)str[0]);
+ printf("\n");
+}
+
+int main(void)
+{
+ setlocale(LC_ALL, "en_US.utf8");
+ const char* str = "z\u00df\u6c34\U0001f34c";
+ printf("The string \"%s\" consists of %zu characters, but %zu bytes: ",
+ str, strlen_mb(str), strlen(str));
+ dump_bytes(str);
+}</pre></div> <p>Possible output:</p>
+<div class="text source-text"><pre data-language="c">The string "zß水🍌" consists of 4 characters, but 10 bytes: 7A C3 9F E6 B0 B4 F0 9F 8D 8C</pre></div> </div> <h3 id="References"> References</h3> <ul>
+<li> C17 standard (ISO/IEC 9899:2018): </li>
+<ul><li> 7.22.7.1 The mblen function (p: 260) </li></ul>
+<li> C11 standard (ISO/IEC 9899:2011): </li>
+<ul><li> 7.22.7.1 The mblen function (p: 357) </li></ul>
+<li> C99 standard (ISO/IEC 9899:1999): </li>
+<ul><li> 7.20.7.1 The mblen function (p: 321) </li></ul>
+<li> C89/C90 standard (ISO/IEC 9899:1990): </li>
+<ul><li> 4.10.7.1 The mblen function </li></ul>
+</ul> <h3 id="See_also"> See also</h3> <table class="t-dsc-begin"> <tr class="t-dsc"> <td> <div><a href="mbtowc" title="c/string/multibyte/mbtowc"> <span class="t-lines"><span>mbtowc</span></span></a></div> </td> <td> converts the next multibyte character to wide character <br> <span class="t-mark">(function)</span> </td>
+</tr> <tr class="t-dsc"> <td> <div><a href="mbrlen" title="c/string/multibyte/mbrlen"> <span class="t-lines"><span>mbrlen</span></span></a></div>
+<div><span class="t-lines"><span><span class="t-mark-rev t-since-c95">(C95)</span></span></span></div> </td> <td> returns the number of bytes in the next multibyte character, given state <br> <span class="t-mark">(function)</span> </td>
+</tr> <tr class="t-dsc"> <td colspan="2"> <span><a href="https://en.cppreference.com/w/cpp/string/multibyte/mblen" title="cpp/string/multibyte/mblen">C++ documentation</a></span> for <code>mblen</code> </td>
+</tr> </table> <div class="_attribution">
+ <p class="_attribution-p">
+ &copy; cppreference.com<br>Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.<br>
+ <a href="https://en.cppreference.com/w/c/string/multibyte/mblen" class="_attribution-link">https://en.cppreference.com/w/c/string/multibyte/mblen</a>
+ </p>
+</div>