diff options
| author | Craig Jennings <c@cjennings.net> | 2025-08-14 22:58:58 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2025-08-14 22:58:58 -0500 |
| commit | 82ba818ff456bcd6d56a06226e3f27e98fbb55c3 (patch) | |
| tree | 158cfc17b2f644a10f063cb546752cfaae12c97f /devdocs/elisp/non_002dascii-in-strings.html | |
| parent | 9278ddd4ea1a8b1a4c1edaa8894516e3f48d245b (diff) | |
| download | dotemacs-82ba818ff456bcd6d56a06226e3f27e98fbb55c3.tar.gz dotemacs-82ba818ff456bcd6d56a06226e3f27e98fbb55c3.zip | |
removing all downloaded devdocs files
Diffstat (limited to 'devdocs/elisp/non_002dascii-in-strings.html')
| -rw-r--r-- | devdocs/elisp/non_002dascii-in-strings.html | 6 |
1 files changed, 0 insertions, 6 deletions
diff --git a/devdocs/elisp/non_002dascii-in-strings.html b/devdocs/elisp/non_002dascii-in-strings.html deleted file mode 100644 index e7f1aa7ba..000000000 --- a/devdocs/elisp/non_002dascii-in-strings.html +++ /dev/null @@ -1,6 +0,0 @@ - <h4 class="subsubsection">Non-ASCII Characters in Strings</h4> <p>There are two text representations for non-<acronym>ASCII</acronym> characters in Emacs strings: multibyte and unibyte (see <a href="text-representations">Text Representations</a>). Roughly speaking, unibyte strings store raw bytes, while multibyte strings store human-readable text. Each character in a unibyte string is a byte, i.e., its value is between 0 and 255. By contrast, each character in a multibyte string may have a value between 0 to 4194303 (see <a href="character-type">Character Type</a>). In both cases, characters above 127 are non-<acronym>ASCII</acronym>. </p> <p>You can include a non-<acronym>ASCII</acronym> character in a string constant by writing it literally. If the string constant is read from a multibyte source, such as a multibyte buffer or string, or a file that would be visited as multibyte, then Emacs reads each non-<acronym>ASCII</acronym> character as a multibyte character and automatically makes the string a multibyte string. If the string constant is read from a unibyte source, then Emacs reads the non-<acronym>ASCII</acronym> character as unibyte, and makes the string unibyte. </p> <p>Instead of writing a character literally into a multibyte string, you can write it as its character code using an escape sequence. See <a href="general-escape-syntax">General Escape Syntax</a>, for details about escape sequences. </p> <p>If you use any Unicode-style escape sequence ‘<samp>\uNNNN</samp>’ or ‘<samp>\U00NNNNNN</samp>’ in a string constant (even for an <acronym>ASCII</acronym> character), Emacs automatically assumes that it is multibyte. </p> <p>You can also use hexadecimal escape sequences (‘<samp>\x<var>n</var></samp>’) and octal escape sequences (‘<samp>\<var>n</var></samp>’) in string constants. <strong>But beware:</strong> If a string constant contains hexadecimal or octal escape sequences, and these escape sequences all specify unibyte characters (i.e., less than 256), and there are no other literal non-<acronym>ASCII</acronym> characters or Unicode-style escape sequences in the string, then Emacs automatically assumes that it is a unibyte string. That is to say, it assumes that all non-<acronym>ASCII</acronym> characters occurring in the string are 8-bit raw bytes. </p> <p>In hexadecimal and octal escape sequences, the escaped character code may contain a variable number of digits, so the first subsequent character which is not a valid hexadecimal or octal digit terminates the escape sequence. If the next character in a string could be interpreted as a hexadecimal or octal digit, write ‘<samp>\ </samp>’ (backslash and space) to terminate the escape sequence. For example, ‘<samp>\xe0\ </samp>’ represents one character, ‘<samp>a</samp>’ with grave accent. ‘<samp>\ </samp>’ in a string constant is just like backslash-newline; it does not contribute any character to the string, but it does terminate any preceding hex escape. </p><div class="_attribution"> - <p class="_attribution-p"> - Copyright © 1990-1996, 1998-2022 Free Software Foundation, Inc. <br>Licensed under the GNU GPL license.<br> - <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-in-Strings.html" class="_attribution-link">https://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-in-Strings.html</a> - </p> -</div> |
