summaryrefslogtreecommitdiff
path: root/devdocs/elisp/character-codes.html
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2024-04-07 13:41:34 -0500
committerCraig Jennings <c@cjennings.net>2024-04-07 13:41:34 -0500
commit754bbf7a25a8dda49b5d08ef0d0443bbf5af0e36 (patch)
treef1190704f78f04a2b0b4c977d20fe96a828377f1 /devdocs/elisp/character-codes.html
new repository
Diffstat (limited to 'devdocs/elisp/character-codes.html')
-rw-r--r--devdocs/elisp/character-codes.html38
1 files changed, 38 insertions, 0 deletions
diff --git a/devdocs/elisp/character-codes.html b/devdocs/elisp/character-codes.html
new file mode 100644
index 00000000..86d3647e
--- /dev/null
+++ b/devdocs/elisp/character-codes.html
@@ -0,0 +1,38 @@
+ <h3 class="section">Character Codes</h3> <p>The unibyte and multibyte text representations use different character codes. The valid character codes for unibyte representation range from 0 to <code>#xFF</code> (255)—the values that can fit in one byte. The valid character codes for multibyte representation range from 0 to <code>#x3FFFFF</code>. In this code space, values 0 through <code>#x7F</code> (127) are for <acronym>ASCII</acronym> characters, and values <code>#x80</code> (128) through <code>#x3FFF7F</code> (4194175) are for non-<acronym>ASCII</acronym> characters. </p> <p>Emacs character codes are a superset of the Unicode standard. Values 0 through <code>#x10FFFF</code> (1114111) correspond to Unicode characters of the same codepoint; values <code>#x110000</code> (1114112) through <code>#x3FFF7F</code> (4194175) represent characters that are not unified with Unicode; and values <code>#x3FFF80</code> (4194176) through <code>#x3FFFFF</code> (4194303) represent eight-bit raw bytes. </p> <dl> <dt id="characterp">Function: <strong>characterp</strong> <em>charcode</em>
+</dt> <dd>
+<p>This returns <code>t</code> if <var>charcode</var> is a valid character, and <code>nil</code> otherwise. </p> <div class="example"> <pre class="example">(characterp 65)
+ ⇒ t
+</pre>
+<pre class="example">(characterp 4194303)
+ ⇒ t
+</pre>
+<pre class="example">(characterp 4194304)
+ ⇒ nil
+</pre>
+</div> </dd>
+</dl> <dl> <dt id="max-char">Function: <strong>max-char</strong>
+</dt> <dd>
+<p>This function returns the largest value that a valid character codepoint can have. </p> <div class="example"> <pre class="example">(characterp (max-char))
+ ⇒ t
+</pre>
+<pre class="example">(characterp (1+ (max-char)))
+ ⇒ nil
+</pre>
+</div> </dd>
+</dl> <dl> <dt id="char-from-name">Function: <strong>char-from-name</strong> <em>string &amp;optional ignore-case</em>
+</dt> <dd>
+<p>This function returns the character whose Unicode name is <var>string</var>. If <var>ignore-case</var> is non-<code>nil</code>, case is ignored in <var>string</var>. This function returns <code>nil</code> if <var>string</var> does not name a character. </p> <div class="example"> <pre class="example">;; U+03A3
+(= (char-from-name "GREEK CAPITAL LETTER SIGMA") #x03A3)
+ ⇒ t
+</pre>
+</div> </dd>
+</dl> <dl> <dt id="get-byte">Function: <strong>get-byte</strong> <em>&amp;optional pos string</em>
+</dt> <dd>
+<p>This function returns the byte at character position <var>pos</var> in the current buffer. If the current buffer is unibyte, this is literally the byte at that position. If the buffer is multibyte, byte values of <acronym>ASCII</acronym> characters are the same as character codepoints, whereas eight-bit raw bytes are converted to their 8-bit codes. The function signals an error if the character at <var>pos</var> is non-<acronym>ASCII</acronym>. </p> <p>The optional argument <var>string</var> means to get a byte value from that string instead of the current buffer. </p>
+</dd>
+</dl><div class="_attribution">
+ <p class="_attribution-p">
+ Copyright &copy; 1990-1996, 1998-2022 Free Software Foundation, Inc. <br>Licensed under the GNU GPL license.<br>
+ <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Character-Codes.html" class="_attribution-link">https://www.gnu.org/software/emacs/manual/html_node/elisp/Character-Codes.html</a>
+ </p>
+</div>