diff options
Diffstat (limited to 'devdocs/python~3.12/library%2Femail.charset.html')
| -rw-r--r-- | devdocs/python~3.12/library%2Femail.charset.html | 60 |
1 files changed, 60 insertions, 0 deletions
diff --git a/devdocs/python~3.12/library%2Femail.charset.html b/devdocs/python~3.12/library%2Femail.charset.html new file mode 100644 index 00000000..486a267e --- /dev/null +++ b/devdocs/python~3.12/library%2Femail.charset.html @@ -0,0 +1,60 @@ + <span id="email-charset-representing-character-sets"></span><h1>email.charset: Representing character sets</h1> <p><strong>Source code:</strong> <a class="reference external" href="https://github.com/python/cpython/tree/3.12/Lib/email/charset.py">Lib/email/charset.py</a></p> <p>This module is part of the legacy (<code>Compat32</code>) email API. In the new API only the aliases table is used.</p> <p>The remaining text in this section is the original documentation of the module.</p> <p>This module provides a class <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> for representing character sets and character set conversions in email messages, as well as a character set registry and several convenience methods for manipulating this registry. Instances of <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> are used in several other modules within the <a class="reference internal" href="email#module-email" title="email: Package supporting the parsing, manipulating, and generating email messages."><code>email</code></a> package.</p> <p>Import this class from the <a class="reference internal" href="#module-email.charset" title="email.charset: Character Sets"><code>email.charset</code></a> module.</p> <dl class="py class"> <dt class="sig sig-object py" id="email.charset.Charset"> +<code>class email.charset.Charset(input_charset=DEFAULT_CHARSET)</code> </dt> <dd> +<p>Map character sets to their email properties.</p> <p>This class provides information about the requirements imposed on email for a specific character set. It also provides convenience routines for converting between character sets, given the availability of the applicable codecs. Given a character set, it will do its best to provide information on how to use that character set in an email message in an RFC-compliant way.</p> <p>Certain character sets must be encoded with quoted-printable or base64 when used in email headers or bodies. Certain character sets must be converted outright, and are not allowed in email.</p> <p>Optional <em>input_charset</em> is as described below; it is always coerced to lower case. After being alias normalized it is also used as a lookup into the registry of character sets to find out the header encoding, body encoding, and output conversion codec to be used for the character set. For example, if <em>input_charset</em> is <code>iso-8859-1</code>, then headers and bodies will be encoded using quoted-printable and no output conversion codec is necessary. If <em>input_charset</em> is <code>euc-jp</code>, then headers will be encoded with base64, bodies will not be encoded, but output text will be converted from the <code>euc-jp</code> character set to the <code>iso-2022-jp</code> character set.</p> <p><a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> instances have the following data attributes:</p> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.input_charset"> +<code>input_charset</code> </dt> <dd> +<p>The initial character set specified. Common aliases are converted to their <em>official</em> email names (e.g. <code>latin_1</code> is converted to <code>iso-8859-1</code>). Defaults to 7-bit <code>us-ascii</code>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.header_encoding"> +<code>header_encoding</code> </dt> <dd> +<p>If the character set must be encoded before it can be used in an email header, this attribute will be set to <code>charset.QP</code> (for quoted-printable), <code>charset.BASE64</code> (for base64 encoding), or <code>charset.SHORTEST</code> for the shortest of QP or BASE64 encoding. Otherwise, it will be <code>None</code>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.body_encoding"> +<code>body_encoding</code> </dt> <dd> +<p>Same as <em>header_encoding</em>, but describes the encoding for the mail message’s body, which indeed may be different than the header encoding. <code>charset.SHORTEST</code> is not allowed for <em>body_encoding</em>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.output_charset"> +<code>output_charset</code> </dt> <dd> +<p>Some character sets must be converted before they can be used in email headers or bodies. If the <em>input_charset</em> is one of them, this attribute will contain the name of the character set output will be converted to. Otherwise, it will be <code>None</code>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.input_codec"> +<code>input_codec</code> </dt> <dd> +<p>The name of the Python codec used to convert the <em>input_charset</em> to Unicode. If no conversion codec is necessary, this attribute will be <code>None</code>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="email.charset.Charset.output_codec"> +<code>output_codec</code> </dt> <dd> +<p>The name of the Python codec used to convert Unicode to the <em>output_charset</em>. If no conversion codec is necessary, this attribute will have the same value as the <em>input_codec</em>.</p> </dd> +</dl> <p><a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> instances also have the following methods:</p> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.get_body_encoding"> +<code>get_body_encoding()</code> </dt> <dd> +<p>Return the content transfer encoding used for body encoding.</p> <p>This is either the string <code>quoted-printable</code> or <code>base64</code> depending on the encoding used, or it is a function, in which case you should call the function with a single argument, the Message object being encoded. The function should then set the <em class="mailheader">Content-Transfer-Encoding</em> header itself to whatever is appropriate.</p> <p>Returns the string <code>quoted-printable</code> if <em>body_encoding</em> is <code>QP</code>, returns the string <code>base64</code> if <em>body_encoding</em> is <code>BASE64</code>, and returns the string <code>7bit</code> otherwise.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.get_output_charset"> +<code>get_output_charset()</code> </dt> <dd> +<p>Return the output character set.</p> <p>This is the <em>output_charset</em> attribute if that is not <code>None</code>, otherwise it is <em>input_charset</em>.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.header_encode"> +<code>header_encode(string)</code> </dt> <dd> +<p>Header-encode the string <em>string</em>.</p> <p>The type of encoding (base64 or quoted-printable) will be based on the <em>header_encoding</em> attribute.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.header_encode_lines"> +<code>header_encode_lines(string, maxlengths)</code> </dt> <dd> +<p>Header-encode a <em>string</em> by converting it first to bytes.</p> <p>This is similar to <a class="reference internal" href="#email.charset.Charset.header_encode" title="email.charset.Charset.header_encode"><code>header_encode()</code></a> except that the string is fit into maximum line lengths as given by the argument <em>maxlengths</em>, which must be an iterator: each element returned from this iterator will provide the next maximum line length.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.body_encode"> +<code>body_encode(string)</code> </dt> <dd> +<p>Body-encode the string <em>string</em>.</p> <p>The type of encoding (base64 or quoted-printable) will be based on the <em>body_encoding</em> attribute.</p> </dd> +</dl> <p>The <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> class also provides a number of methods to support standard operations and built-in functions.</p> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.__str__"> +<code>__str__()</code> </dt> <dd> +<p>Returns <em>input_charset</em> as a string coerced to lower case. <code>__repr__()</code> is an alias for <code>__str__()</code>.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.__eq__"> +<code>__eq__(other)</code> </dt> <dd> +<p>This method allows you to compare two <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> instances for equality.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="email.charset.Charset.__ne__"> +<code>__ne__(other)</code> </dt> <dd> +<p>This method allows you to compare two <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code>Charset</code></a> instances for inequality.</p> </dd> +</dl> </dd> +</dl> <p>The <a class="reference internal" href="#module-email.charset" title="email.charset: Character Sets"><code>email.charset</code></a> module also provides the following functions for adding new entries to the global character set, alias, and codec registries:</p> <dl class="py function"> <dt class="sig sig-object py" id="email.charset.add_charset"> +<code>email.charset.add_charset(charset, header_enc=None, body_enc=None, output_charset=None)</code> </dt> <dd> +<p>Add character properties to the global registry.</p> <p><em>charset</em> is the input character set, and must be the canonical name of a character set.</p> <p>Optional <em>header_enc</em> and <em>body_enc</em> is either <code>charset.QP</code> for quoted-printable, <code>charset.BASE64</code> for base64 encoding, <code>charset.SHORTEST</code> for the shortest of quoted-printable or base64 encoding, or <code>None</code> for no encoding. <code>SHORTEST</code> is only valid for <em>header_enc</em>. The default is <code>None</code> for no encoding.</p> <p>Optional <em>output_charset</em> is the character set that the output should be in. Conversions will proceed from input charset, to Unicode, to the output charset when the method <code>Charset.convert()</code> is called. The default is to output in the same character set as the input.</p> <p>Both <em>input_charset</em> and <em>output_charset</em> must have Unicode codec entries in the module’s character set-to-codec mapping; use <a class="reference internal" href="#email.charset.add_codec" title="email.charset.add_codec"><code>add_codec()</code></a> to add codecs the module does not know about. See the <a class="reference internal" href="codecs#module-codecs" title="codecs: Encode and decode data and streams."><code>codecs</code></a> module’s documentation for more information.</p> <p>The global character set registry is kept in the module global dictionary <code>CHARSETS</code>.</p> </dd> +</dl> <dl class="py function"> <dt class="sig sig-object py" id="email.charset.add_alias"> +<code>email.charset.add_alias(alias, canonical)</code> </dt> <dd> +<p>Add a character set alias. <em>alias</em> is the alias name, e.g. <code>latin-1</code>. <em>canonical</em> is the character set’s canonical name, e.g. <code>iso-8859-1</code>.</p> <p>The global charset alias registry is kept in the module global dictionary <code>ALIASES</code>.</p> </dd> +</dl> <dl class="py function"> <dt class="sig sig-object py" id="email.charset.add_codec"> +<code>email.charset.add_codec(charset, codecname)</code> </dt> <dd> +<p>Add a codec that map characters in the given character set to and from Unicode.</p> <p><em>charset</em> is the canonical name of a character set. <em>codecname</em> is the name of a Python codec, as appropriate for the second argument to the <a class="reference internal" href="stdtypes#str" title="str"><code>str</code></a>’s <a class="reference internal" href="stdtypes#str.encode" title="str.encode"><code>encode()</code></a> method.</p> </dd> +</dl> <div class="_attribution"> + <p class="_attribution-p"> + © 2001–2023 Python Software Foundation<br>Licensed under the PSF License.<br> + <a href="https://docs.python.org/3.12/library/email.charset.html" class="_attribution-link">https://docs.python.org/3.12/library/email.charset.html</a> + </p> +</div> |
