diff options
| author | Craig Jennings <c@cjennings.net> | 2024-04-07 13:41:34 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2024-04-07 13:41:34 -0500 |
| commit | 754bbf7a25a8dda49b5d08ef0d0443bbf5af0e36 (patch) | |
| tree | f1190704f78f04a2b0b4c977d20fe96a828377f1 /devdocs/python~3.12/library%2Fbz2.html | |
new repository
Diffstat (limited to 'devdocs/python~3.12/library%2Fbz2.html')
| -rw-r--r-- | devdocs/python~3.12/library%2Fbz2.html | 112 |
1 files changed, 112 insertions, 0 deletions
diff --git a/devdocs/python~3.12/library%2Fbz2.html b/devdocs/python~3.12/library%2Fbz2.html new file mode 100644 index 00000000..5fe62e77 --- /dev/null +++ b/devdocs/python~3.12/library%2Fbz2.html @@ -0,0 +1,112 @@ + <span id="bz2-support-for-bzip2-compression"></span><h1>bz2 — Support for bzip2 compression</h1> <p><strong>Source code:</strong> <a class="reference external" href="https://github.com/python/cpython/tree/3.12/Lib/bz2.py">Lib/bz2.py</a></p> <p>This module provides a comprehensive interface for compressing and decompressing data using the bzip2 compression algorithm.</p> <p>The <a class="reference internal" href="#module-bz2" title="bz2: Interfaces for bzip2 compression and decompression."><code>bz2</code></a> module contains:</p> <ul class="simple"> <li>The <a class="reference internal" href="#bz2.open" title="bz2.open"><code>open()</code></a> function and <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> class for reading and writing compressed files.</li> <li>The <a class="reference internal" href="#bz2.BZ2Compressor" title="bz2.BZ2Compressor"><code>BZ2Compressor</code></a> and <a class="reference internal" href="#bz2.BZ2Decompressor" title="bz2.BZ2Decompressor"><code>BZ2Decompressor</code></a> classes for incremental (de)compression.</li> <li>The <a class="reference internal" href="#bz2.compress" title="bz2.compress"><code>compress()</code></a> and <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a> functions for one-shot (de)compression.</li> </ul> <section id="de-compression-of-files"> <h2>(De)compression of files</h2> <dl class="py function"> <dt class="sig sig-object py" id="bz2.open"> +<code>bz2.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)</code> </dt> <dd> +<p>Open a bzip2-compressed file in binary or text mode, returning a <a class="reference internal" href="../glossary#term-file-object"><span class="xref std std-term">file object</span></a>.</p> <p>As with the constructor for <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a>, the <em>filename</em> argument can be an actual filename (a <a class="reference internal" href="stdtypes#str" title="str"><code>str</code></a> or <a class="reference internal" href="stdtypes#bytes" title="bytes"><code>bytes</code></a> object), or an existing file object to read from or write to.</p> <p>The <em>mode</em> argument can be any of <code>'r'</code>, <code>'rb'</code>, <code>'w'</code>, <code>'wb'</code>, <code>'x'</code>, <code>'xb'</code>, <code>'a'</code> or <code>'ab'</code> for binary mode, or <code>'rt'</code>, <code>'wt'</code>, <code>'xt'</code>, or <code>'at'</code> for text mode. The default is <code>'rb'</code>.</p> <p>The <em>compresslevel</em> argument is an integer from 1 to 9, as for the <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> constructor.</p> <p>For binary mode, this function is equivalent to the <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> constructor: <code>BZ2File(filename, mode, compresslevel=compresslevel)</code>. In this case, the <em>encoding</em>, <em>errors</em> and <em>newline</em> arguments must not be provided.</p> <p>For text mode, a <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> object is created, and wrapped in an <a class="reference internal" href="io#io.TextIOWrapper" title="io.TextIOWrapper"><code>io.TextIOWrapper</code></a> instance with the specified encoding, error handling behavior, and line ending(s).</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.4: </span>The <code>'x'</code> (exclusive creation) mode was added.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.6: </span>Accepts a <a class="reference internal" href="../glossary#term-path-like-object"><span class="xref std std-term">path-like object</span></a>.</p> </div> </dd> +</dl> <dl class="py class"> <dt class="sig sig-object py" id="bz2.BZ2File"> +<code>class bz2.BZ2File(filename, mode='r', *, compresslevel=9)</code> </dt> <dd> +<p>Open a bzip2-compressed file in binary mode.</p> <p>If <em>filename</em> is a <a class="reference internal" href="stdtypes#str" title="str"><code>str</code></a> or <a class="reference internal" href="stdtypes#bytes" title="bytes"><code>bytes</code></a> object, open the named file directly. Otherwise, <em>filename</em> should be a <a class="reference internal" href="../glossary#term-file-object"><span class="xref std std-term">file object</span></a>, which will be used to read or write the compressed data.</p> <p>The <em>mode</em> argument can be either <code>'r'</code> for reading (default), <code>'w'</code> for overwriting, <code>'x'</code> for exclusive creation, or <code>'a'</code> for appending. These can equivalently be given as <code>'rb'</code>, <code>'wb'</code>, <code>'xb'</code> and <code>'ab'</code> respectively.</p> <p>If <em>filename</em> is a file object (rather than an actual file name), a mode of <code>'w'</code> does not truncate the file, and is instead equivalent to <code>'a'</code>.</p> <p>If <em>mode</em> is <code>'w'</code> or <code>'a'</code>, <em>compresslevel</em> can be an integer between <code>1</code> and <code>9</code> specifying the level of compression: <code>1</code> produces the least compression, and <code>9</code> (default) produces the most compression.</p> <p>If <em>mode</em> is <code>'r'</code>, the input file may be the concatenation of multiple compressed streams.</p> <p><a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> provides all of the members specified by the <a class="reference internal" href="io#io.BufferedIOBase" title="io.BufferedIOBase"><code>io.BufferedIOBase</code></a>, except for <a class="reference internal" href="io#io.BufferedIOBase.detach" title="io.BufferedIOBase.detach"><code>detach()</code></a> and <a class="reference internal" href="io#io.IOBase.truncate" title="io.IOBase.truncate"><code>truncate()</code></a>. Iteration and the <a class="reference internal" href="../reference/compound_stmts#with"><code>with</code></a> statement are supported.</p> <p><a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> also provides the following methods:</p> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.peek"> +<code>peek([n])</code> </dt> <dd> +<p>Return buffered data without advancing the file position. At least one byte of data will be returned (unless at EOF). The exact number of bytes returned is unspecified.</p> <div class="admonition note"> <p class="admonition-title">Note</p> <p>While calling <a class="reference internal" href="#bz2.BZ2File.peek" title="bz2.BZ2File.peek"><code>peek()</code></a> does not change the file position of the <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a>, it may change the position of the underlying file object (e.g. if the <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a> was constructed by passing a file object for <em>filename</em>).</p> </div> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.fileno"> +<code>fileno()</code> </dt> <dd> +<p>Return the file descriptor for the underlying file.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.readable"> +<code>readable()</code> </dt> <dd> +<p>Return whether the file was opened for reading.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.seekable"> +<code>seekable()</code> </dt> <dd> +<p>Return whether the file supports seeking.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.writable"> +<code>writable()</code> </dt> <dd> +<p>Return whether the file was opened for writing.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.read1"> +<code>read1(size=- 1)</code> </dt> <dd> +<p>Read up to <em>size</em> uncompressed bytes, while trying to avoid making multiple reads from the underlying stream. Reads up to a buffer’s worth of data if size is negative.</p> <p>Returns <code>b''</code> if the file is at EOF.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2File.readinto"> +<code>readinto(b)</code> </dt> <dd> +<p>Read bytes into <em>b</em>.</p> <p>Returns the number of bytes read (0 for EOF).</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.1: </span>Support for the <a class="reference internal" href="../reference/compound_stmts#with"><code>with</code></a> statement was added.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.3: </span>Support was added for <em>filename</em> being a <a class="reference internal" href="../glossary#term-file-object"><span class="xref std std-term">file object</span></a> instead of an actual filename.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.3: </span>The <code>'a'</code> (append) mode was added, along with support for reading multi-stream files.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.4: </span>The <code>'x'</code> (exclusive creation) mode was added.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.5: </span>The <a class="reference internal" href="io#io.BufferedIOBase.read" title="io.BufferedIOBase.read"><code>read()</code></a> method now accepts an argument of <code>None</code>.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.6: </span>Accepts a <a class="reference internal" href="../glossary#term-path-like-object"><span class="xref std std-term">path-like object</span></a>.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.9: </span>The <em>buffering</em> parameter has been removed. It was ignored and deprecated since Python 3.0. Pass an open file object to control how the file is opened.</p> <p>The <em>compresslevel</em> parameter became keyword-only.</p> </div> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.10: </span>This class is thread unsafe in the face of multiple simultaneous readers or writers, just like its equivalent classes in <a class="reference internal" href="gzip#module-gzip" title="gzip: Interfaces for gzip compression and decompression using file objects."><code>gzip</code></a> and <a class="reference internal" href="lzma#module-lzma" title="lzma: A Python wrapper for the liblzma compression library."><code>lzma</code></a> have always been.</p> </div> </dd> +</dl> </section> <section id="incremental-de-compression"> <h2>Incremental (de)compression</h2> <dl class="py class"> <dt class="sig sig-object py" id="bz2.BZ2Compressor"> +<code>class bz2.BZ2Compressor(compresslevel=9)</code> </dt> <dd> +<p>Create a new compressor object. This object may be used to compress data incrementally. For one-shot compression, use the <a class="reference internal" href="#bz2.compress" title="bz2.compress"><code>compress()</code></a> function instead.</p> <p><em>compresslevel</em>, if given, must be an integer between <code>1</code> and <code>9</code>. The default is <code>9</code>.</p> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2Compressor.compress"> +<code>compress(data)</code> </dt> <dd> +<p>Provide data to the compressor object. Returns a chunk of compressed data if possible, or an empty byte string otherwise.</p> <p>When you have finished providing data to the compressor, call the <a class="reference internal" href="#bz2.BZ2Compressor.flush" title="bz2.BZ2Compressor.flush"><code>flush()</code></a> method to finish the compression process.</p> </dd> +</dl> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2Compressor.flush"> +<code>flush()</code> </dt> <dd> +<p>Finish the compression process. Returns the compressed data left in internal buffers.</p> <p>The compressor object may not be used after this method has been called.</p> </dd> +</dl> </dd> +</dl> <dl class="py class"> <dt class="sig sig-object py" id="bz2.BZ2Decompressor"> +<code>class bz2.BZ2Decompressor</code> </dt> <dd> +<p>Create a new decompressor object. This object may be used to decompress data incrementally. For one-shot compression, use the <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a> function instead.</p> <div class="admonition note"> <p class="admonition-title">Note</p> <p>This class does not transparently handle inputs containing multiple compressed streams, unlike <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a> and <a class="reference internal" href="#bz2.BZ2File" title="bz2.BZ2File"><code>BZ2File</code></a>. If you need to decompress a multi-stream input with <a class="reference internal" href="#bz2.BZ2Decompressor" title="bz2.BZ2Decompressor"><code>BZ2Decompressor</code></a>, you must use a new decompressor for each stream.</p> </div> <dl class="py method"> <dt class="sig sig-object py" id="bz2.BZ2Decompressor.decompress"> +<code>decompress(data, max_length=- 1)</code> </dt> <dd> +<p>Decompress <em>data</em> (a <a class="reference internal" href="../glossary#term-bytes-like-object"><span class="xref std std-term">bytes-like object</span></a>), returning uncompressed data as bytes. Some of <em>data</em> may be buffered internally, for use in later calls to <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a>. The returned data should be concatenated with the output of any previous calls to <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a>.</p> <p>If <em>max_length</em> is nonnegative, returns at most <em>max_length</em> bytes of decompressed data. If this limit is reached and further output can be produced, the <a class="reference internal" href="#bz2.BZ2Decompressor.needs_input" title="bz2.BZ2Decompressor.needs_input"><code>needs_input</code></a> attribute will be set to <code>False</code>. In this case, the next call to <a class="reference internal" href="#bz2.BZ2Decompressor.decompress" title="bz2.BZ2Decompressor.decompress"><code>decompress()</code></a> may provide <em>data</em> as <code>b''</code> to obtain more of the output.</p> <p>If all of the input data was decompressed and returned (either because this was less than <em>max_length</em> bytes, or because <em>max_length</em> was negative), the <a class="reference internal" href="#bz2.BZ2Decompressor.needs_input" title="bz2.BZ2Decompressor.needs_input"><code>needs_input</code></a> attribute will be set to <code>True</code>.</p> <p>Attempting to decompress data after the end of stream is reached raises an <a class="reference internal" href="exceptions#EOFError" title="EOFError"><code>EOFError</code></a>. Any data found after the end of the stream is ignored and saved in the <a class="reference internal" href="#bz2.BZ2Decompressor.unused_data" title="bz2.BZ2Decompressor.unused_data"><code>unused_data</code></a> attribute.</p> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.5: </span>Added the <em>max_length</em> parameter.</p> </div> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="bz2.BZ2Decompressor.eof"> +<code>eof</code> </dt> <dd> +<p><code>True</code> if the end-of-stream marker has been reached.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.3.</span></p> </div> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="bz2.BZ2Decompressor.unused_data"> +<code>unused_data</code> </dt> <dd> +<p>Data found after the end of the compressed stream.</p> <p>If this attribute is accessed before the end of the stream has been reached, its value will be <code>b''</code>.</p> </dd> +</dl> <dl class="py attribute"> <dt class="sig sig-object py" id="bz2.BZ2Decompressor.needs_input"> +<code>needs_input</code> </dt> <dd> +<p><code>False</code> if the <a class="reference internal" href="#bz2.BZ2Decompressor.decompress" title="bz2.BZ2Decompressor.decompress"><code>decompress()</code></a> method can provide more decompressed data before requiring new uncompressed input.</p> <div class="versionadded"> <p><span class="versionmodified added">New in version 3.5.</span></p> </div> </dd> +</dl> </dd> +</dl> </section> <section id="one-shot-de-compression"> <h2>One-shot (de)compression</h2> <dl class="py function"> <dt class="sig sig-object py" id="bz2.compress"> +<code>bz2.compress(data, compresslevel=9)</code> </dt> <dd> +<p>Compress <em>data</em>, a <a class="reference internal" href="../glossary#term-bytes-like-object"><span class="xref std std-term">bytes-like object</span></a>.</p> <p><em>compresslevel</em>, if given, must be an integer between <code>1</code> and <code>9</code>. The default is <code>9</code>.</p> <p>For incremental compression, use a <a class="reference internal" href="#bz2.BZ2Compressor" title="bz2.BZ2Compressor"><code>BZ2Compressor</code></a> instead.</p> </dd> +</dl> <dl class="py function"> <dt class="sig sig-object py" id="bz2.decompress"> +<code>bz2.decompress(data)</code> </dt> <dd> +<p>Decompress <em>data</em>, a <a class="reference internal" href="../glossary#term-bytes-like-object"><span class="xref std std-term">bytes-like object</span></a>.</p> <p>If <em>data</em> is the concatenation of multiple compressed streams, decompress all of the streams.</p> <p>For incremental decompression, use a <a class="reference internal" href="#bz2.BZ2Decompressor" title="bz2.BZ2Decompressor"><code>BZ2Decompressor</code></a> instead.</p> <div class="versionchanged"> <p><span class="versionmodified changed">Changed in version 3.3: </span>Support for multi-stream inputs was added.</p> </div> </dd> +</dl> </section> <section id="examples-of-usage"> <span id="bz2-usage-examples"></span><h2>Examples of usage</h2> <p>Below are some examples of typical usage of the <a class="reference internal" href="#module-bz2" title="bz2: Interfaces for bzip2 compression and decompression."><code>bz2</code></a> module.</p> <p>Using <a class="reference internal" href="#bz2.compress" title="bz2.compress"><code>compress()</code></a> and <a class="reference internal" href="#bz2.decompress" title="bz2.decompress"><code>decompress()</code></a> to demonstrate round-trip compression:</p> <pre data-language="python">>>> import bz2 +>>> data = b"""\ +... Donec rhoncus quis sapien sit amet molestie. Fusce scelerisque vel augue +... nec ullamcorper. Nam rutrum pretium placerat. Aliquam vel tristique lorem, +... sit amet cursus ante. In interdum laoreet mi, sit amet ultrices purus +... pulvinar a. Nam gravida euismod magna, non varius justo tincidunt feugiat. +... Aliquam pharetra lacus non risus vehicula rutrum. Maecenas aliquam leo +... felis. Pellentesque semper nunc sit amet nibh ullamcorper, ac elementum +... dolor luctus. Curabitur lacinia mi ornare consectetur vestibulum.""" +>>> c = bz2.compress(data) +>>> len(data) / len(c) # Data compression ratio +1.513595166163142 +>>> d = bz2.decompress(c) +>>> data == d # Check equality to original object after round-trip +True +</pre> <p>Using <a class="reference internal" href="#bz2.BZ2Compressor" title="bz2.BZ2Compressor"><code>BZ2Compressor</code></a> for incremental compression:</p> <pre data-language="python">>>> import bz2 +>>> def gen_data(chunks=10, chunksize=1000): +... """Yield incremental blocks of chunksize bytes.""" +... for _ in range(chunks): +... yield b"z" * chunksize +... +>>> comp = bz2.BZ2Compressor() +>>> out = b"" +>>> for chunk in gen_data(): +... # Provide data to the compressor object +... out = out + comp.compress(chunk) +... +>>> # Finish the compression process. Call this once you have +>>> # finished providing data to the compressor. +>>> out = out + comp.flush() +</pre> <p>The example above uses a very “nonrandom” stream of data (a stream of <code>b"z"</code> chunks). Random data tends to compress poorly, while ordered, repetitive data usually yields a high compression ratio.</p> <p>Writing and reading a bzip2-compressed file in binary mode:</p> <pre data-language="python">>>> import bz2 +>>> data = b"""\ +... Donec rhoncus quis sapien sit amet molestie. Fusce scelerisque vel augue +... nec ullamcorper. Nam rutrum pretium placerat. Aliquam vel tristique lorem, +... sit amet cursus ante. In interdum laoreet mi, sit amet ultrices purus +... pulvinar a. Nam gravida euismod magna, non varius justo tincidunt feugiat. +... Aliquam pharetra lacus non risus vehicula rutrum. Maecenas aliquam leo +... felis. Pellentesque semper nunc sit amet nibh ullamcorper, ac elementum +... dolor luctus. Curabitur lacinia mi ornare consectetur vestibulum.""" +>>> with bz2.open("myfile.bz2", "wb") as f: +... # Write compressed data to file +... unused = f.write(data) +... +>>> with bz2.open("myfile.bz2", "rb") as f: +... # Decompress data from file +... content = f.read() +... +>>> content == data # Check equality to original object after round-trip +True +</pre> </section> <div class="_attribution"> + <p class="_attribution-p"> + © 2001–2023 Python Software Foundation<br>Licensed under the PSF License.<br> + <a href="https://docs.python.org/3.12/library/bz2.html" class="_attribution-link">https://docs.python.org/3.12/library/bz2.html</a> + </p> +</div> |
