bz2 — Support for bzip2 compression

Source code: Lib/bz2.py


This module provides a comprehensive interface for compressing and decompressing data using the bzip2 compression algorithm.

The bz2 module contains:

This is an optional module. If it is missing from your copy of CPython, look for documentation from your distributor (that is, whoever provided Python to you). If you are the distributor, see Requirements for optional modules.

(De)compression of files

bz2.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)

Open a bzip2-compressed file in binary or text mode, returning a file object.

As with the constructor for BZ2File, the filename argument can be an actual filename (a str or bytes object), or an existing file object to read from or write to.

The mode argument can be any of 'r', 'rb', 'w', 'wb', 'x', 'xb', 'a' or 'ab' for binary mode, or 'rt', 'wt', 'xt', or 'at' for text mode. The default is 'rb'.

The compresslevel argument is an integer from 1 to 9, as for the BZ2File constructor.

For binary mode, this function is equivalent to the BZ2File constructor: BZ2File(filename, mode, compresslevel=compresslevel). In this case, the encoding, errors and newline arguments must not be provided.

For text mode, a BZ2File object is created, and wrapped in an io.TextIOWrapper instance with the specified encoding, error handling behavior, and line ending(s).

Added in version 3.3.

Changed in version 3.4: The 'x' (exclusive creation) mode was added.

Changed in version 3.6: Accepts a path-like object.

class bz2.BZ2File(filename, mode='r', *, compresslevel=9)

Open a bzip2-compressed file in binary mode.

If filename is a str or bytes object, open the named file directly. Otherwise, filename should be a file object, which will be used to read or write the compressed data.

The mode argument can be either 'r' for reading (default), 'w' for overwriting, 'x' for exclusive creation, or 'a' for appending. These can equivalently be given as 'rb', 'wb', 'xb' and 'ab' respectively.

If filename is a file object (rather than an actual file name), a mode of 'w' does not truncate the file, and is instead equivalent to 'a'.

If mode is 'w' or 'a', compresslevel can be an integer between 1 and 9 specifying the level of compression: 1 produces the least compression, and 9 (default) produces the most compression.

If mode is 'r', the input file may be the concatenation of multiple compressed streams.

BZ2File provides all of the members specified by the io.BufferedIOBase, except for detach() and truncate(). Iteration and the with statement are supported.

BZ2File also provides the following methods and attributes:

peek([n])

Return buffered data without advancing the file position. At least one byte of data will be returned (unless at EOF). The exact number of bytes returned is unspecified.

Note

While calling peek() does not change the file position of the BZ2File, it may change the position of the underlying file object (e.g. if the BZ2File was constructed by passing a file object for filename).

Added in version 3.3.

fileno()

Return the file descriptor for the underlying file.

Added in version 3.3.

readable()

Return whether the file was opened for reading.

Added in version 3.3.

seekable()

Return whether the file supports seeking.

Added in version 3.3.

writable()

Return whether the file was opened for writing.

Added in version 3.3.

read1(size=-1)

Read up to size uncompressed bytes, while trying to avoid making multiple reads from the underlying stream. Reads up to a buffer’s worth of data if size is negative.

Returns b'' if the file is at EOF.

Added in version 3.3.

readinto(b)

Read bytes into b.

Returns the number of bytes read (0 for EOF).

Added in version 3.3.

mode

'rb' for reading and 'wb' for writing.

Added in version 3.13.

name

The bzip2 file name. Equivalent to the name attribute of the underlying file object.

Added in version 3.13.

Changed in version 3.1: Support for the with statement was added.

Changed in version 3.3: Support was added for filename being a file object instead of an actual filename.

The 'a' (append) mode was added, along with support for reading multi-stream files.

Changed in version 3.4: The 'x' (exclusive creation) mode was added.

Changed in version 3.5: The read() method now accepts an argument of None.

Changed in version 3.6: Accepts a path-like object.

Changed in version 3.9: The buffering parameter has been removed. It was ignored and deprecated since Python 3.0. Pass an open file object to control how the file is opened.

The compresslevel parameter became keyword-only.

Changed in version 3.10: This class is thread unsafe in the face of multiple simultaneous readers or writers, just like its equivalent classes in gzip and lzma have always been.

Incremental (de)compression

class bz2.BZ2Compressor(compresslevel=9)

Create a new compressor object. This object may be used to compress data incrementally. For one-shot compression, use the