16.2. io — Core tools for working with streams

Source code: Lib/io.py


16.2.1. Overview

The io module provides Python’s main facilities for dealing with various types of I/O. There are three main types of I/O: text I/O, binary I/O and raw I/O. These are generic categories, and various backing stores can be used for each of them. A concrete object belonging to any of these categories is called a file object. Other common terms are stream and file-like object.

Independent of its category, each concrete stream object will also have various capabilities: it can be read-only, write-only, or read-write. It can also allow arbitrary random access (seeking forwards or backwards to any location), or only sequential access (for example in the case of a socket or pipe).

All streams are careful about the type of data you give to them. For example giving a str object to the write() method of a binary stream will raise a TypeError. So will giving a bytes object to the write() method of a text stream.

Changed in version 3.3: Operations that used to raise IOError now raise OSError, since IOError is now an alias of OSError.

16.2.1.1. Text I/O

Text I/O expects and produces str objects. This means that whenever the backing store is natively made of bytes (such as in the case of a file), encoding and decoding of data is made transparently as well as optional translation of platform-specific newline characters.

The easiest way to create a text stream is with open(), optionally specifying an encoding:

f = open("myfile.txt", "r", encoding="utf-8")

In-memory text streams are also available as StringIO objects:

f = io.StringIO("some initial text data")

The text stream API is described in detail in the documentation of TextIOBase.

16.2.1.2. Binary I/O

Binary I/O (also called buffered I/O) expects bytes-like objects and produces bytes objects. No encoding, decoding, or newline translation is performed. This category of streams can be used for all kinds of non-text data, and also when manual control over the handling of text data is desired.

The easiest way to create a binary stream is with open() with 'b' in the mode string:

f = open("myfile.jpg", "rb")

In-memory binary streams are also available as BytesIO objects:

f = io.BytesIO(b"some initial binary data: \x00\x01")

The binary stream API is described in detail in the docs of BufferedIOBase.

Other library modules may provide additional ways to create text or binary streams. See socket.socket.makefile() for example.

16.2.1.3. Raw I/O

Raw I/O (also called unbuffered I/O) is generally used as a low-level building-block for binary and text streams; it is rarely useful to directly manipulate a raw stream from user code. Nevertheless, you can create a raw stream by opening a file in binary mode with buffering disabled:

f = open("myfile.jpg", "rb", buffering=0)

The raw stream API is described in detail in the docs of RawIOBase.

16.2.2. High-level Module Interface

io.DEFAULT_BUFFER_SIZE

An int containing the default buffer size used by the module’s buffered I/O classes. open() uses the file’s blksize (as obtained by os.stat()) if possible.

io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

This is an alias for the builtin open() function.

exception io.BlockingIOError

This is a compatibility alias for the builtin BlockingIOError exception.

exception io.UnsupportedOperation

An exception inheriting OSError and ValueError that is raised when an unsupported operation is called on a stream.

16.2.2.1. In-memory streams

It is also possible to use a str or bytes-like object as a file for both reading and writing. For strings StringIO can be used like a file opened in text mode. BytesIO can be used like a file opened in binary mode. Both provide full read-write capabilities with random access.

See also

sys

contains the standard IO streams: sys.stdin, sys.stdout, and sys.stderr.

16.2.3. Class hierarchy

The implementation of I/O streams is organized as a hierarchy of classes. First abstract base classes (ABCs), which are used to specify the various categories of streams, then concrete classes providing the standard stream implementations.

Note

The abstract base classes also provide default implementations of some methods in order to help implementation of concrete stream classes. For example, BufferedIOBase provides unoptimized implementations of readinto() and readline().

At the top of the I/O hierarchy is the abstract base class IOBase. It defines the basic interface to a stream. Note, however, that there is no separation between reading and writing to streams; implementations are allowed to raise UnsupportedOperation if they do not support a given operation.

The RawIOBase ABC extends IOBase. It deals with the reading and writing of bytes to a stream. FileIO subclasses RawIOBase to provide an interface to files in the machine’s file system.

The BufferedIOBase ABC deals with buffering on a raw byte stream (RawIOBase). Its subclasses, BufferedWriter, BufferedReader, and BufferedRWPair buffer streams that are readable, writable, and both readable and writable. BufferedRandom provides a buffered interface to random access streams. Another BufferedIOBase subclass, BytesIO, is a stream of in-memory bytes.

The TextIOBase ABC, another subclass of IOBase, deals with streams whose bytes represent text, and handles encoding and decoding to and from strings. TextIOWrapper, which extends it, is a buffered text interface to a buffered raw stream (BufferedIOBase). Finally, StringIO is an in-memory stream for text.

Argument names are not part of the specification, and only the arguments of open() are intended to be used as keyword arguments.

The following table summarizes the ABCs provided by the io module:

ABC

Inherits

Stub Methods

Mixin Methods and Properties

IOBase

fileno, seek, and truncate

close, closed, __enter__, __exit__, flush, isatty, __iter__, __next__, readable, readline, readlines, seekable, tell, writable, and writelines

RawIOBase

IOBase

readinto and write

Inherited IOBase methods, read, and readall

BufferedIOBase

IOBase

detach, read, read1, and write

Inherited IOBase methods, readinto, and readinto1

TextIOBase

IOBase

detach, read, readline, and write

Inherited IOBase methods, encoding, errors, and newlines

16.2.3.1. I/O Base Classes

class io.IOBase

The abstract base class for all I/O classes, acting on streams of bytes. There is no public constructor.

This class provides empty abstract implementations for many methods that derived classes can override selectively; the default implementations represent a file that cannot be read, written or seeked.

Even though IOBase does not declare read(), readinto(), or write() because their signatures will vary, implementations and clients should consider those methods part of the interface. Also, implementations may raise a ValueError (or UnsupportedOperation) when operations they do not support are called.

The basic type used for binary data read from or written to a file is bytes. Other bytes-like objects are accepted as method arguments too. In some cases, such as readinto(), a writable object such as bytearray is required. Text I/O classes work with str data.

Note that calling any method (even inquiries) on a closed stream is undefined. Implementations may raise ValueError in this case.

IOBase (and its subclasses) supports the iterator protocol, meaning that an IOBase object can be iterated over yielding the lines in a stream. Lines are defined slightly differently depending on whether the stream is a binary stream (yielding bytes), or a text stream (yielding character strings). See readline() below.

IOBase is also a context manager and therefore supports the with statement. In this example, file is closed after the with statement’s suite is finished—even if an exception occurs:

with open('spam.txt', 'w') as file:
    file.write('Spam and eggs!')

IOBase provides these data attributes and methods:

close()

Flush and close this stream. This method has no effect if the file is already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will raise a ValueError.

As a convenience, it is allowed to call this method more than once; only the first call, however, will have an effect.

closed

True if the stream is closed.

fileno()

Return the underlying file descriptor (an integer) of the stream if it exists. An OSError is raised if the IO object does not use a file descriptor.

flush()

Flush the write buffers of the stream if applicable. This does nothing for read-only and non-blocking streams.

isatty()

Return True if the stream is interactive (i.e., connected to a terminal/tty device).

readable()

Return True if the stream can be read from. If False, read() will raise OSError.

readline(size=-1)

Read and return one line from the stream. If size is specified, at most size bytes will be read.

The line terminator is always b'\n' for binary files; for text files, the newline argument to open() can be used to select the line terminator(s) recognized.

readlines(hint=-1)

Read and return a list of lines from the stream. hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

Note that it’s already possible to iterate on file objects using for line in file: ... without calling file.readlines().

seek(offset[, whence])

Change the stream position to the given byte offset. offset is interpreted relative to the position indicated by whence. The default value for whence is SEEK_SET. Values for whence are:

  • SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive

  • SEEK_CUR or 1 – current stream position; offset may be negative

  • SEEK_END or 2 – end of the stream; offset is usually negative

Return the new absolute position.

New in version 3.1: The SEEK_* constants.

New in version 3.3: Some operating systems could support additional values, like os.SEEK_HOLE or os.SEEK_DATA. The valid values for a file could depend on it being open in text or binary mode.

seekable()

Return True if the stream supports random access. If False, seek(), tell() and truncate() will raise OSError.

tell()

Return the current stream position.

truncate(size=None)

Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled). The new file size is returned.

Changed in version 3.5: Windows will now zero-fill files when extending.

writable()

Return True if the stream supports writing. If False, write() and truncate() will raise OSError.

writelines(lines)

Write a list of lines to the stream. Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.

__del__()

Prepare for object destruction. IOBase provides a default implementation of this method that calls the instance’s close() method.

class io.RawIOBase

Base class for raw binary I/O. It inherits IOBase. There is no public constructor.

Raw binary I/O typically provides low-level access to an underlying OS device or API, and does not try to encapsulate it in high-level primitives (this is left to Buffered I/O and Text I/O, described later in this page).

In addition to the attributes and methods from IOBase, RawIOBase provides the following methods:

read(size=-1)

Read up to size bytes from the object and return them. As a convenience, if size is unspecified or -1, all bytes until EOF are returned. Otherwise, only one system call is ever made. Fewer than size bytes may be returned if the operating system call returns fewer than size bytes.

If 0 bytes are returned, and size was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available, None is returned.

The default implementation defers to readall() and readinto().

readall()

Read and return all the bytes from the stream until EOF, using multiple calls to the stream if necessary.

readinto(b)

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read. If the object is in non-blocking mode and no bytes are available, None is returned.

write(b)

Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written. This can be less than the length of b in bytes, depending on specifics of the underlying raw stream, and especially if it is in non-blocking mode. None is returned if the raw stream is set not to block and no single byte could be readily written to it. The caller may release or mutate b after this method returns, so the implementation should only access b during the method call.

class io.BufferedIOBase

Base class for binary streams that support some kind of buffering. It inherits IOBase. There is no public constructor.

The main difference with RawIOBase is that methods read(), readinto() and write() will try (respectively) to read as much input as requested or to consume all given output, at the expense of making perhaps more than one system call.

In addition, those methods can raise BlockingIOError if the underlying raw stream is in non-blocking mode and cannot take or give enough data; unlike their RawIOBase counterparts, they will never return None.

Besides, the