The Java I/O stream library is an important part of everyday programming. The stream API is overwhelmingly rich, replete with interfaces, objects, and methods to support almost every programmer’s needs. In view of providing every need, the stream library has become a large collection of methods, interfaces, and classes with a recent extension into a new package called NIO.2 (New I/O version 2). It is easy to be lost among the stream implementation, especially for a beginner. This article shall try to provide some clue to streamline your understanding of I/O streams APIs in Java.
An Idea of Java I/O Stream
Stream literally means continuous flow, and I/O stream in Java refers to the flow of bytes between an input source and output destination. The type of sources or destination can be anything that contains, generates, or consumes data. For example, it may be a peripheral device, a network socket, a memory structure like an array, disk files, or other programs. After all, bytes are bytes; reading data sent from a server network stream is no different than reading a local file. Similar is the case for writing data. The intriguing part of Java I/O is its unique approach, very different from how I/O is handled in C or C++. Although the data type may vary along with I/O endpoints, the fundamental approach of the methods in output and input stream is same all throughout Java APIs. There will always be a read method for the input stream and a write method for the output stream.
After the stream object is created, we almost can ignore the intricacies involved in realizing the exact details of I/O processing. For example, we can chain filter streams to either an output stream or an input stream, and modify the data in the process of a read or write operation subsequently. The modification can be like applying encryption or compression or simply provide methods to convert data into other formats.
The readers and writers, for example, can be chained to an input and output stream to realize character streams rather than bytes. Readers and writers can handle a variety of character encoding such as multi byte Unicode characters (UTF-8).
Thus, a lot goes on behind the scenes, even if it is seemingly a simple I/O flow from one end to another. Implementing them from scratch is by no means simple and needs to go through the rigor of extensive coding. Java Stream APIs handle these complexities, giving developers an open space to concentrate on their productive ends rather than brainstorm on the intricacies of I/O processing. One just needs to understand the right use of the API interfaces, objects, and methods and let it handle the intricacies on their behalf.
Note: There is a separate package, called java.util.stream. Although, conceptually, java.util.stream is similar to java.io streams, their implementation is different and has a different use altogether. Introduced in JDK8, java.util.stream is closely associated with lambda expression and has little to offer the type of I/O stream we are talking about. Let’s not confuse one with the other (at least for now). |
The Java IO Stream API Library
The classes defined in the java.io package implements Input/Output Stream, File, and Serialization. File is not exactly a stream, but stream operations are the means to achieve file handling. File actually deals with file system manipulation, such as read/write operations, manipulating their properties, disk access, permissions, subdirectory navigation, and so forth. Serialization, on the other hand, is the process of persisting Java objects into a local or remote machine. Complete delineation is out of scope of this article; instead, here we focus only on the I/O streaming part. The base class for I/O streaming is the abstract classes InputStream and OutputStream, and later these classes are extended to to have some added functionality. They can be categorized intuitively as follows.
Figure 1: The Java IO Stream API Library
Byte Stream
Byte Stream classes are mainly used to handle byte-oriented I/O. It is not restricted to any particular data type, though, and can be used with objects including binary data. The data is translated into 8-bit bytes for I/O operations. This makes byte stream classes suitable for I/O operations where a specific data type does not matter and can be dealt with in binary form as well. Byte Stream classes are mainly used in network I/O such as socket or binary file operation, and so on. There are many Byte Stream classes in the library; all are the extension of an abstract class called InputStream for input streaming and OutputStream for output streaming. An example of the concrete implementation of byte stream classes is:
public class FileInputStream extends InputStream public class FileOutputStream extends OutputStream
Character Stream
Character Stream deals with Unicode characters rather than bytes. Sometime the character sets used locally are different, non-Unicode. Character I/O automatically translates a local character set to Unicode upon I/O operation without extensive intervention of the programmer. Using Character Stream is safe for future upgrades to support Internationalization even though the application may use a local character set such as ASCII. The character stream classes make the transformation possible with very little recoding. Character stream classes are derived from abstract classes called Reader and Writer. For example, the character stream reader that handles the translation of character to bytes and vice versa are:
public class InputStreamReader extends Reader public class OutputStreamWriter extends Writer
Buffered Stream
Sometimes, the data needs to be buffered in between I/O operations. For example, an I/O operation may trigger a slow operation like a disk access or some network activity. These expensive operations can bring down overall performance of the application. As a result, to reduce the quagmire, Java platform implements a buffered (buffer=memory area) I/O stream. On invocation of an input operation, the data first is read from the buffer. If no data is found, a native API is called to fetch the content from an I/O device. Calling a native API is expensive, but if the data is found in the buffer, it is quick and efficient. Buffered stream is particularly suitable for I/O access dealing with huge chunks of data.
public class BufferedInputStream extends FilterInputStream public class BufferedOutputStream extends FilterOutputStrea public class BufferedReader extends Reader public class BufferedWriter extends Writer
Data Stream
Data streams are particularly suitable for reading and writing primitive data to and from streams. The primitive data type values can be a String or int, long, float, double, byte, short, boolean, and char. The direct implementation classes for Data I/O stream are DataInputStream and DataOuputStream, which implements DataInput and DataOutput interfaces apart from extending FilterInputStream and FilterOutputStream, respectively.
public class DataOutputStream extends FilterOutputStream implements DataOutput public class DataInputStream extends FilterInputStream implements DataInput
Object Stream
As the name suggests, Object Stream deals with Java objects. That means, instead of dealing with primitive values like Data Stream objects, Object Stream performs I/O operations on objects. Primitive values are atomic, whereas Java objects are composite by nature. The primary interfaces for Object Stream are ObjectInput and ObjectOutput, which are basically an extension of the DataInput and DataOutput interfaces, respectively. The implementation classes for Object Stream are as follows.
public class ObjectInputStream extends InputStream implements ObjectInput, ObjectStreamConstants public class ObjectOutputStream extends OutputStream implements ObjectOutput, ObjectStreamConstants
As Object Stream is closely associated with Serialization. The ObjectStreamConstants interface provides several static constants as stream modifiers for the purpose.
Refer to Java Documention for specific examples of each stream type.
Following is a rudimentary hierarchy of Java IO classes.
Figure 2: A rudimentary hierarchy of Java IO classes
Note: Since version 1.4, there is another I/O system defined within the core API group, called NIO (New I/O), which took Java I/O handling features one step further. It supports channel based, buffer-oriented I/O operations. Version 1.7 enhanced this library considerably with new capabilities of file handling and file system support features. Since then, it is called NIO.2. The new package is called java.nio. This package is, however, not meant to replace java.io; rather, it complements it with finer/newer capabilities. |
Input Streams
Input stream classes are derived from the abstract class java.io.InputStream. The basic operations of this class are as follows:
abstract in read() int read(byte[] b (byte[] b, int off, int len) int available() void close() void mark(int readlimit) boolean markSupported() void reset() long skip(long n)
Output Streams
All output stream classes are the extension of the abstract class java.io.OutputStream. It contains the following variety of operations:
void write(byte[] b) void write(byte[] b, int off, int len) abstract void write(int b) void flush() void close()
It may seem overwhelming at the beginning, but observe that no matter which extension classes you use, you’ll end up using these methods for I/O streaming. For example, ByteArrayOutputStream is a direct extension of the OutputStream class; you will use these methods to write into an extensible array. Similarly, FileOutputStream writes onto a file, but internally it uses native code because “File” is a product of the file system and it completely depends upon the underlying platform on how it is actually maintained. For example, Windows has a different file system than Linux.
Observe that both the OutputStream and InputStream provide a raw implementation of methods. They do not bother about the data formats we want to use. The extension classes are more specific in this matter. It may happen that the supplied extension classes are also insufficient in providing our need. In such a situation, we can customize our own stream classes. Remember, the InputStream and OutputStream classes are abstract, so they can be extended to create a customized class and give a new meaning to the read and write operations. This is the power of polymorphism.
Filter Streams, such as PushBackInputStream and PushbackOutputStream and other sub extensions, provide a sense of customized implementation of the stream lineage. They can be chained to receive data from a filtered stream to another data packet along the chain. For example, a compressed network stream can be chained to a BufferedInputStream and then to a compressed data through CipherInputStream to GZIPInputStream and then to a InputStreamReader to ultimately realize the actual data.
Refer to the Java API documentation for specific details on the classes and methods discussed above.
Conclusion
The underlying principles of stream classes are undoubtedly complex. But, the interface surfaced through the Java API is relatively simple enough to ignore the underlying details. Focus on these four classes: InputStream, OutputStream, Reader, and Writer. This will help to get a grip on the APIs initially and then use a top-down approach to learn its extension. I suppose this is the key to streamline your understanding of the Java I/O stream. Happy learning!
References
- Elliotte R. Harold, Java Network Programming, O’Reilly, 2nd Edition
- H.M.Dietel, P.J.Dietel, Java How to Program, Pearson, 6th Edition
- Herbert Schildt, Java The Complete Reference, Oracle, 9th Edition
- https://docs.oracle.com/javase/tutorial/essential/io/index.html
- Java API documentation