Understanding Byte Streams and Character Streams in Java
Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.
Java performs I/O operations through an abstraction called a stream. There are two basic types of stream defined by Java, called byte stream and character stream. The byte stream classes provide a convenient means for handling input and output of bytes and character streams provide a convenient means for handling input and output of characters, respectively. This article elaborates on these two concepts of stream handling in Java.
Stream Overview
From the perspective of Java, “stream” essentially refers to an abstraction that is used to produce and consume flow of sequential information. The flow of information can be a result of input or output operations performed on any physical device linked to Java I/O subsystem. The actual linked devices may vary, such as a local storage device or network. But, the underlying principle remains the same. Typically, Java stream supports a variety of devices, like a keyboard, network socket, disk file, and so forth. Hence, it provides a convenient way to handle I/O operation in the same way for different types of devices it is actually linked to. The steam classes are bundled in the java.io package.
Note that Java 8 introduced a different type of streams bundled in the java.nio package which have more interesting uses. Let’s not delve into that here; instead, let’s focus on the basic stream classes provided in the java.io package. If you are still interested, refer to the article “Working with Java Stream API” for a glimpse of its uses.
Byte Streams and Character Streams
There are two types of streams in Java: byte and character. When an I/O stream manages 8-bit bytes of raw binary data, it is called a byte stream. And, when the I/O stream manages 16-bit Unicode characters, it is called a character stream. A Unicode set is basically a type of character set where each character corresponds to a specific numeric value within the given character set. Typically, every programming language adopts a particular character set to represent and manage its use of characters. Apart from Unicode, another commonly used character set is ASCII, defined by the International Standard Organization (ISO). At the inception of Java (version 1.0), it did not have character streams; therefore, all I/O operations were byte oriented. The character streams were introduced later (version 1.1). Note that the idea of character and byte streams should not be mixed with low-level I/O operation; they are always bit and bytes, after all. But, the point of character and byte streams essentially provides a convenient and efficient way to handle data streams in Java.
What’s the Difference?
As mentioned earlier, the difference is simply for convenience. Some streams are inherently byte oriented and some are character oriented. As a result, it is always convenient to handle them with appropriate classes and methods defined in I/O stream package. For example, the FileOutputStream is meant for reading a raw stream of bytes, such as image data. Similarly, the FileOutputStream object may be used to write a raw byte stream. Under similar circumstances, reading and writing files based on character-oriented stream of data FileReader and FileWriter may be used, respectively. These classes provide specific methods to manipulate appropriate stream data.
The Byte Stream Classes
At the top of the byte stream class hierarchy, there are two abstract classes: InputStream for byte-oriented input and OutputStream for byte-oriented output operations. The hierarchical layout is as follows:
- InputStream: Top level abstract class for byte-oriented input stream.
- ByteArrayInputStream: An instance of this class contains an internal buffer to read bytes stream.
- FilterInputStream: An instance of this class contains some other input stream as a basic source of data for further manipulation.
- BufferedInputStream: This enables a FilterInputStream instance to make use of a buffer for input data.
- DataInputStream: An instance of this class enables reading primitive Java types from an underlying input stream in a machine-independent manner.
- LineNumberInputStream: An instance of this class aids in keeping track of the current line number of the input stream.
- PushbackInputStream: This provides the ability to push back, or “unread,” a data byte after reading it.
- ByteArrayOutputStream: An instance of this class contains an internal buffer to write a bytes stream.
- FilterOutputStream: An instance of this class contains some other output stream as a basic source of data for further manipulation.
- BufferedOutputStream: This enables a FilterOutputStream instance to make use of a buffer for output data.
- DataOutputStream: An instance of this class enables writing primitive Java types to an underlying output stream in a machine-independent manner.
- PrintStream: This empowers the OutputStream objects with the ability to print representations of various data values conveniently.
The Character Stream Classes
At the top of the character stream class hierarchy, there are two abstract classes: Reader for character-oriented input and Writer for character-oriented output operations. The hierarchical layout is as follows:
- Reader: Top-level abstract class to read to character streams.
- BufferedReader: Provides an in-between buffer for efficiency while reading text from character input stream.
- LineNumberReader: Uses a buffered character input stream that keeps track of line numbers.
- PushbackReader: This enables a character to be pushed back into the stream after reading.
- FileReader: An instance of this class is used for reading character files.
- BufferedWriter: Provides an in-between buffer for efficiency while writing text to a character output stream.
- CharArrayWriter: Implements an auto-increasing character buffer that may be used as a writer.
- FilterWriter: Abstract class for writing filtered character streams.
- OutputStreamWriter: An instance of this class provides a bridge between character streams and byte streams. Characters are encoded into bytes using a specified character set.
- FileWriter: An instance of this class is used for writing character files.
Predefined Streams
Java provides three predefined stream objects: in, out, and err, defined in the System class of the java.lang package. The out object refers to the standard output stream or console. The in object refers to standard input, which is the keyboard. And, the err object refers to a standard error, which again is nothing but the console. As should be obvious, they may be redirected to any other compatible I/O devices, because System.in is nothing but an object of InputStream, and System.out and System.err are objects of the PrintStream class. So, they basically work on a byte-oriented stream although we can use them for reading and writing characters to and from the console.
import java.io.BufferedReader; import java.io.InputStreamReader; public class TestStream < public static void main(String[] args) throws Exception< BufferedReader br=new BufferedReader(new InputStreamReader(System.in)); System.out.println("Enter your name: "); String name=br.readLine(); if(name.length()<=0) System.err.println("Name cannot be empty"); else System.out.println("Hi! "+name); >>
Conclusion
A Java stream acts as a file handling wrapper that operates according to the corresponding I/O constructs. In many cases, character-oriented stream classes and byte-oriented stream classes function in a very similar fashion. But, that does not mean that they are not different. The stream classes defined in the java.io packages are rather simple and unsophisticated, but they do serve their purpose. The new streams introduced with Java 8, on the other hand, are more sophisticated and have numerous interesting uses.
Byte Streams
Programs use byte streams to perform input and output of 8-bit bytes. All byte stream classes are descended from InputStream and OutputStream .
There are many byte stream classes. To demonstrate how byte streams work, we’ll focus on the file I/O byte streams, FileInputStream and FileOutputStream . Other kinds of byte streams are used in much the same way; they differ mainly in the way they are constructed.
Using Byte Streams
We’ll explore FileInputStream and FileOutputStream by examining an example program named CopyBytes , which uses byte streams to copy xanadu.txt , one byte at a time.
import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; public class CopyBytes < public static void main(String[] args) throws IOException < FileInputStream in = null; FileOutputStream out = null; try < in = new FileInputStream("xanadu.txt"); out = new FileOutputStream("outagain.txt"); int c; while ((c = in.read()) != -1) < out.write(c); >> finally < if (in != null) < in.close(); >if (out != null) < out.close(); >> > >
CopyBytes spends most of its time in a simple loop that reads the input stream and writes the output stream, one byte at a time, as shown in the following figure .
Simple byte stream input and output.
Always Close Streams
Closing a stream when it’s no longer needed is very important so important that CopyBytes uses a finally block to guarantee that both streams will be closed even if an error occurs. This practice helps avoid serious resource leaks.
One possible error is that CopyBytes was unable to open one or both files. When that happens, the stream variable corresponding to the file never changes from its initial null value. That’s why CopyBytes makes sure that each stream variable contains an object reference before invoking close .
When Not to Use Byte Streams
CopyBytes seems like a normal program, but it actually represents a kind of low-level I/O that you should avoid. Since xanadu.txt contains character data, the best approach is to use character streams, as discussed in the next section. There are also streams for more complicated data types. Byte streams should only be used for the most primitive I/O.
So why talk about byte streams? Because all other stream types are built on byte streams.
I/O Streams
An I/O Stream represents an input source or an output destination. A stream can represent many different kinds of sources and destinations, including disk files, devices, other programs, and memory arrays.
Streams support many different kinds of data, including simple bytes, primitive data types, localized characters, and objects. Some streams simply pass on data; others manipulate and transform the data in useful ways.
No matter how they work internally, all streams present the same simple model to programs that use them: A stream is a sequence of data. A program uses an input stream to read data from a source, one item at a time:
Reading information into a program.
A program uses an output stream to write data to a destination, one item at time:
Writing information from a program.
In this lesson, we’ll see streams that can handle all kinds of data, from primitive values to advanced objects.
The data source and data destination pictured above can be anything that holds, generates, or consumes data. Obviously this includes disk files, but a source or destination can also be another program, a peripheral device, a network socket, or an array.
In the next section, we’ll use the most basic kind of streams, byte streams, to demonstrate the common operations of Stream I/O. For sample input, we’ll use the example file xanadu.txt , which contains the following verse:
In Xanadu did Kubla Khan A stately pleasure-dome decree: Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea.
- BufferedReader: Provides an in-between buffer for efficiency while reading text from character input stream.