Monday, January 4, 2010

Chapter 20. Readers and Writers










Chapter 20. Readers and Writers


You're probably going to experience a little déjà vu in this chapter. The java.io.Writer class is modeled on the java.io.OutputStream class. The java.io.Reader class is modeled on the java.io.InputStream class. The names and signatures of the methods of the Reader and Writer classes are similar (sometimes identical) to the names and signatures of the methods of the InputStream and OutputStream classes. The patterns these classes follow are similar as well. Filtered input and output streams are chained to other streams in their constructors. Filtered readers and writers are chained to other readers and writers in their constructors. InputStream and OutputStream are abstract superclasses that identify common functionality in the concrete subclasses. Likewise, Reader and Writer are abstract superclasses that identify common functionality in the concrete subclasses. The difference between readers and writers and input and output streams is that streams are fundamentally byte-based while readers and writers are fundamentally character-based. Where an input stream reads a byte, a reader reads a character. Where an output stream writes a byte, a writer writes a character.


While bytes are a more or less universal concept, characters are not. As you learned in the last chapter, the same character can be encoded differently in different character sets, and different character sets include different characters. Characters can even have different sizes in different character sets. For example, ASCII and Latin-1 use 1-byte characters. UTF-8 uses characters of varying width between one and four bytes.


A language that supports international text must separate the reading and writing of raw bytes from the reading and writing of characters. Classes that read characters must be able to parse a variety of character encodings, not just ASCII, and translate them into the language's native character set. Classes that write characters must be able to translate the language's native character set into a variety of formats and write those. In Java, this task is performed by the InputStreamReader and OutputStreamWriter classes.












No comments: