o7planning

Java InputStreamReader Tutorial with Examples

  1. InputStreamReader
  2. UTF-16 InputStreamReader
  3. UTF-8 InputStreamReader

1. InputStreamReader

InputStreamReader is a subclass of Reader, which is a bridge that allows you to convert a byte stream into a character stream. In other words, it allows you to convert an InputStream into a Reader.
Tip: To convert an "InputStream" into a "Reader", you just need to concatenate these two words to form the word "InputStreamReader" and you will get the solution of problem.
InputStreamReader​ constructors
InputStreamReader​(InputStream in)    

InputStreamReader​(InputStream in, String charsetName)    

InputStreamReader​(InputStream in, Charset cs)    

InputStreamReader​(InputStream in, CharsetDecoder dec)

2. UTF-16 InputStreamReader

UTF-16 is a fairly common encoding for Chinese or Japanese text. In this example, we will analyze how InputStreamReader reads UTF-16 texts.
First, take a look at the below Japanese text file, which is encoded in UTF-16.
utf16-file-with-bom.txt
JP日本-八洲
Full code of example:
InputStreamReader_UTF16_Ex1.java
package org.o7planning.inputstreamreader.ex;

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

public class InputStreamReader_UTF16_Ex1 {

    // A file in UTF-16.
    private static final String fileURL = "https://s3.o7planning.com/txt/utf16-file-with-bom.txt";
    
    public static void main(String[] args) throws MalformedURLException, IOException {
        System.out.println(" --- Characters in Character Stream (InputStreamReader) ---");
        readAs_UTF16_Character_Stream();

        System.out.println();
        System.out.println(" --- Bytes in UTF-16 file ---");
        readAs_Binary_Stream();
    }

    private static void readAs_UTF16_Character_Stream() throws MalformedURLException, IOException {

        InputStream is = new URL(fileURL).openStream();
        InputStreamReader isr = new InputStreamReader(is, "UTF-16");

        int charCode;
        while ((charCode = isr.read()) != -1) { // Read each character.
            System.out.println((char) charCode + "  " + charCode);
        }
        isr.close();
    }

    private static void readAs_Binary_Stream() throws MalformedURLException, IOException {

        InputStream is = new URL(fileURL).openStream();

        int byteValue;
        while ((byteValue = is.read()) != -1) { // Read each byte.
            System.out.println((char) byteValue + "  " + byteValue);
        }
        is.close();
    }
}
Output:
--- Characters in Character Stream (InputStreamReader) ---
J  74
P  80
日  26085
本  26412
-  45
八  20843
洲  27954

 --- Bytes in UTF-16 file ---
þ  254
ÿ  255
  0
J  74
  0
P  80
e  101
å  229
g  103
,  44
  0
-  45
Q  81
k  107
m  109
2  50
Create an InputStreamReader object with UTF-16 encoding and wrap an InputStream object:
String url = "https://s3.o7planning.com/txt/utf16-file-with-bom.txt";

InputStream is = new URL(url).openStream();
InputStreamReader isr = new InputStreamReader(is, "UTF-16");
Image below shows bytes in UTF-16 file. The first two bytes (254,255) are used to mark that it the beginning of a UTF-16 text.
InputStreamReader UTF-16 reads the first 2 bytes to determine encoding of the text, and knows that it is working with a UTF-16 text. It joins 2 consecutive bytes to form one character...

3. UTF-8 InputStreamReader

UTF-8 is the world's most popular encoding, which can encode all writing in the world including Chinese characters and Japanese characters. Now we will analyze how InputStreamReader reads UTF-8 texts.
First, take a look at the below Japanese text file, which is encoded in UTF-8:
utf8-file-without-bom.txt
JP日本-八洲
Full code of example:
InputStreamReader_UTF8_Ex1.java
package org.o7planning.inputstreamreader.ex;

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

public class InputStreamReader_UTF8_Ex1 {

    // A file with UTF-8 encoding (And without BOM (Byte Order Mark)).
    private static final String fileURL = "https://s3.o7planning.com/txt/utf8-file-without-bom.txt";
    
    public static void main(String[] args) throws MalformedURLException, IOException {
        System.out.println(" --- Characters in Character Stream (InputStreamReader) ---");
        readAs_UTF8_Character_Stream();

        System.out.println();
        System.out.println(" --- Bytes in UTF-8 file ---");
        readAs_Binary_Stream();
    }

    private static void readAs_UTF8_Character_Stream() throws MalformedURLException, IOException {
        
        InputStream is = new URL(fileURL).openStream();
        InputStreamReader isr = new InputStreamReader(is, "UTF-8");

        int charCode;
        while ((charCode = isr.read()) != -1) { // Read each character.
            System.out.println((char) charCode + "  " + charCode);
        }
        isr.close();
    }

    private static void readAs_Binary_Stream() throws MalformedURLException, IOException {

        InputStream is = new URL(fileURL).openStream();

        int byteValue;
        while ((byteValue = is.read()) != -1) { // Read each byte.
            System.out.println((char) byteValue + "  " + byteValue);
        }
        is.close();
    }
}
Output:
--- Characters in Character Stream (InputStreamReader) ---
J  74
P  80
日  26085
本  26412
-  45
八  20843
洲  27954

 --- Bytes in UTF-8 file ---
J  74
P  80
æ  230
—  151
¥  165
æ  230
œ  156
¬  172
-  45
å  229
…  133
«  171
æ  230
´  180
²  178
Create an InputStreamReader object with UTF-8 encoding and wrap an InputStream object:
String url = "https://s3.o7planning.com/txt/utf8-file-without-bom.txt";

InputStream is = new URL(url).openStream();
InputStreamReader isr = new InputStreamReader(is, "UTF-8");
Image below shows bytes in UFT-8 file:
UTF-8 encoding is much more complicated than UTF-16. It takes 1, 2, 3 or 4 bytes to store a character. This depends on the character's code.
Number of bytes
From
To
Byte 1
Byte 2
Byte 3
Byte 4
1
U+0000
0
U+007F
127
0xxxxxxx
2
U+0080
128
U+07FF
2047
110xxxxx
10xxxxxx
3
U+0800
2048
U+FFFF
65535
1110xxxx
10xxxxxx
10xxxxxx
4
U+10000
65536
U+10FFFF
1114111
11110xxx
10xxxxxx
10xxxxxx
10xxxxxx
Image below is an example showing how UTF-8 InputStreamReader turns 3 bytes of UTF-8 into a 2-bytes Java character:

Java IO Tutorials

Show More