org.apache.commons.io.input
Class BOMInputStream

java.lang.Object
  extended by java.io.InputStream
      extended by java.io.FilterInputStream
          extended by org.apache.commons.io.input.ProxyInputStream
              extended by org.apache.commons.io.input.BOMInputStream
All Implemented Interfaces:
Closeable

public class BOMInputStream
extends ProxyInputStream

This class is used to wrap a stream that includes an encoded ByteOrderMark as its first bytes. This class detects these bytes and, if required, can automatically skip them and return the subsequent byte as the first byte in the stream. The ByteOrderMark implementation has the following pre-defined BOMs:

Example 1 - Detect and exclude a UTF-8 BOM

 BOMInputStream bomIn = new BOMInputStream(in);
 if (bomIn.hasBOM()) {
     // has a UTF-8 BOM
 }
 

Example 2 - Detect a UTF-8 BOM (but don't exclude it)

 boolean include = true;
 BOMInputStream bomIn = new BOMInputStream(in, include);
 if (bomIn.hasBOM()) {
     // has a UTF-8 BOM
 }
 

Example 3 - Detect Multiple BOMs

 BOMInputStream bomIn = new BOMInputStream(in, 
   ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE,
   ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE
   );
 if (bomIn.hasBOM() == false) {
     // No BOM found
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_16LE)) {
     // has a UTF-16LE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_16BE)) {
     // has a UTF-16BE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_32LE)) {
     // has a UTF-32LE BOM
 } else if (bomIn.hasBOM(ByteOrderMark.UTF_32BE)) {
     // has a UTF-32BE BOM
 }
 

Since:
2.0
Version:
$Id: BOMInputStream.java 1346400 2012-06-05 14:48:01Z ggregory $
See Also:
ByteOrderMark, Wikipedia - Byte Order Mark

Field Summary
 
Fields inherited from class java.io.FilterInputStream
in
 
Constructor Summary
BOMInputStream(InputStream delegate)
          Constructs a new BOM InputStream that excludes a ByteOrderMark.UTF_8 BOM.
BOMInputStream(InputStream delegate, boolean include)
          Constructs a new BOM InputStream that detects a a ByteOrderMark.UTF_8 and optionally includes it.
BOMInputStream(InputStream delegate, boolean include, ByteOrderMark... boms)
          Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.
BOMInputStream(InputStream delegate, ByteOrderMark... boms)
          Constructs a new BOM InputStream that excludes the specified BOMs.
 
Method Summary
 ByteOrderMark getBOM()
          Return the BOM (Byte Order Mark).
 String getBOMCharsetName()
          Return the BOM charset Name - ByteOrderMark.getCharsetName().
 boolean hasBOM()
          Indicates whether the stream contains one of the specified BOMs.
 boolean hasBOM(ByteOrderMark bom)
          Indicates whether the stream contains the specified BOM.
 void mark(int readlimit)
          Invokes the delegate's mark(int) method.
 int read()
          Invokes the delegate's read() method, detecting and optionally skipping BOM.
 int read(byte[] buf)
          Invokes the delegate's read(byte[]) method, detecting and optionally skipping BOM.
 int read(byte[] buf, int off, int len)
          Invokes the delegate's read(byte[], int, int) method, detecting and optionally skipping BOM.
 void reset()
          Invokes the delegate's reset() method.
 long skip(long n)
          Invokes the delegate's skip(long) method, detecting and optionallyskipping BOM.
 
Methods inherited from class org.apache.commons.io.input.ProxyInputStream
afterRead, available, beforeRead, close, handleIOException, markSupported
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BOMInputStream

public BOMInputStream(InputStream delegate)
Constructs a new BOM InputStream that excludes a ByteOrderMark.UTF_8 BOM.

Parameters:
delegate - the InputStream to delegate to

BOMInputStream

public BOMInputStream(InputStream delegate,
                      boolean include)
Constructs a new BOM InputStream that detects a a ByteOrderMark.UTF_8 and optionally includes it.

Parameters:
delegate - the InputStream to delegate to
include - true to include the UTF-8 BOM or false to exclude it

BOMInputStream

public BOMInputStream(InputStream delegate,
                      ByteOrderMark... boms)
Constructs a new BOM InputStream that excludes the specified BOMs.

Parameters:
delegate - the InputStream to delegate to
boms - The BOMs to detect and exclude

BOMInputStream

public BOMInputStream(InputStream delegate,
                      boolean include,
                      ByteOrderMark... boms)
Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.

Parameters:
delegate - the InputStream to delegate to
include - true to include the specified BOMs or false to exclude them
boms - The BOMs to detect and optionally exclude
Method Detail

hasBOM

public boolean hasBOM()
               throws IOException
Indicates whether the stream contains one of the specified BOMs.

Returns:
true if the stream has one of the specified BOMs, otherwise false if it does not
Throws:
IOException - if an error reading the first bytes of the stream occurs

hasBOM

public boolean hasBOM(ByteOrderMark bom)
               throws IOException
Indicates whether the stream contains the specified BOM.

Parameters:
bom - The BOM to check for
Returns:
true if the stream has the specified BOM, otherwise false if it does not
Throws:
IllegalArgumentException - if the BOM is not one the stream is configured to detect
IOException - if an error reading the first bytes of the stream occurs

getBOM

public ByteOrderMark getBOM()
                     throws IOException
Return the BOM (Byte Order Mark).

Returns:
The BOM or null if none
Throws:
IOException - if an error reading the first bytes of the stream occurs

getBOMCharsetName

public String getBOMCharsetName()
                         throws IOException
Return the BOM charset Name - ByteOrderMark.getCharsetName().

Returns:
The BOM charset Name or null if no BOM found
Throws:
IOException - if an error reading the first bytes of the stream occurs

read

public int read()
         throws IOException
Invokes the delegate's read() method, detecting and optionally skipping BOM.

Overrides:
read in class ProxyInputStream
Returns:
the byte read (excluding BOM) or -1 if the end of stream
Throws:
IOException - if an I/O error occurs

read

public int read(byte[] buf,
                int off,
                int len)
         throws IOException
Invokes the delegate's read(byte[], int, int) method, detecting and optionally skipping BOM.

Overrides:
read in class ProxyInputStream
Parameters:
buf - the buffer to read the bytes into
off - The start offset
len - The number of bytes to read (excluding BOM)
Returns:
the number of bytes read or -1 if the end of stream
Throws:
IOException - if an I/O error occurs

read

public int read(byte[] buf)
         throws IOException
Invokes the delegate's read(byte[]) method, detecting and optionally skipping BOM.

Overrides:
read in class ProxyInputStream
Parameters:
buf - the buffer to read the bytes into
Returns:
the number of bytes read (excluding BOM) or -1 if the end of stream
Throws:
IOException - if an I/O error occurs

mark

public void mark(int readlimit)
Invokes the delegate's mark(int) method.

Overrides:
mark in class ProxyInputStream
Parameters:
readlimit - read ahead limit

reset

public void reset()
           throws IOException
Invokes the delegate's reset() method.

Overrides:
reset in class ProxyInputStream
Throws:
IOException - if an I/O error occurs

skip

public long skip(long n)
          throws IOException
Invokes the delegate's skip(long) method, detecting and optionallyskipping BOM.

Overrides:
skip in class ProxyInputStream
Parameters:
n - the number of bytes to skip
Returns:
the number of bytes to skipped or -1 if the end of stream
Throws:
IOException - if an I/O error occurs


Copyright © 2002-2012 The Apache Software Foundation. All Rights Reserved.