Apache POI – Reading Excel sheet using Java

POI, Apache POI – Java API to access Microsoft format files. POI (Poor Obfuscation Implementation) API is a fantastic way for Java programmers to access Microsoft document formats. The POI project consists of APIs for manipulating various file formats based upon Microsoft’s OLE 2 Compound Document format using pure Java. In short, you can read and write MS Excel files using Java. An alternate way of generating a spreadsheet is via the Cocoon serializer.

also read:

HSSF is the POI Project’s pure Java implementation of the Excel ’97(-2002) file format. We will see how to read the data from a Excel sheet and display in console using java code in this article.

Features of HSSF

  • HSSF provides a way to read spreadsheets create, modify, read and write XLS.
  • Eventmodel api for efficient read-only access.
  • It provides full usermodel api for creating, reading and modifying XLS files.

Where to get the POI API?

You can download the POI API from http://poi.apache.org/

Apache POI – Terminology

Before getting in to HSSF, we will see some of the POI-Terminologys

  • POIFS (Poor Obfuscation Implementation File System): Java APIs for reading and writing OLE (Object Linking and Embedding) 2 compound document formats.
  • HSSF (Horrible Spreadsheet Format): Java API to read Microsoft Excel.
  • HDF (Horrible Document Format): Java API to read and write Microsoft Word 97.
  • HPSF (Horrible Property Set Format): Java API for reading property sets using (only) Java.

Reading data from Excel format file and displaying to console

Let us assume we have the following excel file (test.xls) with us.

Now let us see how to read through the rows and cells and get the data and display in the console.

Apache POI – Code Sample

The following java program reads a excel file and displays the data to the console.

/*
* POIExcelReader.java
*
* Created on 7 October, 2007, 9:05 PM
*/

package com.ms.util;

//~--- non-JDK imports --------------------------------------------------------

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRichTextString;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;

//~--- JDK imports ------------------------------------------------------------

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;

import java.util.Iterator;

/**
* This java program is used to read the data from a Excel file and display them
* on the console output.
*
* @author dhanago
*/
public class POIExcelReader
{

/** Creates a new instance of POIExcelReader */
public POIExcelReader ()
{}

/**
* This method is used to display the Excel content to command line.
*
* @param xlsPath
*/
@SuppressWarnings ("unchecked")
public void displayFromExcel (String xlsPath)
{
InputStream inputStream = null;

try
{
inputStream = new FileInputStream (xlsPath);
}
catch (FileNotFoundException e)
{
System.out.println ("File not found in the specified path.");
e.printStackTrace ();
}

POIFSFileSystem fileSystem = null;

try
{
fileSystem = new POIFSFileSystem (inputStream);

HSSFWorkbook      workBook = new HSSFWorkbook (fileSystem);
HSSFSheet         sheet    = workBook.getSheetAt (0);
Iterator rows     = sheet.rowIterator ();

while (rows.hasNext ())
{
HSSFRow row = rows.next ();

// display row number in the console.
System.out.println ("Row No.: " + row.getRowNum ());

// once get a row its time to iterate through cells.
Iterator cells = row.cellIterator ();

while (cells.hasNext ())
{
HSSFCell cell = cells.next ();

System.out.println ("Cell No.: " + cell.getCellNum ());

/*
 * Now we will get the cell type and display the values
 * accordingly.
 */
switch (cell.getCellType ())
{
	case HSSFCell.CELL_TYPE_NUMERIC :
	{

		// cell type numeric.
		System.out.println ("Numeric value: " + cell.getNumericCellValue ());

		break;
	}

	case HSSFCell.CELL_TYPE_STRING :
	{

		// cell type string.
		HSSFRichTextString richTextString = cell.getRichStringCellValue ();

		System.out.println ("String value: " + richTextString.getString ());

		break;
	}

	default :
	{

		// types other than String and Numeric.
		System.out.println ("Type not supported.");

		break;
	}
}
}
}
}
catch (IOException e)
{
e.printStackTrace ();
}
}

/**
* The main executable method to test displayFromExcel method.
*
* @param args
*/
public static void main (String[] args)
{
POIExcelReader poiExample = new POIExcelReader ();
String         xlsPath    = "c://test//test.xls";

poiExample.displayFromExcel (xlsPath);
}
}

Code walk through

  • POIFSFileSystem is the main class for POIFS system. POIFSFileSystem manages the entire life cycle for the file system.
  • POIFSFileSystem has a constructor which can take a InputStream as the parameter. Here in the above code we have created a FileInputStream and assigned to the InputStream. This inputStream object is passed to the POIFSFileSystem constructor and POIFSFileSystem object is created.
  • After creating POIFSFileSystem object, HSSFWorkbook object has to be created.
  • HSSFWorkbook object is a high level representation of a workbook. This is the first object most users will construct whether they are reading or writing a workbook. It is also the top level object for creating new sheets/etc.
  • HSSFWorkbook object is created using POIFSFileSystem object. By using the constructor HSSFWorkbook(POIFSFileSystem fs).
  • Once we get the HSSFWorkbook object it is very easy to get the Sheet. HSSFWorkbook has a method getSheetAt(int index) Get the HSSFSheet object at the given index.
  • Note that index starts at zero. In our code we have used like “workBook.getSheetAt (0)” means we are intrested in first sheet.
  • The above method will give us a HSSFSheet object.
  • HSSFSheet object has got a method called “rowIterator()”. This will give us all the rows in a Iterator and is of type HSSFRow
  • By Iterating this in a while loop we can get each and every for and cells in them.
  • HSSFRow has a method called “cellIterator()” This will also return a Iterator consisting of type HSSFCell.
  • By Iterating this we will get individual HSSFCell objects.
  • By getting the HSSFCell we can get the cell type by using the method “getCellType()”
  • By finding the cell type we can use the appropriate method to get the values as shown in the code above. Then it is up to the programmers requirement to use the values got accordingly. Here we have simply displayed it in the console.

Output

Row No.: 0
Cell No.: 0
String value: Name
Cell No.: 1
String value: Age
Cell No.: 2
String value: URL
Row No.: 1
Cell No.: 0
String value: Muthukumar Dhanagopal
Cell No.: 1
Numeric value: 33.0
Cell No.: 2
String value: http://javawave.blogspot.com
Row No.: 2
Cell No.: 0
String value: Krish
Cell No.: 1
Numeric value: 27.0
Cell No.: 2
String value: http://javabeat.net
Row No.: 3
Cell No.: 0
String value: Sri Hariharan Muthukumar
Cell No.: 1
Numeric value: 3.0
Cell No.: 2
String value: http://dhanago.blogspot.com

Other operations you can do with HSSF

  • to create a new workbook
  • to create a sheet
  • to create cells
  • to create date cells
  • Working with different types of cells
  • Aligning cells
  • Working with borders
  • Fills and color
  • Merging cells
  • Working with fonts
  • Custom colors
  • Reading and writing
  • Use newlines in cells.
  • Create user defined data formats
  • Fit Sheet to One Page
  • Set print area for a sheet
  • Set page numbers on the footer of a sheet
  • Shift rows
  • Set a sheet as selected
  • Set the zoom magnification for a sheet
  • Create split and freeze panes
  • Repeating rows and columns
  • Headers and Footers
  • Drawing Shapes
  • Styling Shapes
  • Shapes and Graphics2d
  • Outlining
  • Images
  • Named Ranges and Named Cells
  • How to set cell comments
  • How to adjust column width to fit the contents

Summary

Through this article i have just given the much needed push (Even eagles need a push). It is up to the programmers or developers who wanna proceed further to know more about POI and its API. Now a days there are many applications which are there in real time which uses Excel sheets and need to be read from Java and vice verse. POI is a very helpful, fantastic and easy tool helping Java programmers in achieving this.

Comments

comments

About Krishna Srinivasan

He is Founder and Chief Editor of JavaBeat. He has more than 8+ years of experience on developing Web applications. He writes about Spring, DOJO, JSF, Hibernate and many other emerging technologies in this blog.

Comments

  1. Sravan says:

    Nice tutorial…thank you very much!

  2. In above example, numeric values were read and printed as 33.0 instead can it be only 33 (as mentioned in excel sheet?)

  3. package com.read.xlsx;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;
    import java.io.IOException;

    import org.apache.poi.ss.usermodel.Cell;
    import org.apache.poi.ss.usermodel.DateUtil;
    import org.apache.poi.ss.usermodel.Row;
    import org.apache.poi.ss.usermodel.Sheet;
    import org.apache.poi.ss.usermodel.Workbook;
    import org.apache.poi.ss.usermodel.WorkbookFactory;
    import org.apache.poi.ss.util.CellReference;
    import org.apache.poi.xssf.usermodel.XSSFCell;
    public class Readxls {

    public static void main(String[] args) throws Exception, FileNotFoundException, IOException {
    String filename = “C:\Documents and Settings\asarpal2\Desktop\Resource Master\Report 1 .xlsx”;

    Workbook workbook = WorkbookFactory.create(new FileInputStream(filename)); // or sample.xls
    System.out.println(“Number Of Sheets” + workbook.getNumberOfSheets());
    Sheet sheet = workbook.getSheetAt(0);
    System.out.println(“Number Of Rows:” + sheet.getLastRowNum());

    for (Row row : sheet) {
    for (Cell cell : row) {

    CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
    System.out.print(cellRef.formatAsString());

    System.out.print(” – “);

    switch (cell.getCellType()) {
    case Cell.CELL_TYPE_STRING:
    System.out.println(cell.getRichStringCellValue().getString());
    break;
    case Cell.CELL_TYPE_NUMERIC:
    if (DateUtil.isCellDateFormatted(cell)) {
    System.out.println(cell.getDateCellValue());
    } else {
    System.out.println(cell.getNumericCellValue());
    }
    break;
    case Cell.CELL_TYPE_BOOLEAN:
    System.out.println(cell.getBooleanCellValue());
    break;
    case Cell.CELL_TYPE_FORMULA:
    System.out.println(cell.getCellFormula());
    break;
    default:
    System.out.println();
    }
    }
    }
    /* for (int i=row.getFirstCellNum();i<=row.getLastCellNum();i++){
    if( null!=row.getCell(i)){
    if(row.getCell(i).getCellType()==0){
    System.out.println("Cell Value "+i+" : " + row.getCell(i).getNumericCellValue());
    }if(row.getCell(i).getCellType()==1){
    System.out.println("Cell Value "+i+" : " + row.getCell(i).getStringCellValue());
    System.out.println("Cell Value "+i+" : " + row.getCell(i).getDateCellValue());
    }
    }
    }*/
    }

    }

  4. how would i use this to a form using javascript specifically?

  5. i have a database with fields,(duration,activity,task,date and comments),is there a way that after uploading an excel file values are automatically inserted to the fields of the database, i mean i know that there is a way but how?

  6. I use Fillo, IT is an Excel API for Java and you can query xls & xlsx files. Now, it supports SELECT and UPDATE queries with or without WHERE clause.

    http://www.codoid.com/products/view/2/29

  7. I use Aspose.Cells for Java for managing my excel files and Aspose components for other MS office files as well. You can create, read, modify and convert excel files to many other formats by using this API:

    http://www.aspose.com/java/excel-component.aspx

  8. Jayashree says:

    Trying to read the .Xslx file using java, unable to find the code.
    Please help me on this.
    I am trying to run the above code, but im getting the error, plz find below.

    Exception in thread “main” org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
    at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:62)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:404)
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:155)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.(XSSFWorkbook.java:186)
    at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:79)
    at forms.Readxls.main(Readxls.java:23)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
    … 5 more
    Caused by: java.lang.NoClassDefFoundError: javax/xml/stream/XMLStreamException
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
    at java.lang.Class.getMethod0(Unknown Source)
    at java.lang.Class.getMethod(Unknown Source)
    at org.apache.xmlbeans.XmlBeans.buildMethod(XmlBeans.java:174)
    at org.apache.xmlbeans.XmlBeans.buildNodeMethod(XmlBeans.java:195)
    at org.apache.xmlbeans.XmlBeans.buildNodeToCursorMethod(XmlBeans.java:232)
    at org.apache.xmlbeans.XmlBeans.(XmlBeans.java:131)
    at org.openxmlformats.schemas.drawingml.x2006.main.ThemeDocument$Factory.parse(Unknown Source)
    at org.apache.poi.xssf.model.ThemesTable.(ThemesTable.java:44)
    … 10 more

  9. Rk sujit says:

    i we use this in web application

  10. Rk sujit says:

    how we can use this in web application??

Speak Your Mind

*