Click here to Skip to main content
15,887,214 members
Articles / Productivity Apps and Services / Microsoft Office / Microsoft Excel
Tip/Trick

Read Excel using NPOI

Rate me:
Please Sign up or sign in to vote.
4.59/5 (9 votes)
5 Nov 2015CPOL1 min read 50.8K   16   1
Get Excel cell values as string

Introduction

The C# code snippet demonstrates how to use the open-source NPOI library to read cell values in Microsoft Excel files.

References: NPOI - https://npoi.codeplex.com/

Background

Excel cell can contain values of different types. Also, values may be evaluated from formulae. We need to cater to these cases when reading the cell values.

Using the Code

I write an abstract class ExcelFileReader providing basic functions to read values from an Excel file.

C#
//
// ExcelFileReader
//
public abstract class ExcelFileReader
{
    protected IWorkbook workbook;
    protected DataFormatter dataFormatter;
    protected IFormulaEvaluator formulaEvaluator;

    //
    // Initialize from a stream of Excel file
    //
    protected void InitializeMembers(Stream excelFileStream)
    {
        this.workbook = WorkbookFactory.Create(excelFileStream);
        if (this.workbook != null)
        {
            this.dataFormatter = new DataFormatter(CultureInfo.InvariantCulture);
            this.formulaEvaluator = WorkbookFactory.CreateFormulaEvaluator(this.workbook);
        }
    }

    //
    // Get formatted value as string from the specified cell
    //
    protected string GetFormattedValue(ICell cell)
    {
        string returnValue = string.Empty;
        if (cell != null)
        {
            try
            {
                // Get evaluated and formatted cell value
                returnValue = this.dataFormatter.FormatCellValue(cell, this.formulaEvaluator);
            }
            catch
            {
                // When failed in evaluating the formula, use stored values instead...
                // and set cell value for reference from formulae in other cells...
                if (cell.CellType == CellType.FORMULA)
                {
                    switch (cell.CachedFormulaResultType)
                    {
                        case CellType.STRING:
                            returnValue = cell.StringCellValue;
                            cell.SetCellValue(cell.StringCellValue);
                            break;
                        case CellType.NUMERIC:
                            returnValue = dataFormatter.FormatRawCellContents
                            (cell.NumericCellValue, 0,  cell.CellStyle.GetDataFormatString())
                            cell.SetCellValue(cell.NumericCellValue);
                            break;
                        case CellType.BOOLEAN:
                            returnValue = cell.BooleanCellValue.ToString();
                            cell.SetCellValue(cell.BooleanCellValue);
                            break;
                        default:
                            break;
                    }
                }
            }
        }

        return (returnValue ?? string.Empty).Trim();
    }

    //
    // Get unformatted value as string from the specified cell
    //
    protected string GetUnformattedValue(ICell cell)
    {
        string returnValue = string.Empty;
        if (cell != null)
        {
            try
            {
                // Get evaluated cell value
                returnValue = (cell.CellType == CellType.NUMERIC || 
                (cell.CellType == CellType.FORMULA && 
                cell.CachedFormulaResultType == CellType.NUMERIC)) ?
                    formulaEvaluator.EvaluateInCell(cell).NumericCellValue.ToString() :
                    this.dataFormatter.FormatCellValue(cell, this.formulaEvaluator);
            }
            catch
            {
                // When failed in evaluating the formula, use stored values instead...
                // and set cell value for reference from formulae in other cells...
                if (cell.CellType == CellType.FORMULA)
                {
                    switch (cell.CachedFormulaResultType)
                    {
                        case CellType.STRING:
                            returnValue = cell.StringCellValue;
                            cell.SetCellValue(cell.StringCellValue);
                            break;
                        case CellType.NUMERIC:
                            returnValue = cell.NumericCellValue.ToString();
                            cell.SetCellValue(cell.NumericCellValue);
                            break;
                        case CellType.BOOLEAN:
                            returnValue = cell.BooleanCellValue.ToString();
                            cell.SetCellValue(cell.BooleanCellValue);
                            break;
                        default:
                            break;
                    }
                }
            }
        }

        return (returnValue ?? string.Empty).Trim();
    }
}

You have to implement this abstract ExcelFileReader, which gets the ISheet object from this.workbook and manipulate its IRow object to get the cell values.

Points of Interest

The try-catch block in the function GetFormattedValue() handles the failure in evaluating the cell formula (says with external reference) by taking the cached value (if any) instead. The cell value is then set to the cached value by calling the SetCellValue() function for any formulae in other cells which reference to this cell. For numeric value, formatted value will be obtained by calling DataFormatter.FormatRawCellContents() function.

Similarily, the function GetUnformattedValue() gets cell value as string without any format applied. It identities numeric cell, which format may be applied, and gets the unformatted value from the ICell.NumericCellValue property.

The CellType enum does not have any item like DATE for date/time value, which are stored as number in Excel. So, there is no direct way to identity the cell is storing a date/time value. I think one way is to see its format by refering to ICell.CellStyle.DataFormat property or ICell.CellStyle.GetDataFormatString() function.

History

  • 6th November, 2015 - First publication

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Hong Kong Hong Kong
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Questionreading Password protected workbook Pin
Ravi Sant17-Jul-17 4:44
Ravi Sant17-Jul-17 4:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.