Saturday, May 24, 2014

Writing Excel Files using EPPLus

In my last post I introduced the EPPLus library and showed how to use it to read Microsoft Excel XLSX files. In this post I will show how we can use it to write these files. Writing is pretty similar to reading but there are a lot more options you are likely to use when writing. Here is the basic code you need to create an XLSX file:

using (var package = new ExcelPackage())
{
var workbook = package.Workbook;
var worksheet = workbook.Worksheets.Add("Sheet1");
var cell = worksheet.Cells["A1"];

cell.Value = 100;
package.SaveAs(new System.IO.FileInfo(@"c:\temp\demoOut.xlsx"));
}

We start out by creating an ExcelPackage object. Since we are going to be creating a new file, we don’t pass a FileInfo object to the constructor like we did when reading an existing file. Next we add a new worksheet to the Worksheets collection of the package. Now that we have a worksheet we can access the cells just like we did when reading, in this case we have a variable called ‘cell’ which points to cell A1. Next we put a value in the cell, in this case the number 100. Finally we need to save the file, we do this by calling the SaveAs function on the package. Note, just like when we open a file to read, SaveAs does NOT take a file name as a parameter, but a FileInfo object instead.

Beyond setting cell values, EPPLus also allows you to change the appearance of the sheet. To change the appearance of a single cell we use the Style property on the cell. For example here is how we would change the font properties on a cell using the cell variable we created above.

cell.Style.Font.Name = "Arial";
cell.Style.Font.Bold = true;
cell.Style.Font.Size = 12;

We can change the background color of a cell like this:

cell.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
cell.Style.Fill.BackgroundColor.SetColor(System.Drawing.Color.Yellow);

First we have to set the PatternType for the background, in this case we will make it solid. Setting the PatternType is required before you can change the color, the second line will throw an exception if the PatternType is not set. In the second line we set the color. If you are doing this from a console application you will need to add a reference to System.Drawing to be able to use System.Drawing.Color.

Another common Excel formatting task is to create borders around a cell. Here is how we would set a thick border around an entire cell:

cell.Style.Border.BorderAround(OfficeOpenXml.Style.ExcelBorderStyle.Thick);

You can also set each part of the border individually by using the Bottom, Top, Left, and Right properties of the Border. Each of these has a Color and Style property to set the appearance of the border.

So far we have been setting the style on a single cell, but we can also use these properties on a group of cells. For example this code will set the color and border on the block of cells between B1 and C10:

var cells = worksheet.Cells["B1:C10"];
cells.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
cells.Style.Fill.BackgroundColor.SetColor(System.Drawing.Color.Blue);                cells.Style.Border.BorderAround(OfficeOpenXml.Style.ExcelBorderStyle.MediumDashed);

You will notice that each cell is set to blue, but the border is put around the whole block, not around each cell.

Besides setting properties on cells you can also work with the properties on columns and rows. For example here is how to set the width of a column:
worksheet.Column(1).Width = 100;

It’s important to remember that columns and rows start numbering at one no zero. EPPLus will allow you to use column and row index 0, but you will get an error when you try to open the workbook in Excel. To hide a column you can do this:

worksheet.Column(1).Hidden = true;

There are also similar properties for rows:

worksheet.Row(1).Height = 20;
worksheet.Row(1).Hidden = true;

These are some of the basic tasks you will do when creating an Excel file, but EPPLus lets you do almost anything with a workbook that you can do directly in Excel.

Saturday, May 10, 2014

Working with XLSX files

It’s not unusual when writing applications, especially business applications, then you will have a need to work with Microsoft Excel files. If you are working with .XLSX format files there is an excellent open source library called EPPlus that makes it’s easy to both read and write these files. In my next few posts I want to show how to use this library to handle various tasks. I will start with a discussion of how to use EPPlus to read from existing XLSX files.
First you will need to download the library. The easiest way to do this is to open your Visual Studio Project and in the Package Manager Console type:
Install-Package EPPlus
For the purposes of these posts I will be using version 3.1.3. At the time of this writing Version 4.0 is in the works but it is currently Beta, so I am going to stick with the older stable version for now.
I am going to be doing this demo in a Console application but you could also use the library in a Winforms or even a web application. First you need to import the following namespace:

using OfficeOpenXml;

Now let’s say that I know I have a number in cell B2 of the spreadsheet, here is the code to read that value:

static void Main(string[] args)
{
  using (var package = new ExcelPackage(new System.IO.FileInfo(@"c:\temp\demo.xlsx"))) {
       var workbook = package.Workbook;
       var worksheet = workbook.Worksheets[1];

       double data = (double)worksheet.Cells["B2"].Value;
       Console.WriteLine(data);
  }

  Console.ReadLine();
}

First we need to open the Excel file we want to read. We do this by creating a new ExcelPackage object. ExcelPackage won’t take a filename as a parameter so you have to either pass in an existing FileStream or, as I did in this case, a FileInfo object. I have put everything in a using block to be sure the ExcelPackage gets disposed properly.

Once we have the package open, we get the Workbook object. Workbook contains a collection of all the Worksheets in the file. In this case we will assume there is only one worksheet so we will get the first one into worksheet variable. Note that the Worksheet starts at 1 not 0. Now that we have the worksheet we want to work with we can access any cell by using the Cells property. Here we are getting the Value in cell B2 which will return an object type that in this case we case to a double. In a real application you will want to do some error checking to be sure that cell really contains the type you are expecting.

Another way you can access cells is to use their column and row coordinate like this:

double data = (double)worksheet.Cells[2,2].Value;

Again, note that the rows and columns starting numbering at 1 not 0. This method is useful when you need to iterate over a series of cells. You can also iterate over a specific range of cells like this:


var range = worksheet.Cells["B2:B5"];
foreach (var cell in range)
{
    Console.WriteLine(c.Value);
}

Up until this point we have been using the Value property of the cell which get the raw value. There is also a property called Text which returns what would be displayed in the cell as a string. For example if a cell contained the value .5 and it was formatted as a percentage, Value would return the double value .5, but Text would return the string “50%”. You can also access the formula in a cell by using the Formula property which will return a string containing the formula, or an empty string if the cell does not have a formula.

One final way of querying cells is to use a LINQ query:

var query = (from c in worksheet.Cells["B2:B13"]
                  where c.Value is double
                  && (double)c.Value > 7
                  select c);
foreach (var c in query) Console.WriteLine(c.Address + " " + c.Value.ToString());

This piece of code searches for all cells in the range of B2 to B13 that have a value greater than seven and then prints that address (e.g. B2) and the value. Note that in the where clause we first check if the value is a double in case the cell contains something that can’t be cast to a double.