Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / XML

Make it easy: Convert CSV files to XML with LINQ

4.44/5 (7 votes)
19 Jan 2009CPOL2 min read 98.6K   1.8K  
A small routine to convert a CSV file to a well formatted XML document using LINQ.

Introduction

Recently, I developed a Windows application that received information via a CSV file. I needed to query the information to extract values and make a series of statistic calculations, and obviously, the CSV format was not the ideal way to do them.

I did a small research on the Internet, but I couldn't find any free code to do the job. So, I decided to make it myself, and surprisingly, I discovered a simple way using LINQ to XML.

In this article, I have exposed the code and a small console program to test it. You can use it as you wish. Enjoy it!

Background

The method described here converts a CSV file with an undetermined number of rows and fields to a well formatted XML file.

CSV restriction: the CSV file need to have the first row with the name of the fields, as in the following example:

Name, Surname, Country, Job, Cabin
Garcia, Jose, Cuba,Software Developer,345A
Lenon,Tim,USA,SoftwareDeveloper,444
Rusell, Anthony, UK,Web Designer,345
Wolf, Werner, Germany,Linux IT,234

and the routine converts the file to the following XML document:

XML
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
<row> 
<var name="Name" value="Garcia" /> 
<var name=" Surname" value=" Jose" />
<var name=" Country" value=" Cuba" />
<var name=" Job" value="Software Developer" /> 
</row>
<row> 
... 
</row>  
</root>

That is a flat XML file with all fields converted to a value element with the variable name and the variable value. This schema is repeated for each row.

Using the code

The code is simple. Unlike the baroque constructions that DOM need to implement an XML structure, LINQ makes the process surprisingly simple. We use XDocument, the class that LINQ uses to manage the XML InfoSet.

You can see it in the following code:

C#
using System;
using System.Xml.Linq;

namespace jagg.CsvToXml
{
   public static class ConversorCsvXml 
   {
        /// <summary>
        /// Conversion the input file from csv format to XML
        /// Conversion Method
        /// </summary>
        /// <param name="csvString" > 
        /// cvs string to converted
        /// </param>
        /// <param name="separatorField">
        /// separator used by the csv file
        /// </param>
        /// <return>
        /// XDocument with the created XML
        /// </return>
        public static XDocument ConvertCsvToXML(string csvString, string[] separatorField)

        {
            //split the rows
            var sep = new[] {"\r\n"};
            string[] rows = csvString.Split(sep, StringSplitOptions.RemoveEmptyEntries);
            //Create the declaration
            var xsurvey = new XDocument(
                new XDeclaration("1.0", "UTF-8", "yes"));
            var xroot = new XElement("root"); //Create the root
            for (int i = 0; i < rows.Length; i++)
            {
                //Create each row
                if (i > 0)
                {
                    xroot.Add(rowCreator(rows[i], rows[0], separatorField));
                }
            }
            xsurvey.Add(xroot);
            return xsurvey;
        }

        /// <summary>
        /// Private. Take a csv line and convert in a row - var node
        /// with the fields values as attributes. 
        /// <param name=""row"" />csv row to process</param />
        /// <param name=""firstRow"" />First row with the fields names</param />
        /// <param name=""separatorField"" />separator string use in the csv fields</param />
        /// </summary></returns />
        private static XElement rowCreator(string row, 
                       string firstRow, string[] separatorField)
        {

            string[] temp = row.Split(separatorField, StringSplitOptions.None);
            string[] names = firstRow.Split(separatorField, StringSplitOptions.None);
            var xrow = new XElement("row");
            for (int i = 0; i < temp.Length ; i++)
            {
                //Create the element var and Attributes with the field name and value
                var xvar = new XElement("var",
                                        new XAttribute("name", names[i]),
                                        new XAttribute("value", temp[i]));
                xrow.Add(xvar);
            }
            return xrow;
        }
    }
}

To use the class, you only need to call the ConvertCsvToXML method with the appropriate parameters. The class is static, and you don't need to create it. The comments are not well formatted, so correct them in your code.

Here is a small test program that converts our CSV example to XML:

C#
using System;
using System.IO;
using System.Xml.Linq;
using jagg.CsvToXml;

namespace TestCsvToXml
{
    internal class Program
    {
        /// <summary>
        /// Simple test conversion
        /// </summary>
        private static void Main()
        {
            string csv = File.ReadAllText("csvexample.csv");
            XDocument doc = ConversorCsvXml.ConvertCsvToXML(csv, new[] {","});
            doc.Save("outputxml.xml");
            Console.WriteLine(doc.Declaration);
            foreach (XElement c in doc.Elements())
            {
                Console.WriteLine(c);
            }
            Console.ReadLine();
        }
    }
}

This code stores the result in a outputxml.xml file and shows the resulting XML in the console:

ConvertCsvToXml/ConversorCSVXml.jpg

Points of interest

This class show us how much we can simplify our programs using LINQ. If you have worked with DOM, you can compare the models and see how the construction of an XML document is drastically simplified by LINQ.

History

  • 19.01.2009 - First version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)