Introduction
Doing the bookkeeping and reporting for our app, I had to import a lot of different CSV files, as:
- Google merchant reports
- Apple reports
- Paypal stuff
- Bank account exports
Creating CSV record entities, reading and converting data and setting the properties of the entities was quite boring work to do. So I wanted a simple solution for operating with the data of a CSV file in a typesafe way.
Using the Code
The idea was to create a class describing a line of a CSV file. With the attribute "CsvField
", the name of the CSV file column is bound to the property.
public class MyCsvRecord{
[CsvField("Order ID")]
public String Orderid{get;set;}
[CsvField("Merchant Order Number")]
public double Merchantordernumber{get;set;}
[CsvField("Order Creation Date",DateFormat="yyyy-MM-dd")]
public DateTime Ordercreationdate{get;set;}
public static List<mycsvrecord> LoadFromFile(String file){
return CsvParser.ParseCsvAuto<MyCsvrecord>(file, false, ',', Encoding.UTF8);
}
}
Currently, the following datatypes are supported:
String
int
double
DateTime
If DateTime
is used, the additional property DateFormat
of CsvField
should be specified (ParseExact String
).
For other types, you can specify an IValueConverter
:
...
[CsvField("FieldName", ValueConverter=typeof(MyConverter))]
public bool IsEur{get;set;}
...
class MyConverter : IValueConverter{
public object FromString(string value)
{
return value.ToUpper() == "EUR";
}
}
Using the Code Generator
The code also includes a generator (CsvParser.Generator
) creating the source for a CSV record class from a given CSV file.
The datatypes of the properties are guessed by investigating the data of the given CSV file.
Details of the used rules for guessing the datatype can be found in:
private static String GuessType(String dataValue, out String dateFormat){
dateFormat = null;
if (dataValue.Trim()=="") return "String";
int numericCharCount = 0;
int minusCount = 0;
int dotCount = 0;
int commaCount = 0;
int colonCount = 0;
int blankCount = 0;
int noneNumeriCharCount = 0;
foreach(var ch in dataValue){
if ('0' <= ch && ch <= '9') ++numericCharCount;
else if (ch=='.') ++dotCount;
else if (ch==',') ++commaCount;
else if (ch=='-') ++minusCount;
else if (ch==':') ++colonCount;
else if (ch==' ') ++blankCount;
else{
++noneNumeriCharCount;
}
}
if (noneNumeriCharCount > 0) return "String";
if ((minusCount + commaCount + dotCount + colonCount) == 0) return "int";
if (colonCount==0 && (commaCount+dotCount)==1) return "double";
if (commaCount==0){
if (colonCount==0){
if (minusCount == 2 && dotCount == 0) {
dateFormat = "yyyy-MM-dd";
return "DateTime";
}
if (minusCount == 0 && dotCount == 2) {
dateFormat = "dd.MM.yyyy";
return "DateTime";
}
}
if (colonCount==2){
if (minusCount == 2 && dotCount == 0) {
dateFormat = "yyyy-MM-dd HH:mm:ss";
return "DateTime";
}
if (minusCount == 0 && dotCount == 2) {
dateFormat = "dd.MM.yyyy HH:mm:ss";
return "DateTime";
}
}
}
return "String";
}
History
- 8th February, 2014: Initial version
- 9th February, 2014: Uploaded a new version of the source adding support for parsing double values with thousands separator
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.