Removing extra comma in csv while loading data to a datatable provided some conditions.

Question

1.00/5 (1 vote)

See more:

I have a situation in which I am loading data from csv to SQL using DataTable. One challenge with the data known initially was that it may have an extra comma in one of the fields which needs to be removed. This is done perfectly as shown in the code below using if block.

Now recently this app failed since the csv file contained more fields with a comma. Is there a way we can use while block here to test if still oStreamDataValues[22] != "" then rest of the fields stay as it is and the concatenation happens in columnName[12] considering column[8] is either fixed or is already fixed. The idea is to run the while loop till columnName[22] is blank and then put that as 'Y' to mark that this row had anomalies and load the corrected data in DataTable to upload to SQL.

What I have tried:

C#

private static void ReadCsvFileData()
        {
            var destinationFileNameFullPath2 = ConfigurationManager.AppSettings["destinationFileNameFullPath2"]; //full path and name of the csv file
            var oStreamReader = new StreamReader(destinationFileNameFullPath2);
            var oDataTable = new DataTable();
            var rowCount = 0;
            string[] columnNames = null; // for column names
            while (!oStreamReader.EndOfStream)
            {
                var oStreamRowData = oStreamReader.ReadLine().Trim();
                if (oStreamRowData.Length > 0)
                {
                    var oStreamDataValues = oStreamRowData.Split(',');
                    if (rowCount == 0)
                    {
                        rowCount = 1;
                        columnNames = oStreamDataValues;
                        foreach (var csvHeader in columnNames)
                        {
                            var oDataColumn = new DataColumn(csvHeader.ToUpper(), typeof(string));
                            oDataColumn.DefaultValue = string.Empty;
                            oDataTable.Columns.Add(oDataColumn);
                        }
                    }
                    else
                    {
                        var oDataRow = oDataTable.NewRow();
                        for (var i = 0; i < columnNames.Length; i++)
                        {
                            if (oStreamDataValues[22] != "")
                            {
                                oDataRow[columnNames[0]] = oStreamDataValues[0] == null ? string.Empty : oStreamDataValues[0].Trim();
                                oDataRow[columnNames[1]] = oStreamDataValues[1] == null ? string.Empty : oStreamDataValues[1].Trim();
                                oDataRow[columnNames[2]] = oStreamDataValues[2] == null ? string.Empty : oStreamDataValues[2].Trim();
                                oDataRow[columnNames[3]] = oStreamDataValues[3] == null ? string.Empty : oStreamDataValues[3].Trim();
                                oDataRow[columnNames[4]] = oStreamDataValues[4] == null ? string.Empty : oStreamDataValues[4].Trim();
                                oDataRow[columnNames[5]] = oStreamDataValues[5] == null ? string.Empty : oStreamDataValues[5].Trim();
                                oDataRow[columnNames[6]] = oStreamDataValues[6] == null ? string.Empty : oStreamDataValues[6].Trim();
                                oDataRow[columnNames[7]] = oStreamDataValues[7] == null ? string.Empty : oStreamDataValues[7].Trim();
                                oDataRow[columnNames[8]] = oStreamDataValues[9] == null ? oStreamDataValues[8].Trim() : oStreamDataValues[8].Trim() + ' ' + oStreamDataValues[9].Trim();
                                oDataRow[columnNames[9]] = oStreamDataValues[10] == null ? string.Empty : oStreamDataValues[10].Trim();
                                oDataRow[columnNames[10]] = oStreamDataValues[11] == null ? string.Empty : oStreamDataValues[11].Trim();
                                oDataRow[columnNames[11]] = oStreamDataValues[12] == null ? string.Empty : oStreamDataValues[12].Trim();
                                oDataRow[columnNames[12]] = oStreamDataValues[13] == null ? string.Empty : oStreamDataValues[13].Trim();
                                oDataRow[columnNames[13]] = oStreamDataValues[14] == null ? string.Empty : oStreamDataValues[14].Trim();
                                oDataRow[columnNames[14]] = oStreamDataValues[15] == null ? string.Empty : oStreamDataValues[15].Trim();
                                oDataRow[columnNames[15]] = oStreamDataValues[16] == null ? string.Empty : oStreamDataValues[16].Trim();
                                oDataRow[columnNames[16]] = oStreamDataValues[17] == null ? string.Empty : oStreamDataValues[17].Trim();
                                oDataRow[columnNames[17]] = oStreamDataValues[18] == null ? string.Empty : oStreamDataValues[18].Trim();
                                oDataRow[columnNames[18]] = oStreamDataValues[19] == null ? string.Empty : oStreamDataValues[19].Trim();
                                oDataRow[columnNames[19]] = oStreamDataValues[20] == null ? string.Empty : oStreamDataValues[20].Trim();
                                oDataRow[columnNames[20]] = oStreamDataValues[21] == null ? string.Empty : oStreamDataValues[21].Trim();
                                oDataRow[columnNames[21]] = oStreamDataValues[22] == null ? string.Empty : oStreamDataValues[22].Trim();
                                oDataRow[columnNames[22]] = 'Y';
                            }
                            else
                            {
                                oDataRow[columnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].Trim();
                            }
                        }
                        //Creating DataTable
                        oDataTable.Rows.Add(oDataRow);
                        
                    }
                }
            }
            oStreamReader.Close();
            oStreamReader.Dispose();
//SQLBulkUpload code goes here.
}

Posted 26-Dec-16 6:20am

DinoOnCodeProject

Updated 27-Dec-16 1:30am

Maciej Los

v2

Add a Solution

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Maciej Los · Answer 1 · 2016-12-27T00:53:00

I'd suggest to use ADO.NET (OleDb) instead of custom csv file reader. ADO.NET (OleDb)[^] allows you to read delimited text files into datatable object. Using schema.ini file you'll be able to define set of columns to read.

Please see:
Accessing Microsoft Office Data from .NET Applications[^]
Much ADO About Text Files[^]
Schema.ini File (Text File Driver)[^]
How to: Add a Schema Definition to a Text File Data Source[^]
Using OleDb to Import Text Files (tab, CSV, custom)[^]
Read Text File Specific Columns[^]

#realJSOP · Answer 2 · 2016-12-27T01:30:00

Solution 2

Look at this article - CSV File Parser[^]

If a line comes back "malformed" you can handle that in the code that calls the parsing engine.

Posted 27-Dec-16 1:30am

#realJSOP