Click here to Skip to main content
15,887,683 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Sub NoNonsenseCSV2DataTable(CSVfile As String)
'A,B,C,D,E
'00001,4,1,2,3560
'00002,4,12,1,2000
'00003,1,1,2,4500
'00004,4,12,1,2538.63
'00005,1,1,2,3400
'00006,1,1,2,2996.48
Dim dTable As New DataTable
Using reader As New StreamReader(CSVfile)
    '_______________________________________________________________________
    Dim CSVheader As String = reader.ReadLine
    Dim Cols = (From s In CSVheader.Split(",") Select s).ToList()
    Dim setTblColumns = (From s In Cols Select dTable.Columns.Add(s, GetType(String))).ToList
    'TEST Dim getTblColumns As List(Of String) = (From c As DataColumn In dTable.Columns Select c.ColumnName).ToList()
    '_______________________________________________________________________
    Dim ReadToEnd As String = reader.ReadToEnd()
    Dim Rows = (From s In ReadToEnd.Split(vbLf) Select s).ToList()
    ' up to now everything goes well
    Dim getTblRows = (From z In Rows Select z.Split(",")).ToList()
    ' getting DataTable rows, OK

    'xxx Dim x As DataRow
    'xxx For Each s As String In Rows
    'xxx    x = dTable.NewRow()
    'xxx    x.ItemArray = s.Split(","c)
    'xxx    dTable.Rows.Add(x)
    'xxx Next
    '_______________________________________________________________________
End Using

End Sub


What I have tried:

The procedure works 100% but I want to change the last part, so I would get an advantage of LINQ: nice looking code and better performance. Thanks for help in advance
ATeDe
Dim x As DataRow
For Each s As String In Rows
    x = dTable.NewRow()
    x.ItemArray = s.Split(","c)
    dTable.Rows.Add(x)
Next
Posted
Updated 14-Mar-18 11:21am

1 solution

Quote:
I would get an advantage of LINQ: nice looking code and better performance.

I'm not sure that my solution meets your criteria, but...

I would change NoNonsenseCSV2DataTable(CSVfile As String) procedure into function. See:
VB.NET
Public Function NoNonsenseCSV2DataTable(CSVfile As String) As DataTable
	Dim lines() As String  = File.ReadAllLines(CSVfile)
	
        'create DataTable
	Dim dt As DataTable = New DataTable()
	'get column names from first line and create an array of DataColumn
	Dim cols As DataColumn() = lines.Take(1) _
		.SelectMany(Function(x) x.Split(New String(){","}, StringSplitOptions.RemoveEmptyEntries)) _
		.Select(Function(x) New DataColumn(x, Type.GetType("System.String"))) _
		.ToArray()
	'add columns into DataTable
	dt.Columns.AddRange(cols)	
	
	'get rows - start from 2 line
	dt = lines.Skip(1) _
		.Select(Function(x) dt.LoadDataRow(x.Split(New String(){","}, StringSplitOptions.RemoveEmptyEntries), False)) _
		.ToArray() _
		.CopyToDataTable()
	'return DataTable
	Return dt

End Function


Usage:
VB.NET
Sub Main
    Dim sFileName As String = "FullFileNameOfCsvFile.csv"
    Dim myData As DataTable = NoNonsenseCSV2DataTable(sFileName)
    'myData is ready to use!

End Sub


For further details, please see:
File.ReadAllLines Method (String) (System.IO)[^]
DataTable.LoadDataRow Method (Object[], Boolean) (System.Data)[^]
Function Statement (Visual Basic) | Microsoft Docs[^]
Sub Statement (Visual Basic) | Microsoft Docs[^]
 
Share this answer
 
v2
Comments
ATeDe 14-Mar-18 20:17pm    
Thanks Maciej for your solution, which obviously works well, despite small 'hiccup' in the beginning, as my VB.NET compiler (VS2017) does not like 1st line of code:

Dim lines() As String = File.ReadAllLines(CSVfile).ToList() complaining
Error BC30311 Value of type 'List(Of String)' cannot be converted to 'String()'

After changing to
Dim lines() As String = File.ReadAllLines(CSVfile) the code executes smoothly

Another think to notice is that, it takes approx 10% longer to convert my CSV file (10 columns, 131745 records)

Thank you!


Original code elapsed time: 3490.86 Milliseconds
->Unformatted table: Count= 131745
A,B,C,D,E,F,G,H,I,J
2014,26,00001,1,01,900001,150,150,0,1412
2014,26,00001,1,01,905111,3500,3500,0,1412
2014,26,00001,1,01,907213,60,0,60,1412
2014,26,00002,1,01,900001,250,250,0,1827
..................

New routine elapsed time: 3763.66 Milliseconds
->Unformatted table: Count= 131745
A,B,C,D,E,F,G,H,I,J
2014,26,00001,1,01,900001,150,150,0,1412
2014,26,00001,1,01,905111,3500,3500,0,1412
2014,26,00001,1,01,907213,60,0,60,1412
2014,26,00002,1,01,900001,250,250,0,1827
..................
Maciej Los 15-Mar-18 2:24am    
Sorry, it's my fault. My first idea was to get lines into array of string, but later i decided to get lines into list. I forgot to change the type of variable declaration. Just rRemove ToList() method and it should be work. Note: i wroted it directly from my head.
Cheers,
Maciej
Karthik_Mahalingam 14-Mar-18 23:40pm    
5
Maciej Los 15-Mar-18 2:24am    
Thank you, Karthik.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900