Click here to Skip to main content
15,885,244 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi All,

I want to split a huge 2GB csv file to multiple files.
The split has to be based on the number of records
for eg 1-100000 : file1
100001-200000 : file2
200001-300000 : file3

till end of records.


Thanks for your help...!!

thanks
Venkat.
Posted
Updated 1-Feb-22 18:46pm
Comments
TRK3 10-May-12 19:10pm    
What have you tried? What specifically do you not understand or need help with?

The program you want to write is one of the simplest programs one could attempt. I'd almost expect it to be the second program you learn to write right after "hello world".

We're more than happy to help you learn, but you need to do a little study yourself and make an attempt at it. Then if there is something specific you don't understand or couldn't find in the documentation, we'll be happy to help with that.

If you have the memory, you can read all the lines into an array using File.ReadLines (string Path). Then you can figure the number of lines an move them into new arrays and do a WriteAllLines(String, String[]) for each file. You can use LINQ to get the lines:

C#
var x = File.ReadLines(@"C:\temp\connections.xml");
File.WriteAllLines(@"C:\temp\connections2.xml", x.Take(10).ToArray());


Use Skip after the first file.
 
Share this answer
 
You can also achieved this through PowerShell script. See the link for script and steps to split your large file according to size/limit.
Split a large csv file into multiple csv files according to the size in powershell - Stack Overflow[^]
Then you can check, How to run PowerShell script in C#:
Invoking PowerShell Script with Arguments from C# - Stack Overflow[^]
 
Share this answer
 
v2
Comments
Richard Deeming 2-Feb-22 4:53am    
Why would you want to use C# to run Powershell to do something you can just as easily do directly in C#?

Hopefully the OP managed to solve their problem some time in the ten years since they posted this question.
M Imran Ansari 2-Feb-22 5:18am    
According to problem statement working in C#. Secondly, Why we get old questions in Active pane?
Richard Deeming 2-Feb-22 5:24am    
So as I said, why offer a Powershell solution for a C# question, when C# is perfectly capable of solving the problem?

Old questions get dragged up into the "active" list when someone posts a new solution to them - usually when a spammer tries to hide their spam link in an old thread. Those spam solutions quickly get reported and removed, so the only clue that it wasn't you who resurrected this thread is the two missing solution numbers.
M Imran Ansari 2-Feb-22 5:35am    
Agreed. But when we worked with big files sometimes we get into memory issues like OutOfMemory exception while processing file due to low resource (I have also experienced such issue and achieved through PS scripts) and then we will achieve though an alternative approach.
Thanks for detail clarification on second point.
Richard Deeming 2-Feb-22 5:39am    
You'd run out of memory if you tried to load the whole file into memory. But if you stream the file, either using a StreamReader, or using the File.ReadLines method (instead of File.ReadAllLines or File.ReadAllText), you can easily split a huge file into smaller chunks with hardly any memory overhead.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900