Click here to Skip to main content
15,888,293 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
Hello! I need to read from txt file about 1 billion rows and after manipulation put them in sql server db.

1) When I create a parser and try to store them in collection (List<> or another) I got outofmemoryexception. What kind of collection I should use to read and store successfully this amount of data?
2) How it'll be better to insert data directly to DB? I tried bulkinsert, but it seems to slow.

Thanks.
Posted

First of all I am a little skeptical that a text file (or any flat file) will have one billion records. But assuming that that's just a vague word used to mean a really large file, then you should not try and read the entire data and then write it to the database. Read it in chunks (say 3000 records at a time) and then keep writing to the database.
 
Share this answer
 
Comments
Olivier Levrey 22-Feb-11 9:50am    
Voted 5.
Nish Nishant 22-Feb-11 10:02am    
Thanks.
Sergey Alexandrovich Kryukov 22-Feb-11 12:01pm    
Yes, 5, what else?
--SA
Yusuf 22-Feb-11 12:40pm    
VG +5

Shhh! He has the master list of either the Chinese or Indian population. I know some people are sneaking from the government and not included in the list. :-)
Espen Harlinn 24-Feb-11 3:42am    
Good answer, my 5 - I'd try about 50000-100000 records at the time, using bulk insert
If you have it available, I would look into using SSIS


It will allow you to manipulate your data and store in the DB. Also if it is a repeatable task, you can schedule it too. It could be just as quick to set up an SSIS package as it will take care of chunking/commits/rollbacks etc.
 
Share this answer
 
 
Share this answer
 
Comments
Espen Harlinn 24-Feb-11 3:43am    
Good links :)
I've dealt with data this size before. Why do you need all of it loaded to analyze and parse it? As mentioned earlier, do it in chunks. Also, bulk insert isn't that slow, do you have your indexes disabled while inserting? How many columns of what type of data are you talking about?
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900