Click here to Skip to main content
15,891,981 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
In this moment i have a large csv file which has 11 MB and contains much of data which are inserted to... the end of excel file. So it will be about 1 million rows in 10 columns. Now I'd like to write a c# code which imports this file very faster.

What i've done?

First, i have written code which imports all of data from csv file. Second, i can insert into mysql database to "order_status" table. These codes will be in section "What have you tried?"

But when I insert 11 MB of large file it imports much time that will be about 10 minutes. And in half time of way while compiling this code process memory has about... 5 GB!

Now I'm wondering that how can i speed importing data from large csv file. "MysqlBulkLoader" will be enough? Maybe should rewrite importing code in other way? Any ideas? Thx for any help.

What I have tried:

1) Importing data from csv file:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Collections;
using System.Data.OleDb;
using System.IO;
using System.Configuration;
using MySql.Data.MySqlClient;

namespace ControlDataBase
{
    public partial class Import_data_mysql : Form
    {
        public Import_data_mysql()
        {
            InitializeComponent();
        }
        New_Tables frm2 = (New_Tables)Application.OpenForms["New_Tables"];

        private DataTable ImportFile()
        {
            DataTable imported_data = new DataTable();

            OpenFileDialog ofd = new OpenFileDialog();
            ofd.Title = "Open csv file";
            ofd.DefaultExt = "*.csv";
            ofd.Filter = "Documents (*.csv)|*.csv";
            ofd.ShowDialog();

            FileInfo fi = new FileInfo(ofd.FileName);
            string FileName1 = ofd.FileName;
            string excel = fi.FullName;

            using(StreamReader sr = new StreamReader(excel))
            {
                string header = sr.ReadLine();
                if (string.IsNullOrEmpty(header))
                {
                    MessageBox.Show("Not found or loaded not correct file.");
                    return null;
                }

                string[] header_columns = header.Split(';');
                foreach(string header_column in header_columns)
                {
                    imported_data.Columns.Add(header_column);
                }

                while (!sr.EndOfStream)
                {
                    string line = sr.ReadLine();

                    if (string.IsNullOrEmpty(line)) continue;

                    string[] fields = line.Split(';');
                    DataRow imported_row = imported_data.NewRow();

                    for (int i = 0; i < fields.Count(); i++)
                    {
                        imported_row[i] = fields[i];
                    }

                    imported_data.Rows.Add(imported_row);
                }
            }
            return imported_data;
        }


2) Inserting data into mysql database:

private void save_status_to_database(DataTable imported_data)
       {
           string connect = "datasource=localhost;port=3306;username=root;password=;CharSet=utf8mb4";
           using (MySqlConnection conn = new MySqlConnection(connect))
           {
               conn.Open();
               foreach (DataRow importRow in importowane_dane.Rows)
               {
                   string query5 = @INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT,
                   AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE,
                   projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders
                   INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME
                   AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER
                   AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME
                   AND projects.PROJECT_NAME = @PROJECT_NAME"

                   MySqlCommand cmd = new MySqlCommand(query5, conn);

                   cmd.Parameters.AddWithValue("@ID_WORKER", importRow["ID_WORKER"]);
                   cmd.Parameters.AddWithValue("@FNAME", importRow["FNAME"]);
                   cmd.Parameters.AddWithValue("@LNAME", importRow["LNAME"]);
                   cmd.Parameters.AddWithValue("@DESC_ORDER", importRow["DESC_ORDER"]);
                   cmd.Parameters.AddWithValue("@ORDER_NUMBER", importRow["ORDER_NUMBER"]);
                   cmd.Parameters.AddWithValue("@MODULES_NAME", importRow["NAME"]);
                   cmd.Parameters.AddWithValue("@PROJECT_NAME", importRow["PROJECT_NAME"]);
                   cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", importRow["AMOUNT_OF_PRODUCTS"]);
                   cmd.Parameters.AddWithValue("@BEGIN_DATE", importRow["BEGIN_DATE"]);
                   cmd.Parameters.AddWithValue("@END_DATE", importRow["END_DATE"]);

                   cmd.ExecuteNonQuery();
               }
               conn.Close();
           }
           MessageBox.Show("Imported to database.");
       }


3) I've tried by running in x64 mode and adding <runtime> <gcallowverylargeobjects enabled="true"> in App.config. But it hasn't helped yet.
Posted
Updated 15-Jul-19 19:33pm
v2

One of the reasons it's so slow and consumes so much memory is because you're reading the entire file into a DataTable object. That'll take forever to do and consume a ton of memory.

Don't. Just process each record as it's read and save it to the database. You don't need the DataTable or the code to use it at all.
 
Share this answer
 
My choices would be either the MySqlBulkLoader you suggested or an SSIS package; depending on whichever is faster.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900