Click here to Skip to main content
15,886,689 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I am really getting clueless here. I import ~2million rows into my azure sql database. I create temp table where I put my values, when I use merge technique to insert only rows that do not have duplicates. My code and scripts are below.

What I have tried:

public async Task BulkImportWithoutDuplicates(DataTable reader)
        {
            var tableName = "##tempImport";
 
            using (var connection = new SqlConnection(sqlCOnn.ConnectionString))
            {
                using (SqlCommand command = new SqlCommand("", sqlCOnn))
                {
                    try
                    {
                        sqlCOnn.Open();
                        //Creating temp table on database
                        command.CommandText = Scripts.GetTempTableScript();
                        command.ExecuteNonQuery();

                        //Bulk insert into temp table
                        using (SqlBulkCopy b = new SqlBulkCopy(conString, SqlBulkCopyOptions.TableLock))
                        {
                            b.BulkCopyTimeout = 0;
                            b.BatchSize = reader.Rows.Count;
                            b.DestinationTableName = tableName;
                            await b.WriteToServerAsync(reader);
                            b.Close();
                        }

                        // Updating destination table, and dropping temp table
                        command.CommandText = Scripts.GetMergeScript();
                        var rows = command.ExecuteNonQuery();
                    }
                    catch (Exception ex)
                    {
                        // Handle exception properly
                    }
                    finally
                    {
                        connection.Close();
                    }
                }
            }
        }




public static string GetTempTableScript()
        {
            return $@" 
            IF OBJECT_ID('tempdb.dbo.##tempImport', 'U') IS NOT NULL
            BEGIN
            DROP TABLE ##tempImport;
            END
    
              CREATE TABLE ##tempImport ( ... all the columns);";
        }
        
        public static string GetMergeScript()
        {
            return $@"MERGE
                INTO dbo.Data AS target
                USING ##tempImport AS source
                ON (source.TransactionId = target.TransactionId AND source.UserId = target.UserId)
               
                WHEN NOT MATCHED THEN
                    INSERT (Start, Spend, UserId, Product, Shop, ClientId, UploadDataId, UniqueId, TransactionId, q, cq, c2)
                        VALUES (source.Start, source.Spend, source.UserId, source.Product, source.Shop, 
                source.ClientId, source.UploadDataId, source.UniqueId, source.TransactionId, source.q, source.c1, source.c2);

               ";
        }

I really do not get why it takes ages until it finishes. I waited for 24minutes until it was added to temporary table alone. I was reading this article and it seems that it shouldn't take long. https://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html?m=1

What I am doing wrong here? How Can I improve the speed?
Posted
Updated 4-Jun-20 13:39pm

1 solution

Put all you data on the server before getting fancy. Without a straight load as a bench mark, you're just stumbling in the dark.
 
Share this answer
 
Comments
Member 12885549 5-Jun-20 3:52am    
I'm not sure I got your point

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900