Click here to Skip to main content
15,905,616 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,
I have to insert bulk quantity of data to sql server database
I have the data in DataTable, I need to validate most of the cell values in each row, then save to database.
It takes long time to complete it and UI hangs. and even status text is not updating in UI.
So I tried it in different ways by executing in different threads also the result was same.
At last I tried parallel execution(By spliting the content of table into five seperate table and excecuted parallely) as like given below, but not found any improvement. can any one suggest me any other good solution?? to speed up my validation process of whole document by using parallel or multi thread or any other solution??
C#
int validatedRowCount = 0;
int interval = 0;
interval = totalCount / 5;
ObservableCollection<Myobject>[] arrayObj= new ObservableCollection<Myobject>[5];
DataTable[] arrayOfDatatable = new DataTable[5];
arrayOfDatatable [0] = MyTable.AsEnumerable().Skip(0).Take(interval).CopyToDataTable();
arrayOfDatatable [1] = MyTable.AsEnumerable().Skip(interval).Take(2*interval).CopyToDataTable();
...
arrayOfDatatable [2] = MyTable.AsEnumerable().Skip(4 * interval).Take(totalRowCountOfMyTable).CopyToDataTable();

System.Threading.Tasks.Parallel.Invoke(

() =>
{ arrayObj[0] = blExeM.Validate(arrayOfDatatable[0], contries, ref result, selectedContries);
}
() =>{ arrayObj[1] = blExeM.Validate(arrayOfDatatable[1], contries, ref result, selectedContries);
}
.........
() =>{ arrayObj[4] = blExeM.Validate(arrayOfDatatable[4], contries, ref result, selectedContries);
}

contries : its list<object>
selectedContries: dictionary
result: string
after validating its returning the validated records in a collection..

Is it a good method or what is the best solution to handle this case?? Any help greatly appreciated...If possible please provide piece of code..
Posted
Updated 12-Jun-15 9:26am
v2
Comments
virusstorm 12-Jun-15 14:01pm    
Simple answer, don't use a DataTable. The DataTable object was not designed to do what you are doing. I'm having a hard time following the code you posted, so I'm going to make a few assumptions here in hopes it helps you.

First, I'm assuming that every record in the table represents an entity. My first action would be to create an object that represents that entity. I would then write a validation method for the entity so that it can verify it is in the proper state. You can have it set a flag so you know it is or is not clean. From there, you can use the .NET parallel libraries to to attempt to speed up processing. Please not that this library attempts to scale the operation based on the system's resources. A system with a single core won't see an performance increase like that of a quad core. Even then, depending on the operations, you aren't guaranteed a performance increase.

Before getting into the solution, Datatable or datarow are not thread safe. If you are doing any data modification to data in a datarow in your validation using parallel tasks, you will face issues.

I would suggest to use Tasks instead of BackgroundWorker. First let us try to understand how Parallel.Invoke works.
From MSDN for Parallel.Invoke:

Quote:
No guarantees are made about the order in which the operations execute or whether they execute in parallel. This method does not return until each of the provided operations has completed, regardless of whether completion occurs due to normal or exceptional termination.


Quote:
Note that with Invoke, you simply express which actions you want to run concurrently, and the runtime handles all thread scheduling details, including scaling automatically to the number of cores on the host computer.


So, what happens in your code is, you are splitting the datatble of 50000 records into 5 datatables of 10000 each. There will be 5 parallel tasks, each having 10000 records to validate, sequentially. This is the reason why you did not see any improvement in performance.

The better way to do this will be, to process number of records equal to the number of cores in your host machine. Max can be 2 times number of cores.

Example: if your machine has 4 cores, you can process 8 records max at a time to get the best of the performance.

The next question will be, Do you have to create 50000/8 = 6250 datattables before calling Parallel.Invoke? Yes because this has to be done on way or the other and No because you do not have to do this before calling Parallel.Invoke.

Here is some sample code to show how it can be done, but not the only way. This code is not tested.
C#
long recordsProcessed = 0;
int interval = 8; //assuming 4 cores. You can also use 4.
While(recordsProcessed <= totalCount)
{
  arrayObj[0] = MyTable.AsEnumerable().Skip(0).Take(interval).CopyToDataTable();
  recordsProcessed += 8;
  //Invoke with 8 parallel tasks each taking a datarow to process

}

Once you do this, your Validate method should be change to take DataRow instead of a datatable.
blExeM.Validate(dataRowToValidate, contries, ref result, selectedContries);


You can also use Parallel.ForEach or PLINQ instead of Parallel.Invoke. Search for these terms if you want to learn more.

Finally, to make the UI responsive, the method that calls the Validate method has to invoked asynchronously.
 
Share this answer
 
v2
Comments
Sadique KT 13-Jun-15 3:39am    
Thanks for the answer.. Improved the performance...not too much , even though something not bad...
Mathi Mani 13-Jun-15 3:46am    
Glad it helped in some way :)
In my experience, I recommended you to use BackgroundWorker to do one processing bar, it's a bit user friendly.
here is one reference. Hope can help on your issue.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900