picking many files at a time using Multi threading

Question

5.00/5 (1 vote)

See more:

HI ALL

i m new in using multithreading
i create an application which pick up single file then decode it then create its result file
same process repeat one by one on 100 files

i want to convert the code in multithreaded application
that i pick 10 files run them parallel by using multithreading then create their out put then pick next 10 files .

plz help how use multithreading in this application

Regards

Posted 31-Jul-12 20:38pm

prog786

Add a Solution

Comments

Sandeep Mewara 1-Aug-12 3:00am

And where are you stuck? What have you tried so far? Update your question with that.

ThatsAlok 1-Aug-12 3:20am

better doing that by multithreading, use multitasking! i.e. running ten process instead of then thread!

pasztorpisti 1-Aug-12 4:11am

This is wise if the startup time of the process is minimal compared to running time.

Philip Stuyck 1-Aug-12 4:18am

Why ? A process is more expensive than a thread on most platforms

pasztorpisti 1-Aug-12 4:33am

Sometimes its better to use processes, especially if the startup time is only a fraction of the whole runtime. Its not just a practical way to avoid multithreading but it can have other benefits: I've seen in real world example that a tool program had to use hell buggy 3rd party static libs that leaked and bugged like hell. In this case it was much better to perform 100 new/fresh startups of the same program than running 100 tasks on the lib in the same process.

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Philip Stuyck · Answer 1 · 2012-07-31T21:22:00

You are going about it in the wrong way.
It is not a good idea to slice the 100 files in chunks of 10.
What you need to do is create 10 threads, that all have the same threadfunction.
You also need a list of files, say this list is stored in an stl vector or an array of some kind. Then you need an index that is initialised to -1. This counter needs to be semaphore protected because all 10 threads are going to use it.
In the tread function you start like this pseudocode:

while(1){
lock(countersemaphore)
inc(counter)
unlock(countersemaphore)
if (counter<100)
process_file(filearray[counter])
else break;
}

of course howto create a thread and such, depends on the platform you are using, same for howto create semaphores. But the idea of what you want to do is here.

Note that when a thread is finished with a processing of a file, it simply takes the next file. So this is not running in chunks of 10. That would mean if there is one big file that takes longer, then 9 threads would be waiting for it. And it is actually more difficult to code it that way too.

You will most likely need to wait for all threads to complete in the main execution. This is kinda platform specific too.

pasztorpisti · Answer 2 · 2012-07-31T22:06:00

The whole thing reminds me of a tipical pattern I regularly use. First your program should start by composing the list of files by popping up a file open dialog. You might let the user to do this multiple times to add files from different locations. When you are finished and have your 100 or whaterver number of files you can do this:
Create a list of the files (array, vector, whatever). This list is readonly. Initialize an integer index to -1. You start lets say 10 threaads running the same code that does the following: Calls the InterlockedIncrement()[^] winapi function or the __sync_add_and_fetch()[^] gcc builtin on the index. This increments the index and returns the incremented value. The thread uses the returned index to pick an item from the readonly array for processing, and when the processing is done it repeates the previously described pattern until the index is bigger or equal to the size of the array. When the index reached the size of the array, the thread terminates. The main thread does nothing else just waits for the worker threads to exit.

I think this kind of threading is much easier to implement without errors especially by a threading newbie, not to mention that its ususally faster than locking/unlocking a mutex.

Further place for optimization: Lets say you have 1 million items and only 10 threads - then you can speed up this kind of multithreading by putting more than one item into one slot in your array (lets say 100), thus increasing the size of a single job. This speeds up the stuff because this way you need to do much less thread synchronization calls to InterlockedIncrement() or other kind of locks that are ususally considered very slow operations.

For multithreaded bulk processing on a fixed prepared list consider using this technique.

Note: InterlockedIncrement() on windows expects a LONG parameter so you might need a cast if you use other 32bit integer type.

picking many files at a time using Multi threading

2 solutions

Solution 1

Solution 2

Add your solution here

Preview 0