Click here to Skip to main content
15,903,856 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
hi all

i m using decoder of G729 in visual C++ . this decoder decode slow i simplify the code by applying object oriented but large file processing time not decrease .please tell any idea how to minimize the timing of file decoding . like 5 gb input file take 5 hours to process .

decoder take input in buffer form like 80 bits input and 10 bytes output to write on file.


Regards
Posted

Quote:
i simplify the code by applying object oriented but large file processing time not decrease
Simplified code is a prerequisite for code readibility and reliability, not for speed performance.

In order to speed up the code you should probably go in the opposite direction, that is go more low level (e.g. consider the interaction with the underlying OS in more detail, prefer raw access to resources intead of powerful abstractions and so on) and focus your C++ code on speed optimization.
I cannot help you more without seeing the actual code.
 
Share this answer
 
Did you use your own decoder?

Then I suggest to use an existing one like from the ffmpeg library.

Otherwise tune the optimisation options of your compiler for the decoder source files. Some tips:

  • Avoid memory allocations inside loops. Allocate outside loops and use stack variables for small buffers with known (max.) size.
  • Avoid usage of gloabl variables inside loops. Use a local copy instead.

  • Tell the compiler which min. processor generation should be supported.

  • When using floating point operations tell the compiler to use vector instructions (SSE, AVX) if possible (when only running on modern CPU's).

  • Learn about loop unrolling and check if it can be used.

  • When knowing the final file size in advance, create the file with this size and rewind.


Further tips can't be given without seeing the source.
 
Share this answer
 
I'd do the following things in order until it's fast enough:

- Switch the optimiser on - an unoptimised build can be a factor of 10 slower in some cases. Have a quick fiddle with the compiler settings to make sure it's being aggressive enough.

- Check the algorithm the decoder's using first to see if there's a better one you can use. If it's only processing 10 bytes at a time there may be an opportunity to process more than one lump at a time during each pass of the algorithm [1]. Also look at using someone else's library to do the job - I usually assume that other people know better than I about how to implement stuff so why not borrow or buy from the best? [2]

- Make sure you're using the right structures for the job and not doing something daft like regenerating intermediate structures during each pass of the algorithm. e.g. don't create a new G729Decompressor object each time around the loop.

- Make sure you're using buffered file I/O and aren't doing anything daft like closing the output file after every read or write. Use the largest amount of memory you can get hold of to buffer the data in and out. If this speeds up the processing a lot it's probably I/O bound so a faster storage device might also help.

- Once you've got a massive buffer try multi-threading - one thread per processor and give each thread a subset of your input buffer to process in one fell swoop.

- Make sure you've not doing something the processor caches don't like. Minimise the working set for the algorithm and try and make sure that all the data being accessed at one time is close in memory. Getting it on the same memory page is a good way of making sure you don't get cache collisions. Using automatic variables is good way of getting all your data on one page. Dynamic memory allocation is a good way to screw it up. Calls through tables of function pointers can also mess up caching as they're often not near the data used by an algorithm so (and I hate to say this) minimise using virtual functions unless it really warps your algorithm.

If you try that lot and you haven't got any faster come back and ask again, but that should take you a while to work through.

[1] There probably isn't one as G.729 is meant to operate on 10 byte frames but this is good advice generally. There was also no real incentive to be fast to decode as it was meant for real-time use, one frame every 10ms isn't a lot to keep up with on a modern processor (5GB is a metric f***-ton of G.729 data - years of conversation) so the implementation you're using might be rather appalling.

[2] Buying 3rd party code isn't an option when you're working for a cheapskate or are a student.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900