Click here to Skip to main content
15,867,330 members
Please Sign up or sign in to vote.
4.27/5 (8 votes)
See more:
Hi,

I trying to process live stockMarket data and insert and update a data with the results. I'm using the consumer producer queue design pattern which I have threaded.

Some of the calculations are VERY intensive and degrading the performance of the database. I can't seem to figure how to go about processing, inserting/updating the database with the results.

Can some please give me advice on how to go about setting this up properly?

Thanks,
-Donald
Posted
Comments
Albin Abel 28-Apr-11 16:03pm    
Good question. My 5
Nish Nishant 28-Apr-11 16:07pm    
My 5 too.
AspDotNetDev 28-Apr-11 16:10pm    
Take it step by step. Give us a specific example of something that is too slow. In general, make sure you have the right indexes and use the query plan to figure out problem areas.
Monjurul Habib 28-Apr-11 18:49pm    
my 5.

This is a very general idea I am throwing in. If, and it's an important "if", part of the slowness is due to the managed code, you may want to move some of the more intensive calculations into a fast C++ written library. You could call into it via COM or C++/CLI (among other options).
 
Share this answer
 
Comments
Albin Abel 28-Apr-11 16:16pm    
My 5, good alternative
Nish Nishant 28-Apr-11 16:23pm    
Thanks (comment threading is all messed up)
Nish Nishant 28-Apr-11 16:23pm    
Thank you, Albin.
Sergey Alexandrovich Kryukov 28-Apr-11 17:01pm    
Makes sense, a 5.
What do you think about my idea? Something tells me it can be more effective. It depends on those calculation and the rest of architecture and business logic though.
Please see my answer.
--SA
Nish Nishant 28-Apr-11 17:03pm    
Already saw it, voted 5 too. Up to the OP to think of these approaches though.
I can see that your heavy calculation part could compromise the total throughput of the system, but I don't see why it has to degrade the performance of the database. What is the bottleneck: the calculations themselves or additional transactions for intermediate results? If the transaction make a bottleneck you need to cash the data. I cannot believe you can do correct calculation of an ever-changing database anyway.

If you already developing the consumer/producer queue approach you can more or less easily move big part of processing onto another machine. I would suggest you dedicate a separate tier just for your calculation part. It can run on a separate machine and increase parallelism.

—SA
 
Share this answer
 
v2
Comments
Nish Nishant 28-Apr-11 17:00pm    
Voted 5!
Sergey Alexandrovich Kryukov 28-Apr-11 17:01pm    
Thank you, Nishant.
How could you be so fast?
--SA
Nish Nishant 28-Apr-11 17:02pm    
:-)
yesotaso 28-Apr-11 17:17pm    
Voted 5. I was thinkg same:"Intense calculation <-?-> Degrade database performance" :) Anyway, observing a producer filling bottomless buffer or a consumer eating endless data may show where performance problem lies.
Sergey Alexandrovich Kryukov 28-Apr-11 20:57pm    
Thank you very much. Agree with you.
Actually, observing/profiling how much CPU is used by each tier is not enough. A work flow can be badly unbalanced with defeats parallelism. I guess you're describing a case like that.
--SA
For I have done similar things at university, I think I know where your problem is.
For instance, I did some testing (c#) on just a few hundred thousands of datasets on a sql developer machine. The performance was damn slow compared with a perl solution using in-memory and simple file based storage.

I remember, one weekend my multithreaded app was blocking the whole multicore system and university backbone. This perl program I wrote some time ago was fetching stock data from servers around the world comparing terabytes of data again and again, extracting, filtering, completing extrapolating data and even processing some images for visualization. One thing I can tell is, that a well-designed program with no database at all, interpreted by a well-chosen script compiler like perl (which is known for fast parsing capability), can outperform any precompiled high level managed code application easily. It's like choosing the right tools for a certain task.

From my current point of view, for this kind of application (high data, high access, complex operations - I call it hidaco - and in my case image processing), a standard approach of database programming is a NO-GO! Personally I think Database performance is well overestimated. Though financial manners are most often taken into transaction models because of reliability, this is fatal choice when it comes to performance considerations. Well, my approach was to reduce database activity to the minimum (means zero, I wrote my own). For you, that means doing some caching and maybe kind of creating your own database, or better, consider using an in-memory database (see Google). For recurs computations like neural networks and ai (like aforge or opencv) are much more intense than (well defined and deterministic) financial math, computation is (IMHO) not a bottleneck, nor is managed code. Any SQL may become a bottleneck very easy. Try at least two in-memory databases (see imdb on wiki for a list). If your performance increases, you should redesign your sql statements to get to the max. Well, I bet it will tremendously increase, but if it does not, take the c++ way (use an externally financial math library with c# wrapper) for performance testing.

Another approach would be to expand your SQL-Server / Database capabilities. There is a YouTube video about YouTube’s sizing problems during different periods of growth out there - just a hint, but takes me to the last point ;-)

One last word on common pit falls. I assume that processing live stock data means fetching data over any kind of network!? Please be aware of any limits on connection handling starting with maximum simultaneous connections/ sockets/ ports, bandwidth issues, packet-/session-timeouts and misconfiguration (even on the physical side -> network) and whatever may come.

And last but not least, let us know.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900