Click here to Skip to main content
15,887,337 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
Hello ,

I use sql2008 R2. One my my table contains more than 10,000,000 records with data size around 4.2 GB.This table Contains data about inventories and its properties.
A select * for table query takes 27 Minutes Retrieve data.

I am using this table with inner joins with another table which contains daily transaction. Goal it to find the Average Usage on historical data ,consumption per item and trend. hence i cannot use filters on query for business purpose.(although if filters used, query returns data in less time)

Table contain primary key clustered index as well as no clustered index on columns which are used in inner join, i have also added additional indexes as per sql execution plan.

My Question is, is there any by which we can load large data(greater the 3 GB ) within a minute ? Shall i partitions my table , but it will ultimately select the same data.
(Sorry for dumb question, i do understand indexes, and query optimization part. But my problem is with huge data.)


My Machine is dual core with 8 GB RAM
Table Stats as follows(using sp_spaceused)
rows:10157697
reserved: 4268752 KB
data:4026432 KB
index_size:240720 KB


Thanks for taking time to read my question.
Posted
Updated 21-May-14 19:55pm
v3
Comments
DamithSL 22-May-14 1:52am    
calculate consumption per item for last day and keep that in separate table, you can trigger this as a job on start of each day.
when you need to calculate consumption you can calculate consumption for current day only using where condition. it will definitely take less time. finally use last day consumption with newly calculated current day consumption and get the end result.
just an idea :-)
Amol_B 22-May-14 1:55am    
Thanks DamithSL, but it has to take historical data in account. To Determine average use within period of 2 to 3 years and determine trend.

There are a few things you can do (and I deal with historical data with over 560.000.000 records so far).
1. Make your table partitioned, probably over some column of date. That can help SQL to scan index pages only for a specific time frame.
2. Choose carefully your clustered index. The special about clustered index is that all the data representing the index is stored on the index pages so no additional look-up need to scan them. So if you do a lot of joins using a defined sequence of columns, make that sequence into clustered index (but remember to keep the index as small as possible).
3. Create index for every foreign key you use on the table.
4. Add more columns to the index pages (using CREATE INDEX ... INCLUDE). It can help you to retrieve minimal (and most important) data without even reading a single data page.
5. Consider to do your query in steps. That will reduce the complexity of the execution plan of your SQL query and can save time...

[some more]
An other idea I just got (I didn't used it before) is to run daily summaries that creates for you the necessary usage data. I believe in most cases the average has not to be up to date to the last second...
 
Share this answer
 
v2
Comments
Maciej Los 22-May-14 3:18am    
5ed! It's very good answer, even when in my opinion the main issue is somewhere else ;)
Please, read my answer.
Kornfeld Eliyahu Peter 22-May-14 3:24am    
Thank you...
I agree - it smells like a pure design problem. OP does not know what he really wants so he has no real solution...
Maciej Los 22-May-14 3:32am    
Holy words!
Cheers,
Maciej
Jörgen Andersson 22-May-14 3:54am    
Point 2 and 3 are incompatible if you want performance, see link in my answer.
The loss of performance gain from a clustered index can be compensated using a covering index to the sacrifice of storage space.
Kornfeld Eliyahu Peter 22-May-14 4:38am    
I didn't get you why index on foreign key and clustered index are incompatible?
Amol_B wrote:
My Question is, is there any by which we can load large data(greater the 3 GB ) within a minute ?
(...)
A select * for table query takes 27 Minutes Retrieve data


No, there is no method to load 10 millions records within one minute even if you add indexes, make partitioned your table, etc.

The main sin of database programmers is to use
SQL
SELECT *
FROM TableName

Never use it!

Use:
SQL
SELECT <field_list>
FROM TableName

or even better:
SQL
SELECT <field_list>
FROM TableName
WHERE <condition>


Please, read about Query performance[^] and Database performance, etc. Use SearchBox on the right-top corner of this site.
 
Share this answer
 
v2
Comments
Kornfeld Eliyahu Peter 22-May-14 3:05am    
+5SELECT * is a design problem and for that no index or partition can help OP!
Maciej Los 22-May-14 3:19am    
Thank you ;)
[no name] 24-May-14 20:24pm    
But I like "select * from x" without where, because it Shows me how much power theBe Server has ;)

Therefore because we are not on the same line for this matter I vote only 5 for your answer!
Maciej Los 25-May-14 4:26am    
Thank you, Bruno ;)
I'm seeing two errors here.
Firstly average usage and consumption per item is aggregation of data.
Filtering and aggregation is what the database excels at. So why on earth do you want to move the whole bunch of data off the database. Do your aggregation in the query.
Were having a database where several tables have more than 200 million rows where an aggregate on a pivot of 6 million rows is done in less than 6 seconds. But the resultset is only a few kb.
Shuffling data takes linear time and moving 4.2 GB of data is not a process you can optimize without upgrading your hardware.

Secondly, never use a clustered primary key if you have secondary indexes. I'm not going to explain why as Marcus Winand is doing it[^] so much better.

<edit>Also consider using an indexed view, most aggregates can be used in an indexed view. The exceptions I can think about at the moment is Min() and Max()</edit>
 
Share this answer
 
v2
Comments
Maciej Los 25-May-14 4:27am    
+5!
well i agree with peter event though
the goal of your problem is
Goal it to find the Average Usage on historical data ,consumption per item and trend. hence i cannot use filters on query for business purpose

well you can create a clustered index view based on the item on the transaction
that will ultimately reduce your entire scan of the transaction data
hope it will reduce your code too
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900