Query on Table containing 10,000,000 records and size more than 4GB

Question

5.00/5 (1 vote)

See more:

Hello ,

I use sql2008 R2. One my my table contains more than 10,000,000 records with data size around 4.2 GB.This table Contains data about inventories and its properties.
A select * for table query takes 27 Minutes Retrieve data.

I am using this table with inner joins with another table which contains daily transaction. Goal it to find the Average Usage on historical data ,consumption per item and trend. hence i cannot use filters on query for business purpose.(although if filters used, query returns data in less time)

Table contain primary key clustered index as well as no clustered index on columns which are used in inner join, i have also added additional indexes as per sql execution plan.

My Question is, is there any by which we can load large data(greater the 3 GB ) within a minute ? Shall i partitions my table , but it will ultimately select the same data.
(Sorry for dumb question, i do understand indexes, and query optimization part. But my problem is with huge data.)

My Machine is dual core with 8 GB RAM
Table Stats as follows(using sp_spaceused)
rows:10157697
reserved: 4268752 KB
data:4026432 KB
index_size:240720 KB

Thanks for taking time to read my question.

Posted 21-May-14 19:39pm

Amol_B

Updated 21-May-14 19:55pm

v3

Add a Solution

Comments

DamithSL 22-May-14 1:52am

calculate consumption per item for last day and keep that in separate table, you can trigger this as a job on start of each day.
when you need to calculate consumption you can calculate consumption for current day only using where condition. it will definitely take less time. finally use last day consumption with newly calculated current day consumption and get the end result.
just an idea :-)

Amol_B 22-May-14 1:55am

Thanks DamithSL, but it has to take historical data in account. To Determine average use within period of 2 to 3 years and determine trend.

4 solutions

Solution 3

well i agree with peter event though
the goal of your problem is
Goal it to find the Average Usage on historical data ,consumption per item and trend. hence i cannot use filters on query for business purpose

well you can create a clustered index view based on the item on the transaction
that will ultimately reduce your entire scan of the transaction data
hope it will reduce your code too

Posted 21-May-14 20:55pm

Thava Rajan

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Kornfeld Eliyahu Peter · Accepted Answer · 2014-05-21T20:25:00

There are a few things you can do (and I deal with historical data with over 560.000.000 records so far).
1. Make your table partitioned, probably over some column of date. That can help SQL to scan index pages only for a specific time frame.
2. Choose carefully your clustered index. The special about clustered index is that all the data representing the index is stored on the index pages so no additional look-up need to scan them. So if you do a lot of joins using a defined sequence of columns, make that sequence into clustered index (but remember to keep the index as small as possible).
3. Create index for every foreign key you use on the table.
4. Add more columns to the index pages (using CREATE INDEX ... INCLUDE). It can help you to retrieve minimal (and most important) data without even reading a single data page.
5. Consider to do your query in steps. That will reduce the complexity of the execution plan of your SQL query and can save time...

[some more]
An other idea I just got (I didn't used it before) is to run daily summaries that creates for you the necessary usage data. I believe in most cases the average has not to be up to date to the last second...

Maciej Los · Accepted Answer · 2014-05-21T20:43:00

Amol_B wrote:
My Question is, is there any by which we can load large data(greater the 3 GB ) within a minute ?
(...)
A select * for table query takes 27 Minutes Retrieve data

No, there is no method to load 10 millions records within one minute even if you add indexes, make partitioned your table, etc.

The main sin of database programmers is to use

SQL

SELECT *
FROM TableName

Never use it!

Use:

SQL

SELECT <field_list>
FROM TableName

or even better:

SQL

SELECT <field_list>
FROM TableName
WHERE <condition>

Please, read about Query performance[^] and Database performance, etc. Use SearchBox on the right-top corner of this site.

Jörgen Andersson · Accepted Answer · 2014-05-21T21:40:00

I'm seeing two errors here.
Firstly average usage and consumption per item is aggregation of data.
Filtering and aggregation is what the database excels at. So why on earth do you want to move the whole bunch of data off the database. Do your aggregation in the query.
Were having a database where several tables have more than 200 million rows where an aggregate on a pivot of 6 million rows is done in less than 6 seconds. But the resultset is only a few kb.
Shuffling data takes linear time and moving 4.2 GB of data is not a process you can optimize without upgrading your hardware.

Secondly, never use a clustered primary key if you have secondary indexes. I'm not going to explain why as Marcus Winand is doing it[^] so much better.

<edit>Also consider using an indexed view, most aggregates can be used in an indexed view. The exceptions I can think about at the moment is Min() and Max()</edit>

Query on Table containing 10,000,000 records and size more than 4GB

4 solutions

Solution 1

Solution 2

Solution 4

Solution 3

Add your solution here

Preview 0