Click here to Skip to main content
15,908,254 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am updating one table column by joining other table. There is no index on any table. One table (TRN_FINAL_FCM_DETAILS_MONTHLY ) has 100000 records and other (TRN_PRE_FCM_DETAILS ) has 400000 records. Consider following query-

There is no key on any table.
SQL
UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A, TRN_FINAL_FCM_DETAILS_MONTHLY B 
WHERE A.SITEID = B.SITEID 
  AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
  AND A.REQUESTREF = B.REQUESTREF 
  AND A.BILLMONTH = @CURR_BILLMNTH 
  AND B.BILLMONTH = @PREV_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR 
  AND B.BILLYEAR = @PREV_BILLYR

But this query is taking 2 hours in execute. How can I optimize query performance? I am new in performance tuning. Any help would be appreciated as i need this solution as soon as possible. PLease help me guys.

What I have tried:

I tried indexing on both tables but doesn't know much about indexing. I tried NOnClustered index on columns used in join conditions.
Posted
Updated 29-Mar-16 22:24pm
v4
Comments
Tomas Takac 29-Mar-16 4:27am    
Update your question with the indexes you added to both tables. BTW I would recommend you use the ANSI join syntax.
Rahul_Pandit 29-Mar-16 5:01am    
What is ANSI join ?? Please provide a syntax or example.Is it good in performance.
Andy Lanng 29-Mar-16 5:07am    
You need to hit the "Reply" button otherwise Tomas will not know you have commented.
To turn your join into an ANSI join, replace the comma with "INNER JOIN" and change the WHERE to "ON". To find out more about the advantages of ansi join, google it
Rahul_Pandit 30-Mar-16 3:07am    
I have already follow that approach but doesn't work.
CHill60 29-Mar-16 6:22am    
The line AND A.SHARINGCOUNT <> B.SHARINGCOUNT probably won't help either.

Use "MERGE" Statement
https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/
 
Share this answer
 
Comments
Rahul_Pandit 30-Mar-16 2:13am    
Merge is not allowed to used as per security by dba team. Any other solution?? I am using Sql Server 2005.
The order, and type, of your conditions does matter.

The non-ANSI join you use is incredibly inefficient because it attempts to join everything from table A to everything from table B. Although it is only looking at indexes in the first instance, the indexes remain very large for very large tables and joins. Changing it to an ANSI join will help, but only if your Join conditions take advantage of this.

Using the first condition A.col <> B.col will attempt to join everything but 1 item from table A to everything but 1 from table B. It doesn't improve the situation very much at all.

Try switching the first two conditions to use the "=" first. That will reduce the number of comparisons made my the "<>" immensely.

The three where clauses you have that look at a variable could be done as they are, in a Where clause, because adding them to the join will do little, if anything, to improve the query efficiency.

This would be the result:
SQL
UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A
INNER JOIN TRN_FINAL_FCM_DETAILS_MONTHLY B ON A.SITEID = B.SITEID AND A.REQUESTREF = B.REQUESTREF AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
WHERE A.BILLMONTH = @CURR_BILLMNTH 
  AND B.BILLMONTH = @PREV_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR 
  AND B.BILLYEAR = @PREV_BILLYR


Secondly, you could reduce the query time further by invoking the Where clause before the join. This again reduces the index size for the join. You can do that with a CTE. Depending on how your data is set up and the data types involved, this could make things much better or much worse:

SQL
With preCalcCTE as (
 Select * from TRN_FINAL_FCM_DETAILS_MONTHLY
 where  BILLMONTH = @PREV_BILLMNTH
 and B.BILLYEAR = @PREV_BILLYR)
UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A
INNER JOIN preCalcCTE B ON A.SITEID = B.SITEID AND A.REQUESTREF = B.REQUESTREF AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
WHERE A.BILLMONTH = @CURR_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR


Finally, int to int comparison is much more efficient that varchar to varchar. An int is always a set few bytes whereas varchar length can be much larger and can even vary. Always calculate the simpler datatype comparisons before the larger ones.
Indexes can help, but the larger the data, the longer it takes for inserts and updates to occur because the index must also be updated. From time to time it's a good idea to check if your indexes become fragmented. You can rebuild the indexes to make them more efficient.

Check index fragmentation:
SQL
SELECT OBJECT_NAME(ind.OBJECT_ID) AS TableName, 
ind.name AS IndexName, indexstats.index_type_desc AS IndexType, 
indexstats.avg_fragmentation_in_percent 
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats 
INNER JOIN sys.indexes ind  
ON ind.object_id = indexstats.object_id 
AND ind.index_id = indexstats.index_id 
-- You only really need to reindex when over 30% or rebuild when over 50%
--WHERE indexstats.avg_fragmentation_in_percent > 30 
ORDER BY indexstats.avg_fragmentation_in_percent DESC


There are a couple of ways to reindex or rebuild indexes.DBCC DBREINDEX (Transact-SQL)[^]


NB:
Most of the time (but not always) SQL will actually decide on your behalf what parts of the query occur in what order. The first time you run any query the database will actually compile and store it after calculating the most efficient way to run it. As your database grows, the sql compiled query may no longer be the most efficient execution. You can clear the query cache so SQL will recompile any and all queries the next time they are run.
Clearing Cache for SQL Server Performance Testing[^]

Conciser for redesign or future projects:
I write my databases so common text (such as ref codes) are stored in a table on their own with an ID and each table that uses the ref code references the ID. That makes indexing and comparison MUCH more efficient.

I once had to redesign a huge database (that was pretty much upside-down) by doing this. It took 6 months to put it together and redesign the application that used it, but even my very best estimates but a data transfer as reducing data size to 5% but would also take over 9k hours to complete (that's over a year!). We decided to archive the old system and only transfer data with no history. Still took 2 days.
 
Share this answer
 
Comments
Andy Lanng 30-Mar-16 4:25am    
it told be I'd failed to post this. Imagine if I sat there hitting "Submit" over and over ^_^
Rahul_Pandit 30-Mar-16 4:58am    
Thanks for reply. I try your solution.
Andy Lanng 30-Mar-16 5:00am    
I know there's a lot there but I'll be around all day if you need help ^_^
Rahul_Pandit 31-Mar-16 5:22am    
I tried your Ansi-join approach but its taking much time now.
Andy Lanng 31-Mar-16 5:41am    
It is the most optimised version of your query. The reason it is still slow must be down to the type and amount of data you are comparing. I would suggest a redesign, but that is quite an investment

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900