How to optimize update using join in SQL

Question

0.00/5 (No votes)

See more:

I am updating one table column by joining other table. There is no index on any table. One table (TRN_FINAL_FCM_DETAILS_MONTHLY ) has 100000 records and other (TRN_PRE_FCM_DETAILS ) has 400000 records. Consider following query-

There is no key on any table.

SQL

UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A, TRN_FINAL_FCM_DETAILS_MONTHLY B 
WHERE A.SITEID = B.SITEID 
  AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
  AND A.REQUESTREF = B.REQUESTREF 
  AND A.BILLMONTH = @CURR_BILLMNTH 
  AND B.BILLMONTH = @PREV_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR 
  AND B.BILLYEAR = @PREV_BILLYR

But this query is taking 2 hours in execute. How can I optimize query performance? I am new in performance tuning. Any help would be appreciated as i need this solution as soon as possible. PLease help me guys.

What I have tried:

I tried indexing on both tables but doesn't know much about indexing. I tried NOnClustered index on columns used in join conditions.

Posted 28-Mar-16 20:53pm

Rahul_Pandit

Updated 29-Mar-16 22:24pm

v4

Add a Solution

Comments

Tomas Takac 29-Mar-16 4:27am

Update your question with the indexes you added to both tables. BTW I would recommend you use the ANSI join syntax.

Rahul_Pandit 29-Mar-16 5:01am

What is ANSI join ?? Please provide a syntax or example.Is it good in performance.

Andy Lanng 29-Mar-16 5:07am

You need to hit the "Reply" button otherwise Tomas will not know you have commented.
To turn your join into an ANSI join, replace the comma with "INNER JOIN" and change the WHERE to "ON". To find out more about the advantages of ansi join, google it

Rahul_Pandit 30-Mar-16 3:07am

I have already follow that approach but doesn't work.

CHill60 29-Mar-16 6:22am

The line AND A.SHARINGCOUNT <> B.SHARINGCOUNT probably won't help either.

Rahul_Pandit 30-Mar-16 3:08am

why so?

CHill60 30-Mar-16 10:03am

Andy Lanng has explained in Solution 3

3 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Mr. Mahesh Patel · Answer 1 · 2016-03-29T02:51:00

Solution 1

Use "MERGE" Statement
using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time[^]
sql-server-2008-introduction-to-merge-statement-one-statement-for-insert-update-delete[^]

Posted 29-Mar-16 2:51am

Mr. Mahesh Patel

Mr. Mahesh Patel · Answer 2 · 2016-03-29T02:53:00

Solution 2

Use "MERGE" Statement
https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/

Posted 29-Mar-16 2:53am

Mr. Mahesh Patel

Comments

Rahul_Pandit 30-Mar-16 2:13am

Merge is not allowed to used as per security by dba team. Any other solution?? I am using Sql Server 2005.

Andy Lanng · Answer 3 · 2016-03-29T22:24:00

The order, and type, of your conditions does matter.

The non-ANSI join you use is incredibly inefficient because it attempts to join everything from table A to everything from table B. Although it is only looking at indexes in the first instance, the indexes remain very large for very large tables and joins. Changing it to an ANSI join will help, but only if your Join conditions take advantage of this.

Using the first condition A.col <> B.col will attempt to join everything but 1 item from table A to everything but 1 from table B. It doesn't improve the situation very much at all.

Try switching the first two conditions to use the "=" first. That will reduce the number of comparisons made my the "<>" immensely.

The three where clauses you have that look at a variable could be done as they are, in a Where clause, because adding them to the join will do little, if anything, to improve the query efficiency.

This would be the result:

SQL

UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A
INNER JOIN TRN_FINAL_FCM_DETAILS_MONTHLY B ON A.SITEID = B.SITEID AND A.REQUESTREF = B.REQUESTREF AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
WHERE A.BILLMONTH = @CURR_BILLMNTH 
  AND B.BILLMONTH = @PREV_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR 
  AND B.BILLYEAR = @PREV_BILLYR

Secondly, you could reduce the query time further by invoking the Where clause before the join. This again reduces the index size for the join. You can do that with a CTE. Depending on how your data is set up and the data types involved, this could make things much better or much worse:

SQL

With preCalcCTE as (
 Select * from TRN_FINAL_FCM_DETAILS_MONTHLY
 where  BILLMONTH = @PREV_BILLMNTH
 and B.BILLYEAR = @PREV_BILLYR)
UPDATE A 
SET A.CHANGE_NOCHANGE = @CHANGED 
FROM TRN_PRE_FCM_DETAILS A
INNER JOIN preCalcCTE B ON A.SITEID = B.SITEID AND A.REQUESTREF = B.REQUESTREF AND A.SHARINGCOUNT <> B.SHARINGCOUNT 
WHERE A.BILLMONTH = @CURR_BILLMNTH 
  AND A.BILLYEAR = @CURR_BILLYR

Finally, int to int comparison is much more efficient that varchar to varchar. An int is always a set few bytes whereas varchar length can be much larger and can even vary. Always calculate the simpler datatype comparisons before the larger ones.
Indexes can help, but the larger the data, the longer it takes for inserts and updates to occur because the index must also be updated. From time to time it's a good idea to check if your indexes become fragmented. You can rebuild the indexes to make them more efficient.

Check index fragmentation:

SQL

SELECT OBJECT_NAME(ind.OBJECT_ID) AS TableName, 
ind.name AS IndexName, indexstats.index_type_desc AS IndexType, 
indexstats.avg_fragmentation_in_percent 
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats 
INNER JOIN sys.indexes ind  
ON ind.object_id = indexstats.object_id 
AND ind.index_id = indexstats.index_id 
-- You only really need to reindex when over 30% or rebuild when over 50%
--WHERE indexstats.avg_fragmentation_in_percent > 30 
ORDER BY indexstats.avg_fragmentation_in_percent DESC

There are a couple of ways to reindex or rebuild indexes.DBCC DBREINDEX (Transact-SQL)[^]

NB:
Most of the time (but not always) SQL will actually decide on your behalf what parts of the query occur in what order. The first time you run any query the database will actually compile and store it after calculating the most efficient way to run it. As your database grows, the sql compiled query may no longer be the most efficient execution. You can clear the query cache so SQL will recompile any and all queries the next time they are run.
Clearing Cache for SQL Server Performance Testing[^]

Conciser for redesign or future projects:
I write my databases so common text (such as ref codes) are stored in a table on their own with an ID and each table that uses the ref code references the ID. That makes indexing and comparison MUCH more efficient.

I once had to redesign a huge database (that was pretty much upside-down) by doing this. It took 6 months to put it together and redesign the application that used it, but even my very best estimates but a data transfer as reducing data size to 5% but would also take over 9k hours to complete (that's over a year!). We decided to archive the old system and only transfer data with no history. Still took 2 days.