Click here to Skip to main content
15,881,715 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi new to this site, hoping someone might be able to help me with this SQL question.

So I have a table called T_EE_EVALUATIONS that has duplicate records within in it. The table has the following structure below. I need to find the duplicate address where the house_id is not the same as it's duplicate address. Neither ID's are P keys. I tried this but it doesn't really work the way I want it to work.
SQL
Select E.eval_id , E.house_id , E.ClientCity, E.ClientAddr, E.Houseregion, T.eval_id , T.house_id, T.ClientCity , T.ClientAddr , T.Houseregion
From T_EE_EVALUATIONS E INNER JOIN
(Select eval_id , house_id,ClientCity, ClientAddr, Houseregion, COUNT(*) AS CountOf
FROM T_EE_EVALUATIONS GROUP BY eval_id , house_id,ClientCity, ClientAddr, Houseregion-- HAVING COUNT(*)>1
) T
ON E.ClientCity || E.ClientAddr || E.Houseregion = T.ClientCity || T.ClientAddr || T.Houseregion
eval_id  house_id  ClientCity  ClientAddr      Houseregion
------------------------------------------------------------
12345    56455     Quebec      34 Anyplace     Quebec
12345    23765     Quebec      34 Anyplace     Quebec
12345    94213     Quebec      34 Anyplace     Quebec
13456    23456     Russell     214 mystreet    Ontario
13456    33456     Russell     214 mystreet    Ontario
14526    14567     Quispamsis  456 Thatstreet  New Brunswick
14526    13567     Quispamsis  456 Thatstreet  New Brunswick


What I have tried:

SQL
Select E.eval_id , E.house_id , E.ClientCity, E.ClientAddr, E.Houseregion, T.eval_id , T.house_id, T.ClientCity , T.ClientAddr , T.Houseregion
From T_EE_EVALUATIONS E INNER JOIN
(Select eval_id , house_id,ClientCity, ClientAddr, Houseregion, COUNT(*) AS CountOf
FROM T_EE_EVALUATIONS GROUP BY eval_id , house_id,ClientCity, ClientAddr, Houseregion-- HAVING COUNT(*)>1
) T
ON E.ClientCity || E.ClientAddr || E.Houseregion = T.ClientCity || T.ClientAddr || T.Houseregion
Posted
Updated 14-Nov-19 0:57am
v2
Comments
ZurdoDev 13-Nov-19 10:50am    
Don't group by house_id because that is not part of the duplicate data.
greatwhite1 13-Nov-19 11:44am    
How do I include house_id? I have to have that so I can see if the house_id is different
ZurdoDev 13-Nov-19 11:48am    
One way is to use a derived table. Your derived table pulls groups on just the address and then join back to that based on the address fields.

If you want to delete all but one for each house there are simple scripts you can find online to do that.
greatwhite1 13-Nov-19 12:28pm    
No once I find the records I won't be deleting them they have to have their house_id updated to the original. From my understanding before I started here something caused some addresses to be given a new house ID , Duplicate data is normal but they should all have the same house_id. I know weird but I didn't build it, just have to live with it.
ZurdoDev 13-Nov-19 13:49pm    
1. Reply to the comment so that I am notified instead of posting a new comment.
2. Do something like this
SELECT *
FROM table t
LEFT JOIN
(
SELECT addr1, addr2, ...
FROM table
GROUP BY addr1, addr2, ...
HAVING COUNT(*) > 1
) x ON t.addr1 = x.addr1 AND t.addr2 = x.addr2, ...

1 solution

Try to get all duplicates for specific eval_id and then to join house_id:
SQL
SELECT t1.*, t2.house_id 
FROM 
(
    SELECT eval_id, ClientCity, ClientAddr, Houseregion
    FROM T_EE_EVALUATIONS 
    GROUP BY eval_id, ClientCity, ClientAddr, Houseregion
    HAVING COUNT(*)>1
) AS t1 INNER JOIN T_EE_EVALUATIONS AS t2 ON t1.eval_id = t2.eval_id AND t1.ClientCity = t2.ClientCity AND t1.ClientAddr = t2.ClientAddr AND  t1.Houseregion = t2.Houseregion
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900