Click here to Skip to main content
15,899,025 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hello all, I have a datatable which includes 3 conclusion rows (C) and 1 decision row (D). I want to find rows with similar conclusion values but different decision value using LINQ. How can I achieve this? Thanks for the help.
For example:
SetName :C1-C2-C3-D
Set1    :1-2-2-3
Set2    :2-1-1-2
Set3    :1-2-2-2

.....
Here the result is Set1 and Set3.
Posted
Updated 14-Nov-13 2:10am
v2
Comments
Erik Rude 13-Nov-13 7:27am    
Have you thought about GroupBy?

You hvae to define "similar" in terms that the computer will understand. I have no idea how you're going to do that and, to complicate things, it's going to depend on how you store this data in the database. If the database has to manipulate (change) the data on every record to make a determination you're going to kill the query performance.
 
Share this answer
 
Comments
Member 12 13-Nov-13 11:56am    
First of all thanks for the replies. I know this is not an easy job. I tried many things including grouping columns but it gives nothing. I think I should write my own data row comparer but I have limited knowledge.
Dave Kreskowiak 13-Nov-13 15:53pm    
Grouping isn't going to do anything for you.

You can't even think about writing code until you define in very exact terms what you mean by "similar". What constitutes "similarity" between records? How many records are you comparing? What does the result set look like? If you can't describe the answers to these questions, and more of them, to yourself you have no hope of ever describing it to a computer, which is FAR more picky about it.
Member 12 13-Nov-13 16:50pm    
Let me be more specific. My sets contain nominal values about a disease. Condition values are the symptoms and decision value represents whether he/she is ill or not. For example C1 may represent headache symptoms. I though that the best way to deal with nominal values is converting them to numerical values. Therefore I converted them and save to a data table. Now, my goal is to find conflicting records and remove them from the dataset. For example: Suppose that there are two patients. Both have the same symptoms but only one of them is ill (in my example this is Set1 and Set3). These are outlier values and must be removed. That's my goal.
I think this solves my problem.
SELECT C1,C2,C3 FROM
(SELECT C1,C2,C3,D FROM <datatable> GROUP BY C1,C2,C3,D ) AS A
GROUP BY C1,C2,C3
HAVING COUNT(C1)> 1
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900