Comparing two different values column values for matching records

Question

0.00/5 (No votes)

See more:

Hi - I have a requirement to match two different columns and delete the matching records.

say in column A, we have some 60000 records(contains duplicate) and in column B we have 900 records(unique).

Here i need to match(not row by row) 900 records (Col B) with that of 60000 records(col A) to know the matching counts (including duplicate records) in column A.

Could you please help me on this?

What I have tried:

i tried with some vlookup ,it doesnt work out

Posted 27-Mar-18 7:51am

GJSS

Updated 27-Mar-18 9:11am

Add a Solution

Comments

RedDk 27-Mar-18 14:07pm

Ok, here's a tip from a MSOffice adherent ... Excel is a spreadsheet application and Access is a database application. Because there are similarities between the two, mainly the external appearance of "cell" or "range" to "rowset" or "column"/"row", a new user becomes quickly confused.

If you are in need of an editing facility that will do calculations with digital input ... use Excel. If you are in need of relational manipulations based upon fields and rows ... use Access.

All this is clear in the MS Office documentation. The fact that you are mentioning "record" leads me to think Access would be of more use in this case.

GJSS 27-Mar-18 14:41pm

Can you assist me in using Access?

RedDk 27-Mar-18 15:33pm

Ok. Start by opening Access and attempting to use it to open your spreadsheet. This is a one-way process, by the way. There's no opening an Access database using Excel.

GJSS 5-Apr-18 11:17am

Hi - I have accomplished this task by using the subquery running in SQL view in access DB.
Thanks for your assistance.

2 solutions

Solution 1

This is the typical situation where Excel is just not the right tool.
Your problem is typical for a database like Access or any other database engine.

It is impossible to help you without exact specification of your problem and of the data.
And it is impossible to teach you database in the space of this forum, and few directions will never be enough.

Advice: learn databases or find someone to do the job.

Posted 27-Mar-18 9:09am

Patrice T

Updated 27-Mar-18 10:10am

v2

Comments

Maciej Los 27-Mar-18 16:00pm

Who downvoted your answer? Upvoted to balance unfair vote!

Patrice T 27-Mar-18 16:08pm

Thank you

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Maciej Los · Accepted Answer · 2018-03-27T09:13:00

Solution 2

It's easy to achieve by using array formula[^]. How to: Create an array formula - Office Support[^].

I'd suggest to move unique data into - let's say - Sheet2 - column A. Then in Sheet1 (with non-unique data) - column B add the following formula (assuming that a first row is a header row):
= SUM(IF(A2=Sheet2!A$2:A$901,1,0))
Accept it by pressing combination of keys: Ctrl+Shift+Enter
Fill-down above formula till the end of data in Sheet1.

This will return the number of occurencies: 0 - none (new - unique), >=1 - exits (duplicated)

Finally, add filter to column B (higher than zero) and remove duplicated data ;)

Good luck!

Posted 27-Mar-18 9:13am

Maciej Los

Comments

Patrice T 27-Mar-18 15:33pm

The problem with those formulas is that runtime grow geometrically with data size.

Maciej Los 27-Mar-18 15:39pm

For such portion of data, a risk of unrestricted grow of RAM consumption is minimal. I worked out with much bigger data set.
;)

Patrice T 27-Mar-18 15:43pm

For such problems, I always switch to database.

Maciej Los 27-Mar-18 15:57pm

As to me - it depends... I'm constructively lazy, so i use that tool which is more-handy to me. ;)