Click here to Skip to main content
15,885,898 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more: , +
Hi Everyone,

Can you please provide a regular expression for the below.

In the application, the user will provide a description If the description contains an SSN or EIN ('049-90-1935',149901935,14-9901935), should replace with "xxxxxxxx".

Below are the valid formats to replace.
049-90-1935
ANR049-90-1935
049-90-1935BIH
049901935
14-9901935BIN


Invalid formats:
0141234567897 Invalid. Should not be replaced. The sequence is longer than 9 digits.
0141234 Invalid. Should not be replaced. The sequence is shorter than 9 digits.
123456789abc123456789 INvalid



(1)The data consists of 9 digits only.
(2)The data is delimited by either a SPACE or non-digit character.
(3)Embedded space(s) separating digits should be ignored.
(4)For SSNs, the acceptable formats are 
a.xxx-xx-xxxx or
b.xxxxxxxxx where x represents a digit (number from 0 thru 9)
(5)For EINs, the acceptable formats are
a.xx-xxxxxxx or
b.xxxxxxxxx  where x represents a digit (number from 0 thru 9)


What I have tried:

I tried with the following regular expression.

<pre>string ssnPattern = @"[^0-9](?<grpA>\d{9}) |[^0-9](?<grpB>\d{3}-\d{2}-\d{4}) | [^0-9](?<grpC>\d{2}-\d{7}) ";

                    var matches = Regex.Matches(description, ssnPattern).Cast<Match>().Select(m => m).ToList();
                    if (matches != null && matches.Count > 0)
                    {
                        redactionText = Regex.Replace(content, ssnPattern, m =>
                        {
                            string val = "";
                            if (m.Groups["grpA"] != null && m.Groups["grpA"].Value != "")
                            {
                                val = m.Value.Replace(m.Groups["grpA"].Value, req.ReplacementText);
                                matchingItems.Add(m.Groups["grpA"].Value);
                            }
                            if (m.Groups["grpB"] != null && m.Groups["grpB"].Value != "")
                            {
                                val = m.Value.Replace(m.Groups["grpB"].Value, req.ReplacementText);
                                matchingItems.Add(m.Groups["grpB"].Value);
                            }
                            if (m.Groups["grpC"] != null && m.Groups["grpC"].Value != "")
                            {
                                val = m.Value.Replace(m.Groups["grpC"].Value, req.ReplacementText);
                                matchingItems.Add(m.Groups["grpC"].Value);
                            }

                            return val;
                        });
                    }



Unable to resolve all the scenarios. Can anyone please provide a solution.
Posted
Updated 18-May-19 13:29pm

(2)The data is delimited by either a SPACE or non-digit character.
(3)Embedded space(s) separating digits should be ignored.

So how would you resolve something like "0499019 35"?
 
Share this answer
 
Comments
suman palla 20-May-19 4:06am    
Should not be replaced. The sequence is shorter than 9 digits.
Richard MacCutchan 20-May-19 4:38am    
Rule 3 states "ignore embedded spaces in between digits". So how do you tell the difference between an embedded space and the end of a string?
suman palla 31-May-19 11:18am    
The data is delimited by either a SPACE or non-digit character.
Richard MacCutchan 31-May-19 12:50pm    
Embedded space(s) separating digits should be ignored.
So how do you tell whether it is the end of the string or an embedded space?
suman palla 20-May-19 5:11am    
Input: "123 Street123456789, 123 Street12-1234567, 123 Stree123-12-1234 123456789Abc abc1234567989 abc123456789abc"

Output: 123 Street[Replaced], 123 Street[Replaced], 123 Stree[Replaced] [Replaced]Abc abc[Replaced] abc123456789abc
Quote:
(2)The data is delimited by either a SPACE or non-digit character.

You have the problem that your examples are not following rule 2.
Quote:
(3)Embedded space(s) separating digits should be ignored.

This rule will really complicate RegEx usage.

Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx: Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx: Regexper[^]
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900