Click here to Skip to main content
15,900,973 members
Please Sign up or sign in to vote.
2.50/5 (2 votes)
See more:
Hi everyone,

I have texts like this and the formats are given below.

Salary is 3.6L PA
Salary is 3.5 LPA
Salary is 30,000KPM
Salary is 30,000 KPM
Experience: 3-5years
Experience: 3+ years



Now I need to find Salary like 3.5 or 30,000 and Experience or Minimum experience 3years. If there is space then the experience is working fine but Salary is not working. But if experience "3" and "+" have no place with in it, cant get the result.

Can any one please suggest me the logic to how to get those for both salary and experience.

The only Condition is Salary and the amount will always be in same line
and Experience and value will also be in same line.

Thanks in advance.


This is my sample code for experience.

C#
if (emailInLowerCase.Contains(KeyWords[i].ToLower()))
{
   index_mime = emailInLowerCase.IndexOf(KeyWords[i].ToLower(), 0);
   if (index_mime != -1)
   {
    index_Termination = EmailBodyForKeyWords.IndexOf("\r\n", index_mime + KeyWords[i].Length + 2);
        string _FetchExperience = EmailBodyForKeyWords.Substring(index_mime + KeyWords[i].Length + 2, index_Termination - (index_mime + KeyWords[i].Length + 2)).Trim();
        string[] ExperienceStructure = _FetchExperience.Split(' ');
        if (ExperienceStructure.Length > 0)
        {
            int exp1 = 0;
                int exp2 = 0;
                for (int j = 0; j < ExperienceStructure.Length; j++)
                {
                    int Num;
                        bool isNum = int.TryParse(ExperienceStructure[j].Trim(), out Num);
                        if (isNum)
                        {
                            if (exp1 == 0)
                                {
                                    exp1 = Convert.ToInt32(ExperienceStructure[j].Trim());
                                }
                                else
                                {
                                        exp2 = Convert.ToInt32(ExperienceStructure[j].Trim());
                                }
                                if (exp1 < exp2)
                                {
                                        ExperienceFrom = exp1;
                                        ExperienceTo = exp2;
                                }
                                else
                                {
                                        ExperienceFrom = exp2;
                                        ExperienceTo = exp1;
                                }
                                if ((ExperienceFrom != 0 && ExperienceTo == 0) || (ExperienceTo != 0 && ExperienceFrom == 0))
                                {
                                    if (ExperienceFrom != 0)
                                        {
                                            ExperienceTo = ExperienceFrom;
                                        }
                                        else
                                        {
                                            ExperienceFrom = ExperienceTo;
                                        }
                                }
                         }
                   }
          }
    }
}
Posted
Updated 3-Feb-11 2:52am
v2
Comments
[no name] 3-Feb-11 8:36am    
What are you using to parse this? Showing some of the code may help.
arindamrudra 3-Feb-11 8:53am    
I have given the sample code for Experience.
arindamrudra 3-Feb-11 8:56am    
Do you require the code for Salary also then I will give that too. I checked by using "IsNum" but it works only if have the spaces. But I want the logic to get that.
sapien4u 3-Feb-11 9:18am    
You can keep up a common format for the data entry so that you can track/search easily.
arindamrudra 4-Feb-11 1:17am    
Common format is not there. Because it is not a good way to enforce user about the formatting. These are optional fields.

Did you think of using Regular Expressions!

For salary
.*Salary.*\s(?<salary>\d+(?:,\d+)*(?:\.\d+)?).*

For experience
.*Experience.*\s(?<experience>\d+\s*(?:\+|(?:\s*-\s*\d+)?)).*

From this match results you can easily extract data.</experience></salary>
 
Share this answer
 
Comments
arindamrudra 3-Feb-11 23:07pm    
Yes I have thought about regex. But the formats are varying thats why I have not tried with it. Now I am going to try with it. Thanks for your response.
arindamrudra 4-Feb-11 1:15am    
By using the first expression I am getting the error like "parsing ".*Salary.*\s(?\d+(?:,\d+)*(?:\.\d+)?).*" - Unrecognized grouping construct".
Prerak Patel 4-Feb-11 2:40am    
Sorry, there were groups like <experience> which were not displayed in the answer. Modified the answer now.
arindamrudra 4-Feb-11 3:45am    
Thanks a lot its working perfectly. Great answer. Cheers...
Sergey Alexandrovich Kryukov 4-Feb-11 1:27am    
Lazy to check it. Right idea - my 5.
The whole design sucks, I have not doubt.
--SA
I didnt study the code completely.
What i found was you were trying to split the entire string with space.
Instead of it, you can try taking the index of Experience: or :(colon), say X and the index of years, say Y. Then take the substring of the main string starting from X up to Y. THen you will get the correct Experience. You can replace the -(hyphen) or dot or any thing in between to any thing you need.

Do some similar logic of finding the index and taking the substring in case of Salary also.
 
Share this answer
 
Comments
arindamrudra 3-Feb-11 23:17pm    
I could not understand your answer. I am searching for those key words like Experience, Compensation, Salary, CTC from "KeyWords[i]" and then after that I used substring to get the whole line which contains the experience in different format. My challenge is to find the integer values from that string for Experience and decimal values from the Salary line respectively. Thanks for your response.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900