Click here to Skip to main content
15,897,273 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
i have a string name="
C++
"ThisThis is tested by noornoor"

is there any if i want to remove the second word which comes after like in ma string there are to word like ThisThis can i remove other this???
is there any way?
Posted

It's not a trivial task. For instance how can your application tell that 'oo' in 'noornoor' isn't actually a string repetition?
:)
 
Share this answer
 
You could take each word, divide it into equal-length parts, and check the parts for equality. Of course, this won't fix things like "ThisThisThis"...

Or you could take each word, and starting with the first letter, check for another instance of the current letter. You only have to check the first n/2 letters to determin if there's a problem.

Like CPallini stated, it's not a trivial task...
 
Share this answer
 
v2
Hi TanzeelAhmed,

You can easily do this by dividing the word in two separate words and validate. But, what will you do in such case where a single word consists of two such combination of letters (like: coco, dodo [bird name], tata, papa etc)? In such scenarios this will definitely fail.

Your AI logic must be very complex for solving such scenarios. The call is your. If this answered your query, please "Mark As Answer".
 
Share this answer
 
To avoid mis-identifying real words, you could maintain a dictionary of words and simply compare a given word with all of the words in the dictionary. If it doesn't exist, then you could run through your automatic fix-it code for that word.
 
Share this answer
 
Hi John,

That's a good idea of using a dictionary of valid words. But, he will have to use a huge collection of real words in that case. Probably giving an option to add words from the user-end to dictionary in his application will be useful for that. :cool:
 
Share this answer
 
You could approach this either using regular expressions or directly. The direct method is not at all trivial however regular expressions just might do the trick;

Consider the following rather famous regex example.
\b(\w+)(\s+|$)\1
This will match doubled words

In your case you are looking to improve words with doubled character strings. consider \b(\w{4,})\1
This regular expression will find all words that have a doubling of 4 letters or more. You will need a little experimentation to get exactly what you want.

The following code takes care of the replacment.
string name="ThisThis is tested by noornoor";
Regex    rx = new Regex (@"\b(\w{4,})\1");
string smallName = rx.Replace(name,"$1");


I would give this as a note of caution. Until you are absolutely certain that your regex is working perfectly, do not allow permanent alteration of your data in an automated way. From personal experience I have seen the results of trying to be a bit to clever with regex expressions. That said if you use strings a lot regular expressions are really nice to know.

Hope this helps
Ken
 
Share this answer
 
v2
can u paste me some example of like this???
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900