Click here to Skip to main content
15,911,715 members
Please Sign up or sign in to vote.
1.93/5 (4 votes)
See more:
Hi,

i have sequence of characters immediately followed by the same sequence then i want the program to remove the repeated words like the string 'abcdabcd' i need abcd

how can i do that.

Thanks inadvance
Posted
Updated 16-May-13 3:06am
v2
Comments
Nani Babu 16-May-13 9:30am    
I have one query, In above string "abcdabcd" : 'abc', 'ab' ,'bc' also repeated. On what basis you want repeated words? (length or what)
pradeep manne 16-May-13 10:21am    
hi ,
thanks for the replay
in the string"abcdabcd" i need the repeated word
like"repeatedrepeated" i need only one word "repeated"
Walby 16-May-13 10:43am    
Hi pradeep manne, do you know the phrase you are looking for in the string, that is repeated, or is it all repeated words in a string. i.e. "abcdabcd" return "ab" and "cd".

Regex is your friend!
C#
using System;
using System.Diagnostics;
using System.Text.RegularExpressions;
namespace ConsoleApplication16
{
  class Program
  {
    static readonly string[] Tests = { "abcdabcd", "xabcdabcd", "abcdabc", "xaaabcdabcd" };
    static readonly Regex FindDup = new Regex(@"(.+)\1", RegexOptions.IgnoreCase);
    static void Main(string[] args)
    {
      foreach (string t in Tests)
      {
        MatchCollection allMatches = FindDup.Matches(t);
        Trace.WriteLine(string.Format("{0}: {1}", t, allMatches.Count));
      }
    }
  }
}

Results:
abcdabcd: 1
xabcdabcd: 1
abcdabc: 0
xaaabcdabcd: 2


This will also identify what the matching strings are, in the allMatches collection.

EDIT: Add response to Collin's comments:
Each of the Match values in the allMatches contains the information about the doubled text in the Groups property. Groups[0] contains the whole matched string (both copies), and Groups[1] contains the string of the single copy.
If you change the loop above to:
C#
foreach (string t in Tests)
{
  MatchCollection allMatches = FindDup.Matches(t);
  Trace.WriteLine(string.Format("{0}: {1}", t, allMatches.Count));
  foreach (Match item in allMatches)
  {
    Trace.WriteLine(string.Format(@"  ""{0}"" is doubled", item.Groups[1]));
  }
}

You'll see that.
If the objective is to remove the duplications, then use of Replace() method of the Regex will do the job:
C#
string t2 = FindDup.Replace(t, string.Empty);
Trace.WriteLine(string.Format(@"Final: ""{0}""", t2));

Of course, a different string can be substituted in instead of string.Empty
 
Share this answer
 
v3
Comments
[no name] 16-May-13 13:24pm    
I am failing to comprehend how this achieves at all what the OP wants. The OP stated there is a sequence repeated and he wants what the sequence is.

Your results have nothing to do with this. While I am sure regex can provide a solution (I am not great with its black arts), you have failed to use it appropriately to get what the desired result is. Which in fact is often the failure of regex...

"If I jump up and down and chop a chickens head off then somehow a pig will fly"
"Ahhh... I just wanted a piece of bacon".
I haven't yet voted on yours but if you can edit your post and show how you get the desired sequence string I will happily give you 5 (I felt regex would do it and be highly appropriate but as I said I am unfamiliar with the black arts)
Matt T Heffron 16-May-13 13:57pm    
I've added clarification on usage of the Regex.
[no name] 16-May-13 13:58pm    
Kewl! Have a 5 you dark wizard from beyond :-)
If string has only one word repeated twice then you can try below

C#
string str = "abcdabcd";
string temp = str.Substring(0, str.Length / 2);
 
Share this answer
 
I am making some assumptions on what you are looking for (your question isn't entirely clear) but I think this does the trick.

You can use a System.Text.StringBuilder and then split the original string using the built up string. Once you have all of the parsed items as string empty that is your repeated string and you break out of the loop.

C#
string val = "abcabcabc";
System.Text.StringBuilder sb = new System.Text.StringBuilder();
string result = string.Empty;
foreach (var c in val)
{
   sb.Append(c);
   var parsed = val.Split(new string[] {sb.ToString()}, StringSplitOptions.None);
   var stringFound = !parsed.Any(s => s != string.Empty);//All items are empty
   
   if (stringFound)
   {
      result = sb.ToString();
      break;
   }
}


After this runs your string that is repeated will be in result. Note the algorithm breaks out after the first occurance because it will also meet the criteria when the sb contains all characters of the original string. This algorithm will find any number of it being repeated but assumes the string only contains the repeated sequence.
 
Share this answer
 
v5

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900