Click here to Skip to main content
15,890,282 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi, I'm trying out a sample problem on strings wherein I take a string as input and find out the similarity of each of its suffix with itself.

For two strings A and B, the similarity of the strings is the length of the longest prefix common to both strings. For example, the similarity of strings “abc” and “abd” is 2, while the similarity of strings “aaa” and “aaab” is 3.

For example, if the input string is ababaa. Then its suffixes would be ababaa, babaa, abaa, baa, aa, a. The corresponding similarities would be 6, 0, 3, 0, 1, 1. These are added and printed on the console.

Now in my problem, I'm taking input as follows. First line is the number of strings whose sum of similarities (with each of it's suffixes) have to be printed. In the next lines, I enter that many number of strings. I have written code for the problem and it is running fine for normal inputs. But when i give a very large string, it is giving the following runtime exception message "Could not allocate 8192 bytes". My code is as follows

C#
using System;
using System.Collections.Generic;
using System.IO;
class Solution {
    static List<int> listOfSimilarities = new List<int>();
    static void Main(String[] args) {
        
        int t = 0;
        
        t = Convert.ToInt32(Console.ReadLine());
        
        if( t >= 1 && t <=10)
        {
            
            List<string> listOfStrings = new List<string>();
            string input = string.Empty;
            
            for(int i = 1 ; i <= t ; i++)
            {
                
                input = Console.ReadLine().ToString();
                if(input.Length < 100000)
                {
                    listOfStrings.Add(input);
                }
            }
            
            foreach(string str in listOfStrings)
            {
                List<string> listOfSuffixes = new List<string>();
                listOfSuffixes = GetSuffixes(str);
                
                List<int> listOfSimilaritiesForEachSuffix = new List<int>();
                listOfSimilaritiesForEachSuffix = GetSimilaritiesWithString(str, listOfSuffixes);
                
                int sumOfSimilarities = GetSumOfSimilarities(listOfSimilaritiesForEachSuffix);
                
                listOfSimilarities.Add(sumOfSimilarities);
                
            }
            
            foreach(int similarity in listOfSimilarities)
            {
                Console.Write(similarity + "\n");
            }
        }
    }
    
static int GetSumOfSimilarities(List<int> listOfSimilaritiesForEachSuffix)
    {
        int result = 0;
        foreach(int i in listOfSimilaritiesForEachSuffix)
        {
            result += i;
        }
        return result;
    }
    
static List<int> GetSimilaritiesWithString(string str, List<string> listOfSuffixes)
    {
        List<int> listOfSimilaritiesForEachSuffix = new List<int>();
        char[] arrayOfActualString = str.ToCharArray();
        foreach(string suffix in listOfSuffixes)
        {
            int count = 0;
            char[] suffixArray = suffix.ToCharArray();
            for(int i = 0; i < suffixArray.GetLength(0) ; i++)
            {
                if(suffixArray[i] == arrayOfActualString[i])
                {
                    count++;
                }
                else
                {
                    break;
                }
            }
            listOfSimilaritiesForEachSuffix.Add(count);
        }
        return listOfSimilaritiesForEachSuffix;
    }
    
static List<string> GetSuffixes(string str)
    {
        List<string> listOfSuffixes = new List<string>();
        for(int i = 0; i < str.Length ; i++)
        {
            listOfSuffixes.Add(str.Substring(i));
        }
        return listOfSuffixes;
    }
}


Any idea as to why this is happening?
Posted
Comments
Zoltán Zörgő 11-May-13 15:38pm    
In which code line do you get this exception?

When I run your code with 99,999 byte strings (I cheated and read it from a file) I get "out of system memory" on the first run through GetSuffixes, so start by looking at that - and what do we find?

Input string length: 99,999 bytes. "i" is 8,847, so it doesn't get near the end of the string.

Look at the code: each time round, you create and store a string which is one character shorter than the previous one. So the strings you create with Substring are:
99999 chars
99998 chars
99997 chars
...
So by the time you get the error, you have directly allocated 845,652,024 characters, each of two bytes (unicode), or 1.6 GB in string space. Add to that the Object overhead for each string, and the List space you are using to store the references, and you are surprised when it starts to have problems?

I strongly suggest that you go back a stage or two in your algorithm, and think again about exactly how you should do this.
 
Share this answer
 
Instead of storing all sub-strings (suffixes) in list object, calculate the sum of similarities for each suffix. I could not figure out your purpose of calculating the sum of similarity. You can look into KMP failure function if that helps.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900