Click here to Skip to main content
15,891,184 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
This implementation of strstr uses hashing, the choice of hash function below is obviously not perfect, that's why I want to improve it; the full program is the following:

C#
#include <stdlib.h>
#include <stdio.h>

const char *
strstr(const char *str1, const char *str2)
{
    unsigned hash1 = 0, hash2 = 0;
    unsigned i, j, str2_len;

    for (i = 0; str2[i] != '\0'; i++)
    {
        hash2 += (unsigned) str2[i];
        if (str1[i] == '\0')
            return NULL;
        hash1 += (unsigned) str1[i];
    }

    str2_len = i;

    for (i=0; str1[i] != '\0'; i++)
    {
        if (hash1 == hash2)
        {
            for (j=0; j < str2_len; j++)
                if (str1[i+j] != str2[j])
                    break;
            if (j == str2_len)
                return str1+i;
        }
        if (str1[i+str2_len] == '\0')
            return NULL;
        hash1 = (hash1 - ((unsigned)str1[i])) + ((unsigned)str1[i+str2_len]);
    }
}

int
main(int argc, char **argv)
{
    const char *match;

    char buffer[] = "abc abcdef";
    char sub[] = "abcdef";

    match = strstr(buffer, sub);

    if (match)
        printf("%s\n", match);
    else
        printf("NOT FOUND\n");

    return 0;
}


A better hash function would improve the run time of this algorithm (reduce the number of times str2 is compared to a substring of str1).

Any suggestions that contributes to the improvement the algorithm, It will be well received

Thanks.
Posted
Comments
PIEBALDconsult 15-Aug-15 13:55pm    
I didn't read all that, but I hope you are storing the hash, otherwise hashing costs you more rather than less.
[no name] 17-Sep-15 15:05pm    
The code computes hash2 only once.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900