65.9K
CodeProject is changing. Read more.
Home

FileDiff Contest Entry

starIconstarIconstarIcon
emptyStarIcon
starIcon
emptyStarIcon

3.33/5 (5 votes)

Aug 12, 2009

CPOL
viewsIcon

20942

downloadIcon

130

Text Difference between two files

Introduction

This is a contest entry for file differences.

Using the Code

This application is pretty basic. It uses FileStream objects to perform its task.

ASCIIEncoding encode = new ASCIIEncoding();
FileStream fileA = File.OpenRead(args[0]); 
FileStream fileB = File.OpenRead(args[1]);

int b = 0;
int l = 0;

fileA.Position = 0;
fileB.Position = 0;

We start off by opening the files and setting the positions within the files to 0. The var b is the last byte read and the var l is the length of the changed bytes. 

while (fileA.Position <= (fileA.Length - 1))
{
    b = fileA.ReadByte();

    if (fileB.Position <= (fileB.Length - 1))
    {
        if (b != fileB.ReadByte())
        {
            l = 1;

            while (fileB.Position <= (fileB.Length - 1) && 
                fileB.ReadByte() != b)
            l += 1;

            byte[] s = new byte[l];
            fileB.Seek(fileB.Position - l, 0);
            fileB.Read(s, 0, l);
 
            Console.WriteLine("FileDiff Pos:{0}, Len{1}, Str:{2}",
                              fileA.Position, 
                              l, 
                              encode.GetString(s));
        }
    }
}

fileA.Close();
fileB.Close(); 

This is the main application loop. As you can see, it steps through the file byte by byte. When two bytes are different, it stops looking and scans stream B for the next byte that's equal to stream A.

Points of Interest

This is a contest entry, written in C# with .NET v2, is 73 lines including blank lines / comments and formatted code + the timers, etc. The number of lines that are not overhead, blank or comments are 28. It uses 4,384K Memory (Private Working Set) and the EXE is 5.5k.

Run-time is roughly 30 milliseconds, output is the position in the file, length of the Diff and the textual representation.