Click here to Skip to main content
15,897,891 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Currently I am using this code to check my filesystem for bad hashes but need to know if there is a better way to implement it so that my scanning speeds return to normal.
VB
Dim Vname As String = "Infected!!  "
        Try
            With My.Computer.FileSystem
                If stop_Scan = True Then
                    Exit Sub
                End If
                For Each file1 In System.IO.Directory.GetFiles(dir)
                    Dim fs As New FileInfo(file1)
                    LabelX12.Text = mdsc(fs.FullName)
                    LabelX11.Text = GetCRC32(fs.FullName)
                    LabelX13.Text = getSHA1Hash(fs.FullName)
                    LabelX14.Text = getSHA512(fs.FullName)
If My.Computer.FileSystem.ReadAllText("Master MD5 SIG.txt").Contains(LabelX12.Text)Then
CheckedComboBoxEdit1.Properties.Items.Add(Vname + fs.FullName)
CheckedComboBoxEdit1.Text = "!!Attention Infection(s) Found!!"
lblVirus.Text = "(Virus) " & CheckedComboBoxEdit1.Properties.Items.Count
         Else
If My.Computer.FileSystem.ReadAllText("MASTER CRC 32 .txt").Contains(LabelX11.Text) Then
CheckedComboBoxEdit1.Properties.Items.Add(Vname + fs.FullName)
CheckedComboBoxEdit1.Text = "!!Attention Infection(s) Found!!"
lblVirus.Text = "(Virus) " & CheckedComboBoxEdit1.Properties.Items.Count
        Else
If My.Computer.FileSystem.ReadAllText("SHA1Sig.txt").Contains(LabelX13.Text) Then
CheckedComboBoxEdit1.Properties.Items.Add(Vname + fs.FullName)
CheckedComboBoxEdit1.Text = "!!Attention Infection(s) Found!!"
lblVirus.Text = "(Virus) " & CheckedComboBoxEdit1.Properties.Items.Count
        Else
If My.Computer.FileSystem.ReadAllText("SHA512Sig.txt").Contains(LabelX14.Text) Then
CheckedComboBoxEdit1.Properties.Items.Add(Vname + fs.FullName)
CheckedComboBoxEdit1.Text = "!!Attention Infection(s) Found!!"
lblVirus.Text = "(Virus) " & CheckedComboBoxEdit1.Properties.Items.Count
   End If
    End If
     End If
      End If

If anyone could tell me how to shorten this or make it more efficient, it would greatly help!

Thank you in advance
Posted
Updated 23-Jul-11 17:53pm
v2

So, you're reading your entire hash file tables on every file you're "scanning"? No wonder it's so slow.

You load the has tables ONCE and keep all that data in an internal table, proabably sorted by hash code to make lookups faster.
 
Share this answer
 
Comments
Dale 2012 24-Jul-11 0:22am    
In my previous question that you have answered this is my reason for wanting to know what will speed up my scanning times. I feel that I have some work ahead of me to figure out how to add the table your talking of and how to load it ONCE? where and how can I create this table?
Dave Kreskowiak 24-Jul-11 9:13am    
What's in your files? All you have to do is create a structure that represents a single line of your data, then you create a collection of those instances. When you read the data, line-by-linee, you parse that line and create an instance of your data structure, then add it to you collection.

Once in the collection, looking for a single instance of a value is done by a binary search, which is extremely fast.

...IF you represent your data correctly.
1. The problem may be in
.ReadAllText
method which is not optimized by definition.
You may try to use
StreamReader.ReadLine
method. Take a look at this discussion.
The proposed there code runs fast even on a very large text files (i tested).

2. It may be useful to check an open source code of ClamWin antivirus and read the article about it: Hash-AV: Fast Virus Signature Scanning from Stanford.edu.
 
Share this answer
 
Comments
Dave Kreskowiak 24-Jul-11 9:14am    
...and it can be done even faster if he doesn't read the files at all every time he searches for a value.
Sergey Chepurin 24-Jul-11 10:07am    
I agree, but this is not the optimization of existing code. You propose the new algorithm to be implemented and even want it to "represent data correctly".
Dave Kreskowiak 24-Jul-11 15:15pm    
Optimizing does not mean keeping the existing code. By definition, it means doing something differently to achieve the same end result. You cannot optimize the code without changing it.
Dave Kreskowiak 24-Jul-11 15:21pm    
You know what. With your philosophy, you would tweak this and make it run a little faster. I'd scrap it and rewrite from scratch and make it run a LOT faster.

Optimizing a pile of crap just results in a slightly better smelling pile of crap.
Sergey Chepurin 24-Jul-11 15:57pm    
He will thank you, i guess. By the way, this is not "philosophy", it simply answers the question correctly.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900