Sir, I am willing to find the duplicate files using C++. In-order to achieve this,
I should iterate over all files present in the drive and get its file size and then finding duplicate keys using a map.
So, I've created a map in which key is the size of the file and value is the path of the file. Here is my member function,
bool duplicateFinder::processDrive(const wchar_t* sDir)
{
map<int, wchar_t*> duplicate;
map<int, wchar_t*>::iterator iterate;
WIN32_FIND_DATA fdFile;
HANDLE hFind = NULL;
wchar_t sPath[2048];
wsprintf(sPath, L"%s\\*.*", sDir);
if ((hFind = FindFirstFile(sPath, &fdFile)) == INVALID_HANDLE_VALUE)
{
wprintf(L"Path not found: [%s]\n", sDir);
return false;
}
do
{
if (wcscmp(fdFile.cFileName, L".") != 0
&& wcscmp(fdFile.cFileName, L"..") != 0)
{
wsprintf(sPath, L"%s\\%s", sDir, fdFile.cFileName);
if (fdFile.dwFileAttributes &FILE_ATTRIBUTE_DIRECTORY)
{
wprintf(L"Directory: %s\n", sPath);
processDrive(sPath);
}
else
{
char** arr;
char* hash = new char[MAX_PATH];
memset(hash, 0, MAX_PATH);
int correction;
correction = wcstombs(hash, sPath, MAX_PATH);
iterate = duplicate.find(getFileSize(hash));
if (iterate != duplicate.end())
{
cout << "\n\n FOUND THE VALUE " << iterate->second;
}
else
{
duplicate.insert(pair<int, wchar_t*>(getFileSize(hash), sPath));
}
}
}
} while (FindNextFile(hFind, &fdFile));
FindClose(hFind);
return isDuplcateFound;
}
In the above code whenever a file is found the size of the file is calculated using getFileSize function(This is not a Win32 API function.It is Native C++ user defined function.this function returns size of the file in bytes e.g: 4278) and before inserting it to the map the presence of the key is checked using "find" function in maps if the key is not present then it is inserted into the map. But if the key is found using the "iterate" function the path should be displayed.
But whenever the duplicate file is found the out put returned is an address like this
FOUND THE VALUE A012556 I don't know why this error.I tried iterate->first and found the size of the file displayed correctly but those two files or not the duplicates.
Kindly help me sir with this.
Thank you for your time sir.
What I have tried:
I have tried:
1. Before using maps I first did make sure that the program is iterating over all files present in the drive by printing the names of the files.
2. Then I made sure that "getFileSize" function works correctly by printing sizes of all files present in the drive.
3. After I've tested is MD5 Hash is computed correctly by printing the hashes of all files in the directory.[Since I'm getting errors in adding sizes of files to the map I didn't develop this further more because it is the next step when I found the file of same size]
After several tries and modification above three worked fine. then i moved on to the next step which is adding the value.
I referred my school notes[This is not my Homework] then internet about adding the values to the map and found that i did correctly so I'm unsure of the error.
Kindly help me sir with this.
Thank you for your time sir.