Click here to Skip to main content
15,891,833 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
i have two text files named totalfiles.txt and uploaded.txt i am storing both of them in a vector now offcourse when this program will run for a very first time uploaded.txt will be made and it will be empty and first file will be written in it after its successfully uploaded to the server what i want is some kind of method to compare both .txt files so if files that are already uploaded to server don't get uploaded again so is there any method to do that in c++? again at the beginning i am going to store the files in vector and i need to do above operation via vector and not using .txt files directly with fstream


What I have tried:

ifstream read2("ScannedFiles.txt");
	ofstream read3("Problem.txt", ios::app);
	fstream uploaded("Uploaded.txt", ios::app);
	vector<string>filen2;
	vector<string>upl;
	string st3;
	string st2;
	while (getline(read2, st2))
	{
		if (st2.size() > 0)
			filen2.push_back(st2);
		else
			break;

	}
	while (getline(uploaded,st3))
	{
		if (st3.size() > 0)
			upl.push_back(st3);
		else
			break;
	}
<pre>	for (vector<string>::iterator t = filen2.begin(); t != filen2.end(); t++)
Posted
Updated 29-Feb-20 0:51am

Start with the easy route: calculate a hash code for the whole of each file, and store it with the file. (MD5 is technically broken, but for a non-security application it's faster than SHA - I'd probably use that).
Then when the file upload happens, calculate it's hash code, and compare it with your collection of existing ones. If it isn't there, it's genuinely new. If it is, the chances are that files there already (MD5/SHA hash collision are pretty rare, but you can then do an exhaustive comparison with a much, much, more limited number of matching files to check character by character)
 
Share this answer
 
Comments
Member 12899279 28-Feb-20 14:32pm    
i am storing complete file path of the file i am uploading in the uploaded.txt so all it has to do is that to check next time before uploading by matching the path from totalfiles.txt with uploaded.txt if it finds it in uploaded.txt then leave it and not upload otherwise upload it

this is asn eg

totalfiles.txt have following files:-

C:\Program Files\Android\Android Studio\bin\lldb\lib\plat-irix5\readcd.doc
C:\Program Files\Android\Android Studio\bin\lldb\lib\plat-irix6\readcd.doc
C:\Program Files\JetBrains\PyCharm Community Edition 2019.2.3\help\ReferenceCard.pdf
C:\Program Files\JetBrains\PyCharm Community Edition 2019.2.3\help\ReferenceCardForMac.pdf
C:\Program Files\NetBeans 8.0.2\nb\shortcuts_mac.pdf

uploadedFiles.txt have following files:-

C:\Program Files\Android\Android Studio\bin\lldb\lib\plat-irix6\readcd.doc
C:\Program Files\JetBrains\PyCharm Community Edition 2019.2.3\help\ReferenceCard.pdf

Now all it has to do before uploading is to check if any of the lines are already present in uploadedfiles.txt then dont upload otherwise upload
Stefan_Lang 3-Mar-20 5:05am    
Where's the point of downloading the uploaded file for comparison just to prevent an unnecessary uploading? You're not reducing network traffic in this manner - on the contrary, you'd increase it!

The solution makes a very reasonable suggestion: rather than comparing the entire files, store a hash with each file, and only compare the hashes. That way you only have to download the hash of the uploaded file for testing. Is that not what you want?
Stefan_Lang 3-Mar-20 5:06am    
Very reasonable suggestion. Have a 5.
i think i figured it out below is the code

string hashi ="files";
ifstream read2("ScannedFiles.txt");
ofstream read3("Problem.txt", ios::app);
/// ofstream read4("Uploaded.txt", ios::app);
vector<string>filen2;
string st2;
while (getline(read2, st2))
{
if (st2.size() > 0)
filen2.push_back(st2);
else
break;

}
read2.close();
int a = filen2.size();
int b = a;
int h = 1;
for (vector<string>::iterator t = filen2.begin(); t != filen2.end(); t++)
{
string st3;
vector<string>upl;
ifstream uploaded("Uploaded.txt", ios::app);
while (getline(uploaded, st3))
{
if (st3.size() > 0)
upl.push_back(st3);
else
break;
}
uploaded.close();
int c = upl.size();

int i = 0;

string currfil = t[i];
string name = getFileName(currfil);
bool toWrite = true;

for (vector<string>::iterator t2 = upl.begin(); t2 != upl.end(); t2++)
{
int j = 0;
if (c <= 0)
break;
else
{
if (t2[j] == currfil)
{

toWrite = false;
}
}
}
if (toWrite == true)
{


CURL *curl;
CURLcode res;

curl_httppost* post = NULL;
curl_httppost* last = NULL;
/*HttpPost* post = NULL;
HttpPost* last = NULL;*/
//string name = "CV.pdf";
//string path = "E:\\BIMS Uni\\CV.pdf";
//string name = getFileName(path);
// string request = "Md5=" + hashi;
curl = curl_easy_init();
if (curl)
{
// curl_easy_setopt(curl, CURLOPT_POSTFIELDS, request.c_str());
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "Md5",
CURLFORM_COPYCONTENTS, hashi.c_str(),
CURLFORM_END);
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "name2",
CURLFORM_COPYCONTENTS, name.c_str(),
CURLFORM_END);
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "file2",
CURLFORM_FILE, currfil.c_str(),
CURLFORM_END);



curl_easy_setopt(curl, CURLOPT_URL, "http://localhost:8081/PCInfo/test.php");

curl_easy_setopt(curl, CURLOPT_HTTPPOST, post);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 20L);


res = curl_easy_perform(curl);
if (res == 28)
{
read3 << currfil << endl;
//h = h + 1;

}
if (res == 26)
{
//return 0;
read3 << currfil << endl;
//h = h + 1;

}
if (res == 0)
{
// uploaded.close();
ofstream uploaded("Uploaded.txt", ios::app);
uploaded << currfil << endl;
uploaded.close();
}

curl_formfree(post);
}
/*else
{
return 0;
}*/

curl_easy_cleanup(curl);
}
}
return 0;
i am still looking for a more better and clean way because i think it will take too much time
 
Share this answer
 
Comments
Stefan_Lang 3-Mar-20 5:00am    
Several issues:

1. Please do not post your own comments or further questions as a solution! Unless it is a verified solution that you have successfully validated to work exactly as you intended! Your code is pretty obviously not a complete solution, so do not post it as such.

2. Add any new insights and questions to your original posting! That is what the green [Improve question] button at the bottom right of your question is for! (Hint: it's only visible when you hover your mouse pointer over your question)

3. Please do spent a little effort on formatting your postings in a meaningful way! You have just used a standard formatting tag on your entire posting including code and normal text. That doesn't help anyone.
3.a) Do separate text and code and other elements, and then format them separately with the most appropriate options. Text should normally not use any formatting tag at all; code should use the appropriate code+language tag.
3.b) You don't have to write these formatting tags by hand; instead you can just select the block you want to format, and use the formatting tags on top of the edit box.
3.c) You can even preview the results before posting - do use that feature and don't post if it doesn't look like you expected!
3.d) Indent your code. before posting, or make sure that already indented code doesn't lose it. Nobody wants to read a wall of text and figure out what is what!

4. Your file read loops may not work as expected, due to two issues:
4.a) getline() returns an istream object, not a bool. Using it as a condition doesn't make any sense. You have to call an appropriate function to check the state of the istream object instead, e. g. good().
See http://www.cplusplus.com/reference/istream/istream/
4.b) why do you break out of the loop after encountering an empty line?

Can't there be more text after an empty line?? See 4.a) about properly checking for the end of the file!
4.c) I haven't checked in detail, but it seems your code performs a full text comparison. To do that, you need to effectively copy the remote file. Doesn't that make the whole process obsolete? Wasn't your point to avoid copying the entire file over the network unless necessary? Why then are you doing just that right at the start of your program?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900