Click here to Skip to main content
15,912,897 members
Home / Discussions / C / C++ / MFC
   

C / C++ / MFC

 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Chandrasekharan P18-Feb-09 21:06
Chandrasekharan P18-Feb-09 21:06 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
CPallini18-Feb-09 21:19
mveCPallini18-Feb-09 21:19 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP18-Feb-09 22:14
Ash_VCPP18-Feb-09 22:14 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
CPallini18-Feb-09 22:29
mveCPallini18-Feb-09 22:29 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP19-Feb-09 0:18
Ash_VCPP19-Feb-09 0:18 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP18-Feb-09 22:05
Ash_VCPP18-Feb-09 22:05 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
_AnsHUMAN_ 18-Feb-09 21:32
_AnsHUMAN_ 18-Feb-09 21:32 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Iain Clarke, Warrior Programmer18-Feb-09 22:58
Iain Clarke, Warrior Programmer18-Feb-09 22:58 
As you may have seen from your response, it's not a very good question.

1/ You haven't actually asked a question - you've just told us you have work to do. While we are, of course, very happy for you, there's not much to answer.

2/ You've got quite a bit challenge, especially if your starting from scratch.

3/ You can break it down into several challenges... Handling delays, timeouts, gettinf HTPP pages, parsing them into links, etc.

I've attached below some code I wrote years ago, grabbing a certain page from a specific URL every hour or so - an early RSS reader, essentially. It may help you with your search terms.

There are other articles on codeproject grabbing information from web pages. John Simmons wrote one recently scraping information from a codeproject page.

Good luck with your task!

Iain.

DWORD WINAPI UpdatePageThread ( LPVOID lpParameter )
{
	HWND hWnd = (HWND)lpParameter;

	DWORD dw, dwDelay = 100;
	HINTERNET	hInternet, hIConnect, hIRequest;
	BOOL	bSuccess;
	DWORD	dwStatus, dwSize, dwIndex;

	PCHAR	AcceptTypes [] = { "text/*", NULL };

	// Set up the query.
	hInternet	= NULL;
	hIConnect	= NULL;
	hIRequest	= NULL;
	hInternet = ::InternetOpen ("OC UK Notify", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);

	if (hInternet)
		hIConnect = ::InternetConnect (hInternet, "www.overclock-uk.net", INTERNET_DEFAULT_HTTP_PORT, "user", "pass", INTERNET_SERVICE_HTTP, 0, 1);
	if (hIConnect)
	{
		hIRequest = ::HttpOpenRequest (hIConnect, NULL, "update.ocuk", NULL, NULL, (const char **)AcceptTypes,
			INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_COOKIES | INTERNET_FLAG_NO_UI | INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_AUTH,
			1);
	}

	if (!hIRequest) // Raise an error?
		return 1;

	char	buf [4096];
	std::string	Page;

	while (1)
	{
		dw = WaitForSingleObject (g_hEventStop, dwDelay);
		if (dw != WAIT_TIMEOUT)
			break;
//		dwDelay = 30000; // Wait a minute before we try again.
		dwDelay = 90 * 60000; // 3/2 hours.

		bSuccess = ::HttpSendRequest (hIRequest, NULL, 0, NULL, 0);
		if (!bSuccess)
			continue; // Try again in a while.

		dwSize = sizeof (DWORD);
		dwIndex = 0;
		bSuccess = ::HttpQueryInfo (hIRequest, HTTP_QUERY_STATUS_CODE | HTTP_QUERY_FLAG_NUMBER, &dwStatus, &dwSize, &dwIndex);
		if (!bSuccess)
			continue;
		dwStatus /= 100; // Just get the 2XX part.
		if (dwStatus != 2)
			continue;

		Page.erase ();

		while (1)
		{
			memset (buf, 0, sizeof (buf));
			bSuccess = ::InternetReadFile (hIRequest, buf, sizeof (buf), &dwSize);
			if (dwSize == 0)
				break;
			if (!bSuccess)
				break;
			Page.append (buf, dwSize);
		}


		// We now have the page.
		// Process it...
//		Vec.clear ();
		int nFind = 0;
		std::string Temp;

		EnterCriticalSection (&g_CS_Updates);

		g_UpdateArray.clear ();

		// Try to update the list
		while (1)
		{
			nFind = Page.find ("<p>");
			if (nFind == std::string::npos)
				nFind = Page.find ("<p>");
			if (nFind == std::string::npos)
				break;

			Page.erase (0, nFind + 3);
			
			nFind = Page.find ("</p>");
			if (nFind == std::string::npos)
				nFind = Page.find ("</p>");
			if (nFind == std::string::npos)
				break;

			Temp = Page;
			Temp.erase (nFind, Temp.size ());

			Page.erase (0, nFind + 4);

			g_UpdateArray.push_back (Temp);
		}

		LeaveCriticalSection (&g_CS_Updates);

		PostMessage (hWnd, WM_USER + 1, 0, 0);
	}

	if (hIRequest)	::InternetCloseHandle (hIRequest);
	if (hIConnect)	::InternetCloseHandle (hIConnect);
	if (hInternet)	::InternetCloseHandle (hInternet);


	return 0;
}


Codeproject MVP for C++, I can't believe it's for my lounge posts...

GeneralRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP19-Feb-09 0:15
Ash_VCPP19-Feb-09 0:15 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
Iain Clarke, Warrior Programmer19-Feb-09 0:24
Iain Clarke, Warrior Programmer19-Feb-09 0:24 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Sandeep Saini SRE19-Feb-09 18:58
Sandeep Saini SRE19-Feb-09 18:58 
GeneralRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP19-Feb-09 21:05
Ash_VCPP19-Feb-09 21:05 
QuestionRe: Need To Create a crawler/spider in vc++ Pin
David Crow20-Feb-09 2:20
David Crow20-Feb-09 2:20 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP20-Feb-09 2:26
Ash_VCPP20-Feb-09 2:26 
QuestionRe: Need To Create a crawler/spider in vc++ Pin
David Crow20-Feb-09 2:27
David Crow20-Feb-09 2:27 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP20-Feb-09 2:30
Ash_VCPP20-Feb-09 2:30 
QuestionRe: Need To Create a crawler/spider in vc++ Pin
David Crow20-Feb-09 2:34
David Crow20-Feb-09 2:34 
AnswerRe: Need To Create a crawler/spider in vc++ Pin
Ash_VCPP20-Feb-09 2:43
Ash_VCPP20-Feb-09 2:43 
QuestionNeed help on drawling line in OnPaint function Pin
John50218-Feb-09 19:59
John50218-Feb-09 19:59 
AnswerRe: Need help on drawling line in OnPaint function Pin
Cedric Moonen18-Feb-09 20:11
Cedric Moonen18-Feb-09 20:11 
GeneralRe: Need help on drawling line in OnPaint function Pin
John50218-Feb-09 20:45
John50218-Feb-09 20:45 
GeneralRe: Need help on drawling line in OnPaint function Pin
Cedric Moonen18-Feb-09 21:04
Cedric Moonen18-Feb-09 21:04 
GeneralRe: Need help on drawling line in OnPaint function Pin
John50218-Feb-09 22:25
John50218-Feb-09 22:25 
AnswerRe: Need help on drawling line in OnPaint function Pin
Iain Clarke, Warrior Programmer18-Feb-09 23:04
Iain Clarke, Warrior Programmer18-Feb-09 23:04 
QuestionOnMouseMove swap and clear problem Pin
Member 337533418-Feb-09 19:57
Member 337533418-Feb-09 19:57 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.