Click here to Skip to main content
15,912,072 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
Hi
I have some doubts in website scraping.
I can able to scrap a site which is not having login.
But i cant able to scrape a page which is having login.
The scraped site which is having login is only showing the login page.But i need some page which is found after login page.
Can you please make the needful to scarp a site which is having login security.


Thanks in Advance

What I have tried:

C#
string formUrl = "site url"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("email_address={0}&password={1}", "mail id", "Password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
	os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];

string pageSource;
string getUrl = "url";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
	pageSource = sr.ReadToEnd();
}

using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
	pageSource = sr.ReadToEnd();
}
Posted
Updated 30-May-17 0:21am
v3
Comments
F-ES Sitecore 30-May-17 5:28am    
The solution depends on how they are implementing their security. Maybe one of the reasons they implemented security is to stop people like you stealing their content?
premkumar.r 30-May-17 5:36am    
I have user name and password for specific site.then how it will come as stealing.
please let me know.
F-ES Sitecore 30-May-17 6:30am    
I have a passcard for my company's office but that doesn't mean they're fine with me taking their laptops.
premkumar.r 30-May-17 6:09am    
can anyone please help me to find the solution
Richard MacCutchan 30-May-17 6:32am    
If you have been given a login to the site then you just need to complete the login page first. If you do not have an authorised login then there is no way to get information from the site.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900