Click here to Skip to main content
15,891,657 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

Can some one help me figure out how to login into a page using HttpWebRequest and subsequently scrape a page. The code am using doesnt seem to work.


C#
HttpWebRequest request;
HttpWebResponse response;
CookieContainer cookies;


string url = string.Format("http://control.shaboshabo.com/login-action.php ?username={0}&password={1}", "xxx", "yyy");
request = (HttpWebRequest)WebRequest.Create(url);
request.AllowAutoRedirect = true;
request.Method = "POST";
request.CookieContainer = new CookieContainer();
response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode != HttpStatusCode.Found)
{
    //ToDo: if the page wasn't found raise Exception

    //instead of this textmessage

    Console.WriteLine("Something Wrong");
    response.Close();
    request.KeepAlive = false;
    return;
}
cookies = request.CookieContainer;
response.Close();
request = (HttpWebRequest)WebRequest.Create("http://control.shaboshabo.com/controlpanel.php");
request.AllowAutoRedirect = false;
request.CookieContainer = cookies;
response = (HttpWebResponse)request.GetResponse();
//string markup = null;
var rStream = response.GetResponseStream();
string markup;

using (var sr = new StreamReader(rStream))
{
    markup = sr.ReadToEnd();
}
//using (Stream s = response.GetResponseStream())
//{
//    StreamReader sr = new StreamReader(s);

//    while (!sr.EndOfStream)
//    {
//        markup = sr.ReadToEnd();
//    }
//    sr.Close();
//}

Console.WriteLine(markup);
Console.ReadLine();
Posted
Updated 27-Nov-11 3:00am
v2
Comments
Richard MacCutchan 27-Nov-11 9:02am    
The code am using doesnt seem to work
If you expect people to help you, then you need to explain exactly what is not working, what results you expect against what you see, etc.
[no name] 27-Nov-11 9:09am    
am trying to login and subsequently scrape a page. but I am being redirected to the login page every time I run the code.
Richard MacCutchan 27-Nov-11 10:41am    
See my answer below.

Few points which may help you to resolve your problem.

1) Remove extra space between login-action.php and ? from your Url.
C#
string url = string.Format("http://control.shaboshabo.com/login-action.php?username={0}&password={1}", "xxx", "yyy");


2) If I type below Url in browser then it showing me message - "Wrong username or password".

http://control.shaboshabo.com/login-action.php?username=xxx&password=yyy

That means PHP Website does authenticating cridentials passed as a QueryString but those are wrong cridentials. Try by passing valid cridentials.

3) Use Get as a Method Type instead of Post, as you are passing details in Url(QuesryString) and does not actually posting any Data.

Updated - I tested your code for first Url and it is returning Status as "OK". Change your condition as below.
C#
if (response.StatusCode != HttpStatusCode.OK)
{
    //ToDo: if the page wasn't found raise Exception

    //instead of this textmessage

    Console.WriteLine("Something Wrong");
    response.Close();
    request.KeepAlive = false;
    return;
}


HttpStatusCode.OK indicates your HTTP Request was successful and accessible. But that does not necessarily means your PHP site has authenticated those cridentials succeessfully. For that you will have to pass valid cridentials.

Have a look at below link for more information on HttpStatusCode.

http://msdn.microsoft.com/en-us/library/system.net.httpstatuscode.aspx
 
Share this answer
 
v2
Comments
[no name] 27-Nov-11 10:17am    
I made the changes you suggested but because the status code is not equal to found, the program terminates.
RaisKazi 27-Nov-11 10:41am    
Please refer to my updated answer.
[no name] 27-Nov-11 13:24pm    
it runs ok but markup variable which is suppose to contain the markup after reading is empty
I cannot answer your question. However, I have recently had the need to do some web scraping for myself and used the WebBrowser[^] class, which seems to work OK. I did have to spend quite a long time with my browser debugger to figure out how to get to different pieces of information in the returned pages, but it does work reasonably well.
 
Share this answer
 
v2
Comments
[no name] 27-Nov-11 13:25pm    
where is the link you are suggesting?
Richard MacCutchan 28-Nov-11 4:19am    
My apologies, a slip of the fingers perhaps, fixed now.
Sergey Alexandrovich Kryukov 27-Nov-11 17:31pm    
Please fix the link.
--SA
Richard MacCutchan 28-Nov-11 4:20am    
Fingers and brain out of sync :(

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900