Efficient Way to Crawl through Web Sites or Better Way to Download WebSite Content

Question

0.00/5 (No votes)

See more:

, +

What can be Efficient Way to Crawl through Web Sites
I have two different codes, which would be better & in what respect ?
Or can there be more better code tom crawl through webpages ??

Code 1 :

private static string GetWebTest1(string url)
        {
            System.Net.WebClient Client = new WebClient();
            return Client.DownloadString(url);
        }

In Comparism with :

Code 2 :

private static string GetWebTest2(string url)
       {
           HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
           WebResponse response = request.GetResponse();
           Stream stream = response.GetResponseStream();
           StreamReader reader = new StreamReader(stream);
           return reader.ReadToEnd();
       }

Which Can be better ? & in What respect ( time consuption,handling error,etc. )

Posted 16-Jun-10 20:30pm

jpratik

Updated 16-Jun-10 20:35pm

v2

Add a Solution

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

#realJSOP · Accepted Answer · 2010-06-17T03:25:00

Solution 2

You have more control over what's happening with method #2. That's the one I would choose. However, don't forget to add the requisite try/catch/finally block to handle any exceptions that might occur.

Posted 17-Jun-10 3:25am

#realJSOP

Comments

jpratik 17-Jun-10 12:26pm

In Method #2. I might have more control.But my need is "which one is faster?"
As I need to go through this code again & again with different URL's some which fetched from the downloaded webpages also.
It would be like on this current page, there are 'n' 'web page links' which again are passed to same function and the process would continue.. for large number of pages.
So At that time,what method would be more time saving ?
I would be adding Try..Catch for Handling Exception.

#realJSOP 17-Jun-10 13:37pm

Speed differences are negligible because your bottleneck is the speed of your internet connection. If it were me, I'd pick the way that gives me the most control of what's happening.

jpratik 22-Jun-10 1:29am

ok Thanks for your advice.

vipsha · Accepted Answer · 2010-06-16T23:52:00

Solution 1

hi friend,

According to me the Code 2 is better, because you can get all links resides in page easily by the reader,

For collect the link from reader use the regular expression and Match class in asp.net.

Yap if your are going towards performance the Code 1# is the best.

Accept this answer if you are satisfy.

Thanks,

Mahesh Patel

Posted 16-Jun-10 23:52pm

vipsha

Updated 17-Jun-10 18:26pm

v2

Comments

jpratik 17-Jun-10 12:27pm

In Method #2. I might have more control.But my need is "which one is faster?"
As I need to go through this code again & again with different URL's some which fetched from the downloaded webpages also.
It would be like on this current page, there are 'n' 'web page links' which again are passed to same function and the process would continue.. for large number of pages.
So At that time,what method would be more time saving ?
I would be adding Try..Catch for Handling Exception