Click here to Skip to main content
15,896,453 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi everyone!

Please I have a very serious issue. I want to use webclient to extract blog title, blog image, blog content and published date the way facebook is doing. I want to do it in asp.net using webclient.

All I have been able to do so far is retrieve page title and first page image. Please you people should help me out.

What I have tried:

Below is my code:
HTML
Dim source As String = ""
        Dim wb As WebClient = New WebClient()
        source = wb.DownloadString("http://www.9lessons.info/2010/06/facebook-like-extracting-url-data-with.html")
        Dim title As String = ""
        title = Regex.Match(source, "\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups("Title").Value
        Dim img = ""
        img = Regex.Match(source, "<img.+?src=[\'](.+?)[\'].+?>", RegexOptions.IgnoreCase).Groups(1).Value
        Label1.Text = title
        imgg.ImageUrl = img


All response is highly appreciated.
Thanks in advance.
Posted
Updated 19-Apr-16 21:27pm
v2
Comments
Sinisa Hajnal 11-Apr-16 4:00am    
You're on your way. Now just find the pattern that will match the content of the site (which will be different for each site probably) and you're on your way. If you read some known site (such as FB), you don't need to program the parsing, they offer rich API to get the data you need.
felix pascal 11-Apr-16 13:50pm    
Okay. Thanks for your kind response. But can you please give me a reference of anywhere i can get such API? Thanks.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900