Click here to Skip to main content
15,888,454 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
C#
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Net;
using System.IO;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        private String webText;

        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            WebClient web = new WebClient();
            System.IO.Stream stream = web.OpenRead("https://de.wikipedia.org");
            using (System.IO.StreamReader reader = new System.IO.StreamReader(stream))
            {
                webText = reader.ReadToEnd();   
            }
            stream.Close();

            richTextBox1.Text = webText;
            }   
    }
}


What I have tried:

This Code is working quite well to display the whole sourcecode.

But i would like to go through the sourcecode using the getElementById function.
Apparently this function is limited to htmlDocument types and i couldnt find a way to convert the string i get back from my stream into a htmlDocument.

Is there a way to convert into htmlDocument from string?
Or instead of writing the sourcecode into a string, can i create a htmlDocument in the first place?

Thanks
Posted
Updated 15-May-17 4:55am
Comments
F-ES Sitecore 15-May-17 10:33am    
It's unlikely anyone here is going to help you steal another site's content.
Arimatas 15-May-17 10:44am    
i had no idea that that's a bad thing. i just wanted to automize some copy paste work for personal use :X

1 solution

I wrote (and currently use) these classes:
to allow me to consume data from a web site.

However, please ensure that you have permission from the owners of the website to scrape data for use by your app.

You may also want to consider simply using Wikipedia's API.  For example:

API call to retrieve information about CodeProject in JSON

/ravi
 
Share this answer
 
v2
Comments
Arimatas 15-May-17 14:57pm    
Thanks i'll give this a try.
And your right, i dindnt thougt about an existing API.
I'll take a loot into this too.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900