Click here to Skip to main content
15,895,799 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
I need to get the HTML from a website so that I can parse it and use regex to detect patterns, such as links ("href"). Is it even worth Parsing?

What I have tried:

I have attempted to use https://docs.python.org/3/library/socket.html to get the HTML but I am struggling.
Posted
Updated 22-Aug-21 2:53am
v2

You can do it but ... you'd be better off using a HTML scraping package such as Beautiful Soup: Build a Web Scraper With Python – Real Python[^] which will do most of the "donkey work" for you.
 
Share this answer
 
I already gave you the answer to this question on https://www.codeproject.com/Answers/5310985/How-do-I-make-a-HTML-parser-from-scratch-Python#answer1[^]. Do not repost the question as the answer to your problem remains the same.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900