Click here to Skip to main content
15,908,115 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
i need to scrape a website that uses ajax. the website searches for listing, but uses ajax to return the result.

the result are returned progressively, not all at once.

how do i wait using httpwebrequest for the search to finish before returning the html result?

thanks
Posted
Comments
Sergey Alexandrovich Kryukov 12-Sep-11 22:04pm    
Yes, would not be easy, I guess...
--SA
BillWoodruff 13-Sep-11 11:37am    
Why not get started by examining some of the many "web scraping" articles and posts on CP: http://www.codeproject.com/search.aspx?q=web+scraping&doctypeid=1%3b2%3b3&sort=createddesc
AspDotNetDev 23-Sep-11 16:14pm    
Do any of them deal with AJAX? Usually, they just deal with a simple web request to get HTML.

I don't believe there will be a way to get it by waiting. Because AJAX is used, its more than likely adding to the DOM, which you can't "wait" for, because this happens AFTER the OK response is sent back to your httpwebrequest. However....

I would try to find WHERE in the java script of the page you are trying to scrape it goes and gets its data. There has to be some sort of URL that the AJAX method POSTS to. Find that, and then scrape THAT page (for a certain search term), and then you'll have your information. Probably in XML/JSON or something like that, but atleast it'll be something.
 
Share this answer
 
I like John's idea, but if that doesn't work (perhaps because of some security feature) you can load the page in a WebBrowser control, wait a few seconds, then inject some JavaScript into the page (via the WebBrowser) that crawls the DOM.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900