Click here to Skip to main content
15,886,639 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
hi dears,
i want get all link from a website with map/reduce, for example:
codeproject.com
links:
http://www.codeproject.com/script/Forums/List.aspx
http://www.codeproject.com/Questions/ask.aspx
....
how i can do it?
it's possible?
thanks.
Posted
Comments
Richard MacCutchan 30-Mar-15 3:40am    
You can't do it, unless the site provides a page that contains all the links. And most sites will not wish to allow it.
[no name] 30-Mar-15 3:54am    
some site provide page that content links and i found it, but for implementing in mapreduce i can't?i'm new in mwpreduce,
i installed eclipse map/reduce.

1 solution

There are two options:
1. 'Crawling' the site - it means that you load the site's home page and follow every link on that page and load them and follow every link on them and so on...
2. If the site supports the - very new and unsupported - sitemap-protocol, you can use it...http://www.sitemaps.org/index.html[^]
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900