Click here to Skip to main content
15,891,372 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am using Solr 5.0, Nutch 1.10 with cygwin on windows server 2008 R2. I am issuing the command as:

bin/crawl -D urls/ bin/urls crawl/ 2

As of my knowledge 2 is the number of rounds for crawling. When I execute this command and read the crawldb I receive only 127 url's which is very less as compared to what is expected. Also it does not crawl at deeper depth. When I issue this command for passing data to Solr:

bin/nutch solrindex http://127.0.0.1:8983/solr/thetest crawl/crawldb -linkdb crawl/linkdb crawl/segments/*

and then perform search then I get only 20 url's in all. Can anyone help. I need to do a deeper depth crawling.
Posted

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900