Click here to Skip to main content
15,887,027 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi Everyone,

We are facing some issues with our search :

On full crawl all documents are crawled and available in system.

After n number of incremental crawl either some documents gets deleted from crawl index, or some of the properties(metadata) of a document (like title of document) gets deleted.

If we run full crawl again then every thing gets back to normal, but in near future we again found similar issues

We already tried normal full crawls but the issue always come back after some time.

Troubleshoot findings:

After going through crawl logs we have noticed number of deletes are way too high in production environment as compared to the lower environment (which has almost the similar content) where we are doing our troubleshootings (and didnt found any such issues) .

At high level , below is our Search setup :

BCS crawl is implemented for crawling external document repository d-space.
D-space exposed a set of REST urls which is used by BCS models for incremental and full crawl.
We are using change log approach for implementing incremental crawl(implemented with the help of microsoft only).
Incremental crawls are scheduled to run every 5 hours.
Thanks and Regards,

Athar Faridi

What I have tried:

We already tried normal full crawls but the issue always come back after some time.

Troubleshoot findings:

After going through crawl logs we have noticed number of deletes are way too high in production environment as compared to the lower environment (which has almost the similar content) where we are doing our troubleshootings (and didnt found any such issues) .
Posted
Updated 20-Mar-18 20:45pm

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900