I'd like to use IFilters to extract text from many filetypes.
It is possible, but:
I cannot call default pdf IFilter in Windows 8 and Windows Server 2012:
glcndFilter.dll
It gives me an error.
Also, Filtdump.exe gives an error on it too:
Failed to CoCreate ILoadFilter instance, hr == 0x80040154
FILTDUMP failed, hr == 0x80040154
In the same time my application can call any other filters.
How to do it?
I want to open file pdf, doc, txt etc in my c# application and extract plain text from the file using IFilter.
Now I can do it for any filetypes but only problem with pdf.
The reason is: Microsoft provides its own IFilter for pdf glcndFilter.dll for Windows 8 and Windows Server 2012. I cannot call this filter (see above).
BUT SQL Server is using this filter on the same machine.
I used this very useful code
https://github.com/Sicos1977/IFilterTextReader[
^]