Click here to Skip to main content
15,890,438 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi,

I have been assigned a task to read text from (.jpg) image file.

After searching I came to know [Tessnet2] will help me in converting .jpg to .txt, but sample on [Tessnet2] has not helped me a lot.

Kindly help me by showing the write right path in achieving OCR with best accuracy.

Thank you.
Posted
Updated 13-Jun-12 23:44pm
v2

If you are running Win 7 or Server 2008 R2, the optional Windows component TIFF iFilter has OCR capabilities. I wrote my OCR option around that and it works really well. Best thing is, it is free if you are running one of these. You may want to investigate that as a possibility.
 
Share this answer
 
Hi Muthukumar,
This question is solved here -
C# OCR (How to Read a single character from image)[^]
 
Share this answer
 
Although Tesseract is one of the more accurate free OCR engines, the last time I tried it a couple of years ago it was rather inaccurate. After trying some other open source libraries, we faced similar problems with the other free OCR engines and winded up using leadtools that provided faster and more accurate results.

You can see an example in the following article:
Minimum OCR demo
 
Share this answer
 
v2
There is no any solution which gives guarantee to read all JPG files as most of OCRs readers are based on preloaded font formats and at time of reading those JPG files must containing text in those supported font-families.

Otherwise all suggested solution provided by others should be used.

I have used few of them long back, and one problem you might faced is concatenation of text from Image where text are printed in various positions.

Thanks
Rushikesh Joshi
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900