Click here to Skip to main content
15,887,135 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
I have been working on extracting text from images, specifically focusing on seven-segment fonts, using .NET. Unfortunately, my attempts with popular libraries like Tesseract, IronOcr and many more have been unsuccessful, as they seem to excel with normal English fonts.

Despite many efforts, I'm facing challenges in accurately extracting text from images with seven-segment fonts.


Link to Image Dataset Folder :
https://drive.google.com/drive/folders/1b4S-UQbxaXZPbDfOkTiC1m0qTIt6fdb7

What I have tried:

1. Tesseract (Limited to normal English fonts, unable to recognize seven-segment characters)
2. IronOcr (Similar limitations, not suitable for seven-segment fonts)
3. Leadtools
4. pretrained models
5. custom trained models
6. some matlab and python projects from internet
7. some free OCR Api providers

8.Additionally, I've experimented with image processing techniques, including:
Cropping and zooming to the text region.
Applying gray, black and white, and binarization filters.
Posted
Updated 30-Jan-24 1:05am
Comments
[no name] 30-Jan-24 13:57pm    
If you can't find a solution, you'll obviously have to create your own using "AI" and "machine learning" (pattern recognition).
Maciej Los 31-Jan-24 13:53pm    
Agree. And i believe that it could the path through torment for person who never worked with AI.
[no name] 31-Jan-24 14:09pm    
Have to know where to start. Except with the "new" (AI) searches, I can never find what I used to; I can only go by my own (offline) archives. It's just a bunch of images of the characters in different "warps". i.e. it goes beyond "nice neat characters".

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900