How to segment characters using projection(histogram profile)

Question

4.00/5 (1 vote)

See more:

Hi, I'm doing research project regarding OCR and I need to segment characters using horizontal and vertical histogram profile(projection profile).This is the code I have tried but I coudn't able to segment the lines of a document by crop the image at positions where histogram bin value get zero.Please help me with this problem.Thanks.

C++

#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <opencv\cv.h>
#include <opencv\cxcore.h>
#include <opencv\highgui.h>


int _tmain(int argc, _TCHAR* argv[])
{
	IplImage *img = cvLoadImage("new.jpg");
	CvSize imgSize =cvGetSize(img);

	//Gray scale
	IplImage *gray=cvCreateImage(cvSize(img->width,img->height),8,1);
	cvCvtColor(img,gray,CV_RGB2GRAY);

	//binary
	IplImage *binary=cvCreateImage(cvSize(img->width,img->height),8,1);
	cvThreshold(gray, binary, 5, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);

	double pixel;
	int count=0;
	int height=binary->height;
	int *linecount = new int[height];
	int width=binary->width;
	int *wordcount = new int[width];

	int *HorizontalHistogram = new int[height];
	for(int i = 0; i < height; i++)
    {
        HorizontalHistogram[i] = 0;
    }

	//Line segmentation
	printf("Horizontal Bin Values \n");
	for(int j=0;j<(binary->height);j++){
		count=0;
		for(int i=0;i<(binary->width);i++){
			pixel=cvGetReal2D(binary,j,i);
			if( pixel==0 ){
				HorizontalHistogram[j]++;
				count++;	
			}	
		}
		printf("%d \n", count);
	}
	
	int Hhist_w = height; int Hhist_h = 300;
	int Vhist_w = height; int Vhist_h = 300;
	float range[] = {0,255};
	float *ranges[] = {range};
	int Hhist_size = {binary->height};
	int Vhist_size = {binary->width};
	float min_value,max_value = 0;
	IplImage *histImage1 = cvCreateImage(cvSize(height,300),8,1);
	IplImage *histImage2 = cvCreateImage(cvSize(width,300),8,1);
	cvSet(histImage1,cvScalarAll(255),0);
	cvSet(histImage2,cvScalarAll(255),0);
	CvHistogram *hist = cvCreateHist(1,&Hhist_size,CV_HIST_ARRAY,ranges,1);
	int bin_w1 = cvRound((double)histImage1->width/Hhist_size);
	int bin_w2 = cvRound((double)histImage2->width/Vhist_size);

	for(int i = 0; i < height; i++)
    {
		cvLine(histImage1, cvPoint(bin_w1*(i), Hhist_h),
                              cvPoint(bin_w1*(i), Hhist_h - HorizontalHistogram[i]),
             cvScalar(0,0,0), 1, 8, 0);
    }

	cvNamedWindow("Image:");
	cvShowImage("Image:", img);
	cvNamedWindow("Binary:");
	cvShowImage("Binary:", binary);
	cvNamedWindow("HorizontalHistogram:");
	cvShowImage("HorizontalHistogram:", histImage1);

	cvWaitKey(0);

	cvDestroyWindow("Image:");
	cvReleaseImage(&img);
	cvDestroyWindow("Binary:");
	cvReleaseImage(&binary);
	cvDestroyWindow("HorizontalHistogram:");
	cvReleaseImage(&histImage1);

	return 0;
}

Posted 19-Mar-13 7:41am

123ezone

Updated 19-Mar-13 7:47am

Jochen Arndt

v2

Add a Solution

Comments

nv3 19-Mar-13 15:58pm

Could you explain a little more what the actual problem is, please. Does the histogram that you output show the line separations? Can you tell by looking at the binary image if the binarization has delivered a reasonable result?

What I can see is that you don't do any rotational correction of the image. If it is just slightly rotated, you won't see deep depressions in the histogram for the line separations. Instead the histogram will look more or less homogeneous. Is that the case?

Just to mention on the side, your program has a lot of memory leaks and could need some improvements on other places as well. But we can get to that as soon as you have solved the major problem you are stuck with.

123ezone 20-Mar-13 4:38am

The output histogram is generated by scanning the image horizontally and the places where the histogram get zero are the places I should segment.Then I can segment the lines.
Binarization,Image enhancement and rotation is done by another group member and I have implement this by assuming the input image is an enhanced one.

nv3 20-Mar-13 4:45am

So what exactly is your problem then? Doesn't the histogram show the line gaps? Or is the histogram ok, and you simply don't know how to implement the segmentation?

123ezone 20-Mar-13 4:51am

Histogram shows the line gaps but I have no idea how should I segment those lines from those places and crop those lines into another set of images.

nv3 20-Mar-13 5:29am

So if you detect two neighboring gaps at y=15 and y=35, then create a new image with height 20 and copy the contents of your binarized image to this new image.

What you want to do with all those line image strips is a question of the interface between your function and the other system components. You could for example return an array of such line images to your caller.

If your task is to write the main program, you could for example process those line images one at a time in a loop and let run your OCR engine on each one.

123ezone 20-Mar-13 10:41am

Thank You I have succeeded segmenting lines.Now I'm trying to segment characters of each of those line images.Thank you very much

nv3 20-Mar-13 10:49am

You are welcome. I write a solution with a few line of comments, so we can call the case as closed.

Member 11124690 26-Nov-15 1:19am

Please post the code for line segmentation. Thank You

FEPY 13-Jul-13 4:14am

hi , i try segment character using vertical and horizontal projection but I could not do.

could do you post the complete code for character segmentation using vertical and horizontal projection, please??

thanks!
best regards!

Member 11612026 16-Apr-15 15:15pm

Thanks a lot it is very useful.Can you please give a sample code how to scan that histogram and get separate lines?

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

nv3 · Accepted Answer · 2013-03-20T04:52:00

Solution 1

The problems was actually OP was uncertain on how to extract the line images from the main image after having calculated the position of the line gaps. As a result of the discussion in the comment section (see above) the case could be resolved.

Posted 20-Mar-13 4:52am

nv3

Comments

Jochen Arndt 20-Mar-13 11:05am

5ed for your efforts and congratulations for reaching platinum authority.

nv3 20-Mar-13 11:14am

Thanks Jochen, that is very kind of you.

Member 11124690 26-Nov-15 1:16am

Hi 123ezone, can you please post the code for line and character segmentation and complete the code. Thanks.