Click here to Skip to main content
15,899,825 members
Please Sign up or sign in to vote.
3.00/5 (2 votes)
See more:
hi ,
I developed an application which converts wav file to text using c#. Using SAPI TTS app tool i saved the wav file in microsoft voice itself. For accurate recognition only i saved it in microsoft voice. Though the result is not accurate. It is recognizing the words wrongly, such as meeting as needing and cute as dubed etc.
I attached my code with it.
C#
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.IO;
using SpeechLib;
namespace using_wav
{
    public partial class Form1 : Form
    {
        private SpeechLib.ISpeechRecoContext wavRecoContext = null;
        private SpeechLib.SpFileStream InputWAV = null;
        private SpeechLib.ISpeechRecoGrammar Grammar = null;
        private String _WAVFile = null;
        private string strData = "No recording yet";
        private String _lastRecognized = "";

        public Form1()
        {
            InitializeComponent();
        }
        private void button2_Click(object sender, EventArgs e)
        {
           Close();
        }
        private void button1_Click(object sender, EventArgs e)
        {
           //String[] filePaths = Directory.GetFiles(@"c:\MyDir\", "*.bmp",
           //                              SearchOption.AllDirectories);
            OpenFileDialog dialog = new OpenFileDialog();
            dialog.Title =
            "Select a Speech file";
            dialog.ShowDialog();
            _WAVFile = dialog.FileName;
           //_WAVFile = dialog.filePaths;
            if (_WAVFile == null) return;
            wavRecoContext =new SpeechLib.SpInProcRecoContext();
            ((SpInProcRecoContext)wavRecoContext).Recognition +=new _ISpeechRecoContextEvents_RecognitionEventHandler(wavRecoContext_Recognition);
            ((SpInProcRecoContext)wavRecoContext).EndStream += new _ISpeechRecoContextEvents_EndStreamEventHandler(wavRecoContext_EndStream);
            Grammar = wavRecoContext.CreateGrammar(2);
            Grammar.DictationLoad("", SpeechLoadOption.SLOStatic);
            InputWAV = new SpFileStream();
            InputWAV.Open(@_WAVFile,SpeechStreamFileMode.SSFMOpenForRead, false);
            wavRecoContext.Recognizer.AudioInputStream = InputWAV;
            Grammar.DictationSetState(SpeechRuleState.SGDSActive);
            }
        private void wavRecoContext_Recognition(int StreamNumber, object StreamPosition, SpeechRecognitionType RecognitionType, ISpeechRecoResult Result)
        {
            strData = Result.PhraseInfo.GetText(0, -1,true);
            _lastRecognized = textBox1.Text;
            textBox1.Text = strData;
        }
        private void wavRecoContext_EndStream(int StreamNumber, object StreamPosition, bool f)
        {
            Grammar.DictationSetState(
            SpeechRuleState.SGDSInactive); 
        }
    }
}

Is there any fault in this code.
Thanks in advance.

[edit]Code block added - OriginalGriff[/edit]
Posted
Updated 11-Feb-12 3:12am
v2

What made you think that using a synthesized voice was going to make recognition more accurate?? I'd think it would be jsut the opposite as synthesized speech doesn't always say every word properly.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 11-Feb-12 17:02pm    
Agree, my 5, but I also added a couple of practical recommendations based on my experience, please see.
--SA
psgviscom 11-Feb-12 21:33pm    
But it synthesizing the words correctly. And it is SAPI's own pronunciation. why can't it recognizing the words correctly.
Sergey Alexandrovich Kryukov 11-Feb-12 22:44pm    
Isn't this obvious? Comparison synthesis with recognition is pointless. Don't you see they have virtually nothing in common?
--SA
Dave Kreskowiak 12-Feb-12 9:10am    
Because the sound it listens to is not compared to the sounds it uses to create words. The two alogithms and data they used are completely seperate from each other. They may as well be in seperate libraries.
Dave Kreskowiak 12-Feb-12 9:11am    
Look. Even the best speech recognition engine is not going to get every word correct. Dragon is about the best there is and not even it has a 100% success rate.
First of all, you don't need to use SAPI directly for this simple task. You could simply use the namespace System.Speech.Recognition in the assembly "System.Speech" available in CAG. It is bundled with (freely distributed) .NET Framework runtime package and is easy to use.

The recognition quality is just lower than what you expected. For better results, use only small grammars with clearly distinct expressions. Pronounce more clearly. :-)

—SA
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900