How do I constrain 'watson-HTML5-speech-recognition' to recognize only numbers

Question

0.00/5 (No votes)

See more:

Hi! I'm devloping a speech recognition app using 'watson-html5-speech-recognition'. How can I constrain the recognition context to recognize only numbers (i.e the recognized word should be a number e.g 101, 40, 76, 4.. etc) and not (one hundred and one, forty, seventy six, four.. etc).

I'm working on a math app, where the result should never be a string text, just numbers. Thanks

What I have tried:

I tried using javascript to parse the recognized text(if isNumber()) after recognition, but I think it will be better to constrain the spoken words to only numbers before recognizing and not after..

Posted 18-May-19 3:44am

Real-One

Updated 18-May-19 5:31am

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

phil.o · Answer 1 · 2019-05-18T05:31:00

Solution 1

Speech recognition converts spoken words to string, and I do not think it is possible, nor suitable, to entangle with the speech recognition engine for that. Moreover, you cannot recognize anything before having it parsed in the first place (from audio signal to actual string).

Real-One wrote:
I think it will be better to constrain the spoken words to only numbers before recognizing and not after.

You may have to reconsider this opinion :)
Kindly.

Posted 18-May-19 5:31am

phil.o

Comments

Real-One 19-May-19 1:02am

Thanks for your response phil.o.

..but how do I achieve this. For instance when I say 'five' it transcribes to '5' (which is fine), but when I say 'three' it often transcribes to 'free' (which is not what I want). Is there a work around to this?

phil.o 19-May-19 4:03am

I do not know. It could be an issue in the recognition engine itself, or it could be a mis-pronunciation of the user, or a combination of both. For this kind of issue, you may have better answers on the developpers' forum[^] itself.