Click here to Skip to main content
15,867,860 members
Articles / Translate

Text to Speech with Google Translate

Rate me:
Please Sign up or sign in to vote.
3.00/5 (3 votes)
10 May 2021CPOL2 min read 8.7K   5   1
Some options available for text-to-speech

Learning to count in Japanese can be a bit unintuitive, especially when you get to the larger numbers (above 10,000), so I thought it’d be cool if I could put together a little webapp that would let the user enter a number, and use Text-to-speech to read it aloud, showing the student how to pronounce it.

It turns out that there are actually a couple of options when it comes to Text-to-speech.

1. (Undocumented) Google Translate API

This appears to be an undocumented little gem that’s mentioned on a few blogs, like here and here. It’s free, anonymous and incredibly easy to set up.

JavaScript
var source = `https://translate.google.com/translate_tts?tl=ja-JP&q=
              ${encodeURIComponent('konnichiwa')}&client=tw-ob`;
var audio = new Audio(source);
audio.play(); 

This undocumented API endpoint can be accessed via JavaScript, or a HTML5 audio tag. There are a couple of requirements to get it to work, though:

  1. You need to have either a <meta name="referrer" content="no-referrer"> in your document body, or a rel="noreferrer" in your Audio tag.
  2. You need to specify a client in the request. The value tw-ob gets thrown around on Stack Overflow and some other blogs, but it looks like any value will do the trick.
  3. You can specify the language to use using the tl parameter. This needs to be a supported language code, which you can find from the Google Cloud Doco.

…which brings us to our second option.

2. Google Cloud API

It turns out that Google does in fact have an official API for text-to-speech, which is much more comprehensive than the undocumented one. The service is free for the first 4 million characters per month for standard-quality voices, after which it’s charged at $4 USD/month. That sounds like quite a lot, but if you get a troublesome user that tries to process a PhD thesis using your service, this could quickly get very expensive.

The client libraries for this service don’t seem to include a browser option, but you could, of course, use the REST API directly at https://cloud.google.com/text-to-speech/docs/reference/rest/v1/text/synthesize

3. ResponsiveVoice

There are a whole bunch of third-party TTS tools out there, but one interesting option that I found was ResponsiveVoice. They offer a super-simple JavaScript library that appears to be designed for the browser specifically, and they offer a free licence for non-commercial use. For hobby / NFP / educational purposes, this sounds like it could be a good choice.

Which Option Did I Opt For?

Since my application of this service is mostly for fun, albeit with the desire to help out our students, I opted for the simplest solution, which was the undocumented Google Translate implementation. Since this is an undocumented feature, there is the risk that the service may stop in the future. If that happens, I’ll probably move to ResponsiveVoice - though in actuality, I’m pretty keen to try out their service anyway, and maybe migrate the solution over to it if I qualify for the non-commercial free licence. If that happens, I’ll write up my experience in another article!

Anyway, that’s all from me for now. Catch ya!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Australia Australia
G'day guys! My name is Jason, and I'm a backend software engineer living in Sydney, Australia. I enjoy blogging, playing chess and travelling.

Comments and Discussions

 
QuestionThanks Pin
muzammilaalpha10-May-21 23:09
muzammilaalpha10-May-21 23:09 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.