News Flash


Cloud Text-to-Speech

Text-to-speech conversion powered by machine learning.

This release is the best of breed so far. You can go to the portal and enter some of your own text and hear it in action. Quite impressive.

The main element of this is the ‘Wave-Net’ algorithm which uses deep-learning to model speech in a concatenated approach so it sounds more natural. They have developed models for 90 some-odd dialects and languages. Tests have put Wave-Net into the top of class in terms of the subjective 5-scale Mean Opinion Score.

WaveNet: Google Assistant’s Voice Synthesizer.

This great article is somewhat ‘mathy’ so get ready to unearth college-level linear algebra to fully grasp it. It is worth the read as it will reveal what is so great about Google’s implementation.

The tech used is an explicit generative model whose theory is fairly complex as applied to their Wave-net optimized voice models.

A TensorFlow implementation of DeepMind’s WaveNet paper

If you are interested in the code used to implement this with Tensorflow, by all means check out their github repo.

Cloud Text-to-Speech basics

If you are planning to code something up with the APIs, you want to go read this.