Via Cairoli 1/4
16124 Genoa – Italy
ph. +39 010 8970500

The new AI translation system from Google

A revolution in the field of translation may be just around the corner thanks to a new project from Google. It’s called Translatotron, the new Google translation system that uses artificial intelligence to replicate our voice at the time of translation.

Online voice translation systems were created with the aim of helping people who speak different languages communicate with one another. These translators reproduce the translation of a text in a standard, robotic voice.

Apparently coming to revolutionise the field is new Google project on artificial intelligence. We’re talking about Translatotron, an advanced version of Google Translate.

Translatotron uses artificial intelligence to pattern the target voice along the same frequencies as the source voice. This system makes it possible to replicate the voiceprint of a speaker at the time of translation.

Translatotron is based on a system of neural networks, which is an advanced method of machine translation. They’re called neural networks because the inspiration for the system comes from the biological neural networks that make up the human brain.


How online machine translation works


Traditional machine translation analyses a sentence and then attempts to replace the words with equivalents in the second language. In a series of processes, it substitutes the words and changes their order so that it is consistent with the target language. Yet on account of the properties of this system, the results do not live up to expectations.

Instead, neural machine translation tries to build and train a large neural network that may read not only a sentence but also the context in which it’s found. Making a much more accurate translation possible.

The new AI translation system from Google goes one step further. Translatotron allows the spoken word from one language to be translated directly into another language without relying on intermediate text representation, in either of the two languages.


How the Translatotron system works


The translation systems in use today, including Google Translate, are based on three distinct stages:

  • machine recognition of speech in order to transform the discourse of the source language into text
  • machine translation of the written text from the source language into the target language
  • voice synthesis to produce “speech” in the target language starting from the translated text.

Translatotron is a speech-to-speech method that does not require the intermediate processing of a text to produce a translation. As it involves a direct interpretation, it should lead to greater speed and fewer errors during the conversion from one language to another.

                                              Architecture of the Translatotron model


To achieve this, the neural networks of Translatotron work simultaneously to analyse the spectrogram of the “input voice” and to produce the spectrogram of the translated content in the target language. Furthermore, with the use of the voice encoder, it is possible to keep the characteristics of someone’s voice when the translation is reproduced.

In addition to gaining advantages in speed and accuracy, we can get more natural translations. Translatotron enables us to retain non-verbal signals, such as tone, cadence and accent.




The research findings have been published by the company on the official webpage of the project. On the official page, we can also listen to the set of the original voices used to teach the system along with the respective translations read by voice synthesisers.

In conclusion, we should clarify that Translatotron is a project still in its experimental stage and far from any introduction to the public.  However, Google certainly has demonstrated the great potential of artificial intelligence and thanks to this AI it will soon be possible to break down language barriers completely.