[ad_1]
Google has announced Translatotron, a "new experimental system" that translates speech directly into speech, eliminating the need for text.
"Translatotron is the first end-to-end model capable of directly translating speech from one language into one language into another language," an article in the Google AI blog said Wednesday.
According to Google, today's translation systems have three stages: automatic speech recognition, which translates speech into text; automatic translation, which translates this text into another language; and text-to-speech synthesis, which uses this text to generate speech.
Cascading, these steps have resulted in services such as Google Translate, but the technology giant now claims that it will use a single template without the need for text.
"Nicknamed Translatotron, this system avoids splitting the task into several steps," says the Google engineers blog AI, Ye Jia and Ron Weiss.
This means faster translation speed and less complex errors, according to Google.
The system uses the input spectrograms and generates spectrograms, which also rely on a neural vocoder and a speaker encoder, which means that the system retains the voice characteristics of the speaker once translated.
[ad_2]
Source link