Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
135374 TECHCRUNCH 2019-5-15:
Google’s Translatotron converts one spoken language to another, no text involved
1.000 Find similar Compare side-by-side
135714 THEVERGE 2019-5-17:
Google’s prototype AI translator translates your tone as well as your words
0.349 0.607 Find similar Compare side-by-side
135583 THENEXTWEB 2019-5-16:
Google’s new AI can help you speak another language in your own voice
0.774 0.599 Find similar Compare side-by-side
135401 ENGADGET 2019-5-15:
Google's Translatotron can translate speech in the speaker's voice
0.964 0.550 Find similar Compare side-by-side
135315 VENTUREBEAT 2019-5-15:
Google’s Translatotron is an end-to-end model that mimics human voices
0.941 0.545 Find similar Compare side-by-side
135607 VENTUREBEAT 2019-5-16:
Alexa speech normalization AI reduces errors by up to 81%
0.465 Find similar Compare side-by-side
135067 VENTUREBEAT 2019-5-13:
Amazon Alexa scientists retrain an English-language AI model on Japanese
0.424 Find similar Compare side-by-side
135235 VENTUREBEAT 2019-5-14:
IBM’s AI performs state-of-the-art broadcast news captioning
0.422 Find similar Compare side-by-side
135316 ARSTECHNICA 2019-5-15:
No, someone hasn’t cracked the code of the mysterious Voynich manuscript
0.002 0.399 Find similar Compare side-by-side
135002 THEVERGE 2019-5-13:
Use this cutting-edge AI text generator to write stories, poems, news articles, and more
0.377 Find similar Compare side-by-side
135633 VENTUREBEAT 2019-5-16:
Google’s Live Transcribe is getting sound events and transcription saving
0.348 Find similar Compare side-by-side
135594 THENEXTWEB 2019-5-16:
Designing products for people with disabilities has never been so important
0.344 Find similar Compare side-by-side
135618 THEVERGE 2019-5-16:
Android’s Live Transcribe will let you save transcriptions and show ‘sound events’
0.330 Find similar Compare side-by-side
135363 THEVERGE 2019-5-15:
AI translation boosted eBay sales more than 10 percent
0.313 Find similar Compare side-by-side
135678 THEVERGE 2019-5-17:
This AI-generated Joe Rogan fake has to be heard to be believed
0.300 Find similar Compare side-by-side
135527 ENGADGET 2019-5-16:
Android's Live Transcribe gets sound alerts and transcript saving
0.298 Find similar Compare side-by-side
134953 VENTUREBEAT 2019-5-13:
Adding audio data helps AI navigate 3D mazes
0.298 Find similar Compare side-by-side
135038 THEVERGE 2019-5-13:
How to stop Google from keeping your voice recordings
0.295 Find similar Compare side-by-side
135308 ENGADGET 2019-5-15:
China is blocking Wikipedia in every language
0.290 Find similar Compare side-by-side
135724 ENGADGET 2019-5-17:
I listened to a Massive Attack record remixed by a neural network
0.290 Find similar Compare side-by-side
135495 VENTUREBEAT 2019-5-16:
Microsoft makes Google’s BERT NLP model better
0.285 Find similar Compare side-by-side
135279 TECHCRUNCH 2019-5-14:
Google’s latest app, Rivet, uses speech processing to help kids learn to read
0.282 Find similar Compare side-by-side
135032 THEVERGE 2019-5-13:
How Silicon Valley’s successes are fueled by an underclass of ‘ghost workers’
0.277 Find similar Compare side-by-side
135166 TECHREPUBLIC 2019-5-14:
How to add horizontal lines to a Word 2016 document
0.276 Find similar Compare side-by-side
135129 ARSTECHNICA 2019-5-14:
Mapping Notre Dame’s unique sound will be a boon to reconstruction efforts
0.272 Find similar Compare side-by-side

1

ID: 135374

URL: https://techcrunch.com/2019/05/15/googles-translatotron-converts-one-spoken-language-to-another-no-text-involved/

Date: 2019-05-15

Google’s Translatotron converts one spoken language to another, no text involved

Every day we creep a little closer to Douglas Adams famous and prescient Babel fish. A new research project from Google takes spoken sentences in one language and outputs spoken words in another — but unlike most translation techniques, it uses no intermediate text, working solely with the audio. This makes it quick, but more importantly lets it more easily reflect the cadence and tone of the speakers voice. Translatotron, as the project is called, is the culmination of several years of related work, though its still very much an experiment. Googles researchers, and others, have been looking into the possibility of direct speech-to-speech translation for years, but only recently have those efforts borne fruit worth harvesting. Translating speech is usually done by breaking down the problem into smaller sequential ones: turning the source speech into text (speech-to-text, or STT), turning text in one language into text in another (machine translation), and then turning the resulting text back into speech (text-to-speech, or TTS). This works quite well, really, but it isnt perfect; each step has types of errors it is prone to, and these can compound one another. Furthermore, its not really how multilingual people translate in their own heads, as testimony about their own thought processes suggests. How exactly it works is impossible to say with certainty, but few would say that they break down the text and visualize it changing to a new language, then read the new text. Human cognition is frequently a guide for how to advance machine learning algorithms. Spectrograms of source and translated speech. The translation, let us admit, is not the best. But it sounds better! To that end, researchers began looking into converting spectrograms, detailed frequency breakdowns of audio, of speech in one language directly to spectrograms in another. This is a very different process from the three-step one, and has its own weaknesses, but it also has advantages. One is that, while complex, it is essentially a single-step process rather than multi-step, which means, assuming you have enough processing power, Translatotron could work quicker. But more importantly for many, the process makes it easy to retain the character of the source voice, so the translation doesnt come out robotically, but with the tone and cadence of the original sentence. Naturally this has a huge impact on expression, and someone who relies on translation or voice synthesis regularly will appreciate that not only what they say comes through, but how they say it. Its hard to overstate how important this is for regular users of synthetic speech. Googles Project Euphonia wants to make voice recognition work for people with speech impairmentsThe accuracy of the translation, the researchers admit, is not as good as the traditional systems, which have had more time to hone their accuracy. But many of the resulting translations are (at least partially) quite good, and being able to include expression is too great an advantage to pass up. In the end, the team modestly describes their work as a starting point demonstrating the feasibility of the approach, though its easy to see that it is also a major step forward in an important domain. The paper describing the new technique was published on Arxiv, and you can browse samples of speech, from source to traditional translation to Translatotron, at this page. Just be aware that these are not all selected for the quality of their translation, but serve more as examples of how the system retains expression while getting the gist of the meaning.