Text-to-speech – also speech synthesis – means the conversion of text into human speech. Therefor the synthetic voice is created by using deep learning methods.
The TENIOS TTS service is based on the technology of AWS Voice and Google Speech, which is already used by digital assistants like Alexa and Google Assistant. The output of announcements and other information takes places within the phone call and offers individual customer dialogues. TTS can be inserted into the routing plan by one click. The integration of TTS into TENIOS Cloud Communications is possible via an own interface to the cloud services of AWS and Google.
The idea of creating human speech in a synthetic way is not new. The first computer-based language conversion systems were created 60 years ago. While the text-to-speech output sounded robotic and unnatural at this time, today human speech gets more natural from time to time. The basis for the continuous optimization of voice control and speech recognition is Artificial Intelligence (AI). The processing of natural language (Natural Language Processing) is associated with the so-called deep learning.
This is based on artificial neural networks, which are built in several layers. The operation of such systems are similar to the human brain: the machine is able to recognize, evaluate and optimize structures so that a learning process takes place. For the advancement of language technologies - for example for voice assistants - this is essential, as it allows the machine to respond to speech inputs with the correct output. AWS Voice and Google Speech are based on this technology and are already represented in the market by the Alexa and Google Assistant voice computers.