The idea of creating human speech in a synthetic way is not new. The first computer-based language conversion systems were created 60 years ago. While the text-to-speech output sounded robotic and unnatural at this time, today human speech gets more natural from time to time. The basis for the continuous optimization of voice control and speech recognition is Artificial Intelligence (AI). The processing of natural language (Natural Language Processing) is associated with the so-called deep learning.
This is based on artificial neural networks, which are built in several layers. The operation of such systems are similar to the human brain: the machine is able to recognize, evaluate and optimize structures so that a learning process takes place. For the advancement of language technologies - for example for voice assistants - this is essential, as it allows the machine to respond to speech inputs with the correct output. AWS Voice and Google Speech are based on this technology and are already represented in the market by the Alexa and Google Assistant voice computers.