By using Google Speech Recognition (GSR) plugin to UniMRCP Server, IVR platforms can utilize Google Cloud Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.
Google Cloud Speech API performs speech to text conversion powered by machine learning providing the following main features.
Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.
Recognizes over 110 languages and variants with an extensive vocabulary.
Returns recognition results while the user is still speaking.
Speech recognition can be customized to a specific context by providing a set of words and phrases that are likely to be spoken. Especially useful for adding custom words and names to the vocabulary and in voice-control use cases.
Handles noisy audio from many environments without requiring additional noise cancellation.
Filter inappropriate content in text results for some languages.
By using Google Dialogflow (GDF) plugin to UniMRCP Server, IVR platforms can utilize Google Dialogflow API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.
Google Dialogflow API allows to create conversational applications capable of natural interactions with users and powered by Google Cloud Speech API internally.
Dialogflow is an end-to-end development suite for building conversational applications that are capable of natural and rich interactions with users. It is powered by machine learning to recognize the intent and context of what a user says, allowing your conversational interface to provide highly efficient and accurate responses.
Dialogflow supports 20+ languages allowing to build a multilingual agent that works across multiple languages.
Natural language understanding recognizes a user’s intent and extracts pre-built entities such as time, date, and numbers. You can train your agent to identify custom entity types by providing a small dataset of examples. You can also use 30+ pre-built agents as a template.
You can expand your conversational interface to recognize voice interactions with a single request sent from the client application. Powered by Google Cloud Speech, recognition is implemented in real-time streaming mode.
By using Google Speech Synthesis (GSS) plugin to UniMRCP Server, IVR platforms can utilize Google Cloud Text-to-Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.
Google Cloud Text-to-Speech API synthesizes natural-sounding speech, providing the following main features.
Supports 32 voices in 12 languages and variants, with more to come soon.
Exclusive access to DeepMind WaveNet voices that provide the most natural-sounding speech.
Customize your speech with SSML tags that allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions.
Customize your speaking rate to be 4x faster or slower than the normal rate.
Customize the pitch of your selected voice, up to 20 semitones more or less than the default output.
Increase the volume of the output by up to 16db or decrease the volume up to -96db.