In the Google Action Console I would like to test how Google Assistant would speak some text like <speak>Hello world</speak> without having to change my action.
I've read I could "Select ‘Simulator’ from the left-hand navigation". But there is no "left-hand navigation".
Where can I find such an SSML simulator?
You can use the homepage for Google Cloud TTS to enter some SSML and see how it sounds. Alternatively you can use the visual editor Nightingale and export to SSML after assembling some audio.
Related
I've build an agent that uses dialogflow and fulfillment with the actions-on-google library. In a certain intent, the fulfillment sends back a SimpleResponse, a MediaResponse and some Suggestions.
It works fine in the simulator, the audio file in the MediaResponse plays as expected.
When testing the agent in the Google Assistant app on my iPhone however, the audio file isn't playing. There isn't a play/pause button but shows a loading/buffering gif instead. When I click on the loading gif the audio file starts playing and the start/pause button appears.
I don't think this is expected behaviour, is this a bug on Google's end or am I missing something here? Is there even a way to default autoplay the audiofile without the user having to push the icon?
As it turns out, this is a bug in Actions on Google and they are working on it.
Media responses are supported out of the box only on Android and Google home devices. You must check for the surface capabilities before trying to play the audio. You can check https://developers.google.com/actions/assistant/responses#media_responses for more options and suggestions.
What I would suggest is if you are playing small audio clips in between text/speech data, then use SSML. Check audio tag in the link https://developers.google.com/actions/reference/ssml
I want to create a simple app for use in my house that will cause my Google Home Mini device to speak a custom phrase when executed. Alternatively to have it play a custom audio file which I will prepare with the phrases.
Ideally I'm looking for something like an API that can be called with a text string that the mini then speaks out. The API call would come from a web/desktop app I will write.
Can Dialogflow do something like this? If so, any advice on where to start with the documentation (endpoints etc)? Or if not, is there anything else out there that I can use to do this?
It sounds like you're looking for the Actions on Google API.
I'm interested in using Actions and the Assistant to create dynamic dialog for a video game.
Specifically I would want players to be able to speak (literally) to characters and for the characters responses to be determined by Actions, just like the Assistant.
Is there any version of the Assistant available that can be integrated into a game? As far as I can see they offer a lot of the building block services to developers, through the cloud, but nothing as fully featured as Google Assistant
Sounds like a cool scenario. Not something Actions on Google supports directly, but if you want to experiment, you could use the Google Assistant SDK to host the Assistant in your game and respond to queries that are meant for your players.
https://developers.google.com/assistant/sdk/
Love to see what you come up with.
It pretty much comes down to which Framework you use when building your game. If you use Unity for instance, you can use API.AI's Unity SDK.
There are also a lot of other SDKs provided. I don't think you really have to include the complete Google Assistant SDK, since you most likely will want to write your own responses (?). Some SDKs have speech recognition included, for others you will need a Speech Recignition framework, for instance Google Cloud Speech API.
For google home actions, can I use my own voice or someone else's voice with permission? Can I read the text responses, record them, and play them back as audio files?
Earplay is an example on Alexa:
https://www.amazon.com/gp/product/B01K8V6NSI?ie=UTF8&path=%2Fgp%2Fproduct%2FB01K8V6NSI&ref_=skillrw_dsk_si_dp&useRedirectOnSuccess=1&
A guy from Gupshup said that it is not allowed:
https://youtu.be/f-mPuEbJ-nU?t=45m13s
I didn't see where it was not allowed in the terms of service.
"the platform does not allow that" does not mean that it is legally not allowed, but that it is simply not possible.
Both Alexa and Google Assistant have a default voice which can not be changed.
When developing an Action, you can select from one of four voices (two male, two female) to use. You can't use the default Google Assistant voice. There is no technical way to use another voice.
While you can send audio files, and these audio files can contain a voice, this would be a lot of work for little benefit.
Yes, Progressive does this with their Google Action.
I am working on a project where I have to record a voice covert into text then match the pattern and according to the user command perform action.
I am able to to record voice of the user through AVAudioRecorder and perform action. But the actions are perform on anything what user says. I want to perform on user's particular word like if he say play then playing should start.
Help me by any tutorial or any sample code.
Thanks in Advance
Most apps (including Siri) send the sound file to a remote data center via to do the speech recognition, which involves some fairly heavy duty processing. Nuance may have an commercial API.
Another option might be to try using the CMU OpenEars or PocketSphinx speech library, which has been ported to the iPhone. Also look at VocalKit and this article on running PocketSphinx on the iPhone.