Can a Dialogflow API call play audio on my Google Home - actions-on-google

I want to create a simple app for use in my house that will cause my Google Home Mini device to speak a custom phrase when executed. Alternatively to have it play a custom audio file which I will prepare with the phrases.
Ideally I'm looking for something like an API that can be called with a text string that the mini then speaks out. The API call would come from a web/desktop app I will write.
Can Dialogflow do something like this? If so, any advice on where to start with the documentation (endpoints etc)? Or if not, is there anything else out there that I can use to do this?

It sounds like you're looking for the Actions on Google API.

Related

Google Assistant for Game

I'm interested in using Actions and the Assistant to create dynamic dialog for a video game.
Specifically I would want players to be able to speak (literally) to characters and for the characters responses to be determined by Actions, just like the Assistant.
Is there any version of the Assistant available that can be integrated into a game? As far as I can see they offer a lot of the building block services to developers, through the cloud, but nothing as fully featured as Google Assistant
Sounds like a cool scenario. Not something Actions on Google supports directly, but if you want to experiment, you could use the Google Assistant SDK to host the Assistant in your game and respond to queries that are meant for your players.
https://developers.google.com/assistant/sdk/
Love to see what you come up with.
It pretty much comes down to which Framework you use when building your game. If you use Unity for instance, you can use API.AI's Unity SDK.
There are also a lot of other SDKs provided. I don't think you really have to include the complete Google Assistant SDK, since you most likely will want to write your own responses (?). Some SDKs have speech recognition included, for others you will need a Speech Recignition framework, for instance Google Cloud Speech API.

Can I use my own voice or someone's voice with permission?

For google home actions, can I use my own voice or someone else's voice with permission? Can I read the text responses, record them, and play them back as audio files?
Earplay is an example on Alexa:
https://www.amazon.com/gp/product/B01K8V6NSI?ie=UTF8&path=%2Fgp%2Fproduct%2FB01K8V6NSI&ref_=skillrw_dsk_si_dp&useRedirectOnSuccess=1&
A guy from Gupshup said that it is not allowed:
https://youtu.be/f-mPuEbJ-nU?t=45m13s
I didn't see where it was not allowed in the terms of service.
"the platform does not allow that" does not mean that it is legally not allowed, but that it is simply not possible.
Both Alexa and Google Assistant have a default voice which can not be changed.
When developing an Action, you can select from one of four voices (two male, two female) to use. You can't use the default Google Assistant voice. There is no technical way to use another voice.
While you can send audio files, and these audio files can contain a voice, this would be a lot of work for little benefit.
Yes, Progressive does this with their Google Action.

Play Rdio.com song streaming

I'm trying to implement online music search and sample streaming features for an iOS app: the user may be able to search music by song name/artist/album and reproduce them. I found Rdio.com looks promising, however, can't figure out a way to solve this.
The streaming api expects a key, such as
[rdio.player playSource:#"t2742133"];
but I'm having difficulties to trace how can I get the key "t2742133" for a given song, as it seems there's no documented method for getting a song metadata based on its name. Anyone experienced with Rdio.com could tell if there's a (relatively) straight forward way to get a song info by its name, and which would be the main steps to take?
Couldn't find a direct method for the iOS sdk, but for searching purposes there's a REST api instead.-
http://developer.rdio.com/docs/REST/
The responses include track ids that can be passed to iOS api streaming method.

How to recognize the human voice by code in iphone?

I want to integrate voice detection in my iPhone app. The iPhone app allow the user to search the word by using their voice. But, i don't know a single info about Voice Recognition in iPhone. Can you please suggest me any ideas,tutorials or sample code for this?
You can also use Google Chrome API to integrate voice recognition on your application, but there is a big problem : the API works only with FLAC encoded files, but this encoding isn't supported natively on iOS... :/
You can see those 2 links for more information :
http://www.albertopasca.it/whiletrue/2011/09/objective-c-use-google-speech-iphone/
http://8byte8.com/blog/2012/07/voice-recognition-ios/
EDIT :
I realized an application including voice recognition using Nuance SDK, but it's not free to use. You can register for free and get a developer key that allows you to test your application for 90-days. An application example is included, you can see the code, it's very easy to implement.
Good luck :)
The best approach will probably be to:
Record the voice on the phone
Send the recording to a server that runs the speech recognition software
Then return something to the phone to indicate what it should do
This approach is favorable as there are a number of open source voice to text softwares out there & you are not limited by computing power in the backend.
Having said that, iOS has OpenEars which is based on Pocket Sphinx. It looks promising...
Well voice recognition is not correlated with iphone. All you can do is record the voice in iphone. Once done, you can either code your one voice recognition module, or find a third party API and reuse it.
You can do google search on that.

How to convert a voice recorded by AVAudioRecorder into Text in objective-c?

I am working on a project where I have to record a voice covert into text then match the pattern and according to the user command perform action.
I am able to to record voice of the user through AVAudioRecorder and perform action. But the actions are perform on anything what user says. I want to perform on user's particular word like if he say play then playing should start.
Help me by any tutorial or any sample code.
Thanks in Advance
Most apps (including Siri) send the sound file to a remote data center via to do the speech recognition, which involves some fairly heavy duty processing. Nuance may have an commercial API.
Another option might be to try using the CMU OpenEars or PocketSphinx speech library, which has been ported to the iPhone. Also look at VocalKit and this article on running PocketSphinx on the iPhone.