How can I play a sound in Twilio's Autopilot? - twilio-api

I would like to see if I can play a voice recording to remove twilio's Autopilot robot voice, any idea on how to do this? Or if it's even possible?

Twilio developer evangelist here.
You can specify a voice with Amazon Polly using StyleSheets, a declarative API responsible for dialogue, state management, and error handling that can be used to control an Assistant's tone, language, and more.
In Node.js, you could update a StyleSheet to use a different voice like this (make sure your Twilio helper library is updated to a more recent version as this is a new functionality!). Here, your Autopilot Assistant's voice would be Joanna.
client.autopilot.assistants('REPLACE-WITH-YOUR-AUTOPILOT-ASSISTANT-SID')
.styleSheet()
.update({
styleSheet: {
style_sheet: {
voice: {
say_voice: 'Polly.Joanna'
}
}
}
})
.then(style_sheet => console.log(style_sheet.assistantSid));

Related

I have an Alexa skill. How do I make it work with Google Home?

I’ve already built an Alexa skill, and now I want to make that available on Google Home. Do I have to start from scratch or can I reuse its code for Actions on Google?
Google Assistant works similar to Amazon Alexa, although there are a few differences.
For example, you don't create your language model inside the "Actions on Google" console. Most Google Action developers use DialogFlow (formerly API.AI), which is owned by Google and offers a deep integration. DialogFlow offered an import feature for Alexa Interaction models, which doesn't work anymore. Instead, you can take a look at this tutorial: Turn an Alexa Interaction Model into a Dialogflow Agent.
Although most of the work for developing voice apps is parsing JSON requests and returning JSON responses, the Actions on Google SDK works different compared to the Alexa SDK for Node.js.
To help people build cross-platform voice apps with only one code base, we developed Jovo, an open-source framework that is a little close to the Alexa SDK compare to Google Assistant. So if you consider porting your code over, take a look, I'm happy to help! You can find the repository here: https://github.com/jovotech/jovo-framework-nodejs
It is possible to manually convert your Alexa skill to work as an Assistant Action. Both a skill and an action have similar life cycles that involve accepting incoming HTTP requests and then responding with JSON payloads. The skill’s utterances and intents can be converted to an Action Package if you use the Actions SDK or can be configured in the API.ai web GUI. The skill’s handler function can be modified to use the Actions incoming JSON request format and create the expected Actions JSON response format. You should be able to reuse most of your skill’s logic.
This can be done but it will require some work and you will not have to rewrite all of your code.
Check out this video on developing a Google Home Action using API.AI (that is recommended).
Once you have done the basics and started understanding how Google Home Actions differ from Amazon Alexa Skills, you can simply transfer your logic over to be similar. The idea of intents are very similar but they have different intricacies that you must learn.
When you execute an intent it seems as if your app logic will be similar in most cases. It is just the setup, deploying and running that are different.

Does the Watson Text-to-Speech service in Bluemix work for mobile apps?

Why doesn't the Watson Text-To-Speech service on Bluemix work for mobile devices? Is this a common issue for outputstream data coming from the server? Thanks!
Edit: Sry, somebody have changed my question totally. I am talking about Text-to-Speech
Text To Speech works in Android, and there is a SDK you can use.
http://watson-developer-cloud.github.io/java-wrapper/
For example, to get all the voices you can do:
import com.ibm.watson.developer_cloud.text_to_speech.v1.TextToSpeech;
import com.ibm.watson.developer_cloud.text_to_speech.v1.model.VoiceSet;
TextToSpeech service = new TextToSpeech();
service.setUsernameAndPassword("<username>", "<password>");
VoiceSet voices = service.getVoices();
System.out.println(voices);
where username and password are the credentials you get in Bluemix when you bind the service. You can learn more about the Text to Speech methods by looking at the javadocs here.
It was released today and I made it so let me know if you find any issue.
The Watson Speech-To-Text service is a REST API. You will need to call the REST API from your mobile app. For more info about the REST API check out the API docs.
If you want to use Watson Text-To-Speech for iOS devices it might be handy using Watson-Developer-Cloud SDK for iOS - you might checkout the example on my blumarek.blogspot, just simply build an app in XCode 7.3+:
step 1. use carthage to get all the dependencies:
(create a file cartfile in a project root directory and run the command carthage update --platform iOS)
$ cat > cartfile
# cartfile contents
github "watson-developer-cloud/ios-sdk"
and then you need to add the frameworks to the XCode project - check Step 3: Adding the SDK to the Xcode project on my blumareks.blogpost
step 2. add the code to call the Watson TTS and leverage AVFoundation
(AVFoundation is being deprecated):
- do not forget to add the Watson TTS service in Bluemix.net and get the credentials from it:
{
"credentials": {
"url": "https://stream.watsonplatform.net/text-to-speech/api",
"username": "<service User name>",
"password": "<password>"
}
}
And the code is a plain one:
import UIKit
//adding Watson Text to Speech
import WatsonDeveloperCloud
//adding AVFoundation
import AVFoundation
class ViewController: UIViewController {
#IBOutlet weak var speakText: UITextField!
override func viewDidLoad() {...}
override func didReceiveMemoryWarning() {...}
#IBAction func speakButtonPressed(sender: AnyObject) {
NSLog("speak button pressed, text to say: " + speakText.text!)
//adding Watson service
let service = TextToSpeech(username: "<service User name>", password: "<password>")
service.synthesize(speakText.text!)
{(data, error) in
do {
let audioPlayer = try AVAudioPlayer(data: data!)
audioPlayer.prepareToPlay()
audioPlayer.play()
sleep(10) //the thread needs to live long enough to say your text
} catch {
NSLog("something went terribly wrong")
}
}}}
It is unclear if you are asking about the speech to text or vice versa. The speech to text is covered in most of the questions above and can be referenced on the Watson site -
http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html
The Speech to Text service converts the human voice into the written word. This easy-to-use service uses machine intelligence to combine information about grammar and language structure with knowledge of the composition of the audio signal to generate a more accurate transcription. The transcription is continuously sent back to the client and retroactively updated as more speech is heard. Recognition models can be trained for different languages, as well as for specific domains.
If you look at this github project https://github.com/FarooqMulla/BluemixExample/tree/master which uses the Old SDK
There is an example that uses the real time speech to text api which sends audio packets to bluemix and receives back the transcribed string in real time.
Beware as of 1/22/16 the new Swift based SDK is broken for this particular functionality.

Is siri open API available currently?

There is an app in app store name Touchpad, its last updated at Nov 29, included a new feature that supporting "Use the Siri key on device keyboard to send text to computer", I wanna to know if any open API for Siri right now, especially for iOS 5.1 beta, or how can it support such feature? Sorry that I have not the 4S and never try Siri.
No there is NO api for siri, but if siri is supported in your language then a microphone will be shown in the keyboard allowing you to dictate an text.
No, Siri is using a very closed protocol and/or API.
That is has been hacked is very unfortunate (but also awesome and a damn good job of reversing)
There are a number of sources which suggest that a very limited portion of the Siri API (specifically dictation) will be exposed to developers in the upcoming iOS 5.1
http://www.freakgeeks.com/2011/19239/siri-for-third-party-applications/
http://www.sn0wbreeze.ca/ios-5-1-beta-add-siri-apis-and-battery-issues-are-still-there/
http://arstechnica.com/apple/news/2011/11/ios-51-beta-offers-developers-limited-siri-integration.ars
This is an unofficial, but #applidium have cracked the Siri protocol, so you can call Siri via HTTP.
Basically, you'll need to get a device GUID (see link below), send the raw audio (as speex audio codec) and you get back zipped data. Unzip the binary response (after 4th byte), and it contains plists.
URL: https://guzzoni.apple.com/ace
Http method: ACE
See more details: http://www.ps3trophies.com/forums/apple/6458-siri-cracked-now-even-android-users-can-use-siri.html

Concuming Web services from Apple PAstry Kit

I'm about to implement a web app with pastry kit but I'd like to have it fetching data from a backend hosted on the net.
This backend is reacheable via a Web service... IS there a method in PAstry kit allowing me to call directly a web service and parse response data to update my page ?
Are you aware that PasrtyKit is an Apple project that has not been made publicly available and seems to have only been used by Apple in one project and that all the information and code libraries available have been reversed engineered by people curious as to what is in the package and therefore might change or go away at any moment and has no Apple documentation or support?
Given that the short answer to you question is you will have to google for what is available and then hack it yourself.
You might also want to look at AdLib. It's like Pastry but drives the documentation for the iPad and it might be a bit more accessible as it was built later than Pastry Kit.
Also it is just a guess on my part but i suspect the data sources as they work in DashCode and are reasonably well documented should work in AdLib or PastryKit as Apple would have no reason to change the model.

APIs for converting Voice/Audio data in to text

I am working on a iphone apps in which i am storing the voice of users as audio file and want to display in text.
How it will be ...any idea about APIs ??
Thanks,
Aaryan
Have you seen CMU Sphinx ?
Particularly, pocket sphinx (written in C)
While more recognition oriented, it's been used for transcription before, so it will depend on what exactly you need:
Further, have you considered a non-native/local API, i.e. a web service you could call with your voice data, or are you adamant about a native library/API ?
For example, Ribbit has a platform for these sorts of things, and does support transcribing voice to text
"How do I enable voice-to-text transcriptions?
Available as a paid service, voice-to-text transcriptions are automatically available through the Ribbit API. Please use the $25 Free signup credit to try the service."
There is one app that does this already: Jott. The way they do it is to send the file to transcribers in India! (source)
You will have to develop the voice recognition engine yourself I'm afraid. No library that I know of can do this. Apart form that, the iPhone CPU would probably not be powerful enough.