I need to change the speech language for a specific response. I know I can change the TTS voice for the whole app, but I have not found a way to do that for a response. In this case, the supported user locales are English and German, but the text I want Google Assistant to speak is in Korean.
Interestingly, there is no problem if the user locale is German and the text is in English. However, when I tried to create a response with Korean text, there was no audio feedback.
Unfortunately, the Actions on Google platform does not have support for in-dialog language changes. The case you've outlined may be an exception based on certain languages having support for other-language words which are supported as a subset in the primary language.
One alternative you might consider here is using recorded spoken audio through SSML. This is a popular way to insert custom audio output into your app, which may make sense for your use case.
Related
I am building chat application with voice record. User should be able to send text message by typing through the voice recorder. But not in English, when he speaks in his native language it need to be typed in his native language only.
I have tried speech_to_text, speech_recognition and all and I couldn't find any solution.
Please do help me.
For google home actions, can I use my own voice or someone else's voice with permission? Can I read the text responses, record them, and play them back as audio files?
Earplay is an example on Alexa:
https://www.amazon.com/gp/product/B01K8V6NSI?ie=UTF8&path=%2Fgp%2Fproduct%2FB01K8V6NSI&ref_=skillrw_dsk_si_dp&useRedirectOnSuccess=1&
A guy from Gupshup said that it is not allowed:
https://youtu.be/f-mPuEbJ-nU?t=45m13s
I didn't see where it was not allowed in the terms of service.
"the platform does not allow that" does not mean that it is legally not allowed, but that it is simply not possible.
Both Alexa and Google Assistant have a default voice which can not be changed.
When developing an Action, you can select from one of four voices (two male, two female) to use. You can't use the default Google Assistant voice. There is no technical way to use another voice.
While you can send audio files, and these audio files can contain a voice, this would be a lot of work for little benefit.
Yes, Progressive does this with their Google Action.
We have Text to speech feature wherein a set of voices and different pitch, male/female voices are present.
Similarly we do have voice recognition available in many devices and PCs.
Is there any possibility that a system makes use of voice of the User to speak instead of inbuilt default voices?
Although it is theoretically possible it is most likely unpractical. There are basically two types of artificial voices: fully synthetic and sample based.
If your TTS voice is fully synthetic then it can only be influenced by certain parameters such as pitch and speed. Your best approach is to try and estimate all parameters from your input speech.
If your TTS voice is sample based then you could try to collect enough speech from the user to construct a whole new dataset. Usually you need every possible diphone, which can take a long time to collect unless you have the user utter some text specifically to collect these. Then your engine needs to be able to accept the speech parts and construct a new voice from them.
In both cases the result still won't be very convincing unless you can also mimic the user's prosody and specific pronunciation. If your TTS and recognition modules aren't developed by yourself or extensible then you are likely out of luck since most software don't allow new voices to be built at runtime.
Operears: The speech recognition(Speech to text) framework for iPhone(iOS Devices), I have installed openears demo app on my iPhone device, It works well but only for a list of words like GO, CHANGE, MODEL. Can we make speech recognition more generic for a real time speech recognition, that is, not limited to few words. It should be generic.
Openears:
http://www.politepix.com/openears/
You have to use new Language Model instead of their default one.
The language model is the vocabulary that you want OpenEars to understand, in a format that its speech recognition engine can understand.
The smaller and better-adapted to your users' real usage cases the language model is, the better the accuracy.
An ideal language model for PocketsphinxController has fewer than 200 words.
You can dynamically create new language model through the LanguageModelGenerator class.
See the Details about LangaugeModelGenerator & Openears Basic concepts here
Note:
Please post the queries regarding Openears only in their forum
You can see more Speech-To-Text SDK's here
I want to integrate voice detection in my iPhone app. The iPhone app allow the user to search the word by using their voice. But, i don't know a single info about Voice Recognition in iPhone. Can you please suggest me any ideas,tutorials or sample code for this?
You can also use Google Chrome API to integrate voice recognition on your application, but there is a big problem : the API works only with FLAC encoded files, but this encoding isn't supported natively on iOS... :/
You can see those 2 links for more information :
http://www.albertopasca.it/whiletrue/2011/09/objective-c-use-google-speech-iphone/
http://8byte8.com/blog/2012/07/voice-recognition-ios/
EDIT :
I realized an application including voice recognition using Nuance SDK, but it's not free to use. You can register for free and get a developer key that allows you to test your application for 90-days. An application example is included, you can see the code, it's very easy to implement.
Good luck :)
The best approach will probably be to:
Record the voice on the phone
Send the recording to a server that runs the speech recognition software
Then return something to the phone to indicate what it should do
This approach is favorable as there are a number of open source voice to text softwares out there & you are not limited by computing power in the backend.
Having said that, iOS has OpenEars which is based on Pocket Sphinx. It looks promising...
Well voice recognition is not correlated with iphone. All you can do is record the voice in iphone. Once done, you can either code your one voice recognition module, or find a third party API and reuse it.
You can do google search on that.