MRTK TextToSpeech.SpeakSsml doesn't work when using <voice /> element. Device: HoloLens2 - unity3d

I am using unity + MRTK to develop an application for HoloLens 2. I am trying to use "speech styles" for MRTK TextToSpeech.SpeakSsml method (MRTK API Reference). Text to speech works; however, I am unable to employ speech styles.
Example ssml:
<speak version=""1.0"" xmlns=""http://www.w3.org/2001/10/synthesis"" xmlns:mstts=""https://www.w3.org/2001/mstts"" xml:lang=""en-US"">
<mstts:express-as style=""cheerful"">
Cheerful hello!
</mstts:express-as>
<break time=""1s"" />
<mstts:express-as style=""angry"">
Angry goodbye!
</mstts:express-as>
</speak>
My guess is that the default voice does not support speech styles. But, if I add a voice element to use another voice (there are four available voices listed in the documentation), TextToSpeech won't work at all. So, I am facing two problems:
When using the SpeakSsml method instead of StartSpeaking, the selected voice (TextToSpeech.Voice) is disregarded and I am unable to change it using the voice element.
I couldn't find documentation for supported SSML elements for available voices in MRTK TextToSpeech Class.
Any ideas or useful links?
Thank you!

The TextToSpeech provided by MRTK depends on Windows 10 SpeechSynthesizer class, so it works offline and does not support adjust speaking styles. And the mstts:express-as element is only available in the Azure Speech Service, for more information please refer to this documentation: Improve synthesis with Speech Synthesis Markup Language (SSML)

Related

I do not see a Speech Input Source script in my mrtk (mixed reality toolkit) in the project search bar in unity

I am trying to implement voice commands into my unity project to eventually be used in the HoloLens. At the moment, I am simply trying to make a cube change colors using the speech input handler script and speech input source script. I have the handler script but I can't find the source script anywhere. How do I obtain the source script? Why do I not have it? I am using Unity 2018.4.12f1 and I am using the Mixed Reality Toolkit. If you need additional info to help me please ask!
In versions after MRTK2, SpeechInputSource is no longer needed. Instead, keyword service (e.g., Windows Speech Input Manager) must be added to the input system's data providers. Please check out the SpeechInputExample scene to understand how to use speech input.
The guide you are reading may be outdated, please read the official documentation to learn how to use Speech function in the latest version of MRTK.

Change google assistant voice programmatically

For most languages, the Google Assistant allows developers to choose from 4 type of voice as of now, 2 from female and 2 from male, for most languages.
I want to know the way to change it dynamically through node/java library.
I have tried with actions-on-google-nodejs but did not find anything in it.
I know we can change it either from google assistant application or from deployment setting, but do we have any way to change it dynamically?
Yes. Although not documented (except in StackOverflow currently) you can use the <voice> SSML tag to change which type is used. You can further adjust the pitch to create additional variants.
You can send SSML back using the actions-on-google-nodejs library by either including a string with valid SSML or by explicitly creating a SimpleResponse and setting the ssml property.
The multivocal library includes the ability to pre-define voices as combinations of the voice and prosody tags and will let you easily define which voice is to be used for each response.

google actions sdk: dynamic change in voice

In the use case I am working on, I wish to change the TTS voice by passing a parameter in the conversation speech string. For example,
<speak><voice gender="male" variation="1">Hello</voice></speak>
The actions console mentions that we can override the user's default locale and force a particular TTS voice (as above). And it does work.
The question is: how do we set the voice to say: en-AU or en-GB via the voice tag? I tried setting it via variation or language or name, it does not work.
Thanks.
Although SSML supports a <voice> tag with a languages attribute, this isn't one of the SSML tags that are officially supported by the Google Assistant. Although there is evidence that the tag is semi-supported with the gender and variant attributes, the languages attribute isn't.
Aside from setting the region in the Action console, there is no way to currently change which region's voice is used for your Action.

Explode feature doesn't seem to work on touch devices (Autodesk forge viewer)

When trying to use the slider that manages the explode feature with touch input nothing happens. When using the same slider in chrome with mouse input it works.
Behind the slider we find a <input type="range">. After some reading it seems that this html element works rather bad on touch input in general.
Even so much that there is a mini lib trying to improve range input on mobile.
https://rangetouch.com/ (which I might try out as a workaround)
Am I the only one with this problem or should this be addressed by autodesk?
The Forge Viewer dev team is testing the features on various touch devices and they're working fine, incl. the explode tool (I've just confirmed that the tool works fine on an iPhone 6S), but it's possible that your HW/SW combination is not covered in those tests. Feel free to send your specifics to forge [dot] help [at] autodesk [dot] com, and we'll forward it to the developers.

Speech to text in multiple languages

i have success fully implemented the ispeech API see http://www.ispeech.org/developers for my app to convert speech to text(see the demo app in the sdk http://www.ispeech.org/instructions/sampleprojects/iphone/IntegrationGuide.html). But unfortunately it takes what we speak as only in english and translte it to text.
what i need.
There is a "speak" button that listens what the user spokes and convert it to text(that works fine for english). Also another button that allows the user to select a language as seen in this appscreenshot( http://screencast.com/t/7vBFH565qD). So when the user speaks in the selected language it should get converted to the same language. In my case whatever we speak it takes the input only in english..
Thanks all..
iSpeech also supports more languages, you can find the list with their corresponding locale identifiers here:
http://www.ispeech.org/api
To set a new language you have the "ISpeechSetLocale" method
[_iSpeech ISpeechSetLocale:#"fr-FR"];
http://www.ispeech.org/developer/iframe?src=/iphonesdkdoc/
Why cant you use the NUANCE API which supports speech to text in multiple languages. see the following link and register there to use their iOS sdk .
http://dragonmobile.nuancemobiledeveloper.com/public/index.php?task=supportedLanguages