Flutter Speech Recognition : how to get a percentage of "similarity" between the right pronunciation of a word and the user's pronunciation

Flutter Speech Recognition : how to get a percentage of "similarity" between the right pronunciation of a word and the user's pronunciation - flutter

As stated in the question : I am developing an app for French kids to learn English vocabulary. I have added a SpeechToText functionality using the SpeechToText package, which really works well.
However, I am hitting a hard rock now... One of the activities proposed to the students is simply "Listen and repeat", so that they progressively improve their pronunciation.
I thought of using the SpeechToText package as well for this... and it would work if the students pronounced the words quite well.... One example : The sound "TH" for a French speaker is problematic and is very often pronounced as a "Z"... so the app never really recognize a word like "Father"... it keeps thinking the user says "Fazza".
Is there a way to compare the "good pronunciation" of a word to what the user says... get a percentage of "similitude". I know we can compare strings that way.
Would anyone know of a solution for this issue ? Any advice ?

You can use the speechace APIs to get the following features:
Pronunciation Assessment
Fluency Assessment
Spontaneous speech assessment

Related

Google Assistent Explicit Intents without App name

I would like to make my Google Assistant (Google Home & Android Smartphone) a little bit smarter by adding simple small-talk intents and (last but not least) usefull "Ok Google, do whatever" or "Ok Google, tell me when ..." intents.
For now I only own an Echo Dot with Alexa and I really hate their conception of skills due to their strict invocations. I have read somewhere that Google is going to come around this nightmare by using implicit invocation. However what I have done so far is not even close to good.
With implicit invocation, Google Assistant can find the correct action by searching for intents. This is good and I can add a simple phrase that Google detects correctly. However, instead of invoking that intent, Google asks me if it should ask appname to do so.
Of course this is not really an option if we want to make digital assistants smarter, since this not only destroys any kind of smartness, but also prevents us (at least me) from writing usefull actions at all (because it would be annoying to develop and to use it). They should be able to react to specific phrases and intents instead of requiring to specify the App. This makes it impossible to create simple intents like "Say goodnight" or "Ask my girlfiend when she will be here".
My question is not only if this is currently possible, but also what we can expect regarding this problem in the future? Is there any good news? Or do we have to wait, until we can help the existing assistents to evolve their real power?

You can add custom trigger phrases that will open or deeplink into your skill.
With query pattern in action.json.
Action.Json Query Pattern (Google Doc)
But the amount is limited. And I am not sure if you can completely avoid that google ask some stupid stuff like should i really open it... or i am opening now...
And maybe you have also to say ok, google to make it start listening at all.

Nick Felker's answer is better than mine. To expand on it a bit:
In the Google Home app on your phone tap the hamburger menu icon (three horizontal parallel lines) in the upper left, then go to "More settings", then "Shortcuts" (near the bottom), then press the little blue "+" button in the lower right to set up your custom shortcut.
Another option for extremely simple intents "Say goodnight" for example, is to use IFTTT, which has lots of integrations out of the box as well as the ability to pass along the message to a webhook which you could write yourself. Important caveat: IFTTT isn't "smart" itself, so that first layer of integration only does simple string matching (and I mean simple; it seems to be case-sensitive).

How to give special characters in Trivia Game, Multiple Choice?

I am using the Actions on Google Trivia Game template.
Special characters () are not displaying in the chat window.
In google sheets, I have given in the following format.
Question: How to Add an item to the end of the list
Answer1: append()
Answer2: extend()
In google assistant, it was displaying without parenthesis. How to give questions and answers with parenthesis and other special characters?

This is a good one - it looks like the processor that uses what you entered removes special characters. This does seem odd when you look at the question and the suggestion chips.
However... it makes sense if you think about how people are expected to answer the question. If you run it in "speaker" mode, it won't display the suggestion chips, but users will be expected to verbally give an answer. It is pretty difficult to give an answer with parentheses - so the system automatically removes those from what is expected.

Do keywords affect Bluemix Watson speech recognition?

Watson's speech recognizer supports a list of keywords as a parameter, but I'm trying to figure out whether these keywords actually affect recognition. For example, if you were handing Watson an audio clip you knew to contain proper names that might not be properly recognized, would submitting these names as keywords increase the likelihood that Watson would properly recognize them? Do the keywords interact with the recognition itself?

unfortunately the answer is no, the words wont get added to the vocabulary just because you added them as keywords, so they wont be found.
Dani

Looks like the answer to this is no (though I'd love to be proven wrong here): I ran a test clip that contained proper names that Watson mis-recognized, then submitted the same clip with those names as keywords, and it didn't change the transcript at all.
Not entirely conclusive, but it seems like the answer is no.

final version of my request:how to convert speech to text

I asked about converting speech to text in c++ but my question seemed to be unclear and also the answer I got. I have already heard that we should have "speech" in a wave file. so what to do after that? I said I want to do it on my own, I mean without using windows's facilities and API.c++ libraries are allowed if there are. I need to know how it works, is it impossible?? that it will take a million years. there is no tutorial thing in Google about it. at least tell me some steps to light my path.what are the titles of things I have to do or learn? there are so many thing that are using converting speech to text, like i can say john to my mobile phone and it calls john. does not it recognize john? it seems to be a little bit easy so it is possible to learn not 'a million years'. I need to learn how to convert just some short sentences to string. for now the qualities like pronunciation is not important . what are the bases and where I can learn about them? look , i need to say one sentence to computer and then computer has to choose one the answers that I have already recorded, so I need to convert to string to compare that string with the recorded string sentences. please some one tell me what to do or what to search or where to look or where to go or who to see?

How to provide easy multilanguage text in iphone app?

I have some short text that I want to ship in the currently active language. What's the most easy way to do it? Example: I have the text "the cat", but when someone from spain uses the app, he/she wants to read "el gato". Is there a standard way to do it easily with UIKit? They're pretty simple texts only. I can imagine some kind of property list and feeding it with a key and locale, and getting the appropriate text snippet out of there.

Check out NSLocalizedString.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse