Irish place names pronounced incorrectly - actions-on-google

The speech synthesizer knows that Galway is pronouced 'gol-way' but has no understanding of anglicized versions of other town names such as Portlaoise which it should pronounce 'port-leash' or Thurles 'fur-less'.
Is there a way to correct the default pronunciation in JSON?

While Google should be doing a better job about some of the places, as Leon indicated in the comment, sometimes there will be cases where the audio just doesn't match the text and you want to coerce it yourself.
In these cases, you can use the SSML <sub> tag in your response. It takes an alias parameter which contains how you would like it pronounced, while the body of the tag would contain the displayable text.
I haven't tested it (and I'm not sure I'd know how it should sound, even if I did), but SSML like the following might be what you're looking for:
<speak>
Welcome to <sub alias="furless">Thurles</sub>.
</speak>
If you're using the node.js library from Google, it should detect this as SSML and set parameters in the response accordingly. If you're writing the JSON yourself, you will need to send it in the ssml field instead of the textToSpeech field in your SimpleResponse.

Related

How to add multi-part Smart Health Cards to Apple Wallet and Health in Swift

I'm coding in Swift to add Smart Health Cards to Apple Wallet and Health using the guidelines here.
My code works fine with a single part JSON (i.e., shc:/5676290952432060346029243740... snipped for brevity). I am replacing the shc:/ prefix with https://redirect.health.apple.com/SMARTHealthCard/ to use with the Add to Apple Health and Wallet button as described in the link above.
The challenge is when the JSON is in multiple parts, i.e.,
shc:/1/3/567629095243206034602...
shc:/2/3/315057062436201156761...
shc:/3/3/634538347210283310097...
How can I assemble the QR code strings to add multiple vaccines to Apple Wallet and Health using the method described above using one Add to Apple Health and Wallet button? Things I've tried are using the shc:/ prefix on all QR codes in one long string, merging all JSON together and prefixing the entire string with https://redirect.health.apple.com/SMARTHealthCard/ and multiple variations of the above.
I found a solution and am posting here in case anyone else needs this information. What works for me is the following format:
https://redirect.health.apple.com/SMARTHealthCard/<QR string 1><QR string 2><QR string 3>
I removed all characters up to the trailing '/' in the QR codes. So all strings contain only numerics. I append all strings into 1 big string of numeric characters in the order of the ordinal number (i.e., 1/3, 2/3, 3/3). Then I pre-pend the redirect URL (https://redirect.health.apple.com/SMARTHealthCard/).

flutter:: Can I use speech to text api without pronunciation correction?

I am making an application using google cloud speech to text api with flutter.
As a result of using the google speech to text api, I felt that this api does not convert the exact pronunciation into text, but corrects the pronunciation and converts it to text.
For example, if I pronounce 'opple', the text is automatically converted to 'apple'.
I want the text as 'opple'.
Is there any way to use the speech to text api without a function to correct pronunciation?
There is no option to use Speech-to-Text API without pronunciation correction. Speech-to-Text API tries to identify known words when it is transcribing the audio into text. Using words that don't exist such as [Opple, Epple, Ipple, Upple] will result in words that are similar to what was said like Apple. Unless you are using a different language where any of those words exists, the API will autocorrect the pronunciation.
As a workaround, you can use the speech adaptation feature to help the Speech-to-Text recognize specific words or phrases more frequently than other options that might otherwise be suggested. For example, suppose that your audio data often includes the word "weather". When Speech-to-Text encounters the word "weather", you want it to transcribe the word as "weather" more often than "whether". In this case, you might use speech adaptation to bias Speech-to-Text toward recognizing "weather". To increase the probability that Speech-to-Text recognizes the word "weather" when it transcribes your audio data, pass "weather" in the phrases field of a SpeechContext object. Assign the SpeechContext object to the speechContexts field of the RecognitionConfig object in your request to the Speech-to-Text API. The following snippet shows part of a JSON payload sent to the Speech-to-Text API. The JSON snippet provides the word "weather" for speech adaptation. Please see this doc for more information.
"config": {
"encoding":"LINEAR16",
"sampleRateHertz": 8000,
"languageCode":"en-US",
"speechContexts": [{
"phrases": ["weather"]
}]
}
By default, speech adaptation provides a relatively small effect, especially for one-word phrases. The speech adaptation boost feature allows you to increase the recognition model bias by assigning more weight to some phrases than others to the strength of the speech adaptation effects on your transcription results (i.e) a higher boost value gives more importance to the specified phrases. The following snippet shows an example of a JSON payload. The JSON snippet includes a RecognitionConfig object that uses boost values to weight the words "fare" and "fair" differently. Also, note that “Speech adaptation boost” is a Beta feature. For more information, refer to this doc.
"config": {
"encoding":"LINEAR16",
"sampleRateHertz": 8000,
"languageCode":"en-US",
"speechContexts": [{
"phrases": ["fare"],
"boost": 18
}, {
"phrases": ["fair"],
"boost": 2
}]
}

Mozilla Deep Speech SST suddenly can't spell

I am using deep speech for speech to text. Up to 0.8.1, when I ran transcriptions like:
byte_encoding = subprocess.check_output(
"deepspeech --model deepspeech-0.8.1-models.pbmm --scorer deepspeech-0.8.1-models.scorer --audio audio/2830-3980-0043.wav", shell=True)
transcription = byte_encoding.decode("utf-8").rstrip("\n")
I would get back results that were pretty good. But since 0.8.2, where the scorer argument was removed, my results are just rife with misspellings that make me think I am now getting a character level model where I used to get a word-level model. The errors are in a direction that looks like the model isn't correctly specified somehow.
Now I when I call:
byte_encoding = subprocess.check_output(
['deepspeech', '--model', 'deepspeech-0.8.2-models.pbmm', '--audio', myfile])
transcription = byte_encoding.decode("utf-8").rstrip("\n")
I now see errors like
endless -> "endules"
service -> "servic"
legacy -> "legaci"
earning -> "erting"
before -> "befir"
I'm not 100% that it is related to removing the scorer from the API, but it is one thing I see changing between releases, and the documentation suggested accuracy improvements in particular.
Short: The scorer matches letter output from the audio to actual words. You shouldn't leave it out.
Long: If you leave out the scorer argument, you won't be able to detect real world sentences as it matches the output from the acoustic model to words and word combinations present in the textual language model that is part of the scorer. And bear in mind that each scorer has specific lm_alpha and lm_beta values that make the search even more accurate.
The 0.8.2 version should be able to take the scorer argument. Otherwise update to 0.9.0, which has it as well. Maybe your environment is changed in a way. I would start in a new dir and venv.
Assuming you are using Python, you could add this to your code:
ds.enableExternalScorer(args.scorer)
ds.setScorerAlphaBeta(args.lm_alpha, args.lm_beta)
And check the example script.

Is it possible to use INTENT instead of STRING as List Title in Google Action?

Some Background:
I use Lists a lot for a Google Action with a NodeJS fulfillment backend. The Action is primarily Voice-based. The reason for using List is that I can encode information in List's key and use it later to make a decision. Another reason is that Google Assistant will try to fuzzy match the user's input with the Title of the List's items to find the closest matched option. This is where thing's get a bit hard for me. Consider the following example:
{
JSON.stringify(SOME_OBJECT): {
title: 'Yes'
},
JSON.stringify(ANOTHER_OBJECT): {
title: 'No'
}
}
Now if I say Yes / No, I can get the user's choice and do something with information stored as stringified JSON in the choice's Key.
But, the users may say Sure or Yup or OK as they basically mean the same thing as saying Yes. But as those words don't match Yes, Google Assistant will ignore the "Yes" option. But all of these words belong to the smalltalk.confirmation.yes built-in intent. So, if I could use this intent instead of hardcoding the string Yes then I would be able to capture all of the inputs that mean Yes.
I know I could do this with a Synonyms list or Confirmation intent. But they also have some problems.
Using Synonyms would require me finding every word which is similar. Besides, I would also need to localize these synonyms to all the supported language.
With Confirmation intent, I won't be able to show some information to the user before asking them to choose an option. Besides, it also doesn't support encoding the options as I can do in List's key.
So, List is a good choice for me in this case.
So, is there any way to leverage the built-in intents for this purpose? What do you do in this situation?

Single barcode with Code128B and Code128C with iTextSharp

I wish to generate a barcode mixing code128B and code128C with iTextSharp DLL. Do you know how to do that ? I currently know only with a single codeset.
By example, I wish to generate a barcode with the value 8L1 91450 883421 0550 001065
where "8L1 91450" is in code128B and "883421 0550 001065" is in code128C.
Thanks
Barcode128 will actually automatically switch from B to C if and when it can but it sounds like you don't want this. For the control that you're looking for you'll need to set your barcode's CodeType property to Barcode.CODE128_RAW and manually set the raw values.
There's a couple of posts out there that give the basic idea but unfortunately they tend to assume to much knowledge of iText or too much knowledge of barcodes.
I'm not a barcode expert either but the basic idea is to create a string that starts with Barcode128.START_B, then the first part of your text, then Barcode128.START_C and then the second. When in raw mode, text isn't ASCII, however. You can use this site to get the character codes for various ASCII values. But basically instead of sending the letter L you'd send (char)44.
Hopefully this gets you started at least.