The Actions documentation says that an action may interact in a language of its choice other than the language setting of the user. There is a checkbox on the Invocation page allowing you to use a voice other than the one matching the users language setting.
However, the choice of voices is limited to those in that language. I need to speak and, more importantly, to listen in a specific language and not that set by the user (this is for language teaching).
This doesn't seem to be a mainstream use case but I wonder if there is some workaround through the API or by uploading an action that enables settings not accessible through the DialogFlow GUI.
Speaking in another language is achievable with some inconvenience using recordings, but the inability to listen is a showstopper.
Similar questions have been asked before but new languages and features appear all the time - maybe the current Actions API supports more than is widely known?
I believe Alexa has the same limitation, as well as a shorter list of supported languages.
The selection of TTS voices is constrained by the language selected by the user. Speech recognition is also biased towards the language configured by the user.
Related
how to make or add watson languages other than 13 default languages
(Arabic,Brazilian Portuguese,Chinese (Simplified),Chinese (Traditional),
Dutch,Czech,French,English (US),German,Italian,Japanese,Korean,Spanish)
This technique is totally unsupported and depending how you approach gives good or really poor results.
First determine what language is closest to the unsupported language you want to use. In that there are shared nuances between the two languages. Optionally use English.
Capture your questions to build the intents, then use a language translation tool to translate from your language to the supported language. Train on that.
Then at your application layer you would translate the end users input before sending to Watson Assistant.
For the returning message, Watson Assistant already supports UTF-8, so you can put your responses in your target language.
It’s important to use the same translation engine when asking a question to the system. It may turn the question into garbage, but it may find the correct intent if similar questions mistranslated.
You will need to do the same with entities when training as well.
actions SDK does not recognize any other intent from the action.json.
I've read that that is not a bug in this post: unable to read intents
What I don't understand is why do we have the option to define actions, if they are not recognised by the SDK?
Is there any other way to add more intents without using DialogFlow?
That is correct, it is not a bug. The Intents listed in the actions.json file are primarily used to do matching for the initial Intents (plural - they help identify which initial Intent to use if you have multiple ones defined). They can help to do conversation shaping and suggest what patterns the speech-to-text parser should look for, but they don't mandate the parser follow them - I would venture this is intentional to allow for flexibility in the various Natural Language Parsers.
The latter is probably why they're, ultimately, not used. Unlike Alexa, which requires a wide range of exact text to match for its Intent definitions, Google probably started going that route and realized that it would be better to hand it off to other NLPs, either your own or commercial ones, which could handle the flexibility of how humans actually speak. (And then they bought one to provide as a suggested tool to use.)
So the Actions SDK has primarily become the tool to use if you do intend to hand the language parsing off to another tool. There isn't much advantage to using it over any other tool otherwise.
You're not obligated to use Dialogflow. You can use any NLP system that will accept text input for the language you need. Google also provides direct integration with Converse.AI, and I suspect that any other NLP out there will provide directions of how to integrate them with Actions.
It is a general question.
I am not sure whether i could post this question here. As i search in the programmer section and it seems to me that it is meant for in-depth question on programming.
As i am not a programmer ,however, i would like to find out how is API related to a plugin.. Do they have and difference?
I have tried to google in the web but not able to find any answer to my own question.
Thanks in advance.
Justin
"An application programming interface (API) is a set of routines, protocols, and tools for building software and applications" (Wikipedia).
What this means is that you establish a connection to another program / service which provides you certain functionality, like data retrieving in case of the Twitter API or operations and commands like from the Win32 API. Without this interface, there would be no ("easy") way to make use of the program.
"A plug-in [...] is a software component that adds a specific feature to an existing computer program" (again, Wikipedia).
This means that you have already built an application but want to enhance its functionality / appearance. For instance, you have a table on an HTML page but want to make it searchable. You the could use the jQuery Data Tables plugin. In this case you take an existing piece of software and entirely integrate it into your application.
I guess, as a developer you have a very intuitive understanding of these two and thus can distinguish more easily. Nonetheless, I hope that my explanation made it a bit clearer for you.
If not, do you have any specific question?
When talking about the pros and cons of an API versus plugin integration, it is important to highlight that there is no right or wrong basically both have the same nature.
API stands for Application Programming Interface. An API basically defines how a component interacts with a system, facilitating the communication between them.
More integration flexibility. The merchant has total control over the integration and can make the checkout page look as desired.
A theme or skin is a preset package containing additional or changed graphical appearance details, achieved by the use of a graphical user interface (GUI) that can be applied to specific software and websites to suit the purpose, topic, or tastes of different users to customize the look and feel of a piece of computer software or an operating system front-end GUI.from Wikipedia
Easy and ready to use integration. Within a few minutes, the payment methods can be available at the checkout.
How do you deal with formal / informal speech when building an application that must have all its phrases in one of those?
Most frameworks will let you pick the language for things such as form validation error messages, by setting it to something like 'en-GB' or 'fr-FR'.
But if you want to change from formal to informal or viceversa, you will have to edit the language files.
I know this isn't a big issue in english, but it is in other languages where you have to pick the correct word for say, the equivalent of 'You', depending on whether it is a formal or informal conversation. The same can happen with almost any word in the sentence, depending on the language.
Any thoughts?
Have you ever been told to build an application fully in formal / informal speech?
Does the user even care about this?
Informal vs Formal
The real problem with choosing the form is the fact that it really depends on who you speak to. It is probably OK to use informal messages to an English user, but it would be regarded as an offense if you use the same tone to for example Japanese user. It is the essence of Internationalization.
How to deal with it?
I suggest to pick one "tone" and consequently use it throughout an application. If it is informal (for example target users are teenagers), then be it. However, let Localization decide on how to translate these messages, for they should have the vast knowledge of target market.
If you need to have both formal and informal language in one application, for example depending on target user's age, you can think of implementing themes. Of course theme should not only customize messages but also the User Interface (styles, colors). Again, if you do, let L10n decide what is good for international market (some themes might not be applicable for that market).
Does user even care?
Some users do, some users don't. Depends. From my experience, Asian customers (especially Japanese and Chinese) tend to care a lot. Using informal speech or bright colors might seem as if you being rude to them.
Does anyone have experience with Google Closure Editor/WYSIWYG? I'm thinking of moving from CKEDITOR to Google Closure Editor/WYSIWYG. Ideally I'd love to use the etherpad editor but it doesn't appear that anyone has separated the editor from all the app.
Anyhow, for Google Closure Editor/WYSIWYG, does anyone know, does it support the real-time collaborative aspects seen in Google Docs?
The Google Closure editor is a wrapper around the built-in browser editing capabilities. It is thus similar to other rich text editors like TinyMCE, CKEditor, etc. It is less feature-rich than either of those, but it's smaller and faster. The base editor is used by Gmail (most notably) and various other Google properties.
There is nothing within the public Google Closure editor to enable Google Docs style real-time collaboration. With that said, it has a plugin model which enables you to add new functionality. I would not recommend taking something like this on without a solid understanding of working with Google Closure.
Until recently, the editor was also used by Google Docs. However, the limitations of core browser editing technology became a barrier to innovation, so they built their own editing surface[1,2] (codenamed Kix). This editing surface is not included in Closure Library.
https://drive.googleblog.com/2010/04/a-rebuilt-more-real-time-google.html
https://drive.googleblog.com/2010/05/whats-different-about-new-google-docs.html
It might not last, but there is a standalone version of kix up on github:
https://github.com/benjamn/kix-standalone
EtherPad Lite is the most viable option I've seen so far:
https://github.com/ether/etherpad-lite
Personally I favor this one, because:
It's open source
You can host your own
Has few server-side dependencies (Node.js)
It has an API, so you can build your app in any language
Attempting to steal Google's work is probably not a good long-term plan. (I'm also not convinced that having the client-side libraries actually helps you, in terms of the real-time collaboration feature, which depends heavily on the server-side.)