How to detect more than one intent with IBM Watson Assistant? - ibm-cloud

Can the IBM Watson Conversation / Assistant service detect more than one intention in a single sentence?
Example of input:
play music and turn on the light
Intent 1 is #Turn_on
Intent 2 is #Play
==> the answer must be simultaneous for both intents: Music played and light turned on
If so, how can I do that?

Yes, Watson Assistant returns all detected intents with their associated confidence. See here for the API definition. In the response returned by Watson Assistant is n array of intents recognized in the user input, sorted in descending order of confidence.
The documents have an example on how to deal with multiple intents and their confidence. Be also aware of a setting alternate_intents to allow even more intents with lower confidence to be returned.

While #data_henrik is correct in how to get the other intents, it doesn't mean that the second question is related.
Take the following example graph, where we map the intents versus confidence that comes back:
Here you can clearly see that there are two intents in the persons question.
Now look at this one:
You can clearly see that there is only one intent.
So how do you solve this? There are a couple of ways.
You can check if the first and second intent fall within a certain percentage of each other. This is the easiest to detect, but tricker to code to select two different intents. It can get messy, and you will sometimes get false positives.
At the application layer you can do a K-Means on the intent result. K-Means will allow you to group intents by buckets, so you create two buckets (K=2), and if there is more than one in the first bucket, you have a compound question. I wrote about this and a sample on my site.
There is a new feature you can play with in Beta called "Disambiguation". This allows you to flag intent nodes with a question to ask to get it. Then if two questions are found it will say "Did you mean? ...." and the user can select.

IS this disambiguation feature available in non production environments, on Beta?

Related

Create custom Google Smart Home Action

I have a Google Nest Hub Max and I want to increase its capabilities for a custom need:
"Hey Google, add xyz to my work planning"
Then I want to make an HTTP call to my private server
The private server returns a text
The text is displayed in the Google Nest Hub Max screen + speak-out.
How can that be achieved?
Originally I thought that this will not be difficult. I've imagined a NodeJs, Java, Python or whatever framework where Google gives me the xyz text and I can do my thing and return a simple text. And obviously, Google will handle the intent matching and only call my custom code when users say the precise phrase.
I've tried to search for how to do it online, but there is a lot of documentation everywhere. This post resumes quite well the situation, but I've never found a tutorial or hello world example of such a thing.
Does anyone know how to do it?
For steps 2. and 3., I don't necessarily need to use a private server, if I can achieve what the private server does inside the Smart Home Action code, mostly some basic Python code.
First - you're on the right track! There are a few assumptions and terminology issues in your question that we need to clear up first, but your idea is fundamentally sound:
Google uses the term "Smart Home Actions" to describe controlling IoT/smart home devices such as lights, appliances, outlets, etc. Making something that you control through the Assistant, including Smart Speakers and Smart Hubs, means building a Conversational Action.
Most Conversational Actions need to be invoked by name. So you would start your action with something like "Talk to Work Planning" or "Ask Work Planning to add XYZ'. There are a limited, but growing, number of built in intents (BIIs) to cover other verticals - but don't count on them right now.
All Actions are public. They all share an invocation name namespace and anyone can access them. You can add Account Linking or other ways to ensure a limited audience, and there are ways to have more private alpha and beta testing, but there are issues with both. (Consider this an opportunity!)
You're correct that Google will help you with parsing the Intent and getting the parameter values (the XYZ in your example) and then handing this over to your server. However, the server must be at a publicly accessible address with an HTTPS endpoint. (Google refers to this as a webhook.)
There are a number of resources available, via Google, StackOverflow, and elsewhere:
On StackOverflow, look for the actions-on-google tag. Frequently, conversational actions are either built with dialogflow-es or, more recently, actions-builder which each have their own tags. (And don't forget that when you post your own questions to make sure you provide code, errors, screen shots, and as much other information as you can to help us help you overcome the issues.)
Google's documentation about how to design and build conversational actions.
Google also has codelabs and sample code illustrating how to build conversational actions. The codelabs include the "hello world" examples you are probably looking for.
Most sample code uses JavaScript with node.js, since Google provides a library for it. If you want to use python, you'll need the JSON format that the Assistant will send to your webhook and that it expects back in response.
There are articles and videos written about it. For example, this series of blog posts discussing designing and developing actions outlines the steps and shows the code. And this YouTube playlist takes you through the process step-by-step (and there are other videos covering other details if you want more).

IBM Watson Assistant for non-English language - Intent is not recognized

I am working with IBM Watson Asistant for Korean and found the failure rate to detect the correct intent is so high. Therefore, I decided to check language support and I can see the important missing features that is Entity Fuzzy Matching:
Partial match - With partial matching, the feature automatically suggests substring-based synonyms present in the user-defined entities, and assigns a lower confidence score as compared to the exact entity match.
This result in the chatbot that is not very intelligent for which we need to provide synonyms for each word. Check out the example below where Watson Assistant in English can detect an intent from words that is not included in the example by any means. I tested and found it is not possible for Korean language to do so.
I wonder If I understood something wrong or there is away to workaround this issue that I do not know of?
By default, you start with IBM Watson Assistant and an untrained dialog. You can significantly improve the understood intents and entities by providing more examples and then using the dashboard to tag correctly understood conversations and to change incorrect intents / entities to the right ones. This is the preferred way and is just part of the regular development process which includes training the model.
Another method, this time as workaround, is to preprocess a dialog using Watson Natural Language Understanding which has Korean support, too.
BTW: I use German language for some of my bots and it requires training for some scenarios.
In addition to Henrik's answer, here are couple of tips while creating an intent
Provide at least five examples for each intent.
Always re-train your system
If the system does not recognize the correct intent, you can correct
it. To correct the recognized intent, select the displayed intent and
then select the correct intent from the list. After your correction is
submitted, the system automatically retrains itself to incorporate the
new data.
Remember, The Watson Assistant service scores each intent’s confidence independently, not in relation to other intents.
Avoid conflicts and if there are any resolve the conflicts - The Watson Assistant application detects a conflict when two or more intent examples in separate intents are so similar that Watson Assistant is confused as to which intent to use.

Can IBM Cloud Watson recognize the same person across multiple images?

I want to create a gallery service that clusters images based on different characteristics, chief among them being faces matched across multiple images.
I've been considering the IBM Cloud for this, but i can't find a definitive yes or no answer to whether Watson supports Face recognition (on top of detection) so the same person is identified across multiple images, like AWS Rekognition and Azure CognitiveServices Face API do.
The concrete scenario i want to implement is this: Given photos A.jpg and B.jpg Watson should be able to tell that A.jpg has a face corresponding to person X, and B.jpg has another face that looks similar to the one in A.jpg. Ideally, it should do this automatically and give me face id values for each detected face.
Has anyone tackled this with Watson before? Is it doable in a simple manner without much code or ML techniques on top of the vanilla Watson face detection?
I have used Watson to do basic face detection on the CLI. Are you wanting to recognize a particular individual after training on images of that individual? Could you clarify your question. Here is what I can answer, if you have a Watson API key you can run this for example on a terminal:
curl -X POST --form "images_file=#path/to/image.jpg" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/detect_faces?api_key={your_api_key}&version=2016-05-20"
That will recognize the individual in the photo and give other categorical information about the person such as age, sex, etc. And if they are famous it will give further category information.

How does Microsoft LUIS filter swear words?

We've been using Microsoft's LUIS cognitive service as an ML tool for our Chatbot. We've observed that whenever there is a swear word entered, there is no response from the bot. I couldn't find anything about this in the documentations, except that LUIS can identify slang words.
I would also like to know if anyone knows how to customize your Chatbot's response in such a scenario?
Any help would be great. Thank you!
LUIS doesn't filter swearwords. Regarding an explanation for the lack of response from your chatbot, it would be necessary to see the code for the bot. If the user isn't in a dialog and utters a swearword, your bot should either map it to a defined intent, map it to the crowd-favorite "None" intent, or do nothing with it. To my knowledge the only time the chatbot will do nothing, is when a handler for the "None" intent isn't defined.
To handle an utterance that contains swearwords it's necessary to know the context behind it.
At certain points, the SDK you're using may block swearwords indirectly. E.g. a user saying, "#$%! yes!" to a confirm prompt may have the bot asking the user to repeat themselves with either a yes or no response.
An extremely simple and intrusive way to handle swear words in the Node SDK would be to create a bot.dialog() that activates through the use of .triggerAction(). You can use regexp so the chatbot responds to swearwords by switching to this dialog. You can also use a custom Intent Recognizer to recognize swearwords.
The 'Swear' intent needs to be implemented by hand in LUIS. I suggest to separate it from the None intent in LUIS.
In the Bot, it is possible to have the same handler for None and Swear intents, or to have separate handlers and potentially different Bot behaviors for these two intents.

Google turn by turn API (iPhone)

I can't find API from Google which provides turn by turn based directions. Just wanted to make sure if this type is even public? If not, what are my alternatives on iOS?
Thanks!
Have you tried Googling "Google Maps Directions API"? This is easy to find on Google or in the Google Maps API homepage.
http://code.google.com/apis/maps/documentation/directions/
"Each element in the steps array defines a single step of the
calculated directions. A step is the most atomic unit of a direction's
route, containing a single step describing a specific, single
instruction on the journey. E.g. "Turn left at W. 4th St." The step
not only describes the instruction but also contains distance and
duration information relating to how this step relates to the
following step. For example, a step denoted as "Merge onto I-80 West"
may contain a duration of "37 miles" and "40 minutes," indicating that
the next step is 37 miles/40 minutes from this step."
When I implemented this on Android, I passed a URI formatted string with an address to the OS which called the included turn-by-turn navigation which I recall was a 3rd party api (but shipped with the OS). Now that I'm developing iOS, I too would like to identify a solution on iPhone. As far as I can tell so far, there is no similar API for iOS (yet). I actually hope that I'm wrong here.
Hope this helps.