How to I create an #Intent which looks something like this:
How much is a #ProductType?
Whereas the #ProductType is an simple Entity which consists of:
Soft Drinks: Coke, Pepsi, Sprite, Fanta
Fruits: Apple, Banana, Watermelon
I tried adding an Intent with above settings, but it doesn't seem to work. Is such ability natively supported in IBM Watson? Or otherwise, do I need to manually handle in the Dialog, using Conditions and stuffs? Please kindly advise.
The training is based on regular language and typical sentences or phrases. So #ProductType is not what you want in the phrase, but any of the fruits or drinks.
By defining the entities, Watson Assistant later learns the connection and to identify the entities and intents.
To get started, you define the intents and entities. Both can be imported from lists. Then you add the dialog which references the different types.
This blog should give insight to all the ways to train an entity and how it is used within intents.
https://medium.com/ibm-watson/all-about-entities-dictionaries-and-patterns-with-watson-assistant-part-1-5ef7254df76b
There are a number of possible pipelines you can choose from.
1. Indirect references: this is the preferred method.
Use natural language in your intent training data. "I want to buy a pear"
Watson will automatically see the other values you have related to pear and use those as intent training as well. This will be the fastest and simplest way to manage your data
2. Direct references: this should only be used if absolutely necessary
Directly reference the entity in your intent data. "I want to buy an #pear"
Nothing is done in the UI to tell you this works, but it does. This tells Watson the entity is a very important term and will increase the weight, as well as reference all synonyms with high weight. This is more effort for you to go through your entire workspace and relabel everything this way, hence why it is not recommended unless absolutely necessary. By doing this, you also tell watson that when the system sees various fruits without the # symbol, to ignore them as entities which is not ideal
3. Contextual entities. This is highlighting them like in your screenshot.
Note the UI has been updated so there is no an annotation mode instead of just highlighting. This builds a model around the entity, and is good for things like names or locations, but not necessary for a small list of items like crayons in a box, or fruit in a store. This will ignore all of the dictionary values youve created and only look at the model. It should be used according to the blog above when the use case is ideal.
What #data_henrik answered was partially correct. But it doesn't seem like Watson Assistant "automatically" learns the preferred #Entity just by simply inputting the pure (plain-text) Examples into the #Intent. In fact, that step was required. But we still need to do one more step.
After keying in the good plain-text Examples into the #Intent, we then still need to "right click" on the text-string of the possible #Entity entry, and then choose (teach Watson) the correct #Entity name from the dropdown list appeared.
Only then Watson starts to understand such; this #Intent uses that #Entity, I suppose.
Thank you #data_henrik, and appreciate your hint.
Related
This could be a simple one that I haven't been able to find but I'm trying to exclude a single value ("girlfriend") from being picked up as an entity in a chatbot I'm building. The entity list is currently "dog, cat, pet, mum, horse" with relevant synonyms for each of those entities as well.
Watson keeps picking up "girlfriend" and matching it as an entity despite it not being in there which is stuffing up the logic in the conversation.
Is there a way to stop Watson identifying similar words in an entity list beyond what is in the list? I have tried turning off fuzzy matching but that just misses spelling mistakes.
Please note this is not an intent training issue, it is specifically asking about entity identification.
Any help appreciated.
-T-
Your question is not entirely clear, but likely you want to take a look at how to improve a skill. Because Watson Assistant is built on AI technology, a key part is about learning.
You can "teach" Watson Assistant by going back to conversations and correct wrong matches with the right ones. Watson Assistant is going to pick this up and then retrain the dialog. This should result in excluding "girlfriend".
I had a similar problem. My bot kept picking its own name as a username and I wanted it to ignore its own name even if the user typed it (e.g. Hello Robot, I am Jill) I wanted it to respond to 'Jill' and not 'Robot' but it kept missing it. I later realized the context variables I created had similar values to user names. So what I did was create a variable #bot-name and gave it only 1 value (Robot), no synonyms, no fuzzy match, no annotations. Then tried it again and the Bot recognized its own name, ignored that and picked the second name correctly as the user name. So when I repeated the sentence 'Hello Robot, I am Jill' it recognized #entity:bot-name and #entity:user-name and then responded only to the username. You can try something similar.
Its not clear how you created your entity list. If its was via contextual entities then Watson may be taking "girlfriend" as being in the same "family" as the other entities and adding to the entity list. If the entity list was hard coded, along with the synonyms, then I would guess that one of your synonyms shares some of the spelling of girlfriend, girl or friend. Which via fuzzy logic would match an entity, but with a lower confidence level.
To fix you could create a new entity list and have a condition that looks to match entity list one, but not entity list two (girlfriend).
Or you could set your condition on the entity list and entity confidence level > 0.8 - but you may then miss some spelling mistakes. (Select a confidence level thats just above that reported for girlfriend).
I can't say if this is a solution, However I'd prefer to call it a workaround as it worked for me in my case.
Non-Contextual Case:
Create a new entity and add girlfriend as a value. Thus it would never interfere with your current entity in a dialog flow.
Contextual Case:
Train an intent with examples which includes girlfriend and annotate it with new entity.
I'm getting started with Knowledge Studio and Natural Language Understanding.
I'm able to deploy a machine-learning model toNatural Language Understanding and use the API to query it.
I would know if there's a way to deploy only the pre-annotator.
I read from Knowledge Studio's documentation that
You can deploy or export a machine-learning annotator. A dictionary pre-annotator can only be used to pre-annotate documents within Watson Knowledge Studio.
Does exist a workaround to create a model that simply does the job of the pre-annotator, i.e. use dictionaries to find entities instead of the machine-learning model?
Does exist a workaround to create a model that simply does the job of the pre-annotator, i.e. use dictionaries to find entities instead of the machine-learning model?
You may need to explain this better in what you need.
WKS allows you to pre-annotate documents with dictionaries you upload. Once you have created a ML model, you can alternatively use that to annotate your training documents, and then manually correct. As you continue the amount of manual work will reduce after each model iteration.
The assumption is that you are creating a model with a reasonable amount of examples. In your model results, you will want the mention/relations to be outside or close to outside the gray area of the report.
The other interpretation of your request I took was you want to create a dictionary based model only. This is possible using the "Rule-Based Model" functionality. You would have to create the parsing rules but you just map what you want to find to the dictionary/rule.
Using this in production though is still limited. You should get a warning when you deploy these kinds of models.
It's slightly better than just a keyword search as you can map items to parts of speech.
The last point. The purpose of WKS is to create a machine learning model which will do the work in discovering new terms you haven't seen before. With the rule based engine it can only find what you explicitly tell it to find.
If all you want is just dictionary entries, then you can create a very simple string comparison solution, but you lose the linguistic features.
I am trying to match objects based on predefined user preferences. A simple example would be finding best matching vechicle.
Lets say a user 'Tom' is offered a rented vehicle for travel based on his predefined preferences. In this case, the predefined user preferences will be -
** Pre-defined user preferences for Tom:
PreferredVehicle (Make='ANY', Type='3-wheeler/4-wheeler',
Category='Sedan/Hatchback', AC/Non-AC='AC')
** while the 10 available vehicles are -
Vechile1(Make='Toyota', Type='4-wheeler', Category='Hatchback', AC/Non-AC='AC')
Vechile2(Make='Tata', Type='3-wheeler', Category='Transport', AC/Non-AC='Non-AC')
Vechile3(Make='Honda', Type='4-wheeler', Category='Sedan', AC/Non-AC='AC')
;
;
and so on upto 'Vehicle10'
All I want to do is - choose a vehicle for Tom that best matches his preferences and also probably give him choices in order, i.e. best match first.
Questions I have :
Can this be done with Mahout Taste?
If yes, can someone please point me to some example code where I can start quickly?
A recommender may not be the best tool for the job here, for a few reasons. First, I don't expect that the best answers are all that personal in this domain. If I wanted a Ford Focus, the best alternative you have is likely about the same for most every user. Second, there is not much of a discovery problem here. I'm searching for a vehicle that meets certain needs; I don't particularly want or need to find new and unknown vehicles, like I would for music. Finally you don't have much data per user; I assume most users have never rented before, and very few have even 3+ rentals.
Can you throw this data at a recommender anyway? Sure, try Mahout Taste (I'm the author). If you have the book Mahout in Action it will walk you through it. Since it's non-rating data, I can also recommend the successor project, Myrrix (http://myrrix.com) as it will be easier to set up and run. You can at least evaluate the results to see if it's anywhere near useful.
Either way, your work will just be to make a CSV file of "userID,vehicleID" pairs from your data and feed it in. Then it will give you vehicle IDs as recommendations for any user ID.
But, I imagine you will do much better to analyze what people picked when the car wasn't available, and look at the difference, and learn which attributes they are most and least likely to be sacrificed, and learn to score the alternatives that way. This is entirely feasible since this data set is small, and because you have rich item attribute data.
I am thinking of creating a web site, which lets people to rate restaurants. Since I don't have a database containing all the restaurants, this web site relies on user's inputs.
But there is a problem of this method, because people may use different word (name) to describe a same restaurant, but I don't want to create different entries inside the database, as they refer to the same restaurant.
For example, when describing KFC, somebody use the name "KFC", others may use "Kentucky Fried Chicken"
How can I make the system to automatically detect this? and give the user a list of existing items of the database.
This should quite similar to stackoverflow, which tells you "questions with similar title". But I don't know how to implement this.
You can't ... you have to create a list of the restaurant names and their "synonyms" and other possible spellings.
How can I make the system to automatically detect this?
The system doesn't know that "KFC" means "Kentucky Fried Chicken".
Make a map of synonyms, to let it know.
This should quite similar to stackoverflow, which tells you "questions with similar title"
It generally matches word-for-word. It may have an internal list of synonyms to deal with common cases.
Looking at all the examples of Operational Transformation Frameworks out there, they all seem to resolve around the transformation of changes to plain text documents. How would an OT framework be used for more complex objects?
I'm wanting to dev a real-time sticky notes style app, where people can co-create sticky notes, change their positon and text value. Would I be right in assuming that the position values wouldn't be transformed? (I mean, how would they, you can't merge them right?). However, I would want to use an OT framework to resolve conflicts with the posit-its value, correct?
I do not see any problem to use Operational Transformation to work with Complex Objects, what you need is to define what operations your OT system support and how concurrency is solved for them
For instance, if you receive two Sticky notes "coordinates move operation" from two different users from same 'client state', you need to make both states to converge, probably cancelling out second operation.
This is exactly the same behaviour with text when two users generate two updates to delete a text range that overlaps completely, (or maybe partially), the second update processed must be transformed against the previous and the resultant operation will only effectively delete a portion of the original one, (or completely cancelled with a 'no-op')
You can take a look on this nice explanation about how Google Wave Operational Transformation works and guess from this point how it should work your own implementation
See the following paper for an approach to using OT with trees if you want to go down that route:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.100.74
However, in your particular case, I would use a separate plain text OT document for each stickynote and use an existing library, eg: etherPad, to do the heavy lifting. The positions of the notes could then be broadcast on a last-committer-wins basis.
Operation Transformation is a general technique, it works for any data type. The point is you need to define your transformation functions. Also, there are some atomic attributes that you cannot merge automatically like (position and background color) those will be mostly "last-update wins" or the user solves them manually when there is a conflict.
there are some nice libs and frameworks that provide OT for complex data already out there:
ShareJS : library for Node which provides all operations on JSON objects
DerbyJS: framework for NodeJS, it uses ShareJS for OT stuff.
Open Coweb framework : Dojo foundation project for cooperative web applications using OT