Writing entitiy names in intent [IBM Watson] - chatbot

I have a question about IBM Watson Chatbot and intents.
For Example i have 2 Entitys and one Intent.
Entity #Cybercrime:{
Phising,
Malware,
DDoS,
Botnet
}
Entity #DDoS_types:{
Ping_Flooding,
Mail Bombing,
Syn Flooding
}
must i have to write a "#" in in the intents?
For example
What is #Cybercrime #DDoS:types
or is it enough if i write only "What is"?

It really depends on how you are planning to use the intent. You need to put enough variations to remove doubt and allow for variations so the service can obtain high certainty for the intent that you intended.
If you only train it with "what is", then the question "what is Phising?" could get confused with questions like "what is a good strategy against Phising?"

Related

Google Dialogflow: How do i know which topics, use cases are allowed and which are not?

Obviously, one can't abuse keywords, but this is clear enough. Though, it's not clear whether or not i can use Dialogflow for, say, my name as invocation phrase? Like "ask Rick to introduce himself" - how do i know if this is allowed or not?
In your case, "Rick" is not just part of an invocation phrase - it is the name of the Action. So this goes beyond having it as a Dialogflow invocation - it goes to the invocation system for the Assistant.
While Google doesn't release the exact rules for naming, there are a few that are fairly clear:
Unless you have rights to an existing brand that is one word (see point 3), your name has to be at least two words long.
Some words don't count or aren't allowed. "My" is usually allowed in a name, but doesn't count towards the two word minimum. "Assistant" and "personal" usually aren't allowed since they're true for many Actions. You can bet that "my personal assistant" isn't considered very useful.
If you're trying to use well-known brands, you need to prove you have rights to use that brand. This isn't needed for every trademarked brand, and it isn't clear which ones Google requires it for, but if you need to use it you can connect a website to prove you have the rights to use it.
Overly broad or generic names or phrases aren't allowed. (For example, "Santa Claus" wouldn't be allowed because it would be used by a lot of actions, while "my local government" would probably be rejected because it is a generic term.) (Many of these will likely turn into the "built-in intents" that you'll be able to use, but there is no guarantee and the details of this aren't fully clear yet.)
In your example, there are a lot of "Rick"s around, and it is a single word, so it is likely that this would be rejected. If you controlled rick.com, however, it is possible that it would be allowed as a connected property.

Assigning weights to intents

I'm just getting started with Watson Conversation and I ran into the following issue.
I have a general intent for #greetings (hello, hi, greetings....) and another intent for #general-issues (I have a problem, I have an issue, stuff's not working, etc.....). If the user says: "hello, I have a problem with your product.", it matches the #greetings intent and replies back, but in this case I want to match the #general-issue intent.
As per: https://console.bluemix.net/docs/services/conversation/dialog-overview.html#dialog-overview
I would expect the nodes at the top of the list to be matched first, so I have the #greetings node at the bottom of the dialogue tree, to give a chance for a "higher weight" node to be matched first, but it doesn't seem to work every time.
Is duplicating the #greeting intents in #general-issue the only solution here?
So, trying to help you based on my experience, you can use the intents[0].confidence as your favor.
For example:
In my example, I create one condition with:
intents[0].confidence > 0.75
So, Watson will recognizes this intent just if the user types something very similar to their trained examples for the #greetings Intent.
As you can see, works very well:
See more about building a Complex dialog using Watson Conversation.
See more about Confidence in Conversation here.
So here are two other approaches you can take.
Treat off topic as a contamination.
When building a conversational system, it's important to know what your end users are actually saying. So collect questions from real users.
You will find that not many people will say a greeting and a question. I personally haven't done the statistical chance over projects I've done, but at least anecdotal I have not seen it happen often.
Knowing this, you can try removing off topic / chit-chat from your intents. As it does not fully reflect the domains you want to train on.
To counter this, you can create a more detailed second workspace with off topic/chit-chat. If you do not get a good hit on the primary workspace, you can call out to the second one. You can improve this by adding chit-chat to counter examples in the primary workspace.
You can also mitigate this by simply wording your initial response to the user. For example, if your initial response is a hello, have the system also ask a question. Or have it progress the conversation where a hello becomes redundant.
Detect possible compounded intents.
At the moment, this is only easily possible at the application layer.
Setting alternate_intents to true will return the top 10 intents and their confidences.
Before going further, if the top intent < 0.2 then it needs more training (so no need to proceed).
If > 0.2 you can map these on a graph, you can visually see if the top two intents. For example:
To have your application see this, you can use the k-means algorithm to create two buckets (k=2). That would be relevant and irrelevant.
Once you see more then one that is relevant, you can take action to ignore the chit-chat/off-topic.
There is more details, and sample code here.

Does Program-o uses NLP ?

I am trying to make chat bot. I searched for some solutions and programs to help me.
Can someone tell me if Program-o uses natural language processing?
I have searched on google but i didn't find the answer.
Program-O is basically the engine that uses recursive pattern-matching on AIML to find a suitable response.
The answer given here explains in a bit more detail NLP in AIML
The pertinent paragraph being:
If by "natural language processing" you mean what is commonly called a "learning bot," the ALICE (AIML) bot does not meet the definition. The ALICE program (whose "brain" is the AIML scripting language) is a pattern-matching program. It searches a fairly large database - usually about 40,000 entries - for a phrase or term that matches one in the input, then selects a reply from the set designated by the closest match. It neither writes to its own files or generates spontaneous output. It doesn't "learn" by itself. Any changes or new information must be hard-coded into the AIML files by the botmaster.

REST resources with parameters

Here is a problem. This is our simple domain: there is a number of questions and answers, a question may have several answers or doesn't have any.
Example:
question : "Where can I get a pen?".
answer #1:"You can buy a pen in a shop."
answer #2:"You can borrow it from a friend."
For the following situation it's possible to use the following REST resources:
GET /questions
GET /questions/{question_id}
GET /questions/{question_id}/answers
GET /questions/{question_id}/answers/{answer_id}
However some questions may have parameters like:
"What is the distance between {location A} and {location B}?"
"What is the status of flight {flight_number}?"
What is the best way to represent such questions and answers for them as REST resources?
You can use the following links:
GET /questions/{question_id}/locationA:Zurich/locationB:Budapest/flightNumber:1234/answers
GET /questions/{question_id}/answers?locationA="Zurich"&locationB="Budapest"&flightNumber=1234
Now I am not sure you need the question id here. If there are limited number of question types, then you can add a path for each of them.
GET /questions/distance/from:"Zurich"/to:"Budapest"
You can generate this automatically from the question title:
GET /questions/what-is-the-distance/between:"Zurich"/and:"Budapest"
To be honest the URI structure does not really matter by REST services, because it is used only by machines and maybe by developers to configure the routing. REST clients should follow links instead of building URIs (HATEOAS constraint). So most of the APIs are not REST and most of their clients are not REST clients...
Ok, I think you could build something like this
GET /question /{questionid}?location=a&location2=b
And
GET /question /{questionid}?number=12345
But, it is going to have a lot of things to consider.
Who defines what the parameter names would be? Is it the caller who asks the question? I guess without more of an idea of the actors involved in interacting with these services it is hard to be more specific.
Sorry, this is not as much help as I hoped it would be when I started. :)
What about considering POST as here I understand you are getting the question, but at the same time you are creating a question with filling in the variables.
Provide the variables as a list of Key Value pairs (or Dictionary) in body.

Advice on modeling behaviors/concepts not commonly thought to be persisted

I am looking for advice and experience related to the best way to REST-ify behaviors/concepts that are not "commonly persisted" - in other words, resource-ifying transformative or side-effect behaviors.
First, let me say that I know there's no complete test for REST-ful-ness - and actually I don't care. I find the CRUD-over-the-Web notion of REST very intuitive and it works well on most of the data I'm providing access to. At the same time, I am not worried about straying from anyones bible on how to perfectly REST-ify something. I'm in search of whatever the best compromise between practical and REST-intuitive is for the cases that don't fit so well. Clearly, consistency is the main goal, but intuitive-ness helps ensure that.
Given that, let me dive into the details of what I'm after.
Since REST is inherently resource-oriented, it is easy to model things commonly persisted - what is less clear is how to model behaviors/concepts that are not commonly persisted, especially those which have side effects or are purely transformational.
For example, take a stackoverflow.com question. It can be Created, Updated, Read, and Deleted. Everyone can relate to that and it all makes sense under REST.
But now consider something like a translation - for example, I want to build a REST API for my service that translates an English sentence to Spanish.
There are at least two ways to address the translation scenario. I could:
Look at a translation invocation as the creation of a "translation instance" (which doesn't happen to be persisted, but could be), in which case a POST /Translation (i.e., a create) actually makes a lot of sense. This is particularly the case if the translation service requires the URL of something to be translated (since the contents of that URL can change over time).
Look at translation as really the act of querying a larger dictionary of known answers, in which case a GET /Translation (i.e., a read) might be more appropriate. This is especially tempting if the translation service requires just the text of the sentence to be translated. Noone would expect the translation of a static sentence to change over time. You could also argue that it could be cacheable, which GET would lend itself towards.
This same dilemma can crop up for other actions which primarily have side effects (e.g., sending an SMS or an e-mail) and are less commonly associated with persistence of the data.
My approach thus far has been to essentially "Order"-ify these cases, that is looking at everything as though one was placing an order. More generally, I think this is just converting verbs (translate) into nouns (translation order), which would seem to be one way of REST-ifying such things.
Can anyone share a better, more intuitive, way to approach the modeling of actions/behaviors that are not commonly assumed to be persisted?
The "Would it hurt to do more than once?"-test perhaps?
In your examples - what would happen if you are asked to translate the same text twice? Not really much. Sending an sms twice is likely to be a bad thing though.
The two options you describe are valid options and you have already highlighted scenarios when one is better than the other. The only other option I would suggest which is use POST with a "processing resource". e.g.
POST /translator?from=en&to=fr
Content-Type: text/plain
Body: The sentence to be translated
=>
200 OK
La phrase à traduire
One nice thing about "processing resources" is that they may or may not have persistent side effects. Obviously the disadvantage is that intermediaries can't cache the response.