identify new words as intents in rasa nlu - chatbot

Have been using rasa nlu to classify intents and entities for my chatbot. Everything works as expected (with extensive training) but with entities, it seems to predict the value based on the exact position and length of the word. This is fine for a scenario where the entities are limited. But when the bot needs to identify a word (which has a different length and not trained yet, for example a new name), it's failing to detect. Is there a way wherein I can make rasa identify the entities based on the relative position of the word or better yet, insert a list of words that becomes the domain specific for the entity to find a match with (like phrase list in LUIS)?
{"q":"i want to buy a Casio SX56"}
{
"project": "default",
"entities": [
{
"extractor": "ner_crf",
"confidence": 0.7043648832678735,
"end": 26,
"value": "Casio SX56",
"entity": "watch",
"start": 16
}
],
"intent": {
"confidence": 0.8835646513829762,
"name": "buy_watch"
},
"text": "i want to buy a Casio SX56",
"model": "model_20180522-165141",
"intent_ranking": [
{
"confidence": 0.8835646513829762,
"name": "buy_watch"
},
{
"confidence": 0.07072182459497935,
"name": "greet"
}
]
}
But if Casio SX56 gets replaced with Citizen M1:
{"q":"i want to buy a Citizen M1"}
{
"project": "default",
"intent": {
"confidence": 0.8710909096729019,
"name": "buy_watch"
},
"text": "i want to buy a Citizen M1",
"model": "model_20180522-165141",
"intent_ranking": [
{
"confidence": 0.8710909096729019,
"name": "buy_watch"
},
{
"confidence": 0.07355588750895545,
"name": "greet"
}
]
}
Thank you!

Make sure you actually added each entity value training examples before training it with rasa_nlu.
--- For successful entity extraction we need to create at least 2 or more contextual training data ---
add this eg. in rasa_nlu training data if it's not extracting properly
"text": "i want to buy a Citizen M1",
"model": "model_20180522-165141",
"intent_ranking": [
{
"confidence": 0.8710909096729019,
"name": "buy_watch"
},
{
"confidence": 0.07355588750895545,
"name": "greet"
}
]
entity extraction with phrase matching does work in rasa_nlu try it with spacy_sklearn backend pipeline

The feature I was looking for is phrase matcher which would allow me to add a list of possible entities to the training model. This way, if any new name pops up, we can simply add the name to the phrase list and the model would be able to identify it with all possible utterances. Though this is still in development and should be added to the master soon: https://github.com/RasaHQ/rasa_nlu/pull/822

Related

Add files to Salesforce CMS channel folder via Connect API?

I'm developing an integration that will programmatically create product entries in Salesforce, and part of that process needs to be the addition of product images. I'm using the Connect API and am able to make a GET call to the right folder like this (I've scrambled the IDs and what not for this example):
https://example.salesforce.com/services/data/v52.0/connect/cms/delivery/channels/0591G0000000006/contents/query?folderId=9Pu1M000000fxUMSYI
That returns a payload like this:
{
"currentPageUrl": "/services/data/v52.0/connect/cms/delivery/channels/0ap1G0000000006/contents/query?page=0&pageSize=250",
"items": [
{
"contentKey": "MCZ2YVCGLNSBETNIG5P5QMIS4KNA",
"contentNodes": {
"source": {
"fileName": "PET Round.jpg",
"isExternal": false,
"mediaType": "Image",
"mimeType": "image/jpeg",
"nodeType": "MediaSource",
"referenceId": "05T0R000005MthL",
"resourceUrl": "/services/data/v52.0/connect/cms/delivery/channels/0ap1G0000000007/media/MCY2YVCGLNSBETNIG5P4QMIS4KNA/content",
"unauthenticatedUrl": "/cms/delivery/media/MCZ2YVCGLNSBETNIG5P4QMIS4KNA",
"url": "/cms/delivery/media/MCY2YVCGLNSBETNIG5P4QMIS4KNA"
},
"title": {
"nodeType": "NameField",
"value": "844333"
}
},
"contentUrlName": "844333",
"language": "en_US",
"managedContentId": "20T0R0000008U9qUAE",
"publishedDate": "2021-08-18T16:20:57.000Z",
"title": "844333",
"type": "cms_image",
"typeLabel": "Image",
"unauthenticatedUrl": "/cms/delivery/v52.0/0DB1G0000008tfOWAU/contents/20Y0R0000008y9qUAE?oid=00D0R000000OI7GUAW"
}
]
}
I am also able to retrieve images by contentKey with a GET call like this:
https://example.salesforce.com/services/data/v52.0/connect/cms/delivery/channels/0ap1G0000000007/media/MCZ2ZVCGLNSBETMIG5P4QMIS4KNA/content
Anyone know what the endpoint should look like and what parameters etc it should have? I'm having trouble finding anything for this specific scenario in the docs but surely there's a way.
Thanks!

How to use roomHint and structureHint with smarthome actions on Google

We are currently setting un a Smarthome action, and we would like to provide roomHint on the first sync (not on request sync) as it's really tedious to set up rooms on the first sync, but it does not work.
We tried to name rooms in english and also in italian, (as it's not really clear from the documentation if there is a list on room names that we can use?) but no way.
So can you please give us a hint how to use the roomHint field?
Also in the API doc we've found structureHint, does it work? The documentation for SYNC intent does not mention this field.
Here is our SYNC intent with one device and room, we took office from the example JSON:
{
"requestId": "3582198904737125163",
"payload": {
"agentUserId": "xyz#qwertyz.com",
"devices": [
{
"id": "deviceID",
"type": "action.devices.types.LIGHT",
"traits": [
"action.devices.traits.OnOff"
],
"name": {
"name": "Lampadina",
"defaultNames": [
"Lampadina_XYZ"
],
"nicknames": [
"Lampadina"
]
},
"willReportState": false,
"customData": {
"modelType": "DEVICE"
},
"roomHint": "office"
}
]
}
}
Thanks
Unfortunately, I believe the structureHint is only in the HomeGraph API sync response.
It cannot be used in the Sync intent.
If someone can tell me I'm wrong and how to use it, you'd be a hero.

Validate referential integrity of object arrays with Joi

I'm trying to validate that the data I am returned it sensible. Validating data types is done. Now I want to validate that I've received all of the data needed to perform a task.
Here's a representative example:
{
"things": [
{
"id": "00fb60c7-520e-4228-96c7-13a1f7a82749",
"name": "Thing 1",
"url": "https://lolagons.com"
},
{
"id": "709b85a3-98be-4c02-85a5-e3f007ce4bbf",
"name": "Thing 2",
"url": "https://lolfacts.com"
}
],
"layouts": {
"sections": [
{
"id": "34f10988-bb3d-4c38-86ce-ed819cb6daee",
"name": "Section 1",
"content:" [
{
"type": 2,
"id": "00fb60c7-520e-4228-96c7-13a1f7a82749" //Ref to Thing 1
}
]
}
]
}
}
So every Section references 0+ Things, and I want to validate that every id value returned in the Content of Sections also exists as an id in Things.
The docs for Object.assert(..) implies that I need a concrete reference. Even if I do the validation within the Object.keys or Array.items, I can't resolve the reference at the other end.
Not that it matters, but my context is that I'm validating HTTP responses within IcedFrisby, a Frisby.js fork.
This wasn't really solveable in the way I asked (i.e. with Joi).
I solved this for my context by writing a plugin for icedfrisby (published on npm here) which uses jsonpath to fetch each id in Content and each id in Things. The plugin will then assert that all of the first set exist within the second.

Does the OData protocol provide a way to transform an array of objects to an array of raw values?

Is there a way specify in an OData query that instead of certain name/value pairs being returned, a raw array should be returned instead? For example, if I have an OData query that results in the following:
{
"#odata.context": "http://blah.org/MyService/$metadata#People",
"value": [
{
"Name": "Joe Smith",
"Age": 55,
"Employers": [
{
"Name": "Acme",
"StartDate": "1/1/1990"
},
{
"Name": "Enron",
"StartDate": "1/1/1995"
},
{
"Name": "Amazon",
"StartDate": "1/1/1999"
}
]
},
{
"Name": "Jane Doe",
"Age": 30,
"Employers": [
{
"Name": "Joe's Crab Shack",
"StartDate": "1/1/2007"
},
{
"Name": "TGI Fridays",
"StartDate": "1/1/2010"
}
]
}
]
}
Is there anything I can add to the query to instead get back:
{
"#odata.context": "http://blah.org/MyService/$metadata#People",
"value": [
{
"Name": "Joe Smith",
"Age": 55,
"Employers": [
[ "Acme", "1/1/1990" ],
[ "Enron", "1/1/1995" ],
[ "Amazon", "1/1/1999" ]
]
},
{
"Name": "Jane Doe",
"Age": 30,
"Employers": [
[ "Joe's Crab Shack", "1/1/2007" ],
[ "TGI Fridays", "1/1/2010" ]
]
}
]
}
While I could obviously do the transformation client side, in my use case the field names are very large compared to the data, and I would rather not transmit all those names over the wire nor spend the CPU cycles on the client doing the transformation. Before I come up with my own custom parameters to indicate that the format should be as I desire, I wanted to check if there wasn't already a standardized way to do so.
OData provides several options to control the amount of data and metadata to be included in the response.
In OData v4, you can add odata.metadata=minimal to the Accept header parameters (check the documentation here). This is the default behaviour but even with this, it will still include the field names in the response and for a good reason.
I can see why you want to send only the values without the fields name but keep in mind that this will change the semantic meaning of the response structure. It will make it less intuitive to deal with as a json record on the client side.
So to answer your question, The answer is 'NO',
Other options to minimize the response size:
You can use the $value OData option to gets the raw value of a single property.
Check this example:
services.odata.org/OData/OData.svc/Categories(1)/Products(1)/Supplier/Address/City/$value
You can also use the $select option to cherry pick only the fields you need by selecting a subset of properties to include in the response

ACRCloud external meta data and IDs not returning

When making valid requests to http://ap-southeast-1.api.acrcloud.com/v1/identify I get successful responses, however both external_ids and external_metadata always come back as empty objects.
Example response:
{
"external_ids": {},
"play_offset_ms": 97480,
"external_metadata": {},
"label": "Universal Music Ltd.",
"release_date": "2012-01-01",
"album": {
"name": "The Love Club EP"
},
"title": "Royals",
"duration_ms": "190185",
"genres": [
{
"name": "Pop"
}
],
"acrid": "b748d828aba29c699f732bd660123bae",
"result_from": 3,
"artists": [
{
"name": "Lorde"
}
]
}
Anyone know why all my identifications wouldn't contain this data?
Please select the 3rd party ID integration while creating the projects.