I have several examples of images I need to recognize with OCR.
I've tried to recognize them on the demo page https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ and it works quite well. I use the "Read text in images" option, which works even better than "Read handwritten text from images".
But when I try to use the REST call from a script (according to the example given in documentation) results are much worse. Some letters are recognized wrong, some are totally missed. If I try running the same example from the development console https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fc/console I still get the same bad results.
What can cause this difference? How can I fix it to get reliable results as the demo page produces?
Maybe any additional information is required?
UPD: since I couldn't find any solution or even explanation of the difference I've created a sample file (similar to actual files) so you can have a look. The file url is http://sfiles.herokuapp.com/sample.png
You can see, if it is used on the demo page https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ in section "Read text in images" the resulting JSON is
{
"status": "Succeeded",
"succeeded": true,
"failed": false,
"finished": true,
"recognitionResult": {
"lines": [
{
"boundingBox": [
307,
159,
385,
158,
386,
173,
308,
174
],
"text": "October 2011",
"words": [
{
"boundingBox": [
308,
160,
357,
160,
357,
174,
308,
175
],
"text": "October"
},
{
"boundingBox": [
357,
160,
387,
159,
387,
174,
357,
174
],
"text": "2011"
}
]
},
{
"boundingBox": [
426,
157,
519,
158,
519,
173,
425,
172
],
"text": "07UC14PII0244",
"words": [
{
"boundingBox": [
426,
160,
520,
159,
520,
174,
426,
174
],
"text": "07UC14PII0244"
}
]
}
]
}
}
If I use this file in the console and make the following call:
POST https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation =true HTTP/1.1
Host: westcentralus.api.cognitive.microsoft.com
Content-Type: application/json
Ocp-Apim-Subscription-Key: ••••••••••••••••••••••••••••••••
{"url":"http://sfiles.herokuapp.com/sample.png"}
I get different result:
{
"language": "el",
"textAngle": 0.0,
"orientation": "Up",
"regions": [{
"boundingBox": "309,161,75,10",
"lines": [{
"boundingBox": "309,161,75,10",
"words": [{
"boundingBox": "309,161,46,10",
"text": "October"
}, {
"boundingBox": "358,162,26,9",
"text": "2011"
}]
}]
}, {
"boundingBox": "428,161,92,10",
"lines": [{
"boundingBox": "428,161,92,10",
"words": [{
"boundingBox": "428,161,92,10",
"text": "071_lC14P110244"
}]
}]
}]
}
As you see the result is totally different (even the JSON format). Does anyone know what am I doing wrong, or maybe I'm missing something, and the "Read text in images" demo does not match the ocr method of the API?
Will be very grateful for any help.
There are two flavors of OCR in Microsoft Cognitive Services. The newer endpoint (/recognizeText) has better recognition capabilities, but currently only supports English. The older endpoint (/ocr) has broader language coverage.
Some additional details about the differences are in this post.
Related
I am exploring Kairos Facial Recognition APIs. The API /enroll is used for uploading an image to Kairos for a subject_id. I noticed that the response of enroll API contains a confidence score. The image is treated as a new image. What does this confidence mean? When you verify an image, in that case the confidence score is important. But while uploading an image, why does the API return a confidence?
I assume, the API compares the image to the images uploaded before for that subject_id and returns the confidence. Is this the case or is it something else?
API Documentation: API_docs.
Here is a sample response for reference:
{
"face_id": "f2f0f8de43e545f8aff",
"images": [
{
"attributes": {
"age": 40,
"asian": 0.13225,
"black": 0.00103,
"gender": {
"femaleConfidence": 0.00028,
"maleConfidence": 0.99972,
"type": "M"
},
"glasses": "None",
"hispanic": 0.09578,
"lips": "Together",
"other": 0.27899,
"white": 0.49195
},
"transaction": {
"confidence": 0.99932,
"eyeDistance": 30,
"face_id": "f2f0f8de43e545f8aff",
"gallery_name": "ps-recognize",
"height": 70,
"image_id": 1,
"pitch": -14,
"quality": 0.10107,
"roll": -4,
"status": "success",
"subject_id": "vinod-khanna.&**#~`%$#_=+/",
"timestamp": "1526029231708",
"topLeftX": 124,
"topLeftY": 42,
"width": 70,
"yaw": 1
}
}
]
}
Yes, this isn't clear from the documentation.
For /recognize and /verify the confidence % represents how similar the face sent in with the request is to the the face being compared against.
For /detect and /enroll the confidence represents how confident the engine is that it found a face. Usually you will see 98-99 percent range for those values.
Disclosure: Kairos.com CTO
I am using Magento2 default API: /V1/carts/mine/payment-information.
The response from this API is:
{
"payment_methods": [
{
"code": "payu",
"title": "PayUMoney"
},
{
"code": "checkmo",
"title": "Check / Money order"
},
{
"code": "paytm",
"title": "Paytm PG"
}
],
"totals": {
"grand_total": 195,
"base_grand_total": 195,
"subtotal": 45,
"base_subtotal": 45,
"discount_amount": 0,
"base_discount_amount": 0,
"subtotal_with_discount": 45,
"base_subtotal_with_discount": 45,
"shipping_amount": 150,
"base_shipping_amount": 150,
"shipping_discount_amount": 0,
"base_shipping_discount_amount": 0,
"tax_amount": 0,
"base_tax_amount": 0,
"weee_tax_applied_amount": null,
"shipping_tax_amount": 0,
"base_shipping_tax_amount": 0,
"subtotal_incl_tax": 45,
"shipping_incl_tax": 150,
"base_shipping_incl_tax": 150,
"base_currency_code": "INR",
"quote_currency_code": "INR",
"items_qty": 1,
"items": [
{
"item_id": 41,
"price": 45,
"base_price": 45,
"qty": 1,
"row_total": 45,
"base_row_total": 45,
"row_total_with_discount": 0,
"tax_amount": 0,
"base_tax_amount": 0,
"tax_percent": 0,
"discount_amount": 0,
"base_discount_amount": 0,
"discount_percent": 0,
"price_incl_tax": 45,
"base_price_incl_tax": 45,
"row_total_incl_tax": 45,
"base_row_total_incl_tax": 45,
"options": "[{\"value\":\"Green\",\"label\":\"Color\"},{\"value\":\"29\",\"label\":\"Size\"}]",
"weee_tax_applied_amount": null,
"weee_tax_applied": null,
"name": "Erika Running Short"
}
],
"total_segments": [
{
"code": "subtotal",
"title": "Subtotal",
"value": 45
},
{
"code": "shipping",
"title": "Shipping & Handling (Fixed)",
"value": 150
},
{
"code": "tax",
"title": "Tax",
"value": 0,
"extension_attributes": {
"tax_grandtotal_details": []
}
},
{
"code": "grand_total",
"title": "Grand Total",
"value": 195,
"area": "footer"
}
]
}
}
I want to add the images tag inside items to display images of the items/products. But this tag is not defined in the interface of items, i.e
TotalsItemInterface.php
I replicated TotalsItemInterface in my custom module and added all getters and setters from Totalsinterface along with setImages and getImages Tag. Thus internally I call the method to use and show it by my custom apiInterfaces.
Is there a better or a proper "Magento 2 way" if we want to change the data displayed in the APIs?
You may use extension attributes for it. You can find more information about them in the devdocs there:
https://devdocs.magento.com/guides/v2.3/extension-dev-guide/extension_attributes/adding-attributes.html
I tried the Facebook Graph API Explorer and looked into the news feed of a person with /[userid]/feed.
In this case, the person added a new cover photo which is at least 720px wide.
But in the news feed, all I found was a picture attribute containing a very small 130px wide thumbnail only.
Is there a way to access the full - or at least a bigger - picture within the API?
Thank you.
If you check the data coming back from the /[userid]/feed API call, you'll notice "object_id": "xxx" in the response.
If you make a second API call to this, e.g. /xxx/, you will get an array of images back with different sizes you can use. E.g.:
{
"id": "000",
"created_time": "...",
"from": {
...
},
"height": 223,
"icon": "https://fbstatic-a.akamaihd.net/rsrc.php/v2/yz/r/StEh3RhPvjk.gif",
"images": [
{
"height": 480,
"source": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xfp1/t31.0-8/1912152_448929725237759_238021965_o.jpg",
"width": 1547
},
{
"height": 320,
"source": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xfp1/t31.0-8/p320x320/1912152_448929725237759_238021965_o.jpg",
"width": 1031
},
{
"height": 130,
"source": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xpa1/t1.0-9/p130x130/1013838_448929725237759_238021965_n.jpg",
"width": 420
},
{
"height": 225,
"source": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xfp1/t31.0-8/p75x225/1912152_448929725237759_238021965_o.jpg",
"width": 725
}
],
"link": "https://www.facebook.com/147587828705285/photos/a.448909598573105.1073741827.147587828705285/448929725237759/?type=1",
"picture": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xpa1/t1.0-9/s130x130/1013838_448929725237759_238021965_n.jpg",
"source": "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xfp1/t31.0-8/q71/s720x720/1912152_448929725237759_238021965_o.jpg",
"updated_time": "2014-03-12T19:58:08+0000",
"width": 720
}
I'm new to MongoDB. Here's my problem: a user can have multiple avatars, but one and only one is active. Here's what a user document looks like for the moment:
{
"_id": ObjectId("515c99f7e4d8094a87e13757"),
"avatars": [{
"url": "http://example.com/img/photo1.jpg",
"width": 50,
"height": 50,
}, {
"active": true,
"url": "http://example.com/img/photo2.jpg",
"width": 50,
"height": 50,
}]
}
This solution is simple enough, but there are a few things I don't like:
changing the active avatar means updating two embedded documents
I'm not sure it will behave nicely in case of concurrent access (read_avatar+change_active_avatar or change_active+change_active) (will it?)
looking for the active avatar requires a sequential search
Another solution would be this:
{
"_id": ObjectId("515c99f7e4d8094a87e13757"),
"active_avatar_id": 2,
"avatars": [{
"_id": 1,
"url": "http://example.com/img/photo1.jpg",
"width": 50,
"height": 50,
}, {
"_id": 2,
"url": "http://example.com/img/photo2.jpg",
"width": 50,
"height": 50,
}]
}
This fixes problems 1&2, but not problem 3. And it adds an extra _id field in every embedded document. Plus when I insert a new avatar I now need to know what's the next _id to use (unless I use an objectId, but then it's a 12 bytes id for just a few hundred avatars (max).
So yet another solution could be this:
{
"_id": ObjectId("515c99f7e4d8094a87e13757"),
"active_avatar": {
"url": "http://example.com/img/photo2.jpg",
"width": 50,
"height": 50,
},
"extra_avatars": [{
"url": "http://example.com/img/photo1.jpg",
"width": 50,
"height": 50,
}]
}
Not much better than the first solution (it just fixes problem #3, that's it, and it's uglier).
All these solutions work, but I'm looking for "the right way" to do it. Any ideas? And what if I allow users to have multiple active avatars (whatever that would mean)?
Thanks.
Have you considered the document inside a document model?
{
"_id" : ObjectId("515c99f7e4d8094a87e13758"),
"active_avatar_id" : 2,
"avatars" : {
"1" : {
"url" : "http://example.com/img/photo1.jpg",
"width" : 50,
"height" : 50
},
"2" : {
"url" : "http://example.com/img/photo2.jpg",
"width" : 50,
"height" : 50
}
}
}
Below is an example response i am getting for stats on an adgroup.
The "connections" returns 12 and the "actions" : null
How do i get to the break down of the connection? (pagelike/app installs/event responses)
Thank you!
{
"id": "XXXXX",
"impressions": "789862",
"clicks": "292",
"spent": "3019",
"social_impressions": "109327",
"social_clicks": "26",
"social_spent": "235",
"unique_impressions": 295055,
"social_unique_impressions": 18819,
"unique_clicks": 287,
"social_unique_clicks": 25,
"actions": null,
"connections": 12,
"adgroup_id": XXX,
"campaign_id": XXX,
"start_time": "2011-08-21T00:00:00+0000",
"end_time": "2011-08-22T00:00:00+0000",
"newsfeed_position": null
}
Check out the Ad Statistics and Conversion Statistics documentation - you're probably looking for /conversions instead of /stats
Sample query: /<CAMPAIGN ID>/conversions/0/1354207892
sample response is:
{
"campaign_id": <CAMPAIGN ID>,
"values": [
{
"start_time": 0,
"end_time": 1354233600,
"conversions": [
{
"action_type": "like",
"object_id": <PAGE ID>,
"post_click_1d": 345,
"post_click_7d": 349,
"post_click_28d": 351,
"post_imp_1d": 53,
"post_imp_7d": 89,
"post_imp_28d": 104
},
{
"action_type": "link_click",
"object_id": <PAGE ID>,
"post_click_1d": 893,
"post_click_7d": 904,
"post_click_28d": 938,
"post_imp_1d": 120,
"post_imp_7d": 185,
"post_imp_28d": 235
},
]
}
]
In my experience, you will get a non-null result for the actions field of a stats query if you don't specify a start_time.