Artificial intelligence used for recognizing emojis on wallpaper - image-recognition

I want to build an app that will recognize what emojis have been used on the wallpaper.
So for instance this app will receive on input:
And on output should array of names of recognizing emojis return:
[
"Smiling Face with Sunglasses",
"Grinning Face with Smiling Eyes",
"Kissing Face with Closed Eyes"
]
I have prepared training data that consists of individual emojis.
For instance, I have rotated each emoji by 30 degrees, cut it by half, etc.
After training the model, the average precision is 0.8, which is quite nice, but it only works for one emoji per wallpaper. If I upload many types of emojis on one wallpaper, it does not recognize any objects.
My question is: why does it recognize one type of emoji per wallpaper, but if I put many types of emojis on one wallpaper, it does not work?
I am using Google ML Vision, and I have chosen Multi-Label Classification for the data set.

Related

Ai used for recognizing emojis on wallpaper

I want to build an app that will recognize what emojis have been used on the wallpaper.
So for instance this app will receive on input:
And on output should array of names of recognizing emojis return:
[
"Smiling Face with Sunglasses",
"Grinning Face with Smiling Eyes",
"Kissing Face with Closed Eyes"
]
Of course, the names of these emojis will come from the names of files of training images.
For example this file:
It will be called Grinning_Face_with_Smiling_Eyes.jpg
I would like to use AWS Rekognition Label or Google AutoML Vision, but they require a minimum of 10 images of each emoji for training.
As you know, I can only provide one image of each emoji, because there is no more option, they are in 2D ;)
Now my question is:
What should I do? How can I skip these requirements? Which service should I choose?
PS. In real business instead of emojis, there are covers of the books, which AI has to recognize. There is also one image per book-cover photo in 2D.

Feed multiple images to CoreML image classification model (swift)

I know how to use the CoreML library to train a model and use it. However, I was wondering if it's possible to feed the model more than one image in order for it to identify it with better accuracy.
The reason for this is because i'm a trying to build an app that classifies histological slides, however, many of them look quite similar, so I thought maybe I could feed the model images at different magnifications in order to make the identification. Is it possible?
Thank you,
Mehdi
Yes, this is a common technique. You can give Core ML the images at different scales or use different crops from the same larger image.
A typical approach is to take 4 corner crops and 1 center crop, and also horizontally flip these, so you have 10 images total. Then feed these to Core ML as a batch. (Maybe in your case it makes sense to also vertically flip the crops.)
To get the final prediction, take the average of the predicted probabilities for all images.
Note that in order to use images at different sizes, the model must be configured to support "size flexibility". And it must also be trained on images of different sizes to get good results.

Why does the red heart emoji require two code points, but the other colored hearts require one?

It appears that the red heart emoji (❤️) "\u2764\uFE0F" requires two Unicode codepoints, specifically Heavy Black Heart followed by a Variation Selector. However, blue 💙, green 💚, yellow 💛, and purple 💜 each have their own single codepoint.
Why is red so different?
For historical reasons. Originally, there was only U+2764 HEAVY BLACK HEART which the first applications that supported Emojis decided to render as a red heart. These early applications always rendered U+2764 as Emoji. Later it was realized that this was a bad idea and the variation selectors for Emojis were standardized. When additional heart emojis were added, there was no need for another red heart, so it was omitted. Instead there's a separate black heart emoji U+1F5A4 🖤.
In theory, an application could require that the Emoji variation selector is appended to other heart code points as well. But it doesn't make much sense to render characters like PURPLE HEART as a non-Emoji. It does make a difference for HEAVY BLACK HEART, though, which is often intended to be rendered as the original, plain heavy black heart character.
HEAVY BLACK HEART was added to Unicode decades before emoji. When emoji were incorporated in Unicode 6 some already existing characters were simply reused as emoji to avoid unnecessary duplicates. Later, variation sequences were defined for characters that also map to a non-emoji character set to allow for better control over how they display. For example, U+2744 ❄ SNOWFLAKE is originally from Zapf Dingbats (I believe) but was later also made an emoji. So if you want to force the original text-style display you can use VARIATION SELECTOR-15 (resulting in ❄︎), and if you want to force the newer emoji-style display you can use VARIATION SELECTOR-16 (resulting in ❄️).
Note, however, that not many platforms actually support those variation sequences correctly at the moment. Also not all of them automatically apply the variation selectors when using the emoji keyboard. In theory ❤ and ❄ (and many other emoji) should display as text style by default without VS16, but many applications ignore that as well.
I have a list of all code points that can display differently via a variation sequence, on my website, if you're interested. The next Unicode update in June is going to add some more.

How does one embed a file inside of an image? iOS iPhone

There is an app on the app store called active photo (http://itunes.apple.com/us/app/active-photo/id366798464?mt=8) that allows you to embed a hidden image or .exe file inside of an image. I would like to know how to do this regrading adding images to images, kinda like sub images in the original image.
I've been looking into metadata but no tag seems to be big enough to hold an NSData representation of the second picture.
How would one go about adding any type of file to an image, either through embedding or metadata, that would allow the image to be sent though email and or text message and still retain the data?
Thank you.
This is known as steganography.
I would imagine the simplest way of hiding a file inside a JPEG image is just to alter its pixel data in such a way that the compression doesn't damage it but is subtle enough that an interceptor can't detect the hidden data.
I don't think it is possible with JPEG because it's a lossy compression so you would end up corrupting the embedded file. But PNG uses a compression method similar to Deflate, which is loseless.
I have started writing a program like this. The idea was to hide bytes of data by splitting them into the least significant bits of pixels' color channels. Let me do some examples.
An RGB-8 image represents a pixel with 3 bytes, one for red, one for green and one for blue. I store 3 bits into red channel, two into green (human eye is more sensitive to green color) and 3 into blue. So I embed one byte per pixel. Similarly with RGBA-8 image I do 2-2-2-2. This of course involves some bitwise operations.
Things become more interesting with RGB(A)-16 images, where there are two bytes per channel. I use the entire least significant byte of every channel with minimal distortion (worst case 255 / 65535 = ~3.9%) and store up to 3 or 4 bytes of data per pixel. Not bad!!
Moreover there are no complex bitwise operations in this case, a single assignement does the job.
There are lot of improvement to it. I thought to ask the user a password, hash it and seed a secure pseudo random number generator, then no longer move pixel by pixel but instead asking the generator for a new random index.
The drawback of this solution is that the more data has already been embedded, the slower it becomes, because the generator will give more and more occupied indices. But it is much more secure in this way. To make it even more safer I thought to introduce noise data in the untouched pixels, in order to hide the positions of the true data.
As you can see you can do a lot with PNG images! If you are interested I can give the code I wrote so far.

iOS japanese handwriting input code help please

I have a series of questions about writing code for iOS and including handwritten recognition of japanese. I am a beginner, so be gentle and assume I am stupid ...
I'd like to present a japanese word in hiragana (japanese phonetic alphabet), then have the user handwrite the appropriate kanji (chinese character). Then, this is internally compared to the correct character. Then, user gets feedback (if they were correct or not).
My questions here revolve around the handwritten input.
I know normally if one uses the chinese keyboard this type of input is possible.
How can I institute something similar, without using the keyboard itself? Are there already library functions for this (I feel there must be since that input is available on the chinese keyboard)?
Also, Kanji aren't exactly the same as chinese characters. There are unique characters that japanese people invented themselves. How would I be able to include these in my handwriting recognition?
We worked on a similar exercise back at University.
As the order of the strokes is well defined with kanji and there are only 8 (?) different strokes. Basically each Kanji is a well-ordered sequence of strokes. Like te (hand) is the sequence "The short falling backward stroke" and then twice the "left to right stroke" and finally "The long downward stroke with the little tip at the bottom". There are databases that give you this information.
Now the problem is almost reduced to identify the correct stroke. You will still run into some ambiguities where you have to take into consideration in which spatial relation some strokes are to some others.
EDIT: For stroke recognition we snapped the free hand writing to 45 degrees (Where is the little circle symbol on the keyboard?) angles, thus converting it into a sequence of vectors along one of these directions. Let's assume that direction zero is from bottom to top, direction 1 bottom right to top left, 2 from right to left and so on CCW.
Then the first stroke of te (手) would be [23]+ (as some write it falling and some horizontal)
The second and third stroke would be 6+
and the last would be 4+[123] (as with the little tip, every writer uses a different direction)
This coarse snapping was actually enough for us to recognize kanjis. Maybe there are more sofisticated ways, but this simple solution managed to recognize about 90% of kanjis. It couldn't grasp only the handwriting of one professor, but the problem was that also no human except himself could read his handwriting.
EDIT2: It is important that your user "prints" the Kanji and doesn't write in calligraphy, since in calligraphy many strokes are merged into one. Like when writing a kanji with the radical of "rice field" in calligraphy, this radical morphs into something completely different. Or radicals with a lot of horizontal dashes (like the radical of "speech" iu) just become one long wriggly line.