Find image within image (template matching?) - swift

I need to find the location of an image that the user provides within an image that I provide.
It is safe to assume at the time of the analysis that the user provided image is certain to be contained within the image to be compared with.
I’ve looked through and even have some experience with Core ML and Vision image classification however I am struggling to convince myself that it is the correct way to approach this problem. I feel like the way “feature values” is handled in Vision it is almost the reverse of what I’m looking for.
My question: Is there a feature of Core ML or Vision that tackles this particular problem head on?
Other information that may be needed;
It is not safe to assume that images provided are pixel to pixel perfect due to possible resolution differences.
They may also be provided in any shape although possible to crop to a standardised shape before analysis.
Rotation will also need to be accounted for.
There would not be cases where the image is in the image twice.

Take a look at some of the feature detection and matching algorithms.
For example, you could use SIFT (scale-invariant feature transform algorithm) with RANSAC (Random sample consensus algorithm) to do exactly what you described.
If you are using OpenCV there are plenty of such algorithms which you can easily use. (FAST, Shi-Tomasi, etc.)

I think you need something like this expale in OpenCV

Related

Which features can i use for handwritten OCR other than a downsampled binary grid of the image?

Hi I have been searching though research papers on what features would be good for me to use in my handwritten OCR classifying neural network. I am a beginner so I have been just taking the image of the handwritten character, made a bounding box around it, and then resize it into a 15x20 binary image. So this means i have an input layer of 300 features. From the papers i have found on google (most of which are quite old) the methods really vary. My accuracy is not bad with just a binary grid of the image, but I was wondering if anyone had other features I could use to boost my accuracy. Or even just pointing me in the right direction. I would really appreciate it!
Thanks,
Zach
I haven't read any actual papers on this topic, but my advice would be to get creative. Use anything you could think of that might help the classifier identify numbers.
My first thought would be to try and identify "lines" in the image, maybe via a modified "sliding window" algorithm (sliding/rotating line?), or to try and identify a "line of best fit" to the image (to help the classifier respond to changes in italicism or writing style). Really though, if you're using a neural network, it should be picking up on these sorts of things without your manual help (that's the whole point of them!)
I would focus first on the structure and topology of your net to try and improve performance, and worry about additional features only if you cannot get satisfactory performance some other way. Also you could try improving the features you already have, make sure the character is centered in the image, maybe try an algorithm to skew italicised characters to make them vertical?
In my experience these sorts of things don't often help, but you could get lucky and run into one that improves your net :)

iOS image comparison

I am just doing some research into image processing and would appreciate it if someone could point me in the right direction. I want to compare image 'A' which is a picture of a person's face with image's stored in a database -B,C,D,E .. etc which are also pictures of faces. I want to compare them to see if the person 'A' is already in the database.
Several questions :
1.How is face recognition comparison usually done? (do you extract features e.g. eyes/mouth and compare them to other images?).
2. Are there prebuilt libraries that are able to do a comparison between images? or do i need to write my own algorithm?
3. Where can i start with this? (would appreciate some references/reading material).
Yes, you identify, extract and quantify various aspects of human faces, such as distance between pupils, width of mouth, percentage of head height where tip of nose is, etc.
There is a company, Luxand which makes software to do this, and I think they license it. Last time I looked (2009?) they didn't have an objective-c library. They do have an app that claims to merge faces from photograhs, so you can see what the offspring of any two people would look like, but it is very cheesy, with lots of hard-coded faces. (If you cross a dog with a tea-pot, you get the same baby-face as from crossing a 2 real faces.)
AFAIK, there is nothing in the iOS SDK that does this.
I would just Google "face recognition" and start reading. Good luck.
I would go with compiling openCV for the iPhone ( http://computer-vision-talks.com/2011/02/building-opencv-for-iphone-in-one-click/ ), and then implementing one of the classical ways to do face recognition like eigenfaces ( http://www.shervinemami.info/faceRecognition.html )
But don't expect miracles the accuracy will be low, and the app will be slow.
Also when you say face recognition is difficult doesn't the first link show how easy it is to detect faces on a picture?
The face detection from the first link is just to detect the face. It is just to see if there is a face in the image, which then you can pass as input to the recognition algorithm.
face recognition are very difficult, you need to extract some kind of "features" and perform some measurement...iphone hardware isn't very appropriate for this job.
yes, you can check here
http://maniacdev.com/2011/11/tutorial-easy-face-detection-with-core-image-in-ios-5/
for a tutorial and here
http://maniacdev.com/2011/12/open-source-library-for-adding-easy-face-to-your-ios-app-with-the-free-face-com-api/
for a free webservice.
3.i suggest you google scholar (http://scholar.google.it/scholar?q=face+recognition&hl=it&btnG=Cerca&lr=) but i think that if you want to write your own algorithm you need a lot o spare time :)

Does a free API for a Augmented reality service exist?

Currently I am trying to create an app for iPhone which is capable of recognizing the objects on an image such as car, bus, building, bridge, human, etc, and label as object name with the help of Internet.
Is there any free service which provide solution to my problem, as object recognition its self a complex algorithm requiring digital image processing, neural networks and all.
Can this can be done via API?
If you want to recognise planar images the current generation of mobile AR SDKs from Metaio, Qualcomm and Layar will allow you to upload images to match against, and perform the matching.
If you want to match freely against a set of 3D objects, e.g. a Toyota Prius or the Empire state, the same techniques might be applied to match against sets of images taken at different rotations, but you might have to choose to match just one object due to limitations on how large an image database you can have with the service, or contact those companies for a custom solution, and it may not work very reliably given the state of the art is to reliably match against planar images.
If you want to recognize general classes (human, car, building), this is a very difficult problem, and I don't know of any solutions anywhere fast enough to operate online (which I assume is a requirement given you want an AR solution - is that a fair assumption?). It's been a few years since I studied CV, but at that time the most promising solution for visual classification was "bag of visual words" approaches - you might try reading up on those.
Take a look at Cortexica. Very useful for this sort of thing.
http://www.cortexica.com/
I haven't done work with mobile AR in a while, but the last time I was working on this stuff I was using Layar and starting to investigate Junaio. Those are oriented toward 3D graphics, not simply text labels, so for your use case you may be better served with OpenCV.
Note that Layar (and I believe Junaio too) works like a web app, where you put the content on your own server and give Layar the URL to link to.

Is there an imaging library that can make you look thinner?

Very odd question, I know, but this is a problem a potential client handed me today.
We assume we have a full length photo of a person. We want to generate a thinner image of that user. Obviously, one way would just be to compress the width of the image but that would result in various distortions that wouldn't be realistic.
I'd like to keep this an open-source implementation so if anybody knows of a library that can identify certain parts of the body and slim each in a way that is most realistic, I'd like to know.
This is obviously something that could be done by hand but we need a solution that works without user interaction.
You should look into seam-carving algorithms. The algorithm is very simple to implement and has many such implmentations online. Seems like ImageMagick has it too - called "Liquid Rescale".
I assume that already the detection of bodyparts in photos is a challenge too hard for algorithms, unless the photos are all very similar (e.g. same background, same pose, etc.)
I have once played around developing algorithms for skin smoothing. I was able to detect skin areas pretty well by converting colors to the LAB space and selecting pixels similar to skin sample colors learnt with a support vector machine from various sample images. Once you have that, you could run something like a liquify-contract algorithm for slimming.
I wouldn't expect satisfying results though unless you spend huge amounts of time on this.

Is there a library that can do raster to vector conversion, for the iPhone?

I am trying to take an image and extract hand written text so that it can be read easily and zoomed in on. I would like to convert the text to vector paths.
I am not aware of any libraries that would make this as painless as possible. Any help is greatly appreciated. Examples are nice too :)
Simple iPhone Image Processing (on Google code) contains all the primitive tools you will need:
Canny edge detection
Histogram equalisation
Skeletonisation
Thresholding, adaptive and global )
Gaussian blur (used as a
preprocessing step for canny edge
detection)
Brightness normalisation
Connected region extraction
Resizing - uses interpolation
The only program I know of for the iPhone that does handwriting recognition is HWPEN. Unfortunately, it's not a library but a full application and (to make matters worse) it requires a Jailbroken phone.
I fear you must either try to get the source for HWPEN or reverse engineer it to obtain the code you need.
Barring that, you may want to write your own. There are several studies on handwriting recognition that may help.