iOS image comparison algorithm to find the differences in two very similar but different photos - iphone

My use case is that I have two images (photos) that are almost identical but do have some differences throughout. They won't be perfectly aligned, but they will be pretty darn close to each other. As a rough estimate, I'd expect to find a half dozen or less differences scattered randomly throughout a pair of 640x480 photos. The size of each difference would be roughly 20x20 px but there may be some odd shaped ones in there like 200x30 or so.
The algorithm will probably need to do some kind of blurring or fuzzy comparison or use "shaders" and "filters". Unfortunately, this is nowhere near my area of expertise. I've seen some libraries in my google searches, but nothing that looks like it would run on an iPhone.
Ideally, I'm looking for a working library, tutorial, or example code that can run on iOS.

The basic comparison may be the eigenvector comparison. But there is nice tutorial for java devs, here. May be helpful.

Related

Easy and straight-forward FFT of MP3 file

I have a MP3 file and need to constantly detect and show Hz value of this playing MP3 file. A bit of googling shows that I have 2 opportunities: use FFT or use Apple Accelerate framework. Unfortunately I haven't found any easy to use sample of neither. All samples, like AurioTouch etc, need tons of code to get simple number for sample buffer. Is there any easy example for pitch detection for iOS?
For example I've found https://github.com/clindsey/pkmFFT, but it's missing some files that its' author has removed. Anything working like that?
I'm afraid not. Working with sound is generally hard, and Core Audio makes no exception. Now to the matter at hand.
FFT is an algorithm for transforming input from time domain to frequency domain. Is not necessarily linked with sound processing, you can use it for other things other than sound as well.
Accelerate is an Apple provided framework, which among many other things offer an FFT implementation. So, you actually don't have two options there, just one and its implementation.
Now, depending on what you want to do(e.g. if you favour speed over accuracy, robustness over simplicity etc) and the type of waveform you have(simple, complex, human speech, music), FFT may be not enough on its own or not even the right choice for your task. There are other options, auto-correlation, zero-crossing, cepstral analysis, maximum likelihood to mention some. But none are trivial, except for zero-crossing, which also gives you the poorest results and will fail to work with complex waveforms.
Here is a good place to start:
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
There are also other question on SO.
However, as indicated by other answers, this is not something that can just be "magically" done. Even if you license code from someone (eg, iZotope, and z-plane both make excellent code for doing what you want to do), you still need to understand what's going on to get data in and out of their libraries.
If you need fast pitch detection go with http://www.schmittmachine.com/dywapitchtrack.html
You'll find a IOS sample code inside.
If you need FFT you should use Apple Accelerate framework.
Hope this help

Which features can i use for handwritten OCR other than a downsampled binary grid of the image?

Hi I have been searching though research papers on what features would be good for me to use in my handwritten OCR classifying neural network. I am a beginner so I have been just taking the image of the handwritten character, made a bounding box around it, and then resize it into a 15x20 binary image. So this means i have an input layer of 300 features. From the papers i have found on google (most of which are quite old) the methods really vary. My accuracy is not bad with just a binary grid of the image, but I was wondering if anyone had other features I could use to boost my accuracy. Or even just pointing me in the right direction. I would really appreciate it!
Thanks,
Zach
I haven't read any actual papers on this topic, but my advice would be to get creative. Use anything you could think of that might help the classifier identify numbers.
My first thought would be to try and identify "lines" in the image, maybe via a modified "sliding window" algorithm (sliding/rotating line?), or to try and identify a "line of best fit" to the image (to help the classifier respond to changes in italicism or writing style). Really though, if you're using a neural network, it should be picking up on these sorts of things without your manual help (that's the whole point of them!)
I would focus first on the structure and topology of your net to try and improve performance, and worry about additional features only if you cannot get satisfactory performance some other way. Also you could try improving the features you already have, make sure the character is centered in the image, maybe try an algorithm to skew italicised characters to make them vertical?
In my experience these sorts of things don't often help, but you could get lucky and run into one that improves your net :)

Is there an imaging library that can make you look thinner?

Very odd question, I know, but this is a problem a potential client handed me today.
We assume we have a full length photo of a person. We want to generate a thinner image of that user. Obviously, one way would just be to compress the width of the image but that would result in various distortions that wouldn't be realistic.
I'd like to keep this an open-source implementation so if anybody knows of a library that can identify certain parts of the body and slim each in a way that is most realistic, I'd like to know.
This is obviously something that could be done by hand but we need a solution that works without user interaction.
You should look into seam-carving algorithms. The algorithm is very simple to implement and has many such implmentations online. Seems like ImageMagick has it too - called "Liquid Rescale".
I assume that already the detection of bodyparts in photos is a challenge too hard for algorithms, unless the photos are all very similar (e.g. same background, same pose, etc.)
I have once played around developing algorithms for skin smoothing. I was able to detect skin areas pretty well by converting colors to the LAB space and selecting pixels similar to skin sample colors learnt with a support vector machine from various sample images. Once you have that, you could run something like a liquify-contract algorithm for slimming.
I wouldn't expect satisfying results though unless you spend huge amounts of time on this.

iPhone UIImage number recognition

I have a small UIImage (jpg) with a single typed number. I want to be able to read the number with some kind of pattern recognition. I'm really not sure where to start, so any help would be appreciated.
my initial idea was to compare this image with other images. For instance compare the image with that of a 1,2,3, etc until a match was found. That just seems slow and cumbersome and wondered if there was a better way to do it?
Thanks
Update - I'm trying to convert sudoku puzzles from newspaper print to interactive puzzles
No, you are right, it will be slow and cumbersome. But on the plus side you don't have to write it yourself
http://sourceforge.net/projects/opencvlibrary/
Still not exactly easy tho, and i'm not sure about licensing, so… you don't mention why you need to do this (sounds a little odd).
Maybe you can avoid it? If you know the images are numerical digits 0-9, is there another way to track which one a particalur images is, apart from the way it's pixels are arranged?
Sorry if that sounds like i'm missing the point… Maybe you could fill in a few more details?
I read this really good write-up about this exact problem here: http://sudokugrab.blogspot.com/2009/07/how-does-it-all-work.html
It doesn't have any code samples, but explains the concepts, and might be able to point you in the right direction.
The following tutorial may be right down your alley:
http://blog.damiles.com/2008/11/basic-ocr-in-opencv/
It is a simple tutorial on doing number recognition and comes with the source code also.
Additionally, you may want to do a search on OCR SDK (Optical Character Recognition Software Development Kit). You will surely find a stack of them. Commercial ones a pricey though.
I would go for a "role your own" approach along the line of the OpenCV tutorial, especially since you are only interested in numbers.
All of the best ':-)

What content have you made/seen made using procedural techniques

I was looking at some study i have to do in the future to do with procedural generation techniques and i was wondering what type of content you have:
Developed
Helped Develop
Seen implemented
Tried to develop
and what methods/techniques/procedures you used to develop it.
If you feel generous maybe you can even go into specifics of it such as data structures ad algorithms you have used to develop it.
If this needs to be put as community wiki because it is not me asking for a problem to be solved just let me know.
This is not a homework thread because it is a research unit that i'm not taking yet ;)
Introversion software, the makers of the games Defcon, Uplink and Darwinia (among others) have started working on a game about a year ago which extensively uses PCG for city generation, here is a video of their work, and you can read more about it on the development diary of the game (start from the first part at the bottom of the page!).
This immediately got me extremely interested, and seeing the potential for games I immediately started researching the technology. I have amassed a folder of 18 PDFs about the subject (research papers, SIGGRAPH presentations, etc). Here, I uploaded it for you.
The main approach is to use L-Systems, however, I never got around to understanding enough of that to make something out of this. I tried other, less successful approaches like using Voronois, recursively splitting a rectangular area into more smaller areas and shifting the boundaries a little to obtain a bit of randomness and polygon division.
The last method I had gotten from Mike's Code Blog's posts (here and here). The screenshots shown on his blog make me drool, it is my biggest programmer's dream to ever get something that looks like that. I emailed him to ask how he did it, and here is the relevant part of his reply, I'm sure he wouldn't mind me posting this here:
L-Systems is definitely one way to go, but that isn't what I'm doing. The basis of my method is polygon subdivision. I start with a simple polygon that represents the entire area of the city. Then, I split it (roughly) in half, and then split those two polygons, etc. until I get down to city-block size. At that point, the edges of all my polygons represent roads. I then use the same subdivision method to break the blocks down into building-size lots.
The devil is in the details, of course, but that is the basic method.
I for one still haven't managed to fully implement a solution of which I'm satisfied of, but it remains one of, if not my single biggest programmer's dream to ever achieve something like this.
Here are a few of the leaders in procedurally generated terrain (and to a lesser extent foliage). If you don't get a detailed answer here regarding methods and techniques, you might want to look in / ask in their forums. I have seen some discussions of techniques there.
TerraGen 2
World Builder
World Machine
Natural Graphics
Noone mentioned the demoscene that ONLY use procedural stuff?
So, go search for Werkkzeug, Kkrieger, MilkyTracker to start. Also you can visit the site pouet and see the wonder of well done procedural videos (yes, procedural videoclips! With music and graphics, all procedural!)
Allegorithmic's products are used in actual shipping titles. These guys focus on texture generation (both offline and at runtime).
They have some very pretty screenshots and demos.