Are there any algorithms available for analyzing the complexity of an image? Basically I'm writing a Perl script that will use the system() function to launch MPlayer in the background to generate 10 to 20 screenshots for the input video file and I'd like it to be able to discard any simple images such as a shot of the sky, or a black background, and other simple images and keep just 3 of those images with the highest complexity or most number of colors. Is there a module or a separate program I can use to accomplish this? I'm guessing maybe Image::Magick can take care of this.
See how small a JPEG-compressed copy is. JPEG works hard to remove redundancies in image information and "complex" images simply don't have as much redundancy to remove.
Great paper here on the subject. It considers the more narrow problem of matching images in a military application, but surveys the research and contains plenty of different metrics of image complexity that have been considered by various authors. It's possible you might need only one or two of the methods in your particular task. Check it out.
My first answer would be the JPEG method but somebody already suggested it, so my next answer would be to compute a histogram ($image->Histogram()). Just look at the number of different colors in the image. For photos (things like the sky), the more colors in an image, the more complex it is.
You might consider doing an FFT and looking for high-frequency information in the images... That would give you a rough idea of complexity.
I don't know of a ready-made library method, but there are some algorithms to measure this ...
You could try to add up the absolute values of the differences of one pixel to the next, separately per color channel. The sample image with the highest result would win, then. Still, it would be a very rough measurement...
Bit of pseudo-code, since I don't know perl:
complexity = 0
// image coordinates start at [0,0]
for x = 1 to image.max_x:
for y = 1 to image.max_y:
complexity += abs(image[x,y].red - image[x,y-1].red)
complexity += abs(image[x,y].red - image[x-1,y].red)
complexity += abs(image[x,y].blue - image[x,y-1].blue)
complexity += abs(image[x,y].blue - image[x-1,y].blue)
complexity += abs(image[x,y].green - image[x,y-1].green)
complexity += abs(image[x,y].green - image[x-1,y].green)
Related
I have a situation where I have many images, and I compare them using a specific fuzz factor (say 10%), looking for images that match. Works fine.
However, I sometimes have a situation where I want to compare all images to all other images (for e.g. 1000 images). Doing 5000+ ImageMagick compares is way too slow.
Hashing all the files and comparing the hashes 5000 times is lightning fast, but of course only works when the images are identical (no fuzz factor).
I'm wondering if there is some way to produce an ID or fingerprint - or maybe a range of IDs - where I could very quickly determine what images are close enough to each other, and then pay the ImageMagick compare cost only for those likely matches. Ideas or names of existing algorithms/approaches are very welcome.
There are quite a few imaging hashing algorithms out there. pHash is the one that springs to the top of my mind. http://www.phash.org/. That one works with basic transformations that one might want to do on an image. If you want to be more sophisticated and roll your own, you can use a pre-trained image classifier like image net (https://www.learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/), lop off the final layer, and use the penultimate layer as a vector. For small # of images, you can easily do a nearest neighbor. If you have more, you cam use annoy (https://github.com/spotify/annoy) to make the nearest neighbor search a bit more efficient
I am trying to reproduce the experiments on the ai-junkie website http://www.ai-junkie.com/ann/som/som1.html to cluster/group different colors together using Self Organizing maps(SOM) on a larger color dataset. I use about 400 images of differing solid colors and since they are solid colors, the color values in any color space(for example, RGB) would be same for all the points in a particular image. Hence the features I use before clustering using SOM are just the 3 dimensional color value for each image.
When I perform SOM, source code of which is obtained from http://knnl.sourceforge.net/ with 40 rows , 40 columns and 20 iterations(epoch=20), the result of clustering makes no sense to me. I looks like follows:
I feel like this is just random clustering(if I can call it that) and even a k-means algorithm would give better results. Any thoughts on what could have possibly gone wrong?
20 iterations is not enough for SOM algorithm. Try rows*columns*500. It's default value for the learning algorithm. On simple datasets like yours you can reduce this number but 20 is too small number. And be patien it's gonna take a while :)
It looks wrong, as you say it looks just like a random clustering.
A variety of things could have gone wrong. A few that come to mind: the number of iterations is not sufficient, the neighborhood function is not adequate, the implementation of the library you're using has some bug.
You can download the example posted on ai-junkie.com directly:
ai-junkie.com SOM Demo
Not sure what the SourceForge library is. Or are you asking for help debugging it?
i have made a similar SOM with AForge, you can have the source if still need. i tried with a 4*4 and a 16*16 SOM, and i just needed a few iterations (<100) to adap. Sure, it also depends on the learning factor.
I have a picture.1200*1175 pixel.I want to train a net(mlp or hopfield) to learn a specific part of it(201*111pixel) to save its weight to use in a new net(with the same previous feature)only without train it to find that specific part.now there are this questions :what kind of nets is useful;mlp or hopfield,if mlp;the number of hidden layers;the trainlm function is unuseful because "out of memory" error.I convert the picture to a binary image,is it useful?
What exactly do you need the solution to do? Find an object with an image (like "Where's Waldo"?). Will the target object always be the same size and orientation? Might it look different because of lighting changes?
If you just need to find a fixed pattern of pixels within a larger image, I suggest using a straightforward correlation measure, such as crosscorrelation to find it efficiently.
If you need to contend with any of the issues mentioned above, then there are two basic solutions: 1. Build a model using examples of the object in different poses, scalings, etc. so that the model will recognize any of them, or 2. Develop a way to normalize the patch of pixels being examined, to minimize the effect of those distortions (like Hu's invariant moments). If nothing else, yuo'll want to perform some sort of data reduction to get the number of inputs down. Technically, you could also try a model which is invariant to rotations, etc., but I don't know how well those work. I suspect that they are more tempermental than traditional approaches.
I found AdaBoost to be helpful in picking out only important bits of an image. That, and resizing the image to something very tiny (like 40x30) using a Gaussian filter will speed it up and put weight on more of an area of the photo rather than on a tiny insignificant pixel.
I have written matlab codes for two different block matching algorithms, extensive search and three step search, but i am not sure how i can check whether i am getting the correct results. Is there any standard way to check these or any standard code which i can run and compare my result with.I read somewhere that JM software can be used but i didnt find any way to use it.
You can always use the results produced by your algorithms to create the next frame of video and then analyze its quality by either visually inspecting it (which is rather subjective, and we like to deal in numbers) or calculating the mean square error between the produced image and the one you're trying to estimate. Mean square error of the exhaustive (extensive) search should be lower than the one three-step gives you.
Well, did you try to plot it? I mean,after the block-matching you have a new image, right?.
A way to know if you result if true or not is to check the sum of the difference of 2 frames.
A - pre_frame
B - post_frame
C - Compensated frame
If abs(abs(A-B)) is lower than abs(abs(A-C))) that mean it could be true.
Next time, try to specify your algoritm. Also, put your code here to help you more.
Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.