Features matching on multiple images - matlab

I am trying to implement feature matching on multiple images. The idea is to track some features in an image data set. I am using mexopenCV on Matlab and the basics of the algorithm are:
1. Feature Detection using SIFT or SURF
2. Feature Description using SIFT or SURF
3. Feature matching using Flann matcher or Brute Force
4. Filtering matches using RANSAC
My problem is the following:
Using a single object in a scene, all of the tracked features are on that object. However, when I add another object to the scene, the tracked features are only existing on the new object and there are no features on the first object. Is there an explanation for why this is happening ?
Image 1
Image 2
P.S: The features on each image are the ones that are tracked on all the data set (8 images).

I guess I found the reason for finding features only on one object. As I mentioned in a comment, RANSAC will try to find the best model when matching the features. Since we have a change in depth for the two objects, we basically have two models to befitted. I searched for multi-modal fitting and found that there's Sequential RANSAC and Multi-RANSAC that solves this. I have tried with sequential RANSAC by setting the number of models to 2 and got a nice result.

Related

Face detection (viola-jones) in matlab

So I found the cascade object detector in matlab that use the Viola-Jones algorithm to detect faces. Very easy to use, and works great!
But got a few questions.
The viola-jones method got four stages:
Haar Feature Selection
Creating an Integral Image
Adaboost Training
Cascading Classifiers
In matlab I can use FrontalFace(CART) and FrontalFace(LBP). These are Trained cascade classification model, so they will be part of stage 4 right?
But what is the difference between stage 1 and stage 4 if I use FrontalFace(CART)? Both use Haaar features it says.
Can we say that FrontalFace(CART) and FrontalFace(LBP) are two different ways of detecting faces? Can I compare those two against each other to see which one is better?
Or should I find another method to compare against the viola-jones?
Are there other face detection methods that are easy to implement in matlab?
Found some on the internet (using skin color etc), but Matlab is quite new to me. So I felt that those codes where abit to complicated for me.
The main difference is that FrontalFace(CART) and FrontalFace(LBP) have been trained on different data sets. Also, from the name, I am guessing that FrontalFace(LBP) uses LBP feaures instead of Haar.
The original Viola-Jones algorithm used the Haar features. However, it has later been extended to use other types of features. vision.CascadeObjectDetector supports Haar, LBP, and HOG features.
To compare which one is better, you would need some ground truth images, which are images with faces labeled by hand. I am sure you can find a benchmark data set on the web. Alternatively, you can label you own images using trainingImageLabeler app.
Also, if you are not happy with the accuracy of the classifiers that come with vision.CascadeObjectDetctor, you can train your own using the trainCascadeObjectDetector function.

Multiscale search for HOG+SVM in Matlab

first of all this is my first question here, so I hope I can explain it in a clear way.
My goal is to detect different classes of traffic signs in images. For that purpose I have trained binary SVMs following these steps:
First I got a database of cropped traffic signs like the one in the link. I considered different classes (prohibition, danger, etc), and negative images. All of them were scaled to 40x40 pixels.
http://i.imgur.com/Hm9YyZT.jpg
I trained linear-SVM models for each class (1-vs-all), using HOG as feature. Each image is described with a 1728-dimensional feature. (I append the three feature vectors for all three image planes). I did crossvalidation to set parameter C, and tested on previously unseen 40x40 images, and I got very accurate results (F1 score over 0.9 for all classes). I used libsvm for training and testing.
Now I'd want to detect signs in full road images, sliding a window in different image scales. The problem I'm facing is that I couldn't find any function that can do it for me (as DetectMultiScale in OpenCV), and my solution is very slow and rudimentary (I'm just doing a triple for loop, and for each scale I crop consecutive and overlapping 40x40 images, obtain HOG features and apply svmpredict for each one).
Can someone give me a clue to find a faster way to do it? I thought too about getting the HOG feature vector of the whole input image, and then reorder that vector to a matrix where each row will have the features corresponding to each 40x40 window, but I couldn't find a straightforward way of doing it.
Thanks,
I would suggest using SURF feature detection, however I don't know if this would also be too slow your needs.
See : http://morf.lv/modules.php?name=tutorials&lasit=2 for more information on how to implement and weather it is a viable solution for you.

RANSAC using SIFT in Computer Vision

Currently, I am doing a computer vision project. I used SIFT Matlab program using this: http://www.vlfeat.org/overview/sift.html codes. However, it gives me two matrices, one is matches and another is distances. I don't know how to convert these information to pixel values because in the next step I have to use RANSAC for getting the best matches. Would somebody please help me?
You have "matches", i.e. tentative correspondences, which means "feature with index I1 possibly corresponds to feature with index I2". So go the the list of the detected SIFT features and take coordinates of the I1 feature in image 1 and I2 in image 2.
The Computer Vision System Toolbox for MATLAB has various feature detectors and extractors, a function called matchFeatures to match the descriptors, and a function estimateGeometricTransform that uses RANSAC to do exactly what you need.
Please check out the following examples: Find Image Rotation and Scale Using Automated Feature Matching and Detect Objects in a Cluttered Scene Using Point Feature Matching

Dectecting stamp (seals) imprints on digital image with SIFT

I am working on an application that should determine if input image contain a stamp imprint and return its location. For RGB images I am using color segmentation and doing verification (with various shape factors), for grayscale image I thought that SIFT + verification would do the job, but using SIFT would only find those stamps(on input image) that I got in my database.
In ideal case it works really well, as shown on image bellow.
Fig. 1.
http://i.stack.imgur.com/JHkUl.png
The problem occurs when input image contains a stamp that does not exist in database. First thing I did was checking if there would be any matching key points if I compare a similar stamp to the one on input image. In most cases there is no single matching key point and if there is some they rather refer to other parts of input image than a stamp, as shown in Fig. 2.:
Fig. 2.
http://i.stack.imgur.com/coA4l.png
I also tried to find a match between input and circle images as the stamps are circular, but circle image has very few key points, if any.
So I wonder if there is any different approach that will make SIFT a bit more useful in this exact case? I though about creating a matrix with all descriptors and key-points from my database and then looking for nearest euclidean distance between input image and matrix, but it probably wont work as there is a lot of matching key-points(unwanted) across the database (see Fig. 2.).
I'm working with Matlab and tried both VLFeat and D. Lowe SIFT implementations.
Edit:
So I found a way to force SIFT to compute descriptors for user defined points on an image. My test image contained a circle, then the descriptors were computed and matched against input images, including the one under Fig 1 and 2. This process was repeated for scales from 0 to 10. Unfortunately it didn't help too.
This is only a first hint and not a full answer to the SIFT questions.
My impression is that detecting a circle by matching it against an image of a circle via SIFT is not the best approach, especially if the circle you want to detect has some unknown texture inside.
The textbook algorithm for circle detection would be Hough transform, which is mostly used for line detection but does work for any kind of shape which can be described by a low number of parameters (colleagues tell me things get nasty above 3, but a circle just has X,Y and r). There are several implementations in file exchange, the link is just to one example. Hough circle detection requires you to put an upper bound on the radii you want to detect, but this seems ok for your application.
From the examples you provided it looks like you should get quite far if you can detect circles reliably.
Actually I do not think SIFT will be solving this problem. I've been playing around with SIFT for quite some time and my conclusion is that it's really great for identifying identical patterns but not for similar patterns.
Just have a look at the construction of the SIFT feature vector: The descriptor is composed of several histograms of gradients(!). If you have patterns in the database that have very similar blob like structures in the stamps, then you might have a chance. But if this does not hold, then I guess you will not be very lucky.
From my point of view you have kind of solved the problem of finding indentical objects (stamps) and now extend to finding similar objects. This sounds like the same but in my past research I found these problems just related but not too identical.
Do you have any runtime constraints in your application? There might be other approaches but in this case, more input about possible constraints might be useful.
Update regarding constraints:
So your next task might be to detect the unknown stamps, right?
This sounds like a classification task.
In your case I would first try to find a descriptor/representation (or SVM) that classifies images into stamp/no-stamp. In order to evaluate this, set up a data base with ground truth and a reasonable amount of "unknown" stamps and other images like random snapshots from the letters, NOT containing stamps. This will be your test set.
Then try some descriptors/representations to caluclate the distance/similarity between your images to classify your test set into the classes STAMP / NO-STAMP. When you have found a descriptor/distance measure (or SVM) that performs well in classifying, then you could perform a sliding window approach on a letter to find a stamp. The sliding window approach is certainly not a very fast method, but a very easy one.
At least when you have reached this point, you can tune the detection - for example based on interesting point detectors.. but one step after the other...

Ideas for extracting features of an object using keypoints of image

I'll be appreciated if you help me to create a feature vector of an simple object using keypoints. For now, I use ETH-80 dataset, objects have an almost blue background and pictures are took from different views. Like this:
After creating a feature vector, I want to train a neural network with this vector and use that neural network to recognize an input image of an object. I don't want make it complex, input images will be as simple as train images.
I asked similar questions before, some one suggested using average value of 20x20 neighborhood of keypoints. I tried it, It seems it's not working with ETH-80 images, because of different views of images. It's why I asked another question.
SURF or SIFT. Look for interest point detectors. A MATLAB SIFT implementation is freely available.
Update: Object Recognition from Local Scale-Invariant Features
SIFT and SURF features consist of two parts, the detector and the descriptor. The detector finds the point in some n-dimensional space (4D for SIFT), the descriptor is used to robustly describe the surroundings of said points. The latter is increasingly used for image categorization and identification in what is commonly known as the "bag of word" or "visual words" approach. In the most simple form, one can collect all data from all descriptors from all images and cluster them, for example using k-means. Every original image then has descriptors that contribute to a number of clusters. The centroids of these clusters, i.e. the visual words, can be used as a new descriptor for the image. The VLfeat website contains a nice demo of this approach, classifying the caltech 101 dataset:
http://www.vlfeat.org/applications/apps.html#apps.caltech-101