Dear friends I am currently working on a disparity algorithm that visits only a small fraction of disparity space in order to find a semi-dense disparity map. It works by growing from a small set of correspondence seeds. But before that I am implementing the standard region growing algorithm in matlab to understand how it works.
The first step of the baseline growing algorithm says that:
Require: Rectified images Il, Ir, initial correspondence
seeds S, image similarity threshold. Compute similarity simil(s) for every seed s belonging to S.
Now i cannot understand this step. First of all how do i calculate initial seed points from two rectified images. Should i use SIFT algorithm in matlab or is there any better way to do it.???Can anybody also give me some idea about how does a region growing based disparity calculating algorithm works and whether it is better than SAD or SSD.
If you have rectified images, finding disparity is a matter of calculating costs between pixels in left and right images on the same horizontal line.
You can take a few selected points in the images (for example the ones that have high gradient or feature points coming from SIFT), set those as roots/seeds of your regions and calculate cost for a range of disparities using SAD/SSD or whatever cost function you prefer.
Then take the best disparity for a root and assign it to a neighbor. If the cost for that is lower than a predefined threshold, add it to the region otherwise go to next neighbor. When you cannot add any more points the region growing is finished.
This is a detailed example of the process: http://arxiv.org/pdf/0812.1340.pdf
Related
I have a greyscale image, represented by a histogram below (x and y axes are pixels, z axis is pixel intensity).
Each cluster of bars represents an object, with the local maxima fairly approximating the centroid of the object. My goal is to find the Full Width Half Max of each object – so I'm roughly approximating each object as a Gaussian distribution.
How can I detect each cluster individually? I understand how to mathematically calculate the FWHM, but I'm not sure how to detect each cluster based on its (roughly) Gaussian features. (e.g., in the example below I would want to detect 6 clusters. One can see a small cluster in the middle but its amplitude is so small that I am okay with missing it).
I appreciate any advice - and efficiency is not a major issue, so I can implement relatively expensive solutions.
To find the centers of each of these groupings you could use a type of A* search algorithm, or similar linear optimization algorithm.
It will find its way to the maxima of a grouping. The issue after that is you wont know if you are at a local maxima (which in your scenario is likely). After your current search has bottomed out at the highest point, and you have calculated the FWHM for that area, you could set all the nodes your A* has traversed to 0, (or mark each node as visited so as to not be visited again), and start the A* algorithm again, until all nodes have been seen, and all groupings found.
This is a follow-up question on the one below:
Second moments question
MATLAB's regionprops function estimates an ellipse from a given set of 2d-points. This is done by using the image moments, they claim to use normalized second central moments, the formulas also follow what is suggested by the wikipedia link on image moments.
Effectively the covariance matrix of the region is calculated (in a slightly more efficient way) and then the square root of the eigenvalues of this matrix are calculated and put out as the major and minor axes - with one change: They are multiplied by a factor of 4.
Why?
Essentially, covariance estimation assumes a multivariate normal distribution. However, an arbitrary image region is most likely not normally distributed, I would rather expect a factor based on the assumption that data is uniformly distributed. So what is the justification for choosing 4?
Meanwhile I found the answer: Factor 4 yields correct results for regions with an elliptical shape. For e.g. rectangular or non-solid regions, the estimated axis lengths are incorrect, and the error varies nonlinearly with changes in the region.
I have calculated the feature vectors for all images that are present in dataset. I have used euclidean distance for calculating distance between them and retrieving top 10 similar images from the dataset with every query. Setting
'Threshold' value is totally new to me please suggest some examples for selecting it.
Thanks in advance.
There are several possible answers of different degrees of difficulty:
Simple: return the 10 images with smallest distances. If the query image is not very similar to anything in the data set, the returned images won't be very similar, but they would be the most similar anyway. No threshold needed.
More complex: get a few people to rate pairs of images as being similar or non-similar (either yes/no or 0-10 scale). You can figure out from that what euclidean distance most people would say is "not similar" (or would give a score below 5, etc). That is your empirical threshold (but: it may be different for different kinds of images - I still think you will find a typical distance that works pretty well in most cases).
Even more complex: cluster your images with kNN. Try this for many possible numbers of clusters; measure the average cluster size, eg as median(distance(feature vector, centroid of cluster) for each image in cluster). Similarly measure the distances between pairs of clusters. This gives you an idea of what is "close" and "not close": but again ideally you should use for each image the size of the cluster it is in.
I am trying to implement KNN classifier using the cross validation approach where I have different images of a certain character for training(e.g 5 images) and another two for testing. Now I get the idea of the cross validation by simply choosing the K with the least error value when training & then using it with the test data to find how accurate my results are.
My question is how do I train images in matlab to get my K value? Do I compare them and try to find mismatch or what?!
Any help would be really appreciated.
First of you need to define your task precisely. F.ex Given an image I in R^(MxN) we wish to classify I as an image containing faces or an image without faces.
I often work with pixel classifiers, where the task is something like: For an image I decide if each pixel is a face pixel or a non-face pixel.
An important part of defining the task is to make a hypotheses that can be used as basis for training a classifier. F.ex We believe that the distribution of pixel intensities can be used to discriminate images of faces from images not containing faces.
Then you need to select some features that define your image. This can be done in many ways and you should search for what other people do when they analyse the same type of images you are working with.
One widely used method in pixel classification is to use pixel intensity values and do a multi-scale analysis of the image. The idea in multi-scale analysis is that different structures are most evident at different level of blurring called scales. As an illustration consider an image of a tree. Without blurring we notice the fine structure, such as small branches and leafs. When we blur the image we notice the trunk and major branches. This is often used as part of segmentation methods.
When you know your task and the features, you can train a classifier. If you use kNN and cross-validation to find the best k, you should split you dataset in train/testing and then split the training set in train/validate sets. You then train using the reduced training set and use the validation set to decide which k is the best. In the case of binary classification e.g face vs non-face the error rate is often used as a measure of performance.
Finally you use the parameters to train the classifier on the full dataset and estimate its performance on the test set.
A classification example: With or without milk?
As a full example, consider images of a cup of coffee taken from above so it shows the rim of the cup surrounding a brownly colored disk. Further assume that all images are scaled and cropped so the diameter of the disk is the same and dimensions of the image are the same. To simplify the task, we convert the color image to grayscale and scale the pixel intensities to the range [0,1].
We want to train a classifier so it can distinguish coffee with milk from coffee without milk. From inspection of histograms of some of the coffee images, we see that each image has two "bumps" in the histogram that are clearly separated. We believe that these bumps correspond to foreground (coffee) and background. Now we make the hypothesis that the average intensity of the foreground can be used to distinguish between coffee+milk/coffee.
To find the foreground pixels we observe that because the foreground/background ratio is the same (by design) we can just find the intensity value that gives us that ratio for each image. Then we calculate the average intensity of the foreground pixels and use this value as a feature for each image.
If we have N images that we have manually labeled, we split this into training and test set. We then calculate the average foreground intensity for each image in the training set, giving us a set of (average foreground intensity, label) values. We want to use kNN where an image is assigned the same class as the majority class of the k closest images. We measure the distance as the absolute value of the difference in average foreground pixel intensity.
We search for the optimal k with cross validation. We use 2-fold cross validation (aka holdout) to find the best k. We test k = {1,3,5} and select the k that gives the least prediction error on the validation set.
I have a dataset consisting of a large collection of points in three dimensional euclidian space. In this collection of points, i am trying to find the point that is nearest to the area with the highest density of points.
So my problem consists of two steps:
1: Determine where density of the distribution of points is at its highest
2: Determine which point is nearest to the point found in 1
Point 2 i can manage, but i'm not sure how to solve point 1. I know there are a lot of functions for density estimation in Matlab, but i'm not sure which one would be the most suitable, or straightforward to use.
Does anyone know?
My command of statistics is a little bit rusty, but as far as i can tell, this type of problem calls for multivariate analysis. Someone suggested i use multivariate kernel density estimation, but i'm not really sure if that's the best solution.
Density is a measure of mass per unit volume. On the assumption that your points all have the same mass then you are, I suppose, trying to measure the number of points per unit volume. So one approach is to divide your subset of Euclidean space into lots of little unit volumes (let's call them voxels like everyone does) and count how many points there are in each one. The voxel with the most points is where the density of points is at its highest. This is, of course, numerical integration of a sort. If your points were distributed according to some analytic function (and I guess they are not) you could solve the problem with pencil and paper.
You might make this approach as sophisticated as you like, perhaps initially dividing your space into 2 x 2 x 2 voxels, then choosing the voxel with most points and sub-dividing that in turn until your criteria are satisfied.
I hope this will get you started on your point 1; you seem to be OK with point 2 so I'll stop now.
EDIT
It looks as if triplequad might be what you are looking for.