Removing outliers from a grey-scale image - matlab

Question
I have an images sequence representing depth information which I'd like to clean.
There are some outliers (values with intensity below 25, for a 0-255 range) which I would like to be filled with an acceptable alternative (an average value localised to that specific area could be a good guess).
Can someone see a simple way to do this? I've tried to use a median filter (filter size of 10) substituting the undesired values with NaN, but it did worsen the situation, which improves instead by substituting them with a general average value.
P.S. Someone has already suggested me to use a fast wavelet reconstruction, but I would not really know where to start...
Implemented solution (so far)
The solution I implemented (before reading about inpaint_nans suggested by tmpearce) is:
duplicate the original image;
filling the invalid pixels with a general average value;
use a circular disk of ray 10 for blurring it;
replacing the invalid values in the original image with what I got from point 3.
run a median filter of size 10.
img2 = img;
img2(img < .005) = mean(img(:));
H = fspecial('disk',10);
img3 = imfilter(img2,H,'symmetric');
img4 = img;
img4(img < .3) = img3(img < .3);
filterSize = 10;
padopt = {'zeros','indexed','symmetric'};
IMG = medfilt2(img4, [1 1]*filterSize, padopt{p});

I recommend the inpaint_nans contribution from the MATLAB File Exchange - start as you've already done by replacing outliers with NaN and use the link to go from there.
From the description of the function:
Interpolate NaN elements in a 2-d array using non-NaN elements. Can
also extrapolate, as it does not use a triangulation of the data.
Inpaint_nans offers several different approaches to the interpolation,
which give tradeoffs in accuracy versus speed and memory required. All
the methods currently found in inpaint_nans are based on sparse linear
algebra and PDE discretizations. In essence, a PDE is solved to be
consistent with the information supplied.
Hooray for reusable code!

Use a function called roifill. You need to mess with it a little bit. I had to use imdilate because it interpolates from the boundary.
Code:
testimage = imread('BAPz5.png');
testimage = double(rgb2gray(testimage));
testimage_filt = roifill(testimage,imdilate(testimage<100,true(4)));
figure(1);
subplot(1,2,1);
imshow(testimage,[]);
subplot(1,2,2);
imshow(testimage_filt,[]);
Output:

The post is answered but just for the record, in [1], the author based on a basic principle of natural shapes, i.e., the objects follow a second order smoothness, he suggests an in-painting method that minimize curvature in a
least-squares sense. He also offers code. Good luck.
[1] Α Categoty-Level 3-D Object Database: Putting the kineckto Work (ICCV)

Related

Clustering an image using Gaussian mixture models

I want to use GMM(Gaussian mixture models for clustering a binary image and also want to plot the cluster centroids on the binary image itself.
I am using this as my reference:
http://in.mathworks.com/help/stats/gaussian-mixture-models.html
This is my initial code
I=im2double(imread('sil10001.pbm'));
K = I(:);
mu=mean(K);
sigma=std(K);
P=normpdf(K, mu, sigma);
Z = norminv(P,mu,sigma);
X = mvnrnd(mu,sigma,1110);
X=reshape(X,111,10);
scatter(X(:,1),X(:,2),10,'ko');
options = statset('Display','final');
gm = fitgmdist(X,2,'Options',options);
idx = cluster(gm,X);
cluster1 = (idx == 1);
cluster2 = (idx == 2);
scatter(X(cluster1,1),X(cluster1,2),10,'r+');
hold on
scatter(X(cluster2,1),X(cluster2,2),10,'bo');
hold off
legend('Cluster 1','Cluster 2','Location','NW')
P = posterior(gm,X);
scatter(X(cluster1,1),X(cluster1,2),10,P(cluster1,1),'+')
hold on
scatter(X(cluster2,1),X(cluster2,2),10,P(cluster2,1),'o')
hold off
legend('Cluster 1','Cluster 2','Location','NW')
clrmap = jet(80); colormap(clrmap(9:72,:))
ylabel(colorbar,'Component 1 Posterior Probability')
But the problem is that I am unable to plot the cluster centroids received from GMM in the primary binary image.How do i do this?
**Now suppose i have 10 such images in a sequence And i want to store the information of their mean position in two cell array then how do i do that.This is my code foe my new question **
images=load('gait2go.mat');%load the matrix file
for i=1:10
I{i}=images.result{i};
I{i}=im2double(I{i});
%determine 'white' pixels, size of image can be [M N], [M N 3] or [M N 4]
Idims=size(I{i});
whites=true(Idims(1),Idims(2));
df=I{i};
%we add up the various color channels
for colori=1:size(df,3)
whites=whites & df(:,:,colori)>0.5;
end
%choose indices of 'white' pixels as coordinates of data
[datax datay]=find(whites);
%cluster data into 10 clumps
K = 10; % number of mixtures/clusters
cInd = kmeans([datax datay], K, 'EmptyAction','singleton',...
'maxiter',1000,'start','cluster');
%get clusterwise means
meanx=zeros(K,1);
meany=zeros(K,1);
for i=1:K
meanx(i)=mean(datax(cInd==i));
meany(i)=mean(datay(cInd==i));
end
xc{i}=meanx(i);%cell array contaning the position of the mean for the 10
images
xb{i}=meany(i);
figure;
gscatter(datay,-datax,cInd); %funky coordinates for plotting according to
image
axis equal;
hold on;
scatter(meany,-meanx,20,'+'); %same funky coordinates
end
I am able to get 10 images segmented but no the values of themean stored in the cell arrays xc and xb.They r only storing [] in place of the values of means
I decided to post an answer to your question (where your question was determined by a maximum-likelihood guess:P), but I wrote an extensive introduction. Please read carefully, as I think you have difficulties understanding the methods you want to use, and you have difficulties understanding why others can't help you with your usual approach of asking questions. There are several problems with your question, both code-related and conceptual. Let's start with the latter.
The problem with the problem
You say that you want to cluster your image with Gaussian mixture modelling. While I'm generally not familiar with clustering, after a look through your reference and the wonderful SO answer you cited elsewhere (and a quick 101 from #rayryeng) I think you are on the wrong track altogether.
Gaussian mixture modelling, as its name suggests, models your data set with a mixture of Gaussian (i.e. normal) distributions. The reason for the popularity of this method is that when you do measurements of all sorts of quantities, in many cases you will find that your data is mostly distributed like a normal distribution (which is actually the reason why it's called normal). The reason behind this is the central limit theorem, which implies that the sum of reasonably independent random variables tends to be normal in many cases.
Now, clustering, on the other hand, simply means separating your data set into disjoint smaller bunches based on some criteria. The main criterion is usually (some kind of) distance, so you want to find "close lumps of data" in your larger data set. You usually need to cluster your data before performing a GMM, because it's already hard enough to find the Gaussians underlying your data without having to guess the clusters too. I'm not familiar enough with the procedures involved to tell how well GMM algorithms can work if you just let them work on your raw data (but I expect that many implementations start with a clustering step anyway).
To get closer to your question: I guess you want to do some kind of image recognition. Looking at the picture, you want to get more strongly correlated lumps. This is clustering. If you look at a picture of a zoo, you'll see, say, an elephant and a snake. Both have their distinct shapes, and they are well separated from one another. If you cluster your image (and the snake is not riding the elephant, neither did it eat it), you'll find two lumps: one lump elephant-shaped, and one lump snake-shaped. Now, it wouldn't make sense to use GMM on these data sets: elephants, and especially snakes, are not shaped like multivariate Gaussian distributions. But you don't need this in the first place, if you just want to know where the distinct animals are located in your picture.
Still staying with the example, you should make sure that you cluster your data into an appropriate number of subsets. If you try to cluster your zoo picture into 3 clusters, you might get a second, spurious snake: the nose of the elephant. With an increasing number of clusters your partitioning might make less and less sense.
Your approach
Your code doesn't give you anything reasonable, and there's a very good reason for that: it doesn't make sense from the start. Look at the beginning:
I=im2double(imread('sil10001.pbm'));
K = I(:);
mu=mean(K);
sigma=std(K);
X = mvnrnd(mu,sigma,1110);
X=reshape(X,111,10);
You read your binary image, convert it to double, then stretch it out into a vector and compute the mean and deviation of that vector. You basically smear your intire image into 2 values: an average intensity and a deviation. And THEN you generate 111*10 standard normal points with these parameters, and try to do GMM on the first two sets of 111. Which are both independently normal with the same parameter. So you probably get two overlapping Gaussians around the same mean with the same deviation.
I think the examples you found online confused you. When you do GMM, you already have your data, so no pseudo-normal numbers should be involved. But when people post examples, they also try to provide reproducible inputs (well, some of them do, nudge nudge wink wink). A simple method for this is to generate a union of simple Gaussians, which can then be fed into GMM.
So, my point is, that you don't have to generate random numbers, but have to use the image data itself as input to your procedure. And you probably just want to cluster your image, instead of actually using GMM to draw potatoes over your cluster, since you want to cluster body parts in an image about a human. Most body parts are not shaped like multivariate Gaussians (with a few distinct exceptions for men and women).
What I think you should do
If you really want to cluster your image, like in the figure you added to your question, then you should use a method like k-means. But then again, you already have a program that does that, don't you? So I don't really think I can answer the question saying "How can I cluster my image with GMM?". Instead, here's an answer to "How can I cluster my image?" with k-means, but at least there will be a piece of code here.
%set infile to what your image file will be
infile='sil10001.pbm';
%read file
I=im2double(imread(infile));
%determine 'white' pixels, size of image can be [M N], [M N 3] or [M N 4]
Idims=size(I);
whites=true(Idims(1),Idims(2));
%we add up the various color channels
for colori=1:Idims(3)
whites=whites & I(:,:,colori)>0.5;
end
%choose indices of 'white' pixels as coordinates of data
[datax datay]=find(whites);
%cluster data into 10 clumps
K = 10; % number of mixtures/clusters
cInd = kmeans([datax datay], K, 'EmptyAction','singleton',...
'maxiter',1000,'start','cluster');
%get clusterwise means
meanx=zeros(K,1);
meany=zeros(K,1);
for i=1:K
meanx(i)=mean(datax(cInd==i));
meany(i)=mean(datay(cInd==i));
end
figure;
gscatter(datay,-datax,cInd); %funky coordinates for plotting according to image
axis equal;
hold on;
scatter(meany,-meanx,20,'ko'); %same funky coordinates
Here's what this does. It first reads your image as double like yours did. Then it tries to determine "white" pixels by checking that each color channel (of which can be either 1, 3 or 4) is brighter than 0.5. Then your input data points to the clustering will be the x and y "coordinates" (i.e. indices) of your white pixels.
Next it does the clustering via kmeans. This part of the code is loosely based on the already cited answer of Amro. I had to set a large maximal number of iterations, as the problem is ill-posed in the sense that there aren't 10 clear clusters in the picture. Then we compute the mean for each cluster, and plot the clusters with gscatter, and the means with scatter. Note that in order to have the picture facing in the right directions in a scatter plot you have to shift around the input coordinates. Alternatively you could define datax and datay correspondingly at the beginning.
And here's my output, run with the already processed figure you provided in your question:
I do believe you must had made a naive mistake in the plot and that's why you see just a straight line: You are plotting only the x values.
In my opinion, the second argument in the scatter command should be X(cluster1,2) or X(cluster2,2) depending on which scatter command is being used in the code.
The code can be made more simple:
%read file
I=im2double(imread('sil10340.pbm'));
%choose indices of 'white' pixels as coordinates of data
[datax datay]=find(I);
%cluster data into 10 clumps
K = 10; % number of mixtures/clusters
[cInd, c] = kmeans([datax datay], K, 'EmptyAction','singleton',...
'maxiter',1000,'start','cluster');
figure;
gscatter(datay,-datax,cInd); %funky coordinates for plotting according to
image
axis equal;
hold on;
scatter(c(:,2),-c(:,1),20,'ko'); %same funky coordinates
I don't think there is nay need for the looping as the c itself return a 10x2 double array which contains the position of the means

Image Parameters (Standard Deviation, Mean and Entropy) of an RGB Image

I couldn't find an answer for RGB image.
How can someone get a value of SD,mean and Entropy of RGB image using MATLAB?
From http://airccse.org/journal/ijdms/papers/4612ijdms05.pdf TABLE3, it seems he got one answer so did he get the average of the RGB values?
Really in need of any help.
After reading the paper, because you are dealing with colour images, you have three channels of information to access. This means that you could alter one of the channels for a colour image and it could still affect the information it's trying to portray. The author wasn't very clear on how they were obtaining just a single value to represent the overall mean and standard deviation. Quite frankly, because this paper was published in a no-name journal, I'm not surprised how they managed to get away with it. If this was attempted to be published in more well known journals (IEEE, ACM, etc.), this would probably be rejected outright due to that very ambiguity.
On how I interpret this procedure, averaging all three channels doesn't make sense because you want to capture the differences over all channels. Doing this averaging will smear that information and those differences get lost. Practically speaking, if you averaged all three channels, should one channel change its intensity by 1, and when you averaged the channels together, the reported average would be so small that it probably would not register as a meaningful difference.
In my opinion, what you should perhaps do is treat the entire RGB image as a 1D signal, then perform the mean, standard deviation and entropy of that image. As such, given an RGB image stored in image_rgb, you can unroll the entire image into a 1D array like so:
image_1D = double(image_rgb(:));
The double casting is important because you want to maintain floating point precision when calculating the mean and standard deviation. The images will probably be of an unsigned integer type, and so this casting must be done to maintain floating point precision. If you don't do this, you may have calculations that get saturated or clamped beyond the limits of that data type and you won't get the right answer. As such, you can calculate the mean, standard deviation and entropy like so:
m = mean(image_1D);
s = std(image_1D);
e = entropy(image_1D);
entropy is a function in MATLAB that calculates the entropy of images so you should be fine here. As noted by #CitizenInsane in his answer, entropy unrolls a grayscale image into a 1D vector and applies the Shannon definition of entropy on this 1D vector. In a similar token, you can do the same thing with a RGB image, but we have already unrolled the signal into a 1D vector anyway, and so the input into entropy will certainly be well suited for the unrolled RGB image.
I have no idea how the author actually did it. But what you could do, is to treat the image as a 1D-array of size WxHx3 and then simply calculate the mean and standard deviation.
Don't know if table 3 is obtain in the same way but at least looking at entropy routine in image toolbox of matlab, RGB values are vectorized to single vector:
I = imread('rgb'); % Read RGB values
I = I(:); % Vectorization of RGB values
p = imhist(I); % Histogram
p(p == 0) = []; % remove zero entries in p
p = p ./ numel(I); % normalize p so that sum(p) is one.
E = -sum(p.*log2(p));

Compute the combined image of SVD perturbations

I know how to generate a combined image:
STEP1: I = imread('image.jpg');
STEP2: Ibw = single(im2double(I));
STEP3: [U S V] = svd(Ibw); %where U and S are letf and right odd vectors, respectively, and D the
%diagonal matrix of particular values
% calculate derived image
STEP4: P = U * power(S, i) * V'; % where i is between 1 and 2
%To compute the combined image of SVD perturbations:
STEP5: J = (single(I) + (alpha*P))/(1+alpha); % where alpha is between 0 and 1
So by integrating P into I , we get a combined image J which keeps the main information of the original image and is expected to work better against minor changes of expression, illumination and occlusions..
I have some questions:
1) I would like to know in details What is the motivation of applying Step3 ? and what we are perturbing here?
2)In Step3, what was meant by "particular values"?
3) The derived image P can also be called: "the perturbed image"?
Any help will be very appreciated!
This method originated from this paper that can be accessed here. Let's answer your questions in order.
If you want to know why this step is useful, you need to know a bit of theory about how the SVD works. The SVD stands for Singular Value Decomposition. What you are doing with the SVD is that it is transforming your N-dimensional data in such a way where it orders it according to which dimension exhibits the most amount of variation, and the other dimensions are ordered by this variation in decreasing order (SVD experts and math purists... don't shoot me. This is how I understand the SVD to be). The singular values in this particular context give you a weighting of how much each dimension of your data contributes to in its overall decomposition.
Therefore, by applying that particular step (P = U * power(S, i) * V';), you are giving more emphasis to the "variation" in your data so that the most important features in your image will stand out while the unimportant ones will "fade" away. This is really the only rationale that I can see behind why they're doing this.
The "particular" values are the singular values. These values are part of the S matrix and those values appear in the diagonals of the matrix.
I wouldn't call P the derived image, but an image that locates which parts of the image are more important in comparison to the rest of the image. By mixing this with the original image, those features that you should concentrate on are more emphasized while the other parts of the image that most people wouldn't pay attention to, the get de-emphasized in the overall result.
I would recommend you read that paper that you got this algorithm from as it explains the whole process fairly well.
Some more references for you
Take a look at this great tutorial on the SVD here. Also, this post may answer more questions regarding the insight of the algorithm.

MATLAB's fminsearch function

I have two images I'm trying to co-register - ie, one could be of a ball in the centre of the picture, the other is of the same ball near the edge and I'm trying to find the numbed of pixels I have to move the second image so that the balls would be in the same place. (I'm actually using 3D MRI brain scans, but the principle is the same).
I've written a function that will move the ball left, right, up or down by a given number of pixels as well as another function that compares the correlation of the ball-in-the-centre image with the translated ball-at-the-edge image. When the two balls are in the same place the correlation function will return 0 and a number larger than 0 for other positions.
I'm trying to use fminsearch (documentation) to find the optimal translation for the correlation function's minimum (ie, the balls being in the same place) like so:
global reference_im unknown_im;
starting_trans = [0 0 0];
trans_vector = fminsearch(#correlate_images,starting_trans)
correlate_images.m:
function r = correlate_images(translate)
global reference_im unknown_im;
new_im = move_image(unknown_im,translate(1),translate(2),translate(3));
% This bit is unimportant to the question
% but you can see how I calculate my correlation
r = 1 - corr(reshape(new_im,[],1),reshape(reference_im,[],1));
There are two problems, firstly fminsearch insists on passing float values for the translation vector into the correlate_images function. Is there any way to inform it that only integers are necessary? (I would save a large number of cpu cycles!)
Secondly, when I run this program the resulting trans_vector is always the same as starting_trans - I assume this is because no minimum has been found, but is there another reason its just plain not working?
Many thanks!
EDIT
I've discovered what I think is the reason the output trans_vector is always the same as starting_trans. The fminsearch looks at the starting value, then a small increment in each direction from there, this small increment is always less than one, which means that the result from the correlation will be a perfect match (as the move_image will return the same as the input image for sub-pixel movements). I'm going to continue working on convincing matlab to only fminsearch over integer values!
First, I'd say that Matlab might not be the best tool for this problem. I'd look at Elastix, which is a pretty user-friendly wrapper around the registration functions in ITK. You get a variety of registration techniques, and the manuals for both programs do a good job of explaining the specifics of image registration.
Second, for this kind of simple translational registration, you can use the FFT. Forward transform both images, multiply the images together (pointwise! That is, use A .* B, not A * B, as those are different operations, and the first is what you want), and there should be a peak in the inverse transform whose offset from the origin is the translational amount you need. Numerical Recipes in C has a good explanation; here's a link to an index pdf. The speed difference between the FFT version and the direct correlation version is huge; the FFT is O(N log N), while the correlation method will be O (N * M), where M is the number of pixels in your search neighborhood. If you want to allow the entire image to be searched, then correlation becomes O (N*N), which will take much longer than the FFT version. Changing parameters from floats to integers won't solve the problem.
The reason the fminsearch function uses floats (if I can guess at the reasons behind the coders' decisions) is that for problems that aren't test problems (ie, spheres in a volume), you often need sub-pixel resolution to perform a correct registration. Take a look at the ITK documentation about the reasons behind this approach.
Third, I'd suggest that a good way to write this program in Matlab (if you still want to do so!) while still forcing integer correlations would be to avoid the fminsearch function, which will want to use floats. Try something like:
startXPos = -10; %these parameters dictate the size of your search neighborhood
startYPos = -10; %corresponds to M in the above explanation
endXPos = 10;
endYPos = 10;
optimalX = 0;
optimalY = 0;
maxCorrVal = 0;
for i=startXPos:endXPos
for j = startYPos:endYPos
%test the correlation of the two images here, where one image is shifted to another
currCorrVal = Correlate(image1, image2OffsetByiAndj);
if (currCorrVal > maxCorrVal)
maxCorrVal = currCorrVal;
optimalX = i;
optimalY = j;
end
end
end
From here, you just have to write the offset function. This way, you avoid the float problem, and you're also incrementing your translation vector (I don't see any way for that vector to move in your provided functions, which probably explains your lack of movement).
There is a very similar demo in the Image Processing Toolbox that uses the normalized cross-correlation function normxcorr2 to perform image registration. To avoid repeating the same thing, check out the demo directly:
Registering an Image Using Normalized Cross-Correlation

How to use SIFT algorithm to compute how similar two images are?

I have used the SIFT implementation of Andrea Vedaldi, to calculate the sift descriptors of two similar images (the second image is actually a zoomed in picture of the same object from a different angle).
Now I am not able to figure out how to compare the descriptors to tell how similar the images are?
I know that this question is not answerable unless you have actually played with these sort of things before, but I thought that somebody who has done this before might know this, so I posted the question.
the little I did to generate the descriptors:
>> i=imread('p1.jpg');
>> j=imread('p2.jpg');
>> i=rgb2gray(i);
>> j=rgb2gray(j);
>> [a, b]=sift(i); % a has the frames and b has the descriptors
>> [c, d]=sift(j);
First, aren't you supposed to be using vl_sift instead of sift?
Second, you can use SIFT feature matching to find correspondences in the two images. Here's some sample code:
I = imread('p1.jpg');
J = imread('p2.jpg');
I = single(rgb2gray(I)); % Conversion to single is recommended
J = single(rgb2gray(J)); % in the documentation
[F1 D1] = vl_sift(I);
[F2 D2] = vl_sift(J);
% Where 1.5 = ratio between euclidean distance of NN2/NN1
[matches score] = vl_ubcmatch(D1,D2,1.5);
subplot(1,2,1);
imshow(uint8(I));
hold on;
plot(F1(1,matches(1,:)),F1(2,matches(1,:)),'b*');
subplot(1,2,2);
imshow(uint8(J));
hold on;
plot(F2(1,matches(2,:)),F2(2,matches(2,:)),'r*');
vl_ubcmatch() essentially does the following:
Suppose you have a point P in F1 and you want to find the "best" match in F2. One way to do that is to compare the descriptor of P in F1 to all the descriptors in D2. By compare, I mean find the Euclidean distance (or the L2-norm of the difference of the two descriptors).
Then, I find two points in F2, say U & V which have the lowest and second-lowest distance (say, Du and Dv) from P respectively.
Here's what Lowe recommended: if Dv/Du >= threshold (I used 1.5 in the sample code), then this match is acceptable; otherwise, it's ambiguously matched and is rejected as a correspondence and we don't match any point in F2 to P. Essentially, if there's a big difference between the best and second-best matches, you can expect this to be a quality match.
This is important since there's a lot of scope for ambiguous matches in an image: imagine matching points in a lake or a building with several windows, the descriptors can look very similar but the correspondence is obviously wrong.
You can do the matching in any number of ways .. you can do it yourself very easily with MATLAB or you can speed it up by using a KD-tree or an approximate nearest number search like FLANN which has been implemented in OpenCV.
EDIT: Also, there are several kd-tree implementations in MATLAB.
You should read David Lowe's paper, which talks about how to do exactly that. It should be sufficient, if you want to compare images of the exact same object. If you want to match images of different objects of the same category (e.g. cars or airplanes) you may want to look at the Pyramid Match Kernel by Grauman and Darrell.
Try to compare each descriptor from the first image with descriptors from the second one situated in a close vicinity (using the Euclidean distance). Thus, you assign a score to each descriptor from the first image based on the degree of similarity between it and the most similar neighbor descriptor from the second image. A statistical measure (sum, mean, dispersion, mean error, etc) of all these scores gives you an estimate of how similar the images are. Experiment with different combinations of vicinity size and statistical measure to give you the best answer.
If you want just compare zoomed and rotated image with known center of rotation you can use phase correlation in log-polar coordinates. By sharpness of peak and histogram of phase correlation you can judge how close images are. You can also use euclidean distance on absolute value of Fourier coefficients.
If you want compare SIFT descriptor, beside euclidean distance you can also use "diffuse distance" - getting descriptor on progressively more rough scale and concatenating them with original descriptor. That way "large scale" feature similarity would have more weight.
If you want to do matching between the images, you should use vl_ubcmatch (in case you have not used it). You can interpret the output 'scores' to see how close the features are. This represents the square of euclidean distance between the two matching feature descriptor. You can also vary the threshold between Best match and 2nd best match as input.