Picture after segmentation with Euclidean distance (just absolute , not absolute squared)
Original texture picture
I'm getting the result above (picture 1) when I perform clustering using Kmeans algorithm and Laws Texture Energy filters (with cluster centroids / groups =6)
What are the possible ways of improving the result ? As can be seen from the result, there is no clear demarcation of the textures.
Could dilation /erosion somehow be implemented for the same ? If yes, please guide.
Analysing the texture using k-means cause you to disregard spatial relations between neighboring pixels: If i and j are next to each other, then it is highly likely that they share the same texutre.
One way of introducing such spatial information is using pair-wise energy that can be optimized using graph cuts or belief-propagation (among other things).
Suppose you have n pixels in the image and L centroids in your k-means, then
D is an L-by-n matrix with D(i,l) is the distance of pixel i to center l.
If you choose to use graph cuts, you can download my wrapper (don't forget to compile it) and then, in Matlab:
>> sz = size( img ); % n should be numel(img)
>> [ii jj] = sparse_adj_matrix( sz, 1, 1 ); % define 4-connect neighbor grid
>> grid = sparse( ii, jj, 1, n, n );
>> gch = GraphCut('open', D, ones( L ) - eye(L), grid );
>> [gch ll] = GraphCut('expand', gch );
>> gch = GraphCut('close', gch );
>> ll = reshape( double(ll)+1, sz );
>> figure; imagesc(ll);colormap (rand(L,3) ); title('resulting clusters'); axis image;
You can find sparse_adj_matrix here.
For a recent implementation of many optimization algorithms, take a look at opengm package.
With respect morphological filtering i suggest this reference: Texture Segmentation Using Area Morphology Local Granulometries. The paper basically describes a morphological area opening filter which removes grayscale components which are smaller than a given area parameter threshold. In binary images the local granulometric size distributions can be generated by placing a window at each image pixel position and, after each opening operation, counting the number of remaining pixels within. This results in a local size distribution, that can be normalised to give the local pdf . Differentiating the pattern spectra gives the density that yields the local pattern spectrum at the pixel, providing a probability density which contains textural information local to each pixel position.
Here is an example to use the granulometries of an image. They are basically non linear scale spaces which work on the area of the grayscale components. The basic intuition is each texture can be characterized based on their spectrum of areas of their grayscale components. A simple binary area opening filter is available in Matlab.
Related
I am trying to convert the rgb image into a grayscale and then cluster it using kmean function of matlab .
here is my code
he = imread('tumor2.jpg');
%convert into a grayscale image
ab=rgb2gray(he);
nrows = size(ab,1);
ncols = size(ab,2);
%convert the image into a column vector
ab = reshape(ab,nrows*ncols,1);
%nColors=no of clusters
nColors = 3;
%cluster_idx is a n x 1 vector where cluster_idx(i) is the index of cluster assigned to ith pixel
[cluster_idx, cluster_center ,cluster_sum] = kmeans(ab,nColors,'distance','sqEuclidean','Replicates',1,'EmptyAction','drop' );
figure;
%converting vector into a matrix of dimensions equal to that of original
%image dimensions (nrows x ncols)
pixel_labels = reshape(cluster_idx,nrows,ncols);
pixel_labels
imshow(pixel_labels,[]), title('image labeled by cluster index');
problems
1) output image is always a plain white image.
i tried the solution given in the link below but output of the image is a plain gray image in this case.
find the solution tried here
2) when i execute my code second time ,execution does not proceed beyond k-mean function (it is likes an infinite loop there). hence no output in this case.
Actually, it looks like when you are colour segmenting kmeans is known to fall in local minima. This means that often, it wont find the amount of clusters you want as the minimization is not the best (that's why lots of people use other type of segmentation, such as level sets or simple region growing).
An option is to increase the amount of Replicates (amount of times kmeans will try to find the answer). At the moment you are setting it to 1, but you could try 3 or 4, and it may reach the solution that way.
In this question the accepted answer recommends to use a kmeans version of the algorithm specifically created for image segmentation. I havent tried myself but I think its worth a shot.
Link to FEX
I have 2 images im1 and im2 shown below. Theim2 picture is the same as im1, but the only difference between them is the colors. im1 has RGB ranges of (0-255, 0-255, 0-255) for each color channel while im2 has RGB ranges of (201-255, 126-255, 140-255). My exercise is to reverse the added effects so I can restore im2 to im1 as closely as I can. I have 2 thoughts in mind. The first is to match their histograms so they both have the same colors. I tried it using histeq but it restores only a portion of the image. Is there any way to change im2's histogram to be exactly the same as im1? The second approach was just to copy each pixel value from im1 to im2 but this is wrong since it doesn't restore the original image state. Are there any suggestions to restore the image?
#sepdek below pretty much suggested the method that #NKN alluded to, but I will provide another approach. One more alternative I can suggest is to perform a colour correction based on a least mean squared solution. What this alludes to is that we can assume that transforming a pixel from im2 to im1 requires a linear combination of weights. In other words, given a RGB pixel where its red, green and blue components are shaped into a 3 x 1 vector from the corrupted image (im2), there exists some linear transformation to get its equivalent pixel in the clean image (im1). In other words, we have this relationship:
[R_im1] [R_im2]
[G_im1] = A * [G_im2]
[B_im1] [B_im2]
Y = A * X
A in this case would be a 3 x 3 matrix. This is essentially performing a matrix multiplication to get your output corrected pixel. The input RGB pixel from im2 would be X and the output RGB pixel from im1 would be Y. We can extend this to as many pixels as we want, where pairs of pixels from im1 and im2 would establish columns along Y and X. In general, this would further extend X and Y to 3 x N matrices. To find the matrix A, you would find the least mean squared error solution. I won't get into it, but to find the optimal matrix of A, this requires finding the pseudo-inverse. In our case here, A would thus equal to:
Once you find this matrix A, you would need to take each pixel in your image, shape it so that it becomes a 3 x 1 vector, then multiply A with this vector like the approach above. One thing you're probably asking yourself is what kinds of pixels do I need to grab from both images to make the above approach work? One guideline you must adhere to is that you need to make sure that you're sampling from the same spatial location between the two images. As such, if we were to grab a pixel at... say... row 4, column 9, you need to make sure that both pixels from im1 and im2 come from this same row and same column, and they are placed in the same corresponding columns in X and Y.
Another small caveat with this approach is that you need to be sure that you sample a lot of pixels in the image to get a good solution, and you also need to make sure the spread of your sampling is over the entire image. If we localize the sampling to be within a small area, then you're not getting a good enough distribution of the colours and so the output will not look very nice. It's up to you on how many pixels you choose for the problem, but from experience, you get to a point where the output starts to plateau and you don't see any difference. For demonstration purposes, I chose 2000 pixels in random positions throughout the image.
As such, this is what the code would look like. I use randperm to generate a random permutation from 1 to M where M is the total number of pixels in the image. These generate linear indices so that we can sample from the images and construct our matrices. We then apply the above equation to find A, then take each pixel and apply a matrix multiplication with A to get the output. Without further ado:
close all;
clear all;
im1 = imread('http://i.stack.imgur.com/GtgHU.jpg');
im2 = imread('http://i.stack.imgur.com/wHW50.jpg');
rng(123); %// Set seed for reproducibility
num_colours = 2000;
ind = randperm(numel(im1) / size(im1,3), num_colours);
%// Grab colours from original image
red_out = im1(:,:,1);
green_out = im1(:,:,2);
blue_out = im1(:,:,3);
%// Grab colours from corrupted image
red_in = im2(:,:,1);
green_in = im2(:,:,2);
blue_in = im2(:,:,3);
%// Create 3 x N matrices
X = double([red_in(ind); green_in(ind); blue_in(ind)]);
Y = double([red_out(ind); green_out(ind); blue_out(ind)]);
%// Find A
A = Y*(X.')/(X*X.');
%// Cast im2 to double for precision
im2_double = double(im2);
%// Apply matrix multiplication
out = cast(reshape((A*reshape(permute(im2_double, [3 1 2]), 3, [])).', ...
[size(im2_double,1) size(im2_double,2), 3]), class(im2));
Let's go through this code slowly. I am reading your images directly from StackOverflow. After, I use rng to set the seed so that you can reproduce the same results on your end. Setting the seed is useful because it allows you to reproduce the random pixel selection that I did. We generate those linear indices, then create our 3 x N matrices for both im1 and im2. Finding A is exactly how I described, but you're probably not used to the rdivide / / operator. rdivide finds the inverse on the right side of the operator, then multiplies it with whatever is on the left side. This is a more efficient way of doing the calculation, rather than calculating the inverse of the right side separately, then multiplying with the left when you're done. In fact, MATLAB will give you a warning stating to avoid calculating the inverse separately and that you should the divide operators instead. Next, I cast im2 to double to ensure precision as A will most likely be floating point valued, then go through the multiplication of each pixel with A to compute the result. That last line of code looks pretty intimidating, but if you want to figure out how I derived this, I used this to create vintage style photos which also require a matrix multiplication much like this approach and you can read up about it here: How do I create vintage images in MATLAB? . out stores our final image. After running this code and showing what out looks like, this is what we get:
Now, the output looks completely scrambled, but the colour distribution more or less mimics what the input original image looks like. I have a few explanations on why this is the case:
There is quantization noise. If you take a look at the final image, there is various white spotting all over. This is probably due to the quantization error that is introduced when compressing your image. Pixels that should map to the same colours between the images will have slight variations due to quantization which gives us that spotting
There is more than one colour from im2 that maps to im1. If there is more than one colour from im2 that maps to im1, it is impossible for a linear multiplication with the matrix A to be able to generate more than one kind of colour for im1 given a single pixel in im2. Instead, the least mean-squared solution will try and generate a colour that minimizes the error and give you the best colour possible instead. This is probably way the face and other fine details of the image are obscured because of this exact reason.
The image is noisy. Your im2 is not completely clean. I can also see various spots of salt and pepper noise across all of the channels. One bad thing about this method is that if your image is subject to noise, then this method will not faithfully reconstruct the original image properly. Your image can only be corrupted by a wrong mapping of colours. Should there be any other type of image noise introduced, then this method will definitely not work as you are trying to reconstruct the original image based on a noisy image. There are pixels in the noisy image that were never present in the original image, so you'll have no luck getting it back to the way it was before!
If you want to take a look at the histograms of each channel between the original image and the output image, this is what we get:
The code I used to generate the above figure was:
names = {'Red', 'Green', 'Blue'};
figure;
for idx = 1 : 3
subplot(3,2,2*idx - 1);
imhist(im1(:,:,idx));
title([names{idx} ': Image 1']);
end
for idx = 1 : 3
subplot(3,2,2*idx);
imhist(out(:,:,idx));
title([names{idx} ': Output']);
end
The left side shows the red, green and blue histograms for the original image while the right side shows the same histograms for the reconstructed image. You can see that the general shape more or less mimics the original image, but there are some spikes throughout - most likely attributed to quantization noise and the non-unique mapping between colours of both images.
All in all, this is the best that I could do, but I think that was the whole point of the exercise.... to show that it isn't possible.
For more information on how to perform colour correction, check out Richard Alan Peters' II Digital Image Processing slides on colour correction. This was what I started with, and the derivation of how to calculate A can be found in his slides. Perhaps you can use some of what he talks about in your future work.
Good luck!
It seems that you need a scaling function to map the values of im2 to the values of im1.
This is fairly simple and you could write a scaling function to have it available for any such case.
A basic scaling mapping would work as follows:
out_value = min_output + (in_value - min_input) * (outrange / inrange)
given that there is an input value in_value that is within a range of values inrange=max_input-min_input and the mapping results an output value out_value within a range outrange=max_output-min_output. We also need to take into account the minimum input and output range bounds (min_input and min_output) to have a correct mapping.
See for example the following code for a scaling function:
%
% scale the values of a matrix using a set of limits
% possible ways to use:
% y = scale( x, in_range, out_range) --> ex. y = scale( x, [8 230], [0 255])
% y = scale( x, out_range) --> ex. y = scale( x, [0 1])
%
function y = scale( x, varargin );
if nargin<2,
error([upper(mfilename),':: Syntax: y=',mfilename,'(x[,in_range],out_range)']);
end;
if nargin==2,
inrange=[min(x(:)) max(x(:))]; % compute the limits of the input variable
outrange=varargin{1}; % get the output limits from the arguments
else
inrange=varargin{1}; % get the input limits from the arguments
outrange=varargin{2}; % get the output limits from the arguments
end;
if diff(inrange)==0, % row or column vector matrix or scalar
% just do a clipping...
if x>=outrange(2),
y=outrange(2);
elseif x<=outrange(1),
y=outrange(1);
else
y=x;
end;
else
% actually scale the data
% using: out = min_output + (x-min_input) * (outrange / inrange)
y = outrange(1) + (x-inrange(1))*abs(diff(outrange))/abs(diff(inrange));
end;
This function gets a matrix of values and scales them to a desired range.
In your case it could be used as following (variable img is the scaled im2):
for i=1:size(im1,3), % for each of the input/output image channels
output_range = [min(min(im1(:,:,i))) max(max(im1(:,:,i)))];
img(:,:,i) = scale( im2(:,:,i), output_range);
end;
This way im2 is scaled to the range of values of im1 one channel at a time. Output variable img should be the desired one.
I have the following code in MATLAB:
I=imread(image);
h=fspecial('gaussian',si,sigma);
I=im2double(I);
I=imfilter(I,h,'conv');
figure,imagesc(I),impixelinfo,title('Original Image after Convolving with gaussian'),colormap('gray');
How can I define and apply a Gaussian filter to an image without imfilter, fspecial and conv2?
It's really unfortunate that you can't use the some of the built-in methods from the Image Processing Toolbox to help you do this task. However, we can still do what you're asking, though it will be a bit more difficult. I'm still going to use some functions from the IPT to help us do what you're asking. Also, I'm going to assume that your image is grayscale. I'll leave it to you if you want to do this for colour images.
Create Gaussian Mask
What you can do is create a grid of 2D spatial co-ordinates using meshgrid that is the same size as the Gaussian filter mask you are creating. I'm going to assume that N is odd to make my life easier. This will allow for the spatial co-ordinates to be symmetric all around the mask.
If you recall, the 2D Gaussian can be defined as:
The scaling factor in front of the exponential is primarily concerned with ensuring that the area underneath the Gaussian is 1. We will deal with this normalization in another way, where we generate the Gaussian coefficients without the scaling factor, then simply sum up all of the coefficients in the mask and divide every element by this sum to ensure a unit area.
Assuming that you want to create a N x N filter, and with a given standard deviation sigma, the code would look something like this, with h representing your Gaussian filter.
%// Generate horizontal and vertical co-ordinates, where
%// the origin is in the middle
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
%// Create Gaussian Mask
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
%// Normalize so that total area (sum of all weights) is 1
h = h / sum(h(:));
If you check this with fspecial, for odd values of N, you'll see that the masks match.
Filter the image
The basics behind filtering an image is for each pixel in your input image, you take a pixel neighbourhood that surrounds this pixel that is the same size as your Gaussian mask. You perform an element-by-element multiplication with this pixel neighbourhood with the Gaussian mask and sum up all of the elements together. The resultant sum is what the output pixel would be at the corresponding spatial location in the output image. I'm going to use the im2col that will take pixel neighbourhoods and turn them into columns. im2col will take each of these columns and create a matrix where each column represents one pixel neighbourhood.
What we can do next is take our Gaussian mask and convert this into a column vector. Next, we would take this column vector, and replicate this for as many columns as we have from the result of im2col to create... let's call this a Gaussian matrix for a lack of a better term. With this Gaussian matrix, we will do an element-by-element multiplication with this matrix and with the output of im2col. Once we do this, we can sum over all of the rows for each column. The best way to do this element-by-element multiplication is through bsxfun, and I'll show you how to use it soon.
The result of this will be your filtered image, but it will be a single vector. You would need to reshape this vector back into matrix form with col2im to get our filtered image. However, a slight problem with this approach is that it doesn't filter pixels where the spatial mask extends beyond the dimensions of the image. As such, you'll actually need to pad the border of your image with zeroes so that we can properly do our filter. We can do this with padarray.
Therefore, our code will look something like this, going with your variables you have defined above:
N = 5; %// Define size of Gaussian mask
sigma = 2; %// Define sigma here
%// Generate Gaussian mask
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
h = h / sum(h(:));
%// Convert filter into a column vector
h = h(:);
%// Filter our image
I = imread(image);
I = im2double(I);
I_pad = padarray(I, [floor(N/2) floor(N/2)]);
C = im2col(I_pad, [N N], 'sliding');
C_filter = sum(bsxfun(#times, C, h), 1);
out = col2im(C_filter, [N N], size(I_pad), 'sliding');
out contains the filtered image after applying a Gaussian filtering mask to your input image I. As an example, let's say N = 9, sigma = 4. Let's also use cameraman.tif that is an image that's part of the MATLAB system path. By using the above parameters, as well as the image, this is the input and output image we get:
I am trying to apply graph cut method for my segmentation task. I found some example codes at
Graph_Cut_Demo.
Part of the codes are showing below
img = im2double( imread([ImageDir 'cat.jpg']) );
[ny,nx,nc] = size(img);
d = reshape( img, ny*nx, nc );
k = 2; % number of clusters
[l0 c] = kmeans( d, k );
l0 = reshape( l0, ny, nx );
% For each class, the data term Dc measures the distance of
% each pixel value to the class prototype. For simplicity, standard
% Euclidean distance is used. Mahalanobis distance (weighted by class
% covariances) might improve the results in some cases. Note that the
% image intensity values are in the [0,1] interval, which provides
% normalization.
Dc = zeros( ny, nx, k );
for i = 1:k
dif = d - repmat( c(i,:), ny*nx,1 );
Dc(:,:,i) = reshape( sum(dif.^2,2), ny, nx );
end
It seems that the method used k-means clustering to initialise the graph and get the data term Dc. However, I don't understand how they calculate this data term. Why they use
dif = d - repmat( c(i,:), ny*nx,1 );
In the comments thy said the data term Dc measures the distance of each pixel value to the class prototype. What is the class prototype, and why it can be determined by k-means label?
In another implementation Graph_Cut_Demo2, it used
% calculate the data cost per cluster center
Dc = zeros([sz(1:2) k],'single');
for ci=1:k
% use covariance matrix per cluster
icv = inv(cov(d(l0==ci,:)));
dif = d- repmat(c(ci,:), [size(d,1) 1]);
% data cost is minus log likelihood of the pixel to belong to each
% cluster according to its RGB value
Dc(:,:,ci) = reshape(sum((dif*icv).*dif./2,2),sz(1:2));
end
This confused me a lot. Why they calculate the covariance matrix and how they formed the data term using minus log likelihood? Any papers or descriptions available for these implementation?
Thanks a lot.
Both graph-cut segmentation examples are strongly related. The authors of Image Processing, Analysis, and Machine Vision: A MATLAB Companion book (first example) used the graph cut wrapper code of Shai Bagon (with the author's permission naturally) - the second example.
So, what is the data term anyway?
The data term represent how each pixel independently is likely to belong to each label. This is why log-likelihood terms are used.
More concretely, in these examples you try to segment the image into k segments based on their colors.
You assume there are only k dominant colors in the image (not a very practical assumption, but sufficient for educational purposes).
Using k-means you try and find what are these two colors. The output of k-means is k centers in RGB space - that is k "representative" colors.
The likelihood of each pixel to belong to any of the k centers is inversely proportional to the distance (in color space) of the pixel from the representative k-th center: the larger the distance the less likely the pixel to belong to the k-th center, the higher the unary energy penalty one must "pay" to assign this pixel to the k-th cluster.
The second example takes this notion one step ahead and assumes that the k clusters may have different densities in color space, modeling this second order behavior using a covariance matrix for each cluster.
In practice, one uses a more sophisticated color model for each segment, usually a mixture of Gaussians. You can read about it in the seminal paper of GrabCut (section 3).
PS,
Next time you can email Shai Bagon directly and ask.
I have an image and my aim is to binarize the image. I have filtered the image with a low pass Gaussian filter and have computed the intensity histogram of the image.
I now want to perform smoothing of the histogram so that I can obtain the threshold for binarization. I used a low pass filter but it did not work. This is the filter I used.
h = fspecial('gaussian', [8 8],2);
Can anyone help me with this? What is the process with respect to smoothing of a histogram?
imhist(Ig);
Thanks a lot for all your help.
I've been working on a very similar problem recently, trying to compute a threshold in order to exclude noisy background pixels from MRI data prior to performing other computations on the images. What I did was fit a spline to the histogram to smooth it while maintaining an accurate fit of the shape. I used the splinefit package from the file exchange to perform the fitting. I computed a histogram for a stack of images treated together, but it should work similarly for an individual image. I also happened to use a logarithmic transformation of my histogram data, but that may or may not be a useful step for your application.
[my_histogram, xvals] = hist(reshape(image_volume), 1, []), number_of_bins);
my_log_hist = log(my_histogram);
my_log_hist(~isfinite(my_log_hist)) = 0; % Get rid of NaN values that arise from empty bins (log of zero = NaN)
figure(1), plot(xvals, my_log_hist, 'b');
hold on
breaks = linspace(0, max_pixel_intensity, numberofbreaks);
xx = linspace(0, max_pixel_intensity, max_pixel_intensity+1);
pp = splinefit(xvals, my_log_hist, breaks, 'r');
plot(xx, ppval(pp, xx), 'r');
Note that the spline is differentiable and you can use ppdiff to get the derivative, which is useful for finding maxima and minima to help pick an appropriate threshold. The numberofbreaks is set to a relatively low number so that the spline will smooth the histogram. I used linspace in the example to pick the breaks, but if you know that some portion of the histogram exhibits much greater curvature than elsewhere, you'd want to have more breaks in that region and less elsewhere in order to accurately capture the shape of the histogram.
To smooth the histogram you need to use a 1-D filter. This is easily done using the filter function. Here is an example:
I = imread('pout.tif');
h = imhist(I);
smooth_h = filter(normpdf(-4:4, 0,1),1,h);
Of course you can use any smoothing function you choose. The mean would simply be ones(1,8).
Since your goal here is just to find the threshold to binarize an image you could just use the graythresh function which uses Otsu's method.