MATLAB: draw centroids - matlab

My main question is given a feature centroid, how can I draw it in MATLAB?
In more detail, I have an NxNx3 image (an RGB image) of which I take 4x4 blocks and compute a 6-dimensional feature vector for each block. I store these feature vectors in an Mx6 matrix on which I run kmeans function and obtain the centroids in a kx6 matrix, where k is the number of clusters and 6 is the number of features for each block.
How can I draw these center clusters in my image in order to visualize if the algorithm is performing the way I wish it to perform? Or if anyone has any other way/suggestions on how I can visualize the centroids on my image, I'd greatly appreciate it.

Here's one way you can visualize the clusters:
As you described, first I extract the blocks, compute the feature vector for each, and cluster this features matrix.
Next we can visualize the clusters assigned to each block. Note that I am assuming that the 4x4 blocks are distinct, this is important so that we can map the blocks to their location back in the original image.
Finally, in order to display the cluster centroids on the image, I simply find the closest block to each cluster and display it as a representative of that cluster.
Here's a complete example to show the above idea (in your case, you would want to replace the function that computes the features of each block by your own implementation; I am simply taking the min/max/mean/median/Q1/Q3 as my feature vector for each 4x4 block):
%# params
NUM_CLUSTERS = 3;
BLOCK_SIZE = 4;
featureFunc = #(X) [min(X); max(X); mean(X); prctile(X, [25 50 75])];
%# read image
I = imread('peppers.png');
I = double( rgb2gray(I) );
%# extract blocks as column
J = im2col(I, [BLOCK_SIZE BLOCK_SIZE], 'distinct'); %# 16-by-NumBlocks
%# compute features for each block
JJ = featureFunc(J)'; %'# NumBlocks-by-6
%# cluster blocks according to the features extracted
[clustIDX, ~, ~, Dist] = kmeans(JJ, NUM_CLUSTERS);
%# display the cluster index assigned for each block as an image
cc = reshape(clustIDX, ceil(size(I)/BLOCK_SIZE));
RGB = label2rgb(cc);
imshow(RGB), hold on
%# find and display the closest block to each cluster
[~,idx] = min(Dist);
[r c] = ind2sub(ceil(size(I)/BLOCK_SIZE), idx);
for i=1:NUM_CLUSTERS
text(c(i)+2, r(i), num2str(i), 'fontsize',20)
end
plot(c, r, 'k.', 'markersize',30)
legend('Centroids')

The centroids do not correspond to coordinates in the image, but to coordinates in the feature space. There is two ways you can test how well kmeans performed. For both ways, you want to fist associate the points with their closest cluster. You get this information from the first output of kmeans.
(1) You can visualize the clustering result by reducing the 6-dimensional space to 2 or 3-dimensional space and then plotting the differently classified coordinates in different colors.
Assuming that the feature vectors are collected in an array called featureArray, and that you asked for nClusters clusters, you'd do the plot as follows using mdscale to transform the data to, say, 3D space:
%# kmeans clustering
[idx,centroids6D] = kmeans(featureArray,nClusters);
%# find the dissimilarity between features in the array for mdscale.
%# Add the cluster centroids to the points, so that they get transformed by mdscale as well.
%# I assume that you use Euclidean distance.
dissimilarities = pdist([featureArray;centroids6D]);
%# transform onto 3D space
transformedCoords = mdscale(dissimilarities,3);
%# create colormap with nClusters colors
cmap = hsv(nClusters);
%# loop to plot
figure
hold on,
for c = 1:nClusters
%# plot the coordinates
currentIdx = find(idx==c);
plot3(transformedCoords(currentIdx,1),transformedCoords(currentIdx,2),...
transformedCoords(currentIdx,3),'.','Color',cmap(c,:));
%# plot the cluster centroid with a black-edged square
plot3(transformedCoords(1:end-nClusters+c,1),transformedCoords(1:end-nClusters+c,2),...
transformedCoords(1:end-nClusters+c,3),'s','MarkerFaceColor',cmap(c,:),...
MarkerEdgeColor','k');
end
(2) You can, alternatively, create a pseudo-colored image that shows you what part of the image belongs to which cluster
Assuming that you have nRows by nCols blocks, you write
%# kmeans clustering
[idx,centroids6D] = kmeans(featureArray,nClusters);
%# create image
img = reshape(idx,nRows,nCols);
%# create colormap
cmap = hsv(nClusters);
%# show the image and color according to clusters
figure
imshow(img,[])
colormap(cmap)

Related

plot an mdscaled matrix of points

I have a 151-by-200 matrix, let's say D, where rows are patients and cols are some features.
After using pdist and mdscale I obtain a 151-by-3 matrix. I also used some consensus clustering algorithm and obtained the partitioning of the patients.
Now I would like to plot the distribution of the mdscaled matrix of points, each one with a shape and a colour (where shapes indicates classes and colors clusters ) as in the image below.
Can you give me an hint on how to do it? Thank you.
Here is an example how to plot grouped 3D points:
% data and clusters
load fisheriris
X = meas(:,1:3);
[L,~,Y] = unique(species);
% colors and markers of each group
colors = hsv(numel(L));
markers = 'osdv^x*+.><ph';
% plot
for i=1:numel(L)
ind = (Y == i);
h(i) = line(X(ind,1), X(ind,2), X(ind,3), ...
'LineStyle','none', 'LineWidth',1, 'MarkerSize',8, ...
'Marker',markers(i), 'Color',colors(i,:), ...
'MarkerEdgeColor','k', 'MarkerFaceColor',colors(i,:));
end
view(-150,30), axis vis3d, grid on, box on
legend(h, L, 'Location','northeast')
title('MultiDimensional Scaling')
xlabel('dim1'), ylabel('dim2'), zlabel('dim3')
You can customize the shapes and colors to match your criteria ("shapes indicates classes and colors clusters").

multiple matlab contour plots with one level

I have a number of 2d probability mass functions from 2 categories. I am trying to plot the contours to visualise them (for example at their half height, but doesn't really matter).
I don't want to use contourf to plot directly because I want to control the fill colour and opacity. So I am using contourc to generate xy coordinates, and am then using fill with these xy coordinates.
The problem is that the xy coordinates from the contourc function have strange numbers in them which cause the following strange vertices to be plotted.
At first I thought it was the odd contourmatrix format, but I don't think it is this as I am only asking for one value from contourc. For example...
contourmatrix = contourc(x, y, Z, [val, val]);
h = fill(contourmatrix(1,:), contourmatrix(2,:), 'r');
Does anyone know why the contourmatrix has these odd values in them when I am only asking for one contour?
UPDATE:
My problem seems might be a failure mode of contourc when the input 2D matrix is not 'smooth'. My source data is a large set of (x,y) points. Then I create a 2D matrix with some hist2d function. But when this is noisy the problem is exaggerated...
But when I use a 2d kernel density function to result in a much smoother 2D function, the problem is lessened...
The full process is
a) I have a set of (x,y) points which form samples from a distribution
b) I convert this into a 2D pmf
c) create a contourmatrix using contourc
d) plot using fill
Your graphic glitches are because of the way you use the data from the ContourMatrix. Even if you specify only one isolevel, this can result in several distinct filled area. So the ContourMatrix may contain data for several shapes.
simple example:
isolevel = 2 ;
[X,Y,Z] = peaks ;
[C,h] = contourf(X,Y,Z,[isolevel,isolevel]);
Produces:
Note that even if you specified only one isolevel to be drawn, this will result in 2 patches (2 shapes). Each has its own definition but they are both embedded in the ContourMatrix, so you have to parse it if you want to extract each shape coordinates individually.
To prove the point, if I simply throw the full contour matrix to the patch function (the fill function will create patch objects anyway so I prefer to use the low level function when practical). I get the same glitch lines as you do:
xc = X(1,:) ;
yc = Y(:,1) ;
c = contourc(xc,yc,Z,[isolevel,isolevel]);
hold on
hp = patch(c(1,1:end),c(2,1:end),'r','LineWidth',2) ;
produces the same kind of glitches that you have:
Now if you properly extract each shape coordinates without including the definition column, you get the proper shapes. The example below is one way to extract and draw each shape for inspiration but they are many ways to do it differently. You can certainly compact the code a lot but here I detailed the operations for clarity.
The key is to read and understand how the ContourMatrix is build.
parsed = false ;
iShape = 1 ;
while ~parsed
%// get coordinates for each isolevel profile
level = c(1,1) ; %// current isolevel
nPoints = c(2,1) ; %// number of coordinate points for this shape
idx = 2:nPoints+1 ; %// prepare the column indices of this shape coordinates
xp = c(1,idx) ; %// retrieve shape x-values
yp = c(2,idx) ; %// retrieve shape y-values
hp(iShape) = patch(xp,yp,'y','FaceAlpha',0.5) ; %// generate path object and save handle for future shape control.
if size(c,2) > (nPoints+1)
%// There is another shape to draw
c(:,1:nPoints+1) = [] ; %// remove processed points from the contour matrix
iShape = iShape+1 ; %// increment shape counter
else
%// we are done => exit while loop
parsed = true ;
end
end
grid on
This will produce:

How to apply Difference of Gaussian(DoG) approach to extract pores in fingerprint image in MATLAB?

I am working on a fingerprint recognition system, I need to extract pores which are present in the ridges.During my research I read that a blob detection technique can separate the pores from the ridges.However when I am applying a Gaussian Filter, it is smoothing the image. Also the DoG results in another similar image. If this is the right approach then where am I going wrong?
You actually need to apply a gaussian filter with 2 different sets of parameters, then subtract the filters and perform a convolution of the input image with that new filter, i.e. the difference of gaussians.
Here is an example with the coins.png demo image...The code is commented; don't hesitate to play with the parameters to see how that affects the output:
clear
clc
Im = imread('coins.png');
[r,c,~] = size(Im);
%// Initialize 3d array containing 3 images.
FilteredIm = zeros(r,c,3);
%// Try with 3 different kernel sizes
GaussKernel = [7 11 15];
figure;
subplot(2,2,1)
imshow(Im);
title('Original image');
TitleText = cell(1,numel(GaussKernel));
%// Apply Gaussian filter with 3 different kernel sizes/sigma values.
for k = 1:numel(GaussKernel)
%// Kernel sizes change but sigma values stay the same for
%// simplicity...you can play around with them of course.
GaussFilt1 = fspecial('Gaussian', GaussKernel(k), 12);
GaussFilt2 = fspecial('Gaussian', GaussKernel(k), 4);
%// Subtract the filters
DiffGauss = GaussFilt1 - GaussFilt2;
%// Perform convoluton to get filtered image
FilteredIm(:,:,k) = conv2(double(Im), DiffGauss, 'same');
%// Display
subplot(2,2,k+1)
imshow(FilteredIm(:,:,k));
TitleText{k} = sprintf('Filtered- Kernel of %i',GaussKernel(k));
title(TitleText{k});
end
Output:
Now for your application you probably need to play around with the kernel size and sigma parameters provided to fspecial. Good luck!

Finding 2D area defined by contour lines in Matlab

I am having difficulty with calculating 2D area of contours produced from a Kernel Density Estimation (KDE) in Matlab. I have three variables:
X and Y = meshgrid which variable 'density' is computed over (256x256)
density = density computed from the KDE (256x256)
I run the code
contour(X,Y,density,10)
This produces the plot that is attached. For each of the 10 contour levels I would like to calculate the area. I have done this in some other platforms such as R but am having trouble figuring out the correct method / syntax in Matlab.
C = contourc(density)
I believe the above line would store all of the values of the contours allowing me to calculate the areas but I do not fully understand how these values are stored nor how to get them properly.
This little script will help you. Its general for contour. Probably working for contour3 and contourf as well, with adjustments of course.
[X,Y,Z] = peaks; %example data
% specify certain levels
clevels = [1 2 3];
C = contour(X,Y,Z,clevels);
xdata = C(1,:); %not really useful, in most cases delimters are not clear
ydata = C(2,:); %therefore further steps to determine the actual curves:
%find curves
n(1) = 1; %n: indices where the certain curves start
d(1) = ydata(1); %d: distance to the next index
ii = 1;
while true
n(ii+1) = n(ii)+d(ii)+1; %calculate index of next startpoint
if n(ii+1) > numel(xdata) %breaking condition
n(end) = []; %delete breaking point
break
end
d(ii+1) = ydata(n(ii+1)); %get next distance
ii = ii+1;
end
%which contourlevel to calculate?
value = 2; %must be member of clevels
sel = find(ismember(xdata(n),value));
idx = n(sel); %indices belonging to choice
L = ydata( n(sel) ); %length of curve array
% calculate area and plot all contours of the same level
for ii = 1:numel(idx)
x{ii} = xdata(idx(ii)+1:idx(ii)+L(ii));
y{ii} = ydata(idx(ii)+1:idx(ii)+L(ii));
figure(ii)
patch(x{ii},y{ii},'red'); %just for displaying purposes
%partial areas of all contours of the same plot
areas(ii) = polyarea(x{ii},y{ii});
end
% calculate total area of all contours of same level
totalarea = sum(areas)
Example: peaks (by Matlab)
Level value=2 are the green contours, the first loop gets all contour lines and the second loop calculates the area of all green polygons. Finally sum it up.
If you want to get all total areas of all levels I'd rather write some little functions, than using another loop. You could also consider, to plot just the level you want for each calculation. This way the contourmatrix would be much easier and you could simplify the process. If you don't have multiple shapes, I'd just specify the level with a scalar and use contour to get C for only this level, delete the first value of xdata and ydata and directly calculate the area with polyarea
Here is a similar question I posted regarding the usage of Matlab contour(...) function.
The main ideas is to properly manipulate the return variable. In your example
c = contour(X,Y,density,10)
the variable c can be returned and used for any calculation over the isolines, including area.

Measuring weighted mean length from an electrophoresis gel image

Background:
My question relates to extracting feature from an electrophoresis gel (see below). In this gel, DNA is loaded from the top and allowed to migrate under a voltage gradient. The gel has sieves so smaller molecules migrate further than longer molecules resulting in the separation of DNA by length. So higher up the molecule, the longer it is.
Question:
In this image there are 9 lanes each with separate source of DNA. I am interested in measuring the mean location (value on the y axis) of each lane.
I am really new to image processing, but I do know MATLAB and I can get by with R with some difficulty. I would really appreciate it if someone can show me how to go about finding the mean of each lane.
Here's my try. It requires that the gels are nice (i.e. straight lanes and the gel should not be rotated), but should otherwise work fairly generically. Note that there are two image-size-dependent parameters that will need to be adjusted to make this work on images of different size.
%# first size-dependent parameter: should be about 1/4th-1/5th
%# of the lane width in pixels.
minFilterWidth = 10;
%# second size-dependent parameter for filtering the
%# lane profiles
gaussWidth = 5;
%# read the image, normalize to 0...1
img = imread('http://img823.imageshack.us/img823/588/gele.png');
img = rgb2gray(img);
img = double(img)/255;
%# Otsu thresholding to (roughly) find lanes
thMsk = img < graythresh(img);
%# count the mask-pixels in each columns. Due to
%# lane separation, there will be fewer pixels
%# between lanes
cts = sum(thMsk,1);
%# widen the local minima, so that we get a nice
%# separation between lanes
ctsEroded = imerode(cts,ones(1,minFilterWidth));
%# use imregionalmin to identify the separation
%# between lanes. Invert to get a positive mask
laneMsk = ~repmat(imregionalmin(ctsEroded),size(img,1),1);
Image with lanes that will be used for analysis
%# for each lane, create an averaged profile
lblMsk = bwlabel(laneMsk);
nLanes = max(lblMsk(:));
profiles = zeros(size(img,1),nLanes);
midLane = zeros(1,nLanes);
for i = 1:nLanes
profiles(:,i) = mean(img.*(lblMsk==i),2);
midLane(:,i) = mean(find(lblMsk(1,:)==i));
end
%# Gauss-filter the profiles (each column is an
%# averaged intensity profile
G = exp(-(-gaussWidth*5:gaussWidth*5).^2/(2*gaussWidth^2));
G=G./sum(G);
profiles = imfilter(profiles,G','replicate'); %'
%# find the minima
[~,idx] = min(profiles,[],1);
%# plot
figure,imshow(img,[])
hold on, plot(midLane,idx,'.r')
Here's my stab at a simple template for an interactive way to do this:
% Load image
img = imread('gel.png');
img = rgb2gray(img);
% Identify lanes
imshow(img)
[x,y] = ginput;
% Invert image
img = max(img(:)) - img;
% Subtract background
[xn,yn] = ginput(1);
noise = img((yn-2):(yn+2), (xn-2):(xn+2));
noise = mean(noise(:));
img = img - noise;
% Calculate means
means = (1:size(img,1)) * double(img(:,round(x))) ./ sum(double(img(:,round(x))), 1);
% Plot
hold on
plot(x, means, 'r.')
The first thing to do to is convert your RGB image to grayscale:
gr = rgb2gray(imread('gelk.png'));
Then, take a look at the image intensity histogram using imhist. Notice anything funny about it? Use imcontrast(imshow(gr)) to pull up the contrast adjustment tool. I found that eliminating the weird stuff after the major intensity peak was beneficial.
The image processing task itself can be divided into several steps.
Separate each lane
Identify ('segment') the band in each lane
Calculate the location of the bands
Step 1 can be done "by hand," if the lane widths are guaranteed. If not, the line detection offered by the Hough transform is probably the way to go. The documentation on the Image Processing Toolbox has a really nice tutorial on this topic. My code recapitulates that tutorial with better parameters for your image. I only spent a few minutes with them, I'm sure you can improve the results by tuning the parameters further.
Step 2 can be done in a few ways. The easiest technique to use is Otsu's method for thresholding grayscale images. This method works by determining a threshold that minimizes the intra-class variance, or, equivalently, maximizes the inter-class variance. Otsu's method is present in MATLAB as the graythresh function. If Otsu's method isn't working well you can try multi-level Otsu or a number of other histogram based threshold determination methods.
Step 3 can be done as you suggest, by calculating the mean y value of the segmented band pixels. This is what my code does, though I've restricted the check to just the center column of each lane, in case the separation was off. I'm worried that the result may not be as good as calculating the band centroid and using its location.
Here is my solution:
function [locations, lanesBW, lanes, cols] = segmentGel(gr)
%%# Detect lane boundaries
unsharp = fspecial('unsharp'); %# Sharpening filter
I = imfilter(gr,unsharp); %# Apply filter
bw = edge(I,'canny',[0.01 0.3],0.5); %# Canny edges with parameters
[H,T,R] = hough(bw); %# Hough transform of edges
P = houghpeaks(H,20,'threshold',ceil(0.5*max(H(:)))); %# Find peaks of Hough transform
lines = houghlines(bw,T,R,P,'FillGap',30,'MinLength',20); %# Use peaks to identify lines
%%# Plot detected lines above image, for quality control
max_len = 0;
imshow(I);
hold on;
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
%# Plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'x','LineWidth',2,'Color','yellow');
plot(xy(2,1),xy(2,2),'x','LineWidth',2,'Color','red');
%# Determine the endpoints of the longest line segment
len = norm(lines(k).point1 - lines(k).point2);
if ( len > max_len)
max_len = len;
end
end
hold off;
%%# Use first endpoint of each line to separate lanes
cols = zeros(length(lines),1);
for k = 1:length(lines)
cols(k) = lines(k).point1(1);
end
cols = sort(cols); %# The lines are in no particular order
lanes = cell(length(cols)-1,1);
for k = 2:length(cols)
lanes{k-1} = im2double( gr(:,cols(k-1):cols(k)) ); %# im2double for compatibility with greythresh
end
otsu = cellfun(#graythresh,lanes); %# Calculate threshold for each lane
lanesBW = cell(size(lanes));
for k = 1:length(lanes)
lanesBW{k} = lanes{k} < otsu(k); %# Apply thresholds
end
%%# Use segmented bands to determine migration distance
locations = zeros(size(lanesBW));
for k = 1:length(lanesBW)
width = size(lanesBW{k},2);
[y,~] = find(lanesBW{k}(:,round(width/2))); %# Only use center of lane
locations(k) = mean(y);
end
I suggest you carefully examine not only each output value, but the results from each step of the function, before using it for actual research purposes. In order to get really good results, you will have to read a bit about Hough transforms, Canny edge detection and Otsu's method, and then tune the parameters. You may also have to alter how the lanes are split; this code assumes that there will be lines detected on either side of the image.
Let me add another implementation similar in concept to that of #JohnColby's, only without the manual user-interaction:
%# read image
I = rgb2gray(imread('gele.png'));
%# middle position of each lane
%# (assuming lanes are somewhat evenly spread and of similar width)
x = linspace(1,size(I,2),10);
x = round( (x(1:end-1)+x(2:end))./2 );
%# compute the mean value across those columns
m = mean(I(:,x));
%# find the y-indices of the mean values
[~,idx] = min( bsxfun(#minus, double(I(:,x)), m) );
%# show the result
figure(1)
imshow(I, 'InitialMagnification',100, 'Border','tight')
hold on, plot(x, idx, ...
'Color','r', 'LineStyle','none', 'Marker','.', 'MarkerSize',10)
and applied on the smaller image: