I have the following code in which first I read the dataset. Then I apply k-means clustering to the dataset. I want to modify the code using a for loop. I have three clusters. I want to plot them using for loop also calculate the mean of the first column of every cluster. I also want them to be using a single for loop instead of manually written code. How can I do that? Can anybody help me with that?
Dataset
0.119349659383,2765187888.188327790000,-50.272277924288,0.000010124208
0.119639999551,2780553879.583636760000,-45.173332876699,0.000015075661
0.119899673836,2765356033.223678110000,-50.327888424563,0.000010123978
0.120209965074,2780981089.939126490000,-45.152589356947,0.000015059274
0.120449679454,2765635512.158593650000,-50.363949423158,0.000010131346
dataset= readmatrix('newdata.txt');
clust = zeros(size(dataset,1),5);
for i=1:5
clust(:,i) = kmeans(dataset,i,'emptyaction','singleton',...
'replicate',5);
figure;
[silh4,h] = silhouette(dataset,clust(:,i));
end
eva = evalclusters(dataset,clust,'silhouette');
K=eva.OptimalK;
[idx,C,sumdist] = kmeans(dataset,3,'Display','final','Replicates',5);
figure
gscatter(dataset(:,1),dataset(:,2),idx,'bgm')
hold on
plot(C(:,1),C(:,2),'kx')
legend('Cluster 1','Cluster 2','Cluster 3','Cluster Centroid')
%% This code to be using For Loop
dataset_idx=zeros(size(dataset,1));
dataset_idx=dataset(:,:);
dataset_idx(:,5)=idx;
cluster1 = dataset_idx(dataset_idx(:,5) == 1,:);
cluster2 = dataset_idx(dataset_idx(:,5) == 2,:);
cluster3 = dataset_idx(dataset_idx(:,5) == 3,:);
figure;
scatter(cluster1(:,1),cluster1(:,2))
legend('Cluster 1')
title('Cluster 1')
figure;
scatter(cluster2(:,1),cluster2(:,2))
legend('Cluster 2')
title('Cluster 2')
figure;
scatter(cluster3(:,1),cluster3(:,2))
legend('Cluster 3')
title('Cluster 3')
%% This code to be using For Loop Instead of manually written so much lines
T=cluster1(:,1);
DeltaT = diff(T);
Mcluster1Timeseries = mean(DeltaT);
formatSpec = 'Mean DeltaT of Cluster 1 is %4e ';
fprintf(formatSpec,Mcluster1Timeseries)
Mcluster1Frequncy = mean(cluster1(:,2));
formatSpec = 'Mean Frequncy of Cluster 1 is %4e ';
fprintf(formatSpec,Mcluster1Frequncy)
Mcluster1Amplitude = max(cluster1(:,3));
formatSpec = 'Max Amplitude of Cluster 1 is %4.4f ';
fprintf(formatSpec,Mcluster1Amplitude)
Mcluster1PW = mean(cluster1(:,4));
formatSpec = 'Mean Pulse Width of Cluster 1 is %4e ';
fprintf(formatSpec,Mcluster1PW)
T2=cluster2(:,1);
DeltaT2 = diff(T2);
Mcluster2Timeseries = mean(DeltaT2);
formatSpec = 'Mean DeltaT of Cluster 2 is %4e ';
fprintf(formatSpec,Mcluster2Timeseries)
Mcluster2Frequncy = mean(cluster2(:,2));
formatSpec = 'Mean Frequncy of Cluster 2 is %4e ';
fprintf(formatSpec,Mcluster2Frequncy)
Mcluster2Amplitude = max(cluster2(:,3));
formatSpec = 'Max Amplitude of Cluster 2 is %4.4f ';
fprintf(formatSpec,Mcluster2Amplitude)
Mcluster2PW = mean(cluster2(:,4));
formatSpec = 'Mean Pulse Width of Cluster 2 is %4e ';
fprintf(formatSpec,Mcluster2PW)
T3=cluster3(:,1);
DeltaT3 = diff(T3);
Mcluster3Timeseries = mean(DeltaT3);
formatSpec = 'Mean DeltaT of Cluster 3 is %4e ';
fprintf(formatSpec,Mcluster3Timeseries)
Mcluster3Frequncy = mean(cluster3(:,2));
formatSpec = 'Mean Frequncy of Cluster 3 is %4e ';
fprintf(formatSpec,Mcluster3Frequncy)
Mcluster3Amplitude = max(cluster3(:,3));
formatSpec = 'Max Amplitude of Cluster 3 is %4.4f ';
fprintf(formatSpec,Mcluster3Amplitude)
Mcluster3PW = mean(cluster3(:,4));
formatSpec = 'Mean Pulse Width of Cluster 3 is %4e ';
fprintf(formatSpec,Mcluster3PW)
In the below I've used a cell structure, though in this case you could also use a 3-dimensional array if you're so inclined. In order to get the "Cluster 1/2/3" labels to match, I just used string formatting a bit more.
Here's what I came up with.
dataset= readmatrix('newdata.txt');
clust = zeros(size(dataset,1),5);
for i=1:5
clust(:,i) = kmeans(dataset,i,'emptyaction','singleton',...
'replicate',5);
figure;
[silh4,h] = silhouette(dataset,clust(:,i));
end
eva = evalclusters(dataset,clust,'silhouette');
K=eva.OptimalK;
[idx,C,sumdist] = kmeans(dataset,3,'Display','final','Replicates',5);
figure
gscatter(dataset(:,1),dataset(:,2),idx,'bgm')
hold on
plot(C(:,1),C(:,2),'kx')
legend('Cluster 1','Cluster 2','Cluster 3','Cluster Centroid')
dataset_idx=zeros(size(dataset,1));
dataset_idx=dataset(:,:);
dataset_idx(:,5)=idx;
clusters = cell(3,1);
for i = 1:3
clusters{i} = dataset_idx(dataset_idx(:,5) == i,:);
figure;
scatter(clusters{i}(:,1),clusters{i}(:,2))
legend(sprintf('Cluster %d',i))
title(sprintf('Cluster %d',i))
end
for i = 1:3
T = clusters{i}(:,1);
DeltaT = diff(T);
MclusterTimeseries = mean(DeltaT);
formatSpec = 'Mean DeltaT of Cluster %d is %4e ';
fprintf(formatSpec,i,MclusterTimeseries)
MclusterFrequncy = mean(clusters{i}(:,2));
formatSpec = 'Mean Frequncy of Cluster %d is %4e ';
fprintf(formatSpec,i,MclusterFrequncy)
MclusterAmplitude = max(clusters{i}(:,3));
formatSpec = 'Max Amplitude of Cluster %d is %4.4f ';
fprintf(formatSpec,i,MclusterAmplitude)
Mcluster1PW = mean(clusters{i}(:,4));
formatSpec = 'Mean Pulse Width of Cluster %d is %4e ';
fprintf(formatSpec,i,Mcluster1PW)
end
Here is a further modification so that the command window output is more readable.
dataset= readmatrix('newdata.txt');
clust = zeros(size(dataset,1),5);
for i=1:5
clust(:,i) = kmeans(dataset,i,'emptyaction','singleton',...
'replicate',5);
figure;
[silh4,h] = silhouette(dataset,clust(:,i));
end
eva = evalclusters(dataset,clust,'silhouette');
K=eva.OptimalK;
[idx,C,sumdist] = kmeans(dataset,3,'Display','final','Replicates',5);
figure
gscatter(dataset(:,1),dataset(:,2),idx,'bgm')
hold on
plot(C(:,1),C(:,2),'kx')
legend('Cluster 1','Cluster 2','Cluster 3','Cluster Centroid')
dataset_idx=zeros(size(dataset,1));
dataset_idx=dataset(:,:);
dataset_idx(:,5)=idx;
clusters = cell(3,1);
for i = 1:3
clusters{i} = dataset_idx(dataset_idx(:,5) == i,:);
figure;
scatter(clusters{i}(:,1),clusters{i}(:,2))
legend(sprintf('Cluster %d',i))
title(sprintf('Cluster %d',i))
end
for i = 1:3
T = clusters{i}(:,1);
fprintf('\nCLUSTER %d:\n',i)
DeltaT = diff(T);
MclusterTimeseries = mean(DeltaT);
formatSpec = 'Mean DeltaT of Cluster %d is %4e\n';
fprintf(formatSpec,i,MclusterTimeseries)
MclusterFrequncy = mean(clusters{i}(:,2));
formatSpec = 'Mean Frequncy of Cluster %d is %4e\n';
fprintf(formatSpec,i,MclusterFrequncy)
MclusterAmplitude = max(clusters{i}(:,3));
formatSpec = 'Max Amplitude of Cluster %d is %4.4f\n';
fprintf(formatSpec,i,MclusterAmplitude)
Mcluster1PW = mean(clusters{i}(:,4));
formatSpec = 'Mean Pulse Width of Cluster %d is %4e\n';
fprintf(formatSpec,i,Mcluster1PW)
end
Output text for the second of these scripts:
Replicate 1, 1 iterations, total sum of distances = 1.05391e+11.
Replicate 2, 1 iterations, total sum of distances = 1.02249e+11.
Replicate 3, 1 iterations, total sum of distances = 1.05391e+11.
Replicate 4, 1 iterations, total sum of distances = 1.02249e+11.
Replicate 5, 1 iterations, total sum of distances = 1.30309e+11.
Best total sum of distances = 1.02249e+11
CLUSTER 1:
Mean DeltaT of Cluster 1 is 5.500100e-04
Mean Frequncy of Cluster 1 is 2.765393e+09
Max Amplitude of Cluster 1 is -50.2723
Mean Pulse Width of Cluster 1 is 1.012651e-05
CLUSTER 2:
Mean DeltaT of Cluster 2 is NaN
Mean Frequncy of Cluster 2 is 2.780981e+09
Max Amplitude of Cluster 2 is -45.1526
Mean Pulse Width of Cluster 2 is 1.505927e-05
CLUSTER 3:
Mean DeltaT of Cluster 3 is NaN
Mean Frequncy of Cluster 3 is 2.780554e+09
Max Amplitude of Cluster 3 is -45.1733
Mean Pulse Width of Cluster 3 is 1.507566e-05
Related
I am trying to plot clusters via the MATLAB function kmean but am getting way too many centroids and have no idea why. Here is my code and an example of a figure:
rng(1);
wv_prop = [min_pts(:) slope(:)];
if (isempty(wv_prop)==0)
[idx,C] = kmeans(wv_prop,2);
subplot(3,2,5);
plot(wv_prop(idx==1,1),wv_prop(idx==1,2),'b.','MarkerSize',12);
hold on
plot(wv_prop(idx==2,1),wv_prop(idx==2,2),'r.','MarkerSize',12);
plot(C(:,1),C(:,2),'kx',...
'MarkerSize',15,'LineWidth',3)
Here is an example of the data I use:
wv_prop:
-7.50904246127179e-05 2.52737793199461e-05
-7.64715493632322e-05 -29.2845021783221
-8.16630514296111e-05 -15.5896244315076
-8.60516901697005e-05 3.87325886247646e-05
-9.07390060961131e-05 4.06844795948271e-05
-7.93980060844007e-05 3.72806601486833e-05
-8.08420950480078e-05 3.81372062193057e-05
-8.53045358845788e-05 4.00072285969318e-05
-7.07712622172574e-05 3.55502071296987e-05
-8.02846575361635e-05 3.91085777803079e-05
-8.82904795076420e-05 4.21557386394776e-05
-8.32088783242009e-05 4.08103587885502e-05
-8.17564769131708e-05 4.06201592898485e-05
-8.88574631122910e-05 4.31980154605407e-05
-9.55496137235401e-05 4.55119867638717e-05
-7.11241881995855e-05 3.72772062250438e-05
-8.20641318582800e-05 6.09118479264444e-05
-7.92369664739745e-05 5.86246041439769e-05
-7.61219361068837e-05 5.57318660221894e-05
-8.52918510230295e-05 5.84710267850959e-05
-8.99668387994064e-05 5.84558301867090e-05
-9.62926333243702e-05 5.87762601336998e-05
-7.87678776488358e-05 4.67111894400931e-05
-7.53525297201741e-05 4.13207831828739e-05
-7.71766983561651e-05 3.82625914011195e-05
-9.03499693359608e-05 4.06874790212135e-05
-7.59387077492098e-05 2.92390401569819e-05
-7.97649576465785e-05 32.1683359898974
-8.06408560217508e-05 1.55409105433306e-05
-8.10515208048491e-05 1.31180389653758e-05
-7.70540121076476e-05 9.43353748786386e-06
-7.24001267378072e-05 5.78599898248438e-06
-8.93350436455590e-05 9.61034087028361e-06
-7.97722332494743e-05 4.89104076311932e-06
-8.40022599007737e-05 5.06726288587479e-06
-7.89655937936233e-05 2.44686642783556e-06
-8.58007004774045e-05 4.06628163987085e-06
-7.68775819259902e-05 1.06146142996962e-06
-7.05769224846652e-05 -2.25666633700963e-06
-7.73022200637920e-05 1.34546072255262e-06
-7.65784897728499e-05 1.62917829786978e-06
-7.41548367397790e-05 1.46536230997079e-06
-9.17371298592096e-05 1.17025036839378e-05
-7.35354500231489e-05 4.43710161064086e-06
function [] = Select_Figs(filename,startblock,endblock,startclust,endclust,animal,day)
%Select_Figs - Plots average waveforms, standard deviation, difference over time,
%fitted peak location histogram, mean squared error, k-mean clustered peak location and slope,
%and raw waveforms across selected blocks and clusters,
%saves to folder Selected-Figures-animal-date
%
%Select_Figs(filename,startblock,endblock,startclust,endclust,animal,date)
%
%filename - Sort.mat(e.g. = 'Sort.mat')
%
%startblock- first block (e.g. = 7)
%
%endblock - last block (e.g. = 12)
%
%startclust - first cluster (e.g. = 5)
%
%endclust - last cluster (e.g. = 10)
%
%animal - animal number (e.g. = 12)
%
%date - start of experiment (e.g. = 101617)
%
%Function called by User_Sort.m
Sort = filename;
addpath(pwd);
%Get Sort file
foldername = sprintf('Selected-Figures-%s-%s',animal,day); %Creates dynamic folder name to store figures
mkdir(foldername); %Makes directory
cd(fullfile(foldername)); %Cd to new directory
tvec = 0:.013653333:(.013653333*97); %Time vector
t = tvec(2:end);
for clust = startclust:endclust %Loops through all clusters
fig = cell(1,endblock); %Preallocate # of figures
name = sprintf('Idx_%d',clust); %Individual cluster name
fig{clust} = figure('Visible', 'off'); %Turns figure visibility off
for block = startblock:endblock %Loop through all blocks
wvfrms_avg =Sort.(name)(block).avg;
wvfrms_avg_scaled = (wvfrms_avg*10^6);
wvfrms_std =Sort.(name)(block).standdev;
min_ind = wvfrms_avg_scaled == min(wvfrms_avg_scaled);
min_loc = t(min_ind);
[~,io] = findpeaks(wvfrms_avg_scaled);
leftmin = io<find(wvfrms_avg_scaled==min(wvfrms_avg_scaled));
leftmin = leftmin(leftmin~=0);
rightmin = io>find(wvfrms_avg_scaled==min(wvfrms_avg_scaled));
rightmin = rightmin(rightmin~=0);
if (isempty(wvfrms_avg_scaled)==0)
subplot(3,2,1);
if (isnan(wvfrms_avg_scaled)==0)&((-30<min(wvfrms_avg_scaled))||(min_loc>0.55)||(min_loc<0.3)||(length(io(leftmin))>2)||(length(io(rightmin))>2))
plot(tvec(1:end-1),wvfrms_avg_scaled,'r');
else
plot(tvec(1:end-1),wvfrms_avg_scaled,'b');
end
end
new_wv = wvfrms_avg_scaled(40:end);
[~,locs_scaled] = findpeaks(new_wv);
if isempty(locs_scaled)==1
ind_scaled = max(new_wv);
else
ind_scaled = locs_scaled(1);
end
x1_scaled = new_wv(find(min(wvfrms_avg_scaled)));
y1_scaled = min(wvfrms_avg_scaled);
x2_scaled = ind_scaled;
y2_scaled = new_wv(find(ind_scaled));
slope_scaled= (y2_scaled-y1_scaled)./(x2_scaled-x1_scaled);
if (isnan(wvfrms_avg_scaled)==0)
if ((-30<min(wvfrms_avg_scaled)))
lab = sprintf('Time (ms) \n Peak exceeds amplitude range (%s)',num2str(min(wvfrms_avg_scaled)));
xlabel(lab,'FontSize',8);
ylabel('Mean Voltage (\muV)','FontSize',8);
title('Average Waveform','FontSize',8);
elseif ((min_loc>0.55)||(min_loc<0.3))
lab = sprintf('Time (ms) \n Peak location exceeds range (Time = %s)',num2str(min_loc));
xlabel(lab,'FontSize',8);
ylabel('Mean Voltage (\muV)','FontSize',8);
title('Average Waveform','FontSize',8);
elseif (length(io(leftmin))>2)||(length(io(rightmin))>2)
lab = sprintf('Time (ms) \n Peak limit exceeded (# = %s) Peak = %s',num2str(length(io)),num2str(min(wvfrms_avg_scaled)));
xlabel(lab,'FontSize',8);
ylabel('Mean Voltage (\muV)','FontSize',8);
title('Average Waveform','FontSize',8);
else
lab = sprintf('Time (ms) \n Peak = %s Slope = %s',num2str(min(wvfrms_avg_scaled)),num2str(slope_scaled));
xlabel(lab,'FontSize',8)
ylabel('Mean Voltage (\muV)','FontSize',8);
title('Average Waveform','FontSize',8);
end
end
if (isempty(wvfrms_std)==0&isempty(wvfrms_avg)==0)
subplot(3,2,2);
errorbar(t,wvfrms_avg,wvfrms_std); %Plots errorbars
end
wvfrms_num_text = sprintf(['Time (ms) \n # Waveforms: ' num2str(size(Sort.(name)(block).block,2))]);
xlabel(wvfrms_num_text,'FontSize',8);
ylabel('Mean Voltage (V)','FontSize',8);
title('Average Waveform + STD','FontSize',8);
wvfrms = Sort.(name)(block).block;
for i = 1:size(wvfrms,1)
if isempty(wvfrms)==0
min_pts = min(wvfrms,[],2); %Adds array of min wvfrm points to matrix
slope = zeros(1,size(wvfrms,1));
new = wvfrms(i,:);
new_cut = new(40:end);
[~,locs] = findpeaks(new_cut);
if isempty(locs)==1
ind = max(new_cut);
else
ind = locs(1);
end
x1 = new(find(min_pts(i)));
y1 = min_pts(i);
x2 = ind;
y2 = new(find(ind));
slope(i) = (y2-y1)./(x2-x1);
else
slope(i) = 0;
end
end
bins = 100;
hist_val = (min_pts(:)*10^6);
if isempty(hist_val)==0
%Convert matrix of min points to array and into microvolts
subplot(3,2,3);
histogram(hist_val,bins);
ylabel('Count','FontSize',8);
title('Waveform Peaks','FontSize',8);
cnt = histcounts(hist_val,bins); %Returns bin counts
line_fit = zeros(1,length(cnt)); %Preallocates vector to hold line to fit histogram
for i = 3:length(line_fit)-3
if (cnt(i)<mean(cnt)) %If bin count is less than mean, take mean of 3
cnt(i)=mean([cnt(i-1) cnt(i+1)]); %consecutive bins, set as bin count
end
if (mean([cnt(i-2) cnt(i-1) cnt(i) cnt(i+1) cnt(i+2)])>=mean(cnt)) %If mean of 5 consecutive bins
line_fit(i-1) = (max([cnt(i-2) cnt(i-1) cnt(i) cnt(i+1) cnt(i+2)]));%exceeds bin count, set max,
end %add to line fit vector
end
line_fit(line_fit<=mean(cnt)) = min(cnt)+1; %Set line_fit values less than mean
x = linspace(min(hist_val),max(hist_val),length(line_fit)); %X axis (min - max point of vals)
hold on
plot(x,line_fit,'k','LineWidth',1.5);
assignin('base','hist_val',hist_val);
if (isempty(hist_val)==0)
gm = fitgmdist(hist_val,2,'RegularizationValue',0.1);
warning('off','stats:gmdistribution:FailedToConverge');
comp1 = gm.ComponentProportion(1)*100;
comp2 = gm.ComponentProportion(2)*100;
mean1 = gm.mu(1);
mean2 = gm.mu(2);
hist_leg = sprintf('\\muV \n Component 1 = %0.2f%% Component 2 = %0.2f%% \n Mean 1 = %0.2f Mean 2 = %0.2f',comp1,comp2,mean1,mean2);
xlabel(hist_leg,'FontSize',8);
end
hold off
else
subplot(3,2,3);
hist_val = 0;
plot(hist_val);
end
hist_val = (slope(:)*10^3);
if isempty(hist_val)==0
subplot(3,2,4);
histogram(hist_val,bins);
ylabel('Count');
cnt = histcounts(hist_val,bins); %Returns bin counts
line_fit = zeros(1,length(cnt)); %Preallocates vector to hold line to fit histogram
for i = 3:length(line_fit)-3
if (cnt(i)<mean(cnt)) %If bin count is less than mean, take mean of 3
cnt(i)=mean([cnt(i-1) cnt(i+1)]); %consecutive bins, set as bin count
end
if (mean([cnt(i-2) cnt(i-1) cnt(i) cnt(i+1) cnt(i+2)])>=mean(cnt)) %If mean of 5 consecutive bins
line_fit(i-1) = (max([cnt(i-2) cnt(i-1) cnt(i) cnt(i+1) cnt(i+2)])); %exceeds bin count, set max,
end %add to line fit vector
end
line_fit(line_fit<=mean(cnt)) = min(cnt)+1; %Set line_fit values less than mean
x = linspace(min(hist_val),max(hist_val),length(line_fit)); %X axis (min - max point of vals)
hold on
plot(x,line_fit,'k','LineWidth',1.5);
gm = fitgmdist(hist_val,2,'RegularizationValue',0.1);
warning('off','stats:gmdistribution:FailedToConverge');
comp1 = gm.ComponentProportion(1)*100;
comp2 = gm.ComponentProportion(2)*100;
mean1 = gm.mu(1);
mean2 = gm.mu(2);
title('Waveform Slope','FontSize',8);
hist_leg = sprintf('Slope (m) \n Component 1 = %0.2f%% Component 2 = %0.2f%% \n Mean 1 = %0.2f Mean 2 = %0.2f',comp1,comp2,mean1,mean2);
xlabel(hist_leg,'FontSize',8);
hold off
else
subplot(3,2,4);
hist_val = 0;
plot(hist_val);
end
rng(1);
wv_prop = [min_pts(:) slope(:)];
if (isempty(wv_prop)==0)
[idx,C] = kmeans(wv_prop,2);
subplot(3,2,5);
plot(wv_prop(idx==1,1),wv_prop(idx==1,2),'b.','MarkerSize',12);
hold on
plot(wv_prop(idx==2,1),wv_prop(idx==2,2),'r.','MarkerSize',12);
plot(C(:,1),C(:,2),'kx',...
'MarkerSize',15,'LineWidth',3)
title('Clustered Peak and Slope','FontSize',8);
fig_about = sprintf('BL%s - Cluster %s Block %s', animal,num2str(clust),num2str(block));
figtitle(fig_about);
else
subplot(3,2,5);
wv_prop = 0;
plot(wv_prop);
end
if isempty(wvfrms)==0
[vals] = align_wvs(wvfrms);
if (~isempty(vals))
subplot(3,2,6);
plot(t,vals);
title('Raw Waveforms','FontSize',8);
end
else
subplot(3,2,6);
w = 0;
plot(w);
end
print(fig{clust},['Cluster-' num2str(clust) ' Block-' num2str(block)],'-dpng');
end
end
disp('Done');
end
I am doing a project on plant disease detection. I need to extract diseased parts from images of leafs but I'm not able to separate out diseased regions accurately using k-means. Specifically, the rest of the leaf is visible on the image with the diseased parts segmented. Here is the original image and image after extracting diseased parts:original image image after separating diseased parts
Here is the code I have written
b=imread('12.jpeg');
G=fspecial('gaussian',[200 250],1);
Ig=imfilter(b,G,'same');
figure,imshow(Ig);
conversionform = makecform('srgb2lab');
lab_img = applycform(Ig,conversionform);
figure,imshow(lab_img);
ab = double(lab_img(:,:,2:3));
nrows = size(ab,1);
ncols = size(ab,2);
ab = reshape(ab,nrows*ncols,2);
nColors = 2;
[cluster_idx, cluster_center] = kmeans(ab,nColors,'distance','sqEuclidean', ...,
'Replicates',3);
pixel_labels = reshape(cluster_idx,nrows,ncols);
figure, imshow(pixel_labels,[]), title('image labeled by cluster index');
segmented_images = cell(1,3);
rgb_label = repmat(pixel_labels,[1 1 3]);
for k = 1:nColors
color = lab_img;
color(rgb_label ~= k) = 0;
segmented_images{k} = color;
end
figure, imshow(segmented_images{1}), title('objects in cluster 1');
figure, imshow(segmented_images{2}), title('objects in cluster 2');
e=segmented_images{1};
figure,imshow(e);
conversionform = makecform('lab2srgb');
new_image=applycform(e,conversionform);
figure,imshow(new_image);
I want to extract only the diseased regions using K means clustering. I would be grateful if someone could help me with this. I am using matlab 2009a.
Here is a corrected code that will do what you expect:
function segmented_img = leaf_segmentation( original_img, nclusters )
original_img = im2double(original_img);
smoothed_img = imgaussfilt(original_img,1);
conversionform = makecform('srgb2lab');
lab_img = applycform(smoothed_img,conversionform);
ab_img = lab_img(:,:,2:3);
[nrows,ncols,~] = size(ab_img);
ab_img = reshape(ab_img,nrows*ncols,2);
cluster_idx = kmeans(ab_img,nclusters,'distance','sqEuclidean','Replicates',3);
cluster_img = reshape(cluster_idx,nrows,ncols);
%figure, imagesc(cluster_img), title('Clustering results');
segmented_img = cell(1,nclusters);
for k = 1:nclusters
segmented_img{k} = bsxfun( #times, original_img, cluster_img == k );
end
end
You can call it and visualise the results like so:
segmented = leaf_segmentation( original, 3 );
figure;
subplot(1,3,1), imshow(segmented{1}), title('Cluster 1');
subplot(1,3,2), imshow(segmented{2}), title('Cluster 2');
subplot(1,3,3), imshow(segmented{3}), title('Cluster 3');
Note that the order of the clusters may vary. You can order them a posteriori knowing that the leaf should be mostly green/yellow, and that the background should be mostly black.
i'm making image segmentation with self organizing map. the image segement by 3 cluster. Sample image is :
and i have type the matlab code like this bellow :
clear;
clc;
i=imread('DataSet/3.jpg');
I = imresize(i,0.5);
cform = makecform('srgb2lab');
lab_I = applycform(I,cform);
ab = double(lab_I(:,:,2:3));
nrows = size(ab,1);
ncols = size(ab,2);
ab = reshape(ab,nrows*ncols,2);
a = ab(:,1);
b = ab(:,2);
normA = (a-min(a(:))) ./ (max(a(:))-min(a(:)));
normB = (b-min(b(:))) ./ (max(b(:))-min(b(:)));
ab = [normA normB];
newnRows = size(ab,1);
newnCols = size(ab,2);
cluster = 3;
% Max number of iteration
N = 90;
% initial learning rate
eta = 0.3;
% exponential decay rate of the learning rate
etadecay = 0.2;
%random weight
w = rand(2,cluster);
%initial D
D = zeros(1,cluster);
% initial cluster index
clusterindex = zeros(newnRows,1);
% start
for t = 1:N
for data = 1 : newnRows
for c = 1 : cluster
D(c) = sqrt(((w(1,c)-ab(data,1))^2) + ((w(2,c)-ab(data,2))^2));
end
%find best macthing unit
[~, bmuindex] = min(D);
clusterindex(data)=bmuindex;
%update weight
oldW = w(:,bmuindex);
new = oldW + eta * (reshape(ab(data,:),2,1)-oldW);
w(:,bmuindex) = new;
end
% update learning rate
eta= etadecay * eta;
end
%Label Every Pixel in the Image Using the Results from KMEANS
pixel_labels = reshape(clusterindex,nrows,ncols);
%Create Images that Segment the I Image by Color.
segmented_images = cell(1,3);
rgb_label = repmat(pixel_labels,[1 1 3]);
for k = 1:cluster
color = I;
color(rgb_label ~= k) = 0;
segmented_images{k} = color;
end
figure,imshow(segmented_images{1}), title('objects in cluster 1');
figure,imshow(segmented_images{2}), title('objects in cluster 2');
figure,imshow(segmented_images{3}), title('objects in cluster 3');
and after runing the matlab code, there is no image segmentation result. Matlab show 3 figure, Figure 1 show the full image, figure 2 blank, figure 3 blank .
please anyone help me to revise my matlab code, is any wrong code or something?
new = oldW + eta * (reshape(ab(data,:),2,1)-oldW);
This line looks suspicious to me, why you are subtracting old weights here, i dont think this makes any sense there, just remove oldW from there and check your results again.
Thank You
In cwt() I can specify which wavelet function to use. How does that impact the speed of cwt()?
Here is a benchmark, which I run with the -singleCompThread option when starting MATLAB to force it to use a single computational thread. cwt() was passed a 1,000,000-sample signal and asked to compute scales 1 to 10. My CPU is an i7-3610QM.
Code used:
clear all
%% Benchmark parameters
results_file_name = 'results_scale1-10.csv';
number_of_random_runs = 10;
scales = 1:10;
number_of_random_samples = 1000000;
%% Construct a cell array containing all the wavelet names
wavelet_haar_names = {'haar'};
wavelet_db_names = {'db1'; 'db2'; 'db3'; 'db4'; 'db5'; 'db6'; 'db7'; 'db8'; 'db9'; 'db10'};
wavelet_sym_names = {'sym2'; 'sym3'; 'sym4'; 'sym5'; 'sym6'; 'sym7'; 'sym8'};
wavelet_coif_names = {'coif1'; 'coif2'; 'coif3'; 'coif4'; 'coif5'};
wavelet_bior_names = {'bior1.1'; 'bior1.3'; 'bior1.5'; 'bior2.2'; 'bior2.4'; 'bior2.6'; 'bior2.8'; 'bior3.1'; 'bior3.3'; 'bior3.5'; 'bior3.7'; 'bior3.9'; 'bior4.4'; 'bior5.5'; 'bior6.8'};
wavelet_rbior_names = {'rbio1.1'; 'rbio1.3'; 'rbio1.5'; 'rbio2.2'; 'rbio2.4'; 'rbio2.6'; 'rbio2.8'; 'rbio3.1'; 'rbio3.3'; 'rbio3.5'; 'rbio3.7'; 'rbio3.9'; 'rbio4.4'; 'rbio5.5'; 'rbio6.8'};
wavelet_meyer_names = {'meyr'};
wavelet_dmeyer_names = {'dmey'};
wavelet_gaus_names = {'gaus1'; 'gaus2'; 'gaus3'; 'gaus4'; 'gaus5'; 'gaus6'; 'gaus7'; 'gaus8'};
wavelet_mexh_names = {'mexh'};
wavelet_morl_names = {'morl'};
wavelet_cgau_names = {'cgau1'; 'cgau2'; 'cgau3'; 'cgau4'; 'cgau5'};
wavelet_shan_names = {'shan1-1.5'; 'shan1-1'; 'shan1-0.5'; 'shan1-0.1'; 'shan2-3'};
wavelet_fbsp_names = {'fbsp1-1-1.5'; 'fbsp1-1-1'; 'fbsp1-1-0.5'; 'fbsp2-1-1'; 'fbsp2-1-0.5'; 'fbsp2-1-0.1'};
wavelet_cmor_names = {'cmor1-1.5'; 'cmor1-1'; 'cmor1-0.5'; 'cmor1-1'; 'cmor1-0.5'; 'cmor1-0.1'};
% Concatenate all wavelet names into a single cell array
wavelet_categories_names = who('wavelet*names');
wavelet_names = {};
for wavelet_categories_number=1:size(wavelet_categories_names,1)
temp = wavelet_categories_names(wavelet_categories_number);
temp = eval(temp{1});
wavelet_names = vertcat(wavelet_names, temp);
end
%% Prepare data
random_signal = rand(number_of_random_runs,number_of_random_samples);
%% Run benchmarks
result_file_ID = fopen(results_file_name, 'w');
for wavelet_number = 1:size(wavelet_names,1)
wavelet_name = wavelet_names(wavelet_number,:)
% Compute wavelet on a random signal
tic
for run = 1:number_of_random_runs
cwt(random_signal(run, :),scales,wavelet_name{1});
end
run_time_random_test = toc
fprintf(result_file_ID, '%s,', wavelet_name{1})
fprintf(result_file_ID, '%d\n', run_time_random_test)
end
size(wavelet_names,1)
fclose(result_file_ID);
If you want to see the impact of the choice of the scale:
Code used:
clear all
%% Benchmark parameters
results_file_name = 'results_sym2_change_scale.csv';
number_of_random_runs = 10;
scales = 1:10;
number_of_random_samples = 10000000;
% wavelet_names = {'sym2', 'sym3'}%, 'sym4'};
output_directory = 'output';
wavelet_names = get_all_wavelet_names();
%% Prepare data
random_signal = rand(number_of_random_runs,number_of_random_samples);
%% Prepare result folder
if ~exist(output_directory, 'dir')
mkdir(output_directory);
end
%% Run benchmarks
result_file_ID = fopen(results_file_name, 'w');
for wavelet_number = 1:size(wavelet_names,1)
wavelet_name = wavelet_names{wavelet_number}
if wavelet_number > 1
fprintf(result_file_ID, '%s\n', '');
end
fprintf(result_file_ID, '%s', wavelet_name)
run_time_random_test_scales = zeros(size(scales,2),1);
for scale_number = 1:size(scales,2)
scale = scales(scale_number);
% Compute wavelet on a random signal
tic
for run = 1:number_of_random_runs
cwt(random_signal(run, :),scale,wavelet_name);
end
run_time_random_test = toc
fprintf(result_file_ID, ',%d', run_time_random_test)
run_time_random_test_scales(scale_number) = run_time_random_test;
end
figure
bar(run_time_random_test_scales)
title(['Run time on random signal for ' wavelet_name])
xlabel('Scale')
ylabel('Run time (seconds)')
save_figure( fullfile(output_directory, ['run_time_random_test_' wavelet_name]) )
close all
end
size(wavelet_names,1)
fclose(result_file_ID);
With 3 functions:
get_all_wavelet_names.m:
function [ wavelet_names ] = get_all_wavelet_names( )
%GET_ALL_WAVELET_NAMES Get a list of available wavelet functions
%% Construct a cell array containing all the wavelet names
wavelet_haar_names = {'haar'};
wavelet_db_names = {'db1'; 'db2'; 'db3'; 'db4'; 'db5'; 'db6'; 'db7'; 'db8'; 'db9'; 'db10'};
wavelet_sym_names = {'sym2'; 'sym3'; 'sym4'; 'sym5'; 'sym6'; 'sym7'; 'sym8'};
wavelet_coif_names = {'coif1'; 'coif2'; 'coif3'; 'coif4'; 'coif5'};
wavelet_bior_names = {'bior1.1'; 'bior1.3'; 'bior1.5'; 'bior2.2'; 'bior2.4'; 'bior2.6'; 'bior2.8'; 'bior3.1'; 'bior3.3'; 'bior3.5'; 'bior3.7'; 'bior3.9'; 'bior4.4'; 'bior5.5'; 'bior6.8'};
wavelet_rbior_names = {'rbio1.1'; 'rbio1.3'; 'rbio1.5'; 'rbio2.2'; 'rbio2.4'; 'rbio2.6'; 'rbio2.8'; 'rbio3.1'; 'rbio3.3'; 'rbio3.5'; 'rbio3.7'; 'rbio3.9'; 'rbio4.4'; 'rbio5.5'; 'rbio6.8'};
wavelet_meyer_names = {'meyr'};
wavelet_dmeyer_names = {'dmey'};
wavelet_gaus_names = {'gaus1'; 'gaus2'; 'gaus3'; 'gaus4'; 'gaus5'; 'gaus6'; 'gaus7'; 'gaus8'};
wavelet_mexh_names = {'mexh'};
wavelet_morl_names = {'morl'};
wavelet_cgau_names = {'cgau1'; 'cgau2'; 'cgau3'; 'cgau4'; 'cgau5'};
wavelet_shan_names = {'shan1-1.5'; 'shan1-1'; 'shan1-0.5'; 'shan1-0.1'; 'shan2-3'};
wavelet_fbsp_names = {'fbsp1-1-1.5'; 'fbsp1-1-1'; 'fbsp1-1-0.5'; 'fbsp2-1-1'; 'fbsp2-1-0.5'; 'fbsp2-1-0.1'};
wavelet_cmor_names = {'cmor1-1.5'; 'cmor1-1'; 'cmor1-0.5'; 'cmor1-1'; 'cmor1-0.5'; 'cmor1-0.1'};
% Concatenate all wavelet names into a single cell array
wavelet_categories_names = who('wavelet*names');
wavelet_names = {};
for wavelet_categories_number=1:size(wavelet_categories_names,1)
temp = wavelet_categories_names(wavelet_categories_number);
temp = eval(temp{1});
wavelet_names = vertcat(wavelet_names, temp);
end
end
save_figure.m:
function [ ] = save_figure( output_graph_filename )
% Record aa figure as PNG and fig files
% Create the folder if it doesn't exist already.
[pathstr, name, ext] = fileparts(output_graph_filename);
if ~exist(pathstr, 'dir')
mkdir(pathstr);
end
h = gcf;
set(0,'defaultAxesFontSize',18) % http://www.mathworks.com/support/solutions/en/data/1-8XOW94/index.html?solution=1-8XOW94
boldify(h);
print('-dpng','-r600', [output_graph_filename '.png']);
print(h,[output_graph_filename '.pdf'],'-dpdf','-r600')
saveas(gcf,[output_graph_filename '.fig'], 'fig')
end
and boldify.m:
function boldify(h,g)
%BOLDIFY Make lines and text bold for standard viewgraph style.
% BOLDIFY boldifies the lines and text of the current figure.
% BOLDIFY(H) applies to the graphics handle H.
%
% BOLDIFY(X,Y) specifies an X by Y inch graph of the current
% figure. If text labels have their 'UserData' data property
% set to 'slope = ...', then the 'Rotation' property is set to
% account for changes in the graph's aspect ratio. The
% default is MATLAB's default.
% S. T. Smith
% The name of this function does not represent an endorsement by the author
% of the egregious grammatical trend of verbing nouns.
if nargin < 1, h = gcf;, end
% Set (and get) the default MATLAB paper size and position
set(gcf,'PaperPosition','default');
units = get(gcf,'PaperUnits');
set(gcf,'PaperUnits','inches');
fsize = get(gcf,'PaperPosition');
fsize = fsize(3:4); % Figure size (X" x Y") on paper.
psize = get(gcf,'PaperSize');
if nargin == 2 % User specified graph size
fsize = [h,g];
h = gcf;
end
% Set the paper position of the current figure
set(gcf,'PaperPosition', ...
[(psize(1)-fsize(1))/2 (psize(2)-fsize(2))/2 fsize(1) fsize(2)]);
fsize = get(gcf,'PaperPosition');
fsize = fsize(3:4); % Graph size (X" x Y") on paper.
set(gcf,'PaperUnits',units); % Back to original
% Get the normalized axis position of the current axes
units = get(gca,'Units');
set(gca,'Units','normalized');
asize = get(gca,'Position');
asize = asize(3:4);
set(gca,'Units',units);
ha = get(h,'Children');
for i=1:length(ha)
% if get(ha(i),'Type') == 'axes'
% changed by B. A. Miller
if strcmp(get(ha(i), 'Type'), 'axes') == 1
units = get(ha(i),'Units');
set(ha(i),'Units','normalized');
asize = get(ha(i),'Position'); % Axes Position (normalized)
asize = asize(3:4);
set(ha(i),'Units',units);
[m,j] = max(asize); j = j(1);
scale = 1/(asize(j)*fsize(j)); % scale*inches -normalized units
set(ha(i),'FontWeight','Bold');
set(ha(i),'LineWidth',2);
[m,k] = min(asize); k = k(1);
if asize(k)*fsize(k) > 1/2
set(ha(i),'TickLength',[1/8 1.5*1/8]*scale); % Gives 1/8" ticks
else
set(ha(i),'TickLength',[3/32 1.5*3/32]*scale); % Gives 3/32" ticks
end
set(get(ha(i),'XLabel'),'FontSize',18); % 14-pt labels
set(get(ha(i),'XLabel'),'FontWeight','Bold');
set(get(ha(i),'XLabel'),'VerticalAlignment','top');
set(get(ha(i),'YLabel'),'FontSize',18); % 14-pt labels
set(get(ha(i),'YLabel'),'FontWeight','Bold');
%set(get(ha(i),'YLabel'),'VerticalAlignment','baseline');
set(get(ha(i),'Title'),'FontSize',18); % 16-pt titles
set(get(ha(i),'Title'),'FontWeight','Bold');
% set(get(ha(i), 'FontSize',20, 'XTick',[]));
end
hc = get(ha(i),'Children');
for j=1:length(hc)
chtype = get(hc(j),'Type');
if chtype(1:4) == 'text'
set(hc(j),'FontSize',17); % 12 pt descriptive labels
set(hc(j),'FontWeight','Bold');
ud = get(hc(j),'UserData'); % User data
if length(ud) 8
if ud(1:8) == 'slope = ' % Account for change in actual slope
slope = sscanf(ud,'slope = %g');
slope = slope*(fsize(2)/fsize(1))/(asize(2)/asize(1));
set(hc(j),'Rotation',atan(slope)/pi*180);
end
end
elseif chtype(1:4) == 'line'
set(hc(j),'LineWidth',2);
end
end
end
Bonus: correlation between all wavelets on a random signal with 1000000 samples with the first 10 scales:
Code used:
%% PRE-REQUISITE: You need to download http://www.mathworks.com/matlabcentral/fileexchange/24253-customizable-heat-maps , which gives the function heatmap()
%% Benchmark parameters
scales = 1:10;
number_of_random_samples = 1000000;
% wavelet_names = {'sym2'; 'sym3'; 'sym4'; 'sym5'; 'sym6'; 'sym7'; 'sym8'};
% wavelet_names = {'cgau1'; 'cgau2'; 'cgau3'; 'cgau4'; 'cgau5'};
wavelet_names = {'db2'; 'sym2'};
OUTPUT_FOLDER = 'output_corr';
% wavelet_names = get_all_wavelet_names(); % WARNING: you need to remove all complex wavelets, viz. cgau1, shan, fbsp and cmor, and the heatmap will be pissed to see complex values coming to her.
%% Prepare data
random_signal = rand(1,number_of_random_samples);
results = zeros(size(wavelet_names,1), number_of_random_samples);
%% Prepare result folder
if ~exist(OUTPUT_FOLDER, 'dir')
mkdir(OUTPUT_FOLDER);
end
%% Run benchmarks
for scale_number = 1:size(scales,2)
scale = scales(scale_number);
for wavelet_number = 1:size(wavelet_names,1)
wavelet_name = wavelet_names{wavelet_number}
% Compute wavelet on a random signal
run = 1;
results(wavelet_number, :) = cwt(random_signal(run, :),scale,wavelet_name);
if wavelet_number == 999
break
end
end
correlation_results = corrcoef(results')
heatmap(correlation_results, [], [], '%0.2f', 'MinColorValue', -1.0, 'MaxColorValue', 1.0, 'Colormap', 'jet',...
'Colorbar', true, 'ColorLevels', 64, 'UseFigureColormap', false);
title(['Correlation matrix for scale ' num2str(scale)]);
xlabel(['Wavelet 1 to ' num2str(size(wavelet_names,1)) ' for scale ' num2str(scale)]);
ylabel(['Wavelet 1 to ' num2str(size(wavelet_names,1)) ' for scale ' num2str(scale)]);
snapnow
print('-dpng','-r600',fullfile(OUTPUT_FOLDER, ['scalecorr' num2str(scale) '.png']))
end
Correlation for each wavelet between different scales (1 to 100):
Code used:
%% PRE-REQUISITE: You need to download http://www.mathworks.com/matlabcentral/fileexchange/24253-customizable-heat-maps , which gives the function heatmap()
%% Benchmark parameters
scales = 1:100;
number_of_random_samples = 1000000;
% wavelet_name = 'gaus2';
% wavelet_names = {'sym2', 'sym3'}%, 'sym4'};
OUTPUT_FOLDER = 'output_corr';
wavelet_names = get_all_wavelet_names(); % WARNING: you need to remove all complex wavelets, viz. cgau1, shan, fbsp and cmor, and the heatmap will be pissed to see complex values coming to her.
%% Prepare data
random_signal = rand(1,number_of_random_samples);
results = zeros(size(scales,2), number_of_random_samples);
%% Prepare result folder
if ~exist(OUTPUT_FOLDER, 'dir')
mkdir(OUTPUT_FOLDER);
end
%% Run benchmarks
for wavelet_number = 1:size(wavelet_names,1)
wavelet_name = wavelet_names{wavelet_number}
run_time_random_test_scales = zeros(size(scales,2),1);
for scale_number = 1:size(scales,2)
scale = scales(scale_number);
run = 1;
% Compute wavelet on a random signal
results(scale_number, :) = cwt(random_signal(run, :),scale,wavelet_name);
end
correlation_results = corrcoef(results')
heatmap(correlation_results, [], [], '%0.2f', 'MinColorValue', -1.0, 'MaxColorValue', 1.0, 'Colormap', 'jet',...
'Colorbar', true, 'ColorLevels', 64, 'UseFigureColormap', false);
title(['Correlation matrix for wavelet ' wavelet_name]);
xlabel(['Scales 1 to ' num2str(max(scales)) ' for wavelet ' wavelet_name]);
ylabel(['Scales 1 to ' num2str(max(scales)) ' for wavelet ' wavelet_name]);
snapnow
print('-dpng','-r600',fullfile(OUTPUT_FOLDER, [wavelet_name '_scalecorr_scale1to' num2str(max(scales)) '.png']))
end
I have a numeric dataset and I want to cluster data with a non-parametric algorithm. Basically, I would like to cluster without specifying the number of clusters for the input. I am using this code that I accessed through the MathWorks File Exchange network which implements the Mean Shift algorithm. However, I don't Know how to adapt my data to this code as my dataset has dimensions 516 x 19.
function [clustCent,data2cluster,cluster2dataCell] =MeanShiftCluster(dataPts,bandWidth,plotFlag)
%UNTITLED2 Summary of this function goes here
% Detailed explanation goes here
%perform MeanShift Clustering of data using a flat kernel
%
% ---INPUT---
% dataPts - input data, (numDim x numPts)
% bandWidth - is bandwidth parameter (scalar)
% plotFlag - display output if 2 or 3 D (logical)
% ---OUTPUT---
% clustCent - is locations of cluster centers (numDim x numClust)
% data2cluster - for every data point which cluster it belongs to (numPts)
% cluster2dataCell - for every cluster which points are in it (numClust)
%
% Bryan Feldman 02/24/06
% MeanShift first appears in
% K. Funkunaga and L.D. Hosteler, "The Estimation of the Gradient of a
% Density Function, with Applications in Pattern Recognition"
%*** Check input ****
if nargin < 2
error('no bandwidth specified')
end
if nargin < 3
plotFlag = true;
plotFlag = false;
end
%**** Initialize stuff ***
%[numPts,numDim] = size(dataPts);
[numDim,numPts] = size(dataPts);
numClust = 0;
bandSq = bandWidth^2;
initPtInds = 1:numPts
maxPos = max(dataPts,[],2); %biggest size in each dimension
minPos = min(dataPts,[],2); %smallest size in each dimension
boundBox = maxPos-minPos; %bounding box size
sizeSpace = norm(boundBox); %indicator of size of data space
stopThresh = 1e-3*bandWidth; %when mean has converged
clustCent = []; %center of clust
beenVisitedFlag = zeros(1,numPts,'uint8'); %track if a points been seen already
numInitPts = numPts %number of points to posibaly use as initilization points
clusterVotes = zeros(1,numPts,'uint16'); %used to resolve conflicts on cluster membership
while numInitPts
tempInd = ceil( (numInitPts-1e-6)*rand) %pick a random seed point
stInd = initPtInds(tempInd) %use this point as start of mean
myMean = dataPts(:,stInd); % intilize mean to this points location
myMembers = []; % points that will get added to this cluster
thisClusterVotes = zeros(1,numPts,'uint16'); %used to resolve conflicts on cluster membership
while 1 %loop untill convergence
sqDistToAll = sum((repmat(myMean,1,numPts) - dataPts).^2); %dist squared from mean to all points still active
inInds = find(sqDistToAll < bandSq); %points within bandWidth
thisClusterVotes(inInds) = thisClusterVotes(inInds)+1; %add a vote for all the in points belonging to this cluster
myOldMean = myMean; %save the old mean
myMean = mean(dataPts(:,inInds),2); %compute the new mean
myMembers = [myMembers inInds]; %add any point within bandWidth to the cluster
beenVisitedFlag(myMembers) = 1; %mark that these points have been visited
%*** plot stuff ****
if plotFlag
figure(12345),clf,hold on
if numDim == 2
plot(dataPts(1,:),dataPts(2,:),'.')
plot(dataPts(1,myMembers),dataPts(2,myMembers),'ys')
plot(myMean(1),myMean(2),'go')
plot(myOldMean(1),myOldMean(2),'rd')
pause
end
end
%**** if mean doesnt move much stop this cluster ***
if norm(myMean-myOldMean) < stopThresh
%check for merge posibilities
mergeWith = 0;
for cN = 1:numClust
distToOther = norm(myMean-clustCent(:,cN)); %distance from posible new clust max to old clust max
if distToOther < bandWidth/2 %if its within bandwidth/2 merge new and old
mergeWith = cN;
break;
end
end
if mergeWith > 0 % something to merge
clustCent(:,mergeWith) = 0.5*(myMean+clustCent(:,mergeWith)); %record the max as the mean of the two merged (I know biased twoards new ones)
%clustMembsCell{mergeWith} = unique([clustMembsCell{mergeWith} myMembers]); %record which points inside
clusterVotes(mergeWith,:) = clusterVotes(mergeWith,:) + thisClusterVotes; %add these votes to the merged cluster
else %its a new cluster
numClust = numClust+1 %increment clusters
clustCent(:,numClust) = myMean; %record the mean
%clustMembsCell{numClust} = myMembers; %store my members
clusterVotes(numClust,:) = thisClusterVotes;
end
break;
end
end
initPtInds = find(beenVisitedFlag == 0); %we can initialize with any of the points not yet visited
numInitPts = length(initPtInds); %number of active points in set
end
[val,data2cluster] = max(clusterVotes,[],1); %a point belongs to the cluster with the most votes
%*** If they want the cluster2data cell find it for them
if nargout > 2
cluster2dataCell = cell(numClust,1);
for cN = 1:numClust
myMembers = find(data2cluster == cN);
cluster2dataCell{cN} = myMembers;
end
end
This is the test code I am using to try and get the Mean Shift program to work:
clear
profile on
nPtsPerClust = 250;
nClust = 3;
totalNumPts = nPtsPerClust*nClust;
m(:,1) = [1 1];
m(:,2) = [-1 -1];
m(:,3) = [1 -1];
var = .6;
bandwidth = .75;
clustMed = [];
%clustCent;
x = var*randn(2,nPtsPerClust*nClust);
%*** build the point set
for i = 1:nClust
x(:,1+(i-1)*nPtsPerClust:(i)*nPtsPerClust) = x(:,1+(i-1)*nPtsPerClust:(i)*nPtsPerClust) + repmat(m(:,i),1,nPtsPerClust);
end
tic
[clustCent,point2cluster,clustMembsCell] = MeanShiftCluster(x,bandwidth);
toc
numClust = length(clustMembsCell)
figure(10),clf,hold on
cVec = 'bgrcmykbgrcmykbgrcmykbgrcmyk';%, cVec = [cVec cVec];
for k = 1:min(numClust,length(cVec))
myMembers = clustMembsCell{k};
myClustCen = clustCent(:,k);
plot(x(1,myMembers),x(2,myMembers),[cVec(k) '.'])
plot(myClustCen(1),myClustCen(2),'o','MarkerEdgeColor','k','MarkerFaceColor',cVec(k), 'MarkerSize',10)
end
title(['no shifting, numClust:' int2str(numClust)])
The test script generates random data X. In my case. I want to use the matrix D of size 516 x 19 but I am not sure how to adapt my data to this function. The function is returning results that are not agreeing with my understanding of the algorithm.
Does anyone know how to do this?