I've just started to learn to code and had a go at writing the standard k-means algorithm. I tried my implementation on a data set generated with three different Gaussians and it seems to work well. However I tried it on the iris data set and every now and then (around third of the time) my function returns only two sets, in other words it returns only two clusters.
I had a peek at the code for the local MATLAB kmeans function but I ended up getting lost due to my lack of coding knowledge. I would really appreciate any help!
function [R,C,P,it] = mykmeans(X,K)
% X -- data matrix
% K -- number of clusters
% C -- partition sets
% P -- matrix of prototypes
% R -- binary indicator matrix: R(i,j) specifies whether the ith data is
% classified into jth cluster
% it -- number of iterations until convergence
% N points with M dimensions
[N,M] = size(X) ;
%% Initialisation
% At this step we randomly partition the data matrix into K equally sized
% matrices and compute the centre of each of these matrices.
% I -- randomised index vector
% v -- number of data points assigned to each cluster
% U -- randomly partitioned matrices
v = N/K ;
C = cell(K,1) ;
U = cell(K,1) ;
I = randperm(N) ;
oldR = zeros(N,K) ;
% C{1} = X(I(1:v),:) ;
% U{1} = mean(X(I(1:v),:)) ;
for k=1:K
C{k} = X(I(1+v*(k-1):k*v),:) ;
U{k} = mean(C{k}) ;
end
P = cell2mat(U) ;
converged = 0 ;
it = 0 ;
while converged ~= 1
%% Assignment step
% Each element of D{n} contains squared euclidean distance of nth data
% point from the kth prototype
D = cell(N,1) ;
R = zeros(N,K) ;
for n=1:N
D{n} = sum((repmat(X(n,:),K,1) - P).^2,2) ;
[~,k] = min(D{n}) ;
R(n,k) = 1 ;
end
%% Update step
C = cell(K,1) ; % reset C
for k=1:K
for n=1:N
P(k,:) = R(n,k)*X(n,:) + P(k,:) ; % compute numerator of mean vector
if R(n,k) == 1
C{k} = [C{k};X(n,:)] ;
end
end
end
P = P ./ (sum(R)') ; % divide by denominator of mean vectors to get prototypes
%% Check for convergence
if sum(sum(R == oldR))==N*K || it == 100 % convergence criteria
converged = 1 ;
else
oldR = R ;
it = it+1 ;
end
end %while
The problem indeed seems not a coding problem but an issue of understanding k-means.
In fact during k-means, clusters can become empty. You will need to account in your code for this, as otherwise the number of clusters in your result can be smaller than k.
A possible solutions could be:
assigning a random data point as the new cluster center for the empty cluster
choosing the point most far from the biggest cluster as the new cluster center for the empty cluster
The general approach therefore goes such as:
Initialize k cluster centers (for example: random)
Assign all data points to the closest cluster center
Recompute the cluster centers based on the assignment
Check for empty clusters
Repeat steps 2 - 4 until convergence (== cluster centers did not change in the last iteration)
A great illustration of the issue of empty clusters can be found here.
Related
I'm new to this concept of parallel pooling on MATLAB (I'm using the version 2019 a) and coding. This code that I'm going to share with you was available on the net, with some few modifications that I've made it for my requirements.
Problem Statement: I'm having a non-linear system (Rossler equation) & I have to plot its Bifurcation diagram, I tried to do it normally using for loop but its computation time was too much and my computer got hanged several times, so I got an advice to parallel pool my code in order to come out of this problem. I tried to learn how to parallel pool using MATLAB on the net but still I'm not able to resolve my Issues as there are still some problems since there are 2 parfor loops in my code I'm having problems with Indexing and in assignment of the global parameter (Please note: This code is written for normal execution without using parallel pooling).
I'm attaching my code below here, please excuse if I've mentioned a lot many lines of codes.
clc;
a = 0.2; b = 0.2; global c;
crange = 1:0.05:90; % Range for parameter c
k = 0; tspan = 0:0.1:500; % Time interval for solving Rossler system
xmax = []; % A matrix for storing the sorted value of x1
for c = crange
f = #(t,x) [-x(2)-x(3); x(1)+a*x(2); b+x(3)*(x(1)-c)];
x0 = [1 1 0]; % initial condition for Rossler system
k = k + 1;
[t,x] = ode45(f,tspan,x0); % call ode() to solve Rossler system
count = find(t>100); % find all the t values which is >10
x = x(count,:);
j = 1;
n = length(x(:,1)); % find the length of vector x1(x in our problem)
for i=2 : n-1
% check for the min value in 1st column of sol matrix
if (x(i-1,1)+eps) < x(i,1) && x(i,1) > (x(i+1,1)+eps)
xmax(k,j)=x(i,1); % Sorting the values of x1 in increasing order
j=j+1;
end
end
% generating bifurcation map by plotting j-1 element of kth row each time
if j>1
plot(c,xmax(k,1:j-1),'k.','MarkerSize',1);
end
hold on;
index(k)=j-1;
end
xlabel('Bifuracation parameter c');
ylabel('x max');
title('Bifurcation diagram for c');
This can be made compatible with parfor by taking a few relatively simple steps. Firstly, parfor workers cannot produce on-screen graphics, so we need to change things to emit a result. In your case, this is not totally trivial since your primary result xmax is being assigned-to in a not-completely-uniform manner - you're assigning different numbers of elements on different loop iterations. Not only that, it appears not to be possible to predict up-front how many columns xmax needs.
Secondly, you need to make some minor changes to the loop iteration to be compatible with parfor, which requires consecutive integer loop iterates.
So, the major change is to have the loop write individual rows of results to a cell array I've called xmax_cell. Outside the parfor loop, it's trivial to convert this back to matrix form.
Putting all this together, we end up with this, which works correctly in R2019b as far as I can tell:
clc;
a = 0.2; b = 0.2;
crange = 1:0.05:90; % Range for parameter c
tspan = 0:0.1:500; % Time interval for solving Rossler system
% PARFOR loop outputs: a cell array of result rows ...
xmax_cell = cell(numel(crange), 1);
% ... and a track of the largest result row
maxNumCols = 0;
parfor k = 1:numel(crange)
c = crange(k);
f = #(t,x) [-x(2)-x(3); x(1)+a*x(2); b+x(3)*(x(1)-c)];
x0 = [1 1 0]; % initial condition for Rossler system
[t,x] = ode45(f,tspan,x0); % call ode() to solve Rossler system
count = find(t>100); % find all the t values which is >10
x = x(count,:);
j = 1;
n = length(x(:,1)); % find the length of vector x1(x in our problem)
this_xmax = [];
for i=2 : n-1
% check for the min value in 1st column of sol matrix
if (x(i-1,1)+eps) < x(i,1) && x(i,1) > (x(i+1,1)+eps)
this_xmax(j) = x(i,1);
j=j+1;
end
end
% Keep track of what's the maximum number of columns
maxNumCols = max(maxNumCols, numel(this_xmax));
% Store this row into the output cell array.
xmax_cell{k} = this_xmax;
end
% Fix up xmax - push each row into the resulting matrix.
xmax = NaN(numel(crange), maxNumCols);
for idx = 1:numel(crange)
this_max = xmax_cell{idx};
xmax(idx, 1:numel(this_max)) = this_max;
end
% Plot
plot(crange, xmax', 'k.', 'MarkerSize', 1)
xlabel('Bifuracation parameter c');
ylabel('x max');
title('Bifurcation diagram for c');
I found matlab file (https://in.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum) which generates probability matrix with uniform probability and each column has sum 1. file is as follow
function [x,v] = randfixedsum(n,m,s,a,b)
%[x,v] = randfixedsum(n,m,s,a,b)
%
% This generates an n by m array x, each of whose m columns
% contains n random values lying in the interval [a,b], but
% subject to the condition that their sum be equal to s. The
% scalar value s must accordingly satisfy n*a <= s <= n*b. The
% distribution of values is uniform in the sense that it has the
% conditional probability distribution of a uniform distribution
% over the whole n-cube, given that the sum of the x's is s.
%
% The scalar v, if requested, returns with the total
% n-1 dimensional volume (content) of the subset satisfying
% this condition. Consequently if v, considered as a function
% of s and divided by sqrt(n), is integrated with respect to s
% from s = a to s = b, the result would necessarily be the
% n-dimensional volume of the whole cube, namely (b-a)^n.
%
% This algorithm does no "rejecting" on the sets of x's it
% obtains. It is designed to generate only those that satisfy all
% the above conditions and to do so with a uniform distribution.
% It accomplishes this by decomposing the space of all possible x
% sets (columns) into n-1 dimensional simplexes. (Line segments,
% triangles, and tetrahedra, are one-, two-, and three-dimensional
% examples of simplexes, respectively.) It makes use of three
% different sets of 'rand' variables, one to locate values
% uniformly within each type of simplex, another to randomly
% select representatives of each different type of simplex in
% proportion to their volume, and a third to perform random
% permutations to provide an even distribution of simplex choices
% among like types. For example, with n equal to 3 and s set at,
% say, 40% of the way from a towards b, there will be 2 different
% types of simplex, in this case triangles, each with its own
% area, and 6 different versions of each from permutations, for
% a total of 12 triangles, and these all fit together to form a
% particular planar non-regular hexagon in 3 dimensions, with v
% returned set equal to the hexagon's area.
%
% Roger Stafford - Jan. 19, 2006
% Check the arguments.
if (m~=round(m))|(n~=round(n))|(m<0)|(n<1)
error('n must be a whole number and m a non-negative integer.')
elseif (s<n*a)|(s>n*b)|(a>=b)
error('Inequalities n*a <= s <= n*b and a < b must hold.')
end
% Rescale to a unit cube: 0 <= x(i) <= 1
s = (s-n*a)/(b-a);
% Construct the transition probability table, t.
% t(i,j) will be utilized only in the region where j <= i + 1.
k = max(min(floor(s),n-1),0); % Must have 0 <= k <= n-1
s = max(min(s,k+1),k); % Must have k <= s <= k+1
s1 = s - [k:-1:k-n+1]; % s1 & s2 will never be negative
s2 = [k+n:-1:k+1] - s;
w = zeros(n,n+1); w(1,2) = realmax; % Scale for full 'double' range
t = zeros(n-1,n);
tiny = 2^(-1074); % The smallest positive matlab 'double' no.
for i = 2:n
tmp1 = w(i-1,2:i+1).*s1(1:i)/i;
tmp2 = w(i-1,1:i).*s2(n-i+1:n)/i;
w(i,2:i+1) = tmp1 + tmp2;
tmp3 = w(i,2:i+1) + tiny; % In case tmp1 & tmp2 are both 0,
tmp4 = (s2(n-i+1:n) > s1(1:i)); % then t is 0 on left & 1 on right
t(i-1,1:i) = (tmp2./tmp3).*tmp4 + (1-tmp1./tmp3).*(~tmp4);
end
% Derive the polytope volume v from the appropriate
% element in the bottom row of w.
v = n^(3/2)*(w(n,k+2)/realmax)*(b-a)^(n-1);
% Now compute the matrix x.
x = zeros(n,m);
if m == 0, return, end % If m is zero, quit with x = []
rt = rand(n-1,m); % For random selection of simplex type
rs = rand(n-1,m); % For random location within a simplex
s = repmat(s,1,m);
j = repmat(k+1,1,m); % For indexing in the t table
sm = zeros(1,m); pr = ones(1,m); % Start with sum zero & product 1
for i = n-1:-1:1 % Work backwards in the t table
e = (rt(n-i,:)<=t(i,j)); % Use rt to choose a transition
sx = rs(n-i,:).^(1/i); % Use rs to compute next simplex coord.
sm = sm + (1-sx).*pr.*s/(i+1); % Update sum
pr = sx.*pr; % Update product
x(n-i,:) = sm + pr.*e; % Calculate x using simplex coords.
s = s - e; j = j - e; % Transition adjustment
end
x(n,:) = sm + pr.*s; % Compute the last x
% Randomly permute the order in the columns of x and rescale.
rp = rand(n,m); % Use rp to carry out a matrix 'randperm'
[ig,p] = sort(rp); % The values placed in ig are ignored
x = (b-a)*x(p+repmat([0:n:n*(m-1)],n,1))+a; % Permute & rescale x
return
but I want to generate matrix which has each row elements sum 1 and each row have uniform probability in matlab. how to do this with above programme. to run above programme I am calling it to another file and setting parameters
i.e.
m=4;n=4; a=0; b=1.5;s=1;
[x,v] = randfixedsum(n,m,s,a,b)
create a random matrix and divide each row by sum of elements of that row:
function result = randrowsum(m ,n)
rnd = rand(m,n);
rowsums = sum(rnd,2);
result = bsxfun(#rdivide, rnd, rowsums);
end
to create an m * n random matrix :
a=randrowsum(3,4)
check if sum of each row is 1:
sum(a,2)
I would say the easiest was is to generate the array with the given function.
[x,v] = randfixedsum(n,m,s,a,b);
Then just transport the results.
x = x';
I have tried hours but I cannot find solution.
I have "two Donuts" Data sample (variable "X")
you can download file below link
donut dataset(rings.mat)
which spreads to 2D shape like below image
First 250pts are located inside donuts and last 750 pts are located outside donuts.
and I need to perform spectral clustering.
I made (similarity matrix "W") with Gaussian similarity distance.
and I made degree matrix by sum of each raw of "W"
and then I computed eigen value(E) and eigen Vector(V)
and the shape of "V" is not good.
what is wrong with my trial???
I cannot figure out.
load rings.mat
[D, N] = size(X); % data stored in X
%initial plot data
figure; hold on;
for i=1:N,
plot(X(1,i), X(2,i),'o');
end
% perform spectral clustering
W = zeros(N,N);
D = zeros(N,N);
sigma = 1;
for i=1:N,
for j=1:N,
xixj2 = (X(1,i)-X(1,j))^2 + (X(2,i)-X(2,j))^2 ;
W(i,j) = exp( -1*xixj2 / (2*sigma^2) ) ; % compute weight here
% if (i==j)
% W(i,j)=0;
% end;
end;
D(i,i) = sum(W(i,:)) ;
end;
L = D - W ;
normL = D^-0.5*L*D^-0.5;
[u,s,v] = svd(normL);
If you use the Laplacian like it is in your code (the "real" laplacian), then to cluster your points into two sets you will want the eigenvector corresponding to second smallest eigenvalue.
The intuitive idea is to connect all of your points to each other with springs, where the springs are stiffer if the points are near each other, and less stiff for points far away. The eigenvectors of the Laplacian are the modes of vibration if you hit your spring network with a hammer and watch it oscillate - smaller eigenvalues corresponding to lower frequency "bulk" modes, and larger eigenvalues corresponding to higher frequency oscillations. You want the eigenvalue corresponding to the second smallest eigenvalue, which will be like the second mode in a drum, with a positive clustered together, and negative part clustered together.
Now there is some confusion in the comments about whether to use the largest or smallest eigenvalue, and it is because the laplacian in the paper linked there by dave is slightly different, being the identity minus your laplacian. So there they want the largest ones, whereas you want the smallest. The clustering in the paper is also a bit more advanced, and better, but not as easy to implement.
Here is your code, modified to work:
load rings.mat
[D, N] = size(X); % data stored in X
%initial plot data
figure; hold on;
for i=1:N,
plot(X(1,i), X(2,i),'o');
end
% perform spectral clustering
W = zeros(N,N);
D = zeros(N,N);
sigma = 0.3; % <--- Changed to be smaller
for i=1:N,
for j=1:N,
xixj2 = (X(1,i)-X(1,j))^2 + (X(2,i)-X(2,j))^2 ;
W(i,j) = exp( -1*xixj2 / (2*sigma^2) ) ; % compute weight here
% if (i==j)
% W(i,j)=0;
% end;
end;
D(i,i) = sum(W(i,:)) ;
end;
L = D - W ;
normL = D^-0.5*L*D^-0.5;
[u,s,v] = svd(normL);
% New code below this point
cluster1 = find(u(:,end-1) >= 0);
cluster2 = find(u(:,end-1) < 0);
figure
plot(X(1,cluster1),X(2,cluster1),'.b')
hold on
plot(X(1,cluster2),X(2,cluster2),'.r')
hold off
title(sprintf('sigma=%d',sigma))
Here is the result:
Now notice that I changed sigma to be smaller - from 1.0 to 0.3. When I left it at 1.0, I got the following result:
which I assume is because with sigma=1, the points in the inner cluster were able to "pull" on the outer cluster (which they are about distance 1 away from) enough so that it was more energetically favorable to split both circles in half like a solid vibrating drum, rather than have two different circles.
This question already has answers here:
Random numbers that add to 100: Matlab
(4 answers)
Closed 7 years ago.
I am looking how to pick 10 positive non-zero elements in 1x10 array randomly whose sum is 1
Example :
A=[0.0973 0.1071 0.0983 0.0933 0.1110 0.0942 0.1062 0.0970 0.0981 0.0974]
Note: If we sum the elements in above matrix it will be 1. I need matlab to generate a matrix like this randomly
Try using Roger's fex submission: http://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum
Here is a copy of the content of the file (in case the link dies).
All the credit obviously goes to the original poster Roger Stafford:
function [x,v] = randfixedsum(n,m,s,a,b)
% [x,v] = randfixedsum(n,m,s,a,b)
%
% This generates an n by m array x, each of whose m columns
% contains n random values lying in the interval [a,b], but
% subject to the condition that their sum be equal to s. The
% scalar value s must accordingly satisfy n*a <= s <= n*b. The
% distribution of values is uniform in the sense that it has the
% conditional probability distribution of a uniform distribution
% over the whole n-cube, given that the sum of the x's is s.
%
% The scalar v, if requested, returns with the total
% n-1 dimensional volume (content) of the subset satisfying
% this condition. Consequently if v, considered as a function
% of s and divided by sqrt(n), is integrated with respect to s
% from s = a to s = b, the result would necessarily be the
% n-dimensional volume of the whole cube, namely (b-a)^n.
%
% This algorithm does no "rejecting" on the sets of x's it
% obtains. It is designed to generate only those that satisfy all
% the above conditions and to do so with a uniform distribution.
% It accomplishes this by decomposing the space of all possible x
% sets (columns) into n-1 dimensional simplexes. (Line segments,
% triangles, and tetrahedra, are one-, two-, and three-dimensional
% examples of simplexes, respectively.) It makes use of three
% different sets of 'rand' variables, one to locate values
% uniformly within each type of simplex, another to randomly
% select representatives of each different type of simplex in
% proportion to their volume, and a third to perform random
% permutations to provide an even distribution of simplex choices
% among like types. For example, with n equal to 3 and s set at,
% say, 40% of the way from a towards b, there will be 2 different
% types of simplex, in this case triangles, each with its own
% area, and 6 different versions of each from permutations, for
% a total of 12 triangles, and these all fit together to form a
% particular planar non-regular hexagon in 3 dimensions, with v
% returned set equal to the hexagon's area.
%
% Roger Stafford - Jan. 19, 2006
% Check the arguments.
if (m~=round(m))|(n~=round(n))|(m<0)|(n<1)
error('n must be a whole number and m a non-negative integer.')
elseif (s<n*a)|(s>n*b)|(a>=b)
error('Inequalities n*a <= s <= n*b and a < b must hold.')
end
% Rescale to a unit cube: 0 <= x(i) <= 1
s = (s-n*a)/(b-a);
% Construct the transition probability table, t.
% t(i,j) will be utilized only in the region where j <= i + 1.
k = max(min(floor(s),n-1),0); % Must have 0 <= k <= n-1
s = max(min(s,k+1),k); % Must have k <= s <= k+1
s1 = s - [k:-1:k-n+1]; % s1 & s2 will never be negative
s2 = [k+n:-1:k+1] - s;
w = zeros(n,n+1); w(1,2) = realmax; % Scale for full 'double' range
t = zeros(n-1,n);
tiny = 2^(-1074); % The smallest positive matlab 'double' no.
for i = 2:n
tmp1 = w(i-1,2:i+1).*s1(1:i)/i;
tmp2 = w(i-1,1:i).*s2(n-i+1:n)/i;
w(i,2:i+1) = tmp1 + tmp2;
tmp3 = w(i,2:i+1) + tiny; % In case tmp1 & tmp2 are both 0,
tmp4 = (s2(n-i+1:n) > s1(1:i)); % then t is 0 on left & 1 on right
t(i-1,1:i) = (tmp2./tmp3).*tmp4 + (1-tmp1./tmp3).*(~tmp4);
end
% Derive the polytope volume v from the appropriate
% element in the bottom row of w.
v = n^(3/2)*(w(n,k+2)/realmax)*(b-a)^(n-1);
% Now compute the matrix x.
x = zeros(n,m);
if m == 0, return, end % If m is zero, quit with x = []
rt = rand(n-1,m); % For random selection of simplex type
rs = rand(n-1,m); % For random location within a simplex
s = repmat(s,1,m);
j = repmat(k+1,1,m); % For indexing in the t table
sm = zeros(1,m); pr = ones(1,m); % Start with sum zero & product 1
for i = n-1:-1:1 % Work backwards in the t table
e = (rt(n-i,:)<=t(i,j)); % Use rt to choose a transition
sx = rs(n-i,:).^(1/i); % Use rs to compute next simplex coord.
sm = sm + (1-sx).*pr.*s/(i+1); % Update sum
pr = sx.*pr; % Update product
x(n-i,:) = sm + pr.*e; % Calculate x using simplex coords.
s = s - e; j = j - e; % Transition adjustment
end
x(n,:) = sm + pr.*s; % Compute the last x
% Randomly permute the order in the columns of x and rescale.
rp = rand(n,m); % Use rp to carry out a matrix 'randperm'
[ig,p] = sort(rp); % The values placed in ig are ignored
x = (b-a)*x(p+repmat([0:n:n*(m-1)],n,1))+a; % Permute & rescale x
return
we are working on a project and trying to get some results with KPCA.
We have a dataset (handwritten digits) and have taken the 200 first digits of each number so our complete traindata matrix is 2000x784 (784 are the dimensions).
When we do KPCA we get a matrix with the new low-dimensionality dataset e.g.2000x100. However we don't understand the result. Shouldn;t we get other matrices such as we do when we do svd for pca? the code we use for KPCA is the following:
function data_out = kernelpca(data_in,num_dim)
%% Checking to ensure output dimensions are lesser than input dimension.
if num_dim > size(data_in,1)
fprintf('\nDimensions of output data has to be lesser than the dimensions of input data\n');
fprintf('Closing program\n');
return
end
%% Using the Gaussian Kernel to construct the Kernel K
% K(x,y) = -exp((x-y)^2/(sigma)^2)
% K is a symmetric Kernel
K = zeros(size(data_in,2),size(data_in,2));
for row = 1:size(data_in,2)
for col = 1:row
temp = sum(((data_in(:,row) - data_in(:,col)).^2));
K(row,col) = exp(-temp); % sigma = 1
end
end
K = K + K';
% Dividing the diagonal element by 2 since it has been added to itself
for row = 1:size(data_in,2)
K(row,row) = K(row,row)/2;
end
% We know that for PCA the data has to be centered. Even if the input data
% set 'X' lets say in centered, there is no gurantee the data when mapped
% in the feature space [phi(x)] is also centered. Since we actually never
% work in the feature space we cannot center the data. To include this
% correction a pseudo centering is done using the Kernel.
one_mat = ones(size(K));
K_center = K - one_mat*K - K*one_mat + one_mat*K*one_mat;
clear K
%% Obtaining the low dimensional projection
% The following equation needs to be satisfied for K
% N*lamda*K*alpha = K*alpha
% Thus lamda's has to be normalized by the number of points
opts.issym=1;
opts.disp = 0;
opts.isreal = 1;
neigs = 30;
[eigvec eigval] = eigs(K_center,[],neigs,'lm',opts);
eig_val = eigval ~= 0;
eig_val = eig_val./size(data_in,2);
% Again 1 = lamda*(alpha.alpha)
% Here '.' indicated dot product
for col = 1:size(eigvec,2)
eigvec(:,col) = eigvec(:,col)./(sqrt(eig_val(col,col)));
end
[~, index] = sort(eig_val,'descend');
eigvec = eigvec(:,index);
%% Projecting the data in lower dimensions
data_out = zeros(num_dim,size(data_in,2));
for count = 1:num_dim
data_out(count,:) = eigvec(:,count)'*K_center';
end
we have read lots of papers but still cannot get the hand of kpca's logic!
Any help would be appreciated!
PCA Algorithm:
PCA data samples
Compute mean
Compute covariance
Solve
: Covariance matrix.
: Eigen Vectors of covariance matrix.
: Eigen values of covariance matrix.
With the first n-th eigen vectors you reduce the dimensionality of your data to the n dimensions. You can use this code for the PCA, it has an integraded example and it is simple.
KPCA Algorithm:
We choose a kernel function in you code this is specified by:
K(x,y) = -exp((x-y)^2/(sigma)^2)
in order to represent your data in a high dimensional space hopping that, in this space your data will be well represented for further porposes like classification or clustering whereas this task could be harder to be solved in the initial feature space. This trick is aslo known as "Kernel trick". Look figure.
[Step1] Constuct gram matrix
K = zeros(size(data_in,2),size(data_in,2));
for row = 1:size(data_in,2)
for col = 1:row
temp = sum(((data_in(:,row) - data_in(:,col)).^2));
K(row,col) = exp(-temp); % sigma = 1
end
end
K = K + K';
% Dividing the diagonal element by 2 since it has been added to itself
for row = 1:size(data_in,2)
K(row,row) = K(row,row)/2;
end
Here because the gram matrix is symetric the half of the values are computed and the final result is obtained by adding the computed so far gram matrix and its transpose. Finally, we divide by 2 as the comments mention.
[Step2] Normalize the kernel matrix
This is done by this part of your code:
K_center = K - one_mat*K - K*one_mat + one_mat*K*one_mat;
As the comments mention a pseudocentering procedure must be done. For an idea about the proof here.
[Step3] Solve the eigenvalue problem
For this task this part of the code is responsible.
%% Obtaining the low dimensional projection
% The following equation needs to be satisfied for K
% N*lamda*K*alpha = K*alpha
% Thus lamda's has to be normalized by the number of points
opts.issym=1;
opts.disp = 0;
opts.isreal = 1;
neigs = 30;
[eigvec eigval] = eigs(K_center,[],neigs,'lm',opts);
eig_val = eigval ~= 0;
eig_val = eig_val./size(data_in,2);
% Again 1 = lamda*(alpha.alpha)
% Here '.' indicated dot product
for col = 1:size(eigvec,2)
eigvec(:,col) = eigvec(:,col)./(sqrt(eig_val(col,col)));
end
[~, index] = sort(eig_val,'descend');
eigvec = eigvec(:,index);
[Step4] Change representaion of each data point
For this task this part of the code is responsible.
%% Projecting the data in lower dimensions
data_out = zeros(num_dim,size(data_in,2));
for count = 1:num_dim
data_out(count,:) = eigvec(:,count)'*K_center';
end
Look the details here.
PS: I encurage you to use code written from this author and contains intuitive examples.