we are working on a project and trying to get some results with KPCA.
We have a dataset (handwritten digits) and have taken the 200 first digits of each number so our complete traindata matrix is 2000x784 (784 are the dimensions).
When we do KPCA we get a matrix with the new low-dimensionality dataset e.g.2000x100. However we don't understand the result. Shouldn;t we get other matrices such as we do when we do svd for pca? the code we use for KPCA is the following:
function data_out = kernelpca(data_in,num_dim)
%% Checking to ensure output dimensions are lesser than input dimension.
if num_dim > size(data_in,1)
fprintf('\nDimensions of output data has to be lesser than the dimensions of input data\n');
fprintf('Closing program\n');
return
end
%% Using the Gaussian Kernel to construct the Kernel K
% K(x,y) = -exp((x-y)^2/(sigma)^2)
% K is a symmetric Kernel
K = zeros(size(data_in,2),size(data_in,2));
for row = 1:size(data_in,2)
for col = 1:row
temp = sum(((data_in(:,row) - data_in(:,col)).^2));
K(row,col) = exp(-temp); % sigma = 1
end
end
K = K + K';
% Dividing the diagonal element by 2 since it has been added to itself
for row = 1:size(data_in,2)
K(row,row) = K(row,row)/2;
end
% We know that for PCA the data has to be centered. Even if the input data
% set 'X' lets say in centered, there is no gurantee the data when mapped
% in the feature space [phi(x)] is also centered. Since we actually never
% work in the feature space we cannot center the data. To include this
% correction a pseudo centering is done using the Kernel.
one_mat = ones(size(K));
K_center = K - one_mat*K - K*one_mat + one_mat*K*one_mat;
clear K
%% Obtaining the low dimensional projection
% The following equation needs to be satisfied for K
% N*lamda*K*alpha = K*alpha
% Thus lamda's has to be normalized by the number of points
opts.issym=1;
opts.disp = 0;
opts.isreal = 1;
neigs = 30;
[eigvec eigval] = eigs(K_center,[],neigs,'lm',opts);
eig_val = eigval ~= 0;
eig_val = eig_val./size(data_in,2);
% Again 1 = lamda*(alpha.alpha)
% Here '.' indicated dot product
for col = 1:size(eigvec,2)
eigvec(:,col) = eigvec(:,col)./(sqrt(eig_val(col,col)));
end
[~, index] = sort(eig_val,'descend');
eigvec = eigvec(:,index);
%% Projecting the data in lower dimensions
data_out = zeros(num_dim,size(data_in,2));
for count = 1:num_dim
data_out(count,:) = eigvec(:,count)'*K_center';
end
we have read lots of papers but still cannot get the hand of kpca's logic!
Any help would be appreciated!
PCA Algorithm:
PCA data samples
Compute mean
Compute covariance
Solve
: Covariance matrix.
: Eigen Vectors of covariance matrix.
: Eigen values of covariance matrix.
With the first n-th eigen vectors you reduce the dimensionality of your data to the n dimensions. You can use this code for the PCA, it has an integraded example and it is simple.
KPCA Algorithm:
We choose a kernel function in you code this is specified by:
K(x,y) = -exp((x-y)^2/(sigma)^2)
in order to represent your data in a high dimensional space hopping that, in this space your data will be well represented for further porposes like classification or clustering whereas this task could be harder to be solved in the initial feature space. This trick is aslo known as "Kernel trick". Look figure.
[Step1] Constuct gram matrix
K = zeros(size(data_in,2),size(data_in,2));
for row = 1:size(data_in,2)
for col = 1:row
temp = sum(((data_in(:,row) - data_in(:,col)).^2));
K(row,col) = exp(-temp); % sigma = 1
end
end
K = K + K';
% Dividing the diagonal element by 2 since it has been added to itself
for row = 1:size(data_in,2)
K(row,row) = K(row,row)/2;
end
Here because the gram matrix is symetric the half of the values are computed and the final result is obtained by adding the computed so far gram matrix and its transpose. Finally, we divide by 2 as the comments mention.
[Step2] Normalize the kernel matrix
This is done by this part of your code:
K_center = K - one_mat*K - K*one_mat + one_mat*K*one_mat;
As the comments mention a pseudocentering procedure must be done. For an idea about the proof here.
[Step3] Solve the eigenvalue problem
For this task this part of the code is responsible.
%% Obtaining the low dimensional projection
% The following equation needs to be satisfied for K
% N*lamda*K*alpha = K*alpha
% Thus lamda's has to be normalized by the number of points
opts.issym=1;
opts.disp = 0;
opts.isreal = 1;
neigs = 30;
[eigvec eigval] = eigs(K_center,[],neigs,'lm',opts);
eig_val = eigval ~= 0;
eig_val = eig_val./size(data_in,2);
% Again 1 = lamda*(alpha.alpha)
% Here '.' indicated dot product
for col = 1:size(eigvec,2)
eigvec(:,col) = eigvec(:,col)./(sqrt(eig_val(col,col)));
end
[~, index] = sort(eig_val,'descend');
eigvec = eigvec(:,index);
[Step4] Change representaion of each data point
For this task this part of the code is responsible.
%% Projecting the data in lower dimensions
data_out = zeros(num_dim,size(data_in,2));
for count = 1:num_dim
data_out(count,:) = eigvec(:,count)'*K_center';
end
Look the details here.
PS: I encurage you to use code written from this author and contains intuitive examples.
Related
I'm generating 3d fractal noise in MATLAB using a variety of methods. It's working relatively well, but I'm having an issue where I see vertical striping artifacts in my noise. This happens regardless of what data type or resolution I use.
Edit: I figured it out. The solution is posted as an answer below. Thanks everyone for your thoughts and guidance!
expo = 2^6;
dims = [expo,expo,expo];
beta = -4.5;
render = randnd(beta, dims); % Create volumetric fractal
render = render - min(render); % Set floor to zero
render = render ./ max(render); % Set ceiling to one
%render = imbinarize(render); % BW Threshold option
render = render .* 255; % For greyscale
slicer = 1; % Turn on image slicer/saver
i = 0; % Page counter
format = '.png';
imagename = '___testDump/slice';
imshow(render(:,:,1),[0 255]); %Single test image
if slicer == 1
for c = 1:length(render)
i = i+1;
pagenumber = num2str(i);
filename = [imagename, pagenumber, format];
imwrite(uint8(render(:,:,i)),filename)
end
end
function X = randnd(beta,varargin)
seed = 999;
rng(seed); % Set seed
%% X = randnd(beta,varargin)
% Based on similar functions by Jon Yearsley and Hristo Zhivomirov
% Written by Marcin Konowalczyk
% Timmel Group # Oxford University
%% Parse the input
narginchk(0,Inf); nargoutchk(0,1);
if nargin < 2 || isempty(beta); beta = 0; end % Default to white noise
assert(isnumeric(beta) && isequal(size(beta),[1 1]),'''beta'' must be a number');
assert(-6 <= beta && beta <= 6,'''beta'' out of range'); % Put on reasonable bounds
%% Generate N-dimensional white noise with 'randn'
X = randn(varargin{:});
if isempty(X); return; end; % Usually happens when size vector contains zeros
% Squeeze prevents an error if X has more than one leading singleton dimension
% This is a slight deviation from the pure functionality of 'randn'
X = squeeze(X);
% Return if white noise is requested
if beta == 0; return; end;
%% Generate corresponding N-dimensional matrix of multipliers
N = size(X);
% Create matrix of multipliers (M) of X in the frequency domain
M = [];
for j = 1:length(N)
n = N(j);
if (rem(n,2)~=0) % if n is odd
% Nyquist frequency bin does not show up in odd-numbered fft
k = ifftshift(-(n-1)/2:(n-1)/2);
else
k = ifftshift(-n/2:n/2-1);
end
% Spectral multipliers
m = (k.^2)';
if isempty(M);
M = m;
else
% Create the permutation vector
M_perm = circshift(1:length(size(M))+1,[0 1]);
% Permute a singleton dimension to the beginning of M
M = permute(M,M_perm);
% Add m along the first dimension of M
M = bsxfun(#plus,M,m);
end
end
% Reverse M to match X (since new dimensions were being added form the left)
M = permute(M,length(size(M)):-1:1);
assert(isequal(size(M),size(X)),'Bad programming error'); % This should never occur
% Shape the amplitude multipliers by beta/4 which corresponds to shaping the power by beta
M = M.^(beta/4);
% Set the DC component to zero
M(1,1) = 0;
%% Multiply X by M in frequency domain
Xstd = std(X(:));
Xmean = mean(X(:));
X = real(ifftn(fftn(X).*M));
% Force zero mean unity standard deviation
X = X - mean(X(:));
X = X./std(X(:));
% Restore the standard deviation and mean from before the spectral shaping.
% This ensures the random sample from randn is truly random. After all, if
% the mean was always exactly zero it would not be all that random.
X = X + Xmean;
X = X.*Xstd;
end
Here is my solution:
My "min/max" code (lines 6 and 7) was bad. I wanted to divide all values in the matrix by the single largest value in the matrix so that all values would be between 0 and 1. Because I used max() improperly, I was stepping through the max value of each column and using that as my divisor; thus the vertical stripes.
In the end this is what my code looks like. X is the 3 dimensional matrix:
minVal = min(X,[],'all'); % Get the lowest value in the entire matrix
X = X - minVal; % Set min value to zero
maxVal = max(X,[],'all'); % Get the highest value in the entire matrix
X = X ./ maxVal; % Set max value to one
My question may be a simple one but I could not think of a logical explanation for my question:
When I use
rref(hilb(8)), rref(hilb(9)), rref(hilb(10)), rref(hilb(11))
it gives me the result that I expected, a unit matrix.
However when it comes to the
rref(hilb(12))
it does not give a nonsingular matrix as expected. I used Wolfram and it gives the unit matrix for the same case so I am sure that it should have given a unit matrix. There may be a round off error or something like that but then 1/11 or 1/7 have also some troublesome decimals
so why does Matlab behave like this when it comes to 12?
It indeed seems like a precision error. This makes sense as the determinant of Hilbert matrix of order n tends to 0 as n tends to infinity (see here). However, you can use rref with tol parameter:
[R,jb] = rref(A,tol)
and take tol to be very small to get more precise results. For example, rref(hilb(12),1e-20)
will give you identity matrix.
EDIT- more details regarding the role of the tol parameter.
The source code of rref is provided at the bottom of the answer. The tol is used when we search for a maximal element in absolute value in a certain part of a column, to find the pivot row.
% Find value and index of largest element in the remainder of column j.
[p,k] = max(abs(A(i:m,j))); k = k+i-1;
if (p <= tol)
% The column is negligible, zero it out.
A(i:m,j) = zeros(m-i+1,1);
j = j + 1;
If all the elements are smaller than tol in absolute value, the relevant part of the column is filled by zeros. This seems to be where the precision error for rref(hilb(12)) occurs. By reducing the tol we avoid this issue in rref(hilb(12),1e-20).
source code:
function [A,jb] = rref(A,tol)
%RREF Reduced row echelon form.
% R = RREF(A) produces the reduced row echelon form of A.
%
% [R,jb] = RREF(A) also returns a vector, jb, so that:
% r = length(jb) is this algorithm's idea of the rank of A,
% x(jb) are the bound variables in a linear system, Ax = b,
% A(:,jb) is a basis for the range of A,
% R(1:r,jb) is the r-by-r identity matrix.
%
% [R,jb] = RREF(A,TOL) uses the given tolerance in the rank tests.
%
% Roundoff errors may cause this algorithm to compute a different
% value for the rank than RANK, ORTH and NULL.
%
% Class support for input A:
% float: double, single
%
% See also RANK, ORTH, NULL, QR, SVD.
% Copyright 1984-2005 The MathWorks, Inc.
% $Revision: 5.9.4.3 $ $Date: 2006/01/18 21:58:54 $
[m,n] = size(A);
% Does it appear that elements of A are ratios of small integers?
[num, den] = rat(A);
rats = isequal(A,num./den);
% Compute the default tolerance if none was provided.
if (nargin < 2), tol = max(m,n)*eps(class(A))*norm(A,'inf'); end
% Loop over the entire matrix.
i = 1;
j = 1;
jb = [];
while (i <= m) && (j <= n)
% Find value and index of largest element in the remainder of column j.
[p,k] = max(abs(A(i:m,j))); k = k+i-1;
if (p <= tol)
% The column is negligible, zero it out.
A(i:m,j) = zeros(m-i+1,1);
j = j + 1;
else
% Remember column index
jb = [jb j];
% Swap i-th and k-th rows.
A([i k],j:n) = A([k i],j:n);
% Divide the pivot row by the pivot element.
A(i,j:n) = A(i,j:n)/A(i,j);
% Subtract multiples of the pivot row from all the other rows.
for k = [1:i-1 i+1:m]
A(k,j:n) = A(k,j:n) - A(k,j)*A(i,j:n);
end
i = i + 1;
j = j + 1;
end
end
% Return "rational" numbers if appropriate.
if rats
[num,den] = rat(A);
A=num./den;
end
I am constructing an adjacency list based on intensity difference of the pixels in an image.
The code snippet in Matlab is as follows:
m=1;
len = size(cur_label, 1);
for j=1:len
for k=1:len
if(k~=j) % avoiding diagonal elements
intensity_diff = abs(indx_intensity(j)-indx_intensity(k)); %intensity defference of two pixels.
if intensity_diff<=10 % difference thresholded by 10
adj_list(m, 1) = j; % storing the vertices of the edge
adj_list(m, 2) = k;
m = m+1;
end
end
end
end
y = sparse(adj_list(:,1),adj_list(:,2),1); % creating a sparse matrix from the adjacency list
How can I avoid these nasty nested for loops? If the image size is big, then its working just as disaster. If anyone have any solution, it would be a great help for me.
Regards
Ratna
I am assuming the input indx_intensity as a 1D array here. With that assumption, here's a vectorized approach with broadcasting/bsxfun -
%// Threshold parameter
thresh = 10;
%// Get elementwise differentiation between elements in indx_intensity
diffs = abs(bsxfun(#minus,indx_intensity(:),indx_intensity(:).')) %//'
%// Threshold the differentiations against the threshold, thus giving us a
%// 2D square matrix. Then, set the diagonal elements to zero to avoid them.
mask = diffs <= thresh;
mask(1:len+1:end) = 0;
%// Get the indices of the TRUE elements in the valid mask as final output.
[R,C] = find(mask);
adj_list_out = [C R];
I am attempting to simulate a 2D field of non-stationary, single variable spatial data based on a correlation function in such a way that I can tweak various parameters and observe the effects.
Specifically, I am trying to (quickly) generate non-stationary data following a varying Matérn covariance structure based on equation (9) in this paper:
http://arxiv.org/pdf/1411.3174.pdf
where R is Matérn or exponential covariance function, and Omega_s represents the covariance matrix at each point s. After making an n^2 x n^2 cell of 2x2 Omega matrices OMs, which specify the way the simulated random field will correlate at given locations, I use the function below to build the correlation matrix:
rho = zeros(n^2);
for i = 1:n^2 % Point 1
for j = 1:n^2 % Point 2
if i == j
rho(i,j) = 1; % Correlation at lag 0 = 1
else
x1 = Crd(i,1); y1 = Crd(i,2); % Crd is n^2 by 2 matrix of
x2 = Crd(j,1); y2 = Crd(j,2); % coordinates in 2D
% Lag vector
h = [x1-x2;y1-y2];
% Correlaton Function:
R = exp(-norm(h)^alpha);
Q = h'*(((OMs{x1,y1}+OMs{x2,y2})/2) \ h);
OMEGA1 = (det(OMs{x1,y1})^(0.25))*det(OMs{x2,y2})^(0.25);
OMEGA2 = det((OMs{x1,y1}+OMs{x2,y2})/2)^(-0.5);
rho(i,j) = OMEGA1*OMEGA2*R*Q^(0.5);
end
end
end
Unfortunately, for moderately sized fields - even 40 by 40 - this takes a lot of time. I need to run this for many different parameters - is there some way it can be done faster, without those loops?
I have following code for calculating PCA in Matlab:
train_out = train';
test_out = test';
% subtract off the mean for each dimension
mn = mean(train_out,2);
train_out = train_out - repmat(mn,1,train_size);
test_out = test_out - repmat(mn,1,test_size);
% calculate the covariance matrix
covariance = 1 / (train_size-1) * train_out * train_out';
% find the eigenvectors and eigenvalues
[PC, V] = eig(covariance);
% extract diagonal of matrix as vector
V = diag(V);
% sort the variances in decreasing order
[junk, rindices] = sort(-1*V);
V = V(rindices);
PC = PC(:,rindices);
% project the original data set
out = PC' * train_out;
train_out = out';
out = PC' * test_out;
test_out = out';
Train and test matrix have observations in rows and feature variables in columns. When I perform classification on original data (without PCA) I get much better results than with PCA, even when I keep all dimensions. When I tried doing PCA directly on the whole dataset (train + test) I noticed correlation between these new principal components and previous ones are either near 1 or near -1 which I find strange. I am probably doing something wrong but just can't figure it out.
The code is correct, however using princomp function my be easier:
train_out=train; % save original data
test_out=test;
mn = mean(train_out);
train_out = bsxfun(#minus,train_out,mn); % substract mean
test_out = bsxfun(#minus,test_out,mn);
[coefs,scores,variances] = princomp(train_out,'econ'); % PCA
pervar = cumsum(variances) / sum(variances);
dims = max(find(pervar < var_frac)); % var_frac - e.g. 0.99 - fraction of variance explained
train_out = train_out*coefs(:,1:dims); % dims - keep this many dimensions
test_out = test_out*coefs(:,1:dims); % result is in train_out and test_out