Why does the first principal component show least difference in my PCA? - matlab

I am using PCA on a set of facial images in MatLab.
Creating an average face and randomizing others are working fine.
In my function vectorComparison I want to see the difference on each principal component vector when using the standard deviation. But when I use eig_face_index = 1 I see less of a difference than when I use 2, or 3 etc.
The higher indexes also seem to add more color, which could be due to noise in the eigenfaces, as I am using RGB space.
Why does my initial vector show the least difference. Shouldn't it be the other way around?
Here is all of the code I am using:
main.m
clear;clc;close all;
[imvecs,img] = loadImages();
meanval = meanValue(imvecs);
[T, D] = covarianceMatrix(imvecs, meanval);
[eigvecs, eigvals] = findEigVecs(imvecs, T, D);
eigenfaces = createEigenFaces(eigvecs, imvecs, img);
%%
[mean_image] = createAverageFace(meanval, img);
%%
[stdev_vec] = createRandomFace(eigvals, eigvecs, imvecs, meanval, img);
%%
vectorComparison(meanval, eigvecs, stdev_vec, img, mean_image);
loadImages.m
function [imvecs,img] = loadImages()
images = dir('D:My\Path\*.png');
imgPath = 'D:My\Path\';
img=imread([imgPath images(1).name]);
n=length(images);
for i = 1:n
img = imread([imgPath images(i).name]);
imvecs{i} = double(img(:));
end
return
meanValue.m
function meanval = meanValue(imvecs, imageNr)
%Creates the mean value from our images.
sumvec=imvecs{1};
for i = 2:(size(imvecs,2))
sumvec = sumvec + imvecs{i};
end
meanval = sumvec ./(size(imvecs,2));
return
covarianceMatrix.m
function [T, D] = covarianceMatrix(imvecs, meanval)
D = [];
for i = 1:size(imvecs,2),
diff = imvecs{i} - meanval;
D = [D, diff];
end
%Dimensionality reduction
T = (D' * D) ./ (size(imvecs,2));
return
findEigVecs.m
function [eigvecs, eigvals] = findEigVecs(imvecs, T, D)
[U,eigvals,V] = svd( T );
eigvecs = [];
for i = 1:size(imvecs,2),
eigvec = D * U(:,i);
eigvec = eigvec ./ sum(eigvec);
eigvecs = [eigvecs, eigvec];
end
return
createEigenFaces.m
function [eigenfaces] = createEigenFaces(eigvecs, imvecs, img)
for i = 1:size(imvecs,2),
eigface = reshape(eigvecs( : , i), size(img));
eigface = eigface - min(min(min((eigface))));
eigface = eigface ./ max(max(max((eigface))));
eigenfaces{i}=eigface;
%figure;imagesc(eigface);
end
return
createAverageFace.m
function [mean_image] = createAverageFace(meanval, img)
mean_image = reshape(meanval, size(img));
figure;imagesc(mean_image./255);
title('Average Face')
return
createRandomFace.m
function [stdev_vec] = createRandomFace(eigvals, eigvecs, imvecs, meanval, img)
stdev_vec = sqrt(diag(eigvals));
t = (100 * rand(size(imvecs,2),1) - 50) .* stdev_vec;
new_face1 = meanval + (eigvecs * t);
new_face1 = reshape(new_face1, size(img));
figure;imagesc(new_face1./255);
title('Random Face')
return
vectorComparison.m
function [] = vectorComparison(meanval, eigvecs, stdev_vec, img, mean_image)
t = zeros(17,1);
eig_face_index = 1;
t(eig_face_index) = 1000;
t = t.*stdev_vec;
new_face1 = meanval + (eigvecs * t);
new_face1 = reshape(new_face1, size(img));
new_face2 = meanval - (eigvecs * t);
new_face2 = reshape(new_face2, size(img));
figure;
title('PCA Comparison')
subplot(3,1,1), subimage(new_face1./255)
subplot(3,1,2), subimage(mean_image./255)
subplot(3,1,3), subimage(new_face2./255)
return

I found what was wrong here.
In my function findEigVals I make each vector a unit vector (length 1) by deviding by the sum of the vector itself. This is possible as long as the content of the vector does range positive entries.
However, as we cannot know if we have a vector in pos or neg direction (both equally valid), we cannot use sum here.
Instead we need to replace this with norm, matlab's way of stating normalization.
function [eigvecs, eigvals] = findEigVecs(imvecs, T, D)
[U,eigvals,V] = svd( T );
eigvecs = [];
for i = 1:size(imvecs,2),
eigvec = D * U(:,i);
eigvec = eigvec ./ norm(eigvec);
eigvecs = [eigvecs, eigvec];
end
return
If anyone is using a version of the code above, note that the random face function will give too strong values in rgb-space.
Replace the createRandomFace with the following:
stdev_vec = sqrt(diag(eigvals));
min_range = 0;
max_range = 2;
t = ((max_range - min_range)*rand(n,1)) .* stdev_vec + min_range;
new_face1 = meanval + (eigvecs * t);
new_face1 = reshape(new_face1, size(img));
figure;imagesc(new_face1./255);

Related

How to correct grid search?

Trying to find the optimal hyperparameters for my svm model using a grid search, but it simply returns 1 for the hyperparameters.
function evaluations = inner_kfold_trainer(C,q,k,features_xy,labels)
features_xy_flds = kdivide(features_xy, k);
labels_flds = kdivide(labels, k);
evaluations = zeros(k,3);
for i = 1:k
fprintf('Fold %i of %i\n',i,k);
train_data = cell2mat(features_xy_flds(1:end ~= i));
train_labels = cell2mat(labels_flds(1:end ~= i));
test_data = cell2mat(features_xy_flds(i));
test_labels = cell2mat(labels_flds(i));
%AU1
train_labels = train_labels(:,1);
test_labels = test_labels(:,1);
[k,~] = size(test_labels);
%train
sv = fitcsvm(train_data,train_labels, 'KernelFunction','polynomial', 'PolynomialOrder',q,'BoxConstraint',C);
sv.predict(test_data);
%Calculate evaluative measures
%svm_outputs = zeros(k,1);
sv_predictions = sv.predict(test_data);
[precision,recall,F1] = evaluation(sv_predictions,test_labels);
evaluations(i,1) = precision;
evaluations(i,2) = recall;
evaluations(i,3) = F1;
end
save('eval.mat', 'evaluations');
end
an inner-fold cross validation function
and below the grid function where something seems to be going wrong
function [q,C] = grid_search(features_xy,labels,k)
% n x n grid
n = 3;
q_grid = linspace(1,19,n);
C_grid = linspace(1,59,n);
tic
evals = zeros(n,n,3);
for i = 1:n
for j = 1:n
fprintf('## i=%i, j=%i ##\n', i, j);
svm_results = inner_kfold_trainer(C_grid(i), q_grid(j),k,features_xy,labels);
evals(i,j,:) = mean(svm_results(:,:));
% precision only
%evals(i,j,:) = max(svm_results(:,1));
toc
end
end
f = evals;
% retrieving the best value of the hyper parameters, to use in the outer
% fold
[M1,I1] = max(f);
[~,I2] = max(M1(1,1,:));
index = I1(:,:,I2);
C = C_grid(index(1))
q = q_grid(index(2))
end
When I run grid_search(features_xy,labels,8) for example, I get C=1 and q=1, for any k(the no. of folds) value. Also features_xy is a 500*98 matrix.

My approximate entropy script for MATLAB isn't working

This is my Approximate entropy Calculator in MATLAB. https://en.wikipedia.org/wiki/Approximate_entropy
I'm not sure why it isn't working. It's returning a negative value.Can anyone help me with this? R1 being the data.
FindSize = size(R1);
N = FindSize(1);
% N = input ('insert number of data values');
%if you want to put your own N in, take away the % from the line above
and
%insert the % before the N = FindSize(1)
%m = input ('insert m: integer representing length of data, embedding
dimension ');
m = 2;
%r = input ('insert r: positive real number for filtering, threshold
');
r = 0.2*std(R1);
for x1= R1(1:N-m+1,1)
D1 = pdist2(x1,x1);
C11 = (D1 <= r)/(N-m+1);
c1 = C11(1);
end
for i1 = 1:N-m+1
s1 = sum(log(c1));
end
phi1 = (s1/(N-m+1));
for x2= R1(1:N-m+2,1)
D2 = pdist2(x2,x2);
C21 = (D2 <= r)/(N-m+2);
c2 = C21(1);
end
for i2 = 1:N-m+2
s2 = sum(log(c2));
end
phi2 = (s2/(N-m+2));
Ap = phi1 - phi2;
Apen = Ap(1)
Following the documentation provided by the Wikipedia article, I developed this small function that calculates the approximate entropy:
function res = approximate_entropy(U,m,r)
N = numel(U);
res = zeros(1,2);
for i = [1 2]
off = m + i - 1;
off_N = N - off;
off_N1 = off_N + 1;
x = zeros(off_N1,off);
for j = 1:off
x(:,j) = U(j:off_N+j);
end
C = zeros(off_N1,1);
for j = 1:off_N1
dist = abs(x - repmat(x(j,:),off_N1,1));
C(j) = sum(~any((dist > r),2)) / off_N1;
end
res(i) = sum(log(C)) / off_N1;
end
res = res(1) - res(2);
end
I first tried to replicate the computation shown the article, and the result I obtain matches the result shown in the example:
U = repmat([85 80 89],1,17);
approximate_entropy(U,2,3)
ans =
-1.09965411068114e-05
Then I created another example that shows a case in which approximate entropy produces a meaningful result (the entropy of the first sample is always less than the entropy of the second one):
% starting variables...
s1 = repmat([10 20],1,10);
s1_m = mean(s1);
s1_s = std(s1);
s2_m = 0;
s2_s = 0;
% datasample will not always return a perfect M and S match
% so let's repeat this until equality is achieved...
while ((s1_m ~= s2_m) && (s1_s ~= s2_s))
s2 = datasample([10 20],20,'Replace',true,'Weights',[0.5 0.5]);
s2_m = mean(s2);
s2_s = std(s2);
end
m = 2;
r = 3;
ae1 = approximate_entropy(s1,m,r)
ae2 = approximate_entropy(s2,m,r)
ae1 =
0.00138568170752751
ae2 =
0.680090884817465
Finally, I tried with your sample data:
fid = fopen('O1.txt','r');
U = cell2mat(textscan(fid,'%f'));
fclose(fid);
m = 2;
r = 0.2 * std(U);
approximate_entropy(U,m,r)
ans =
1.08567461184858

Weird phenomenon when converting RGB to HSV manually in Matlab

I have written a small Matlab funcion which takes an image in RGB and converts it to HSV according to the conversion formulas found here.
The problem is that when I apply this to a color spectrum there is a cut in the spectrum and some values are wrong, see images (to make the comparison easier I have used the internal hsv2rgb() function to convert back to RGB. This does not happen with Matlabs own function rgb2hsv() but I can not find what I have done wrong.
This is my function
function [ I_HSV ] = RGB2HSV( I_RGB )
%UNTITLED3 Summary of this function goes here
% Detailed explanation goes here
[MAX, ind] = max(I_RGB,[],3);
if max(max(MAX)) > 1
I_r = I_RGB(:,:,1)/255;
I_g = I_RGB(:,:,2)/255;
I_b = I_RGB(:,:,3)/255;
MAX = max(cat(3,I_r, I_g, I_b),[],3);
else
I_r = I_RGB(:,:,1);
I_g = I_RGB(:,:,2);
I_b = I_RGB(:,:,3);
end
MIN = min(cat(3,I_r, I_g, I_b),[],3);
d = MAX - MIN;
I_V = MAX;
I_S = (MAX - MIN) ./ MAX;
I_H = zeros(size(I_V));
a = 1/6*mod(((I_g - I_b) ./ d),1);
b = 1/6*(I_b - I_r) ./ d + 1/3;
c = 1/6*(I_r - I_g) ./ d + 2/3;
H = cat(3, a, b, c);
for m=1:size(H,1);
for n=1:size(H,2);
if d(m,n) == 0
I_H(m,n) = 0;
else
I_H(m,n) = H(m,n,ind(m,n));
end
end
end
I_HSV = cat(3,I_H,I_S,I_V);
end
Original spectrum
Converted spectrum
The error was in my simplification of the calculations of a, b, and c. Changing it to the following solved the problem.
function [ I_HSV ] = RGB2HSV( I_RGB )
%UNTITLED3 Summary of this function goes here
% Detailed explanation goes here
[MAX, ind] = max(I_RGB,[],3);
if max(max(MAX)) > 1
I_r = I_RGB(:,:,1)/255;
I_g = I_RGB(:,:,2)/255;
I_b = I_RGB(:,:,3)/255;
MAX = max(cat(3,I_r, I_g, I_b),[],3);
else
I_r = I_RGB(:,:,1);
I_g = I_RGB(:,:,2);
I_b = I_RGB(:,:,3);
end
MIN = min(cat(3,I_r, I_g, I_b),[],3);
D = MAX - MIN;
I_V = MAX;
I_S = D ./ MAX;
I_H = zeros(size(I_V));
a = 1/6*mod(((I_g - I_b) ./ D),6);
b = 1/6*((I_b - I_r) ./ D + 2);
c = 1/6*((I_r - I_g) ./ D + 4);
H = cat(3, a, b, c);
for m=1:size(H,1);
for n=1:size(H,2);
if D(m,n) == 0
I_H(m,n) = 0;
else
I_H(m,n) = H(m,n,ind(m,n));
end
if MAX(m,n) == 0
S(m,n) = 0;
end
end
end
I_HSV = cat(3,I_H,I_S,I_V);
end

How to show 40 gabor filter in matlab

can someone help me how to show gabor filter in matlab, i can show it but its not what i want. this is my code :
[Gf,gabout] = gaborfilter1(B,sx,sy,f,theta(j));
G{m,n,i,j} = Gf;
and this is gabor filter class:
function [Gf,gabout] = gaborfilter(I,Sx,Sy,f,theta);
if isa(I,'double')~=1
I = double(I);
end
for x = -fix(Sx):fix(Sx)
for y = -fix(Sy):fix(Sy)
xPrime = x * cos(theta) + y * sin(theta);
yPrime = y * cos(theta) - x * sin(theta);
Gf(fix(Sx)+x+1,fix(Sy)+y+1) = exp(-.5*((xPrime/Sx)^2+(yPrime/Sy)^2))*cos(2*pi*f*xPrime);
end
end
Imgabout = conv2(I,double(imag(Gf)),'same');
Regabout = conv2(I,double(real(Gf)),'same');
gabout = sqrt(Imgabout.*Imgabout + Regabout.*Regabout);
Then, I imshow with this code:
imshow(G{m,n,i,j},[]);
and the results :
But i want this result, can someone help me how to slove this?
Use the following function. I hope this is useful.
----------------------------------------------------------------
gfs = GaborFilter(51,0.45,0.05,6,4);
n=0;
for s=1:6
for d=1:4
n=n+1;
subplot(6,4,n)
imshow(real(squeeze(gfs(s,d,:,:))),[])
end
end
Sample Image
----------------------------------------------------------------
function gfs = GaborFilter(winLen,uh,ul,S,D)
% gfs(SCALE, DIRECTION, :, :)
winLen = winLen + mod(winLen, 2) -1;
x0 = (winLen + 1)/2;
y0 = x0;
if S==1
a = 1;
su = uh/sqrt(log(4));
sv = su;
else
a = (uh/ul)^(1/(S-1));
su = (a-1)*uh/((a+1)*sqrt(log(4)));
if D==1
tang = 1;
else
tang = tan(pi/(2*D));
end
sv = tang * (uh - log(4)*su^2/uh)/sqrt(log(4) - (log(4)*su/uh)^2);
end
sx = 1/(2*pi*su);
sy = 1/(2*pi*sv);
coef = 1/(2*pi*sx*sy);
gfs = zeros(S, D, winLen, winLen);
for d = 1:D
theta = (d-1)*pi/D;
for s = 1:S
scale = a^(-(s-1));
gab = zeros(winLen);
for x = 1:winLen
for y = 1:winLen
X = scale * ((x-x0)*cos(theta) + (y-y0)*sin(theta));
Y = scale * (-(x-x0)*sin(theta) + (y-y0)*cos(theta));
gab(x, y) = -0.5 * ( (X/sx).^2 + (Y/sy).^2 ) + (2*pi*1j*uh)*X ;
end
end
gfs(s, d, :, :) = scale * coef * exp(gab);
end
end
Replace the "cos" component by complex part->complex(0, (2*pi*f*xprime)) ans also multiply equation by scaling factor of (1/sqrt(2*Sy*Sx)).

Recomposing vector input to algorithm from matrix output

I've written some code to implement an algorithm that takes as input a vector q of real numbers, and returns as an output a complex matrix R. The Matlab code below produces a plot showing the input vector q and the output matrix R.
Given only the complex matrix output R, I would like to obtain the input vector q. Can I do this using least-squares optimization? Since there is a recursive running sum in the code (rs_r and rs_i), the calculation for a column of the output matrix is dependent on the calculation of the previous column.
Perhaps a non-linear optimization can be set up to recompose the input vector q from the output matrix R?
Looking at this in another way, I've used an algorithm to compute a matrix R. I want to run the algorithm "in reverse," to get the input vector q from the output matrix R.
If there is no way to recompose the starting values from the output, thereby treating the problem as a "black box," then perhaps the mathematics of the model itself can be used in the optimization? The program evaluates the following equation:
The Utilde(tau, omega) is the output matrix R. The tau (time) variable comprises the columns of the response matrix R, whereas the omega (frequency) variable comprises the rows of the response matrix R. The integration is performed as a recursive running sum from tau = 0 up to the current tau timestep.
Here are the plots created by the program posted below:
Here is the full program code:
N = 1001;
q = zeros(N, 1); % here is the input
q(1:200) = 55;
q(201:300) = 120;
q(301:400) = 70;
q(401:600) = 40;
q(601:800) = 100;
q(801:1001) = 70;
dt = 0.0042;
fs = 1 / dt;
wSize = 101;
Glim = 20;
ginv = 0;
R = get_response(N, q, dt, wSize, Glim, ginv); % R is output matrix
rows = wSize;
cols = N;
figure; plot(q); title('q value input as vector');
ylim([0 200]); xlim([0 1001])
figure; imagesc(abs(R)); title('Matrix output of algorithm')
colorbar
Here is the function that performs the calculation:
function response = get_response(N, Q, dt, wSize, Glim, ginv)
fs = 1 / dt;
Npad = wSize - 1;
N1 = wSize + Npad;
N2 = floor(N1 / 2 + 1);
f = (fs/2)*linspace(0,1,N2);
omega = 2 * pi .* f';
omegah = 2 * pi * f(end);
sigma2 = exp(-(0.23*Glim + 1.63));
sign = 1;
if(ginv == 1)
sign = -1;
end
ratio = omega ./ omegah;
rs_r = zeros(N2, 1);
rs_i = zeros(N2, 1);
termr = zeros(N2, 1);
termi = zeros(N2, 1);
termr_sub1 = zeros(N2, 1);
termi_sub1 = zeros(N2, 1);
response = zeros(N2, N);
% cycle over cols of matrix
for ti = 1:N
term0 = omega ./ (2 .* Q(ti));
gamma = 1 / (pi * Q(ti));
% calculate for the real part
if(ti == 1)
Lambda = ones(N2, 1);
termr_sub1(1) = 0;
termr_sub1(2:end) = term0(2:end) .* (ratio(2:end).^-gamma);
else
termr(1) = 0;
termr(2:end) = term0(2:end) .* (ratio(2:end).^-gamma);
rs_r = rs_r - dt.*(termr + termr_sub1);
termr_sub1 = termr;
Beta = exp( -1 .* -0.5 .* rs_r );
Lambda = (Beta + sigma2) ./ (Beta.^2 + sigma2); % vector
end
% calculate for the complex part
if(ginv == 1)
termi(1) = 0;
termi(2:end) = (ratio(2:end).^(sign .* gamma) - 1) .* omega(2:end);
else
termi = (ratio.^(sign .* gamma) - 1) .* omega;
end
rs_i = rs_i - dt.*(termi + termi_sub1);
termi_sub1 = termi;
integrand = exp( 1i .* -0.5 .* rs_i );
if(ginv == 1)
response(:,ti) = Lambda .* integrand;
else
response(:,ti) = (1 ./ Lambda) .* integrand;
end
end % ti loop
No, you cannot do so unless you know the "model" itself for this process. If you intend to treat the process as a complete black box, then it is impossible in general, although in any specific instance, anything can happen.
Even if you DO know the underlying process, then it may still not work, as any least squares estimator is dependent on the starting values, so if you do not have a good guess there, it may converge to the wrong set of parameters.
It turns out that by using the mathematics of the model, the input can be estimated. This is not true in general, but for my problem it seems to work. The cumulative integral is eliminated by a partial derivative.
N = 1001;
q = zeros(N, 1);
q(1:200) = 55;
q(201:300) = 120;
q(301:400) = 70;
q(401:600) = 40;
q(601:800) = 100;
q(801:1001) = 70;
dt = 0.0042;
fs = 1 / dt;
wSize = 101;
Glim = 20;
ginv = 0;
R = get_response(N, q, dt, wSize, Glim, ginv);
rows = wSize;
cols = N;
cut_val = 200;
imagLogR = imag(log(R));
Mderiv = zeros(rows, cols-2);
for k = 1:rows
val = deriv_3pt(imagLogR(k,:), dt);
val(val > cut_val) = 0;
Mderiv(k,:) = val(1:end-1);
end
disp('Running iteration');
q0 = 10;
q1 = 500;
NN = cols - 2;
qout = zeros(NN, 1);
for k = 1:NN
data = Mderiv(:,k);
qout(k) = fminbnd(#(q) curve_fit_to_get_q(q, dt, rows, data),q0,q1);
end
figure; plot(q); title('q value input as vector');
ylim([0 200]); xlim([0 1001])
figure;
plot(qout); title('Reconstructed q')
ylim([0 200]); xlim([0 1001])
Here are the supporting functions:
function output = deriv_3pt(x, dt)
% Function to compute dx/dt using the 3pt symmetrical rule
% dt is the timestep
N = length(x);
N0 = N - 1;
output = zeros(N0, 1);
denom = 2 * dt;
for k = 2:N0
output(k - 1) = (x(k+1) - x(k-1)) / denom;
end
function sse = curve_fit_to_get_q(q, dt, rows, data)
fs = 1 / dt;
N2 = rows;
f = (fs/2)*linspace(0,1,N2); % vector for frequency along cols
omega = 2 * pi .* f';
omegah = 2 * pi * f(end);
ratio = omega ./ omegah;
gamma = 1 / (pi * q);
termi = ((ratio.^(gamma)) - 1) .* omega;
Error_Vector = termi - data;
sse = sum(Error_Vector.^2);