I am implementing Expectation Maximization algorithm in matlab. Algorithm is operating on 214096 x 2 data matrix and While computing probabilities, there is multiplication of ( 214096 x 2 ) * (2 x 2) * ( 2 x 214096 ) matrices, which is resulting in error of out of memory in matlab. Is there a way to fix this problem?
Equation
Matlab Code:
enter image description here D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components ( Also number of clusters )
% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi, probabilities of clusters, apriori probability of cluster
[idx,mu] = kmeans(X,K); % initial means of the components, theta is mu and variance
% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
tempmat = X(idx==k,:);
sigma(:,:,k) = cov(tempmat); % Sigma j
sigma_det(k) = det(sigma(:,:,k));
end
% calculate x-mu
for k=1: K
check=length( X(idx == k,1))
for lidx = 1: length( X(idx == k,1))
cidx = find( idx == k) ;
Xmu(cidx(lidx),:) = X(cidx(lidx),:) - mu(k,:); %( x-mu ) calculation on cluster level
end
end
% compute P(Cj|x; theta(t)), and take log to simplified calculation
%Eq 14.14 denominator
denom = 0;
for k=1:K
calc_sigma_1_2 = sigma_det(k)^(-1/2);
calc_x_mu = Xmu(idx == k,:);
calc_sigma_inv = inv(sigma(:,:,k));
calc_x_mu_tran = calc_x_mu.';
factor = calc_sigma_1_2 * exp (-1/2 * calc_x_mu * calc_sigma_inv * calc_x_mu_tran ) * p(k);
denom = denom + factor;
end
for k =1:K
calc_sigma_1_2 = sigma_det(k)^(-1/2);
calc_x_mu = Xmu(idx == k,:);
calc_sigma_inv = inv(sigma(:,:,k));
calc_x_mu_tran = calc_x_mu.';
factor = calc_sigma_1_2 * exp (-1/2 * calc_x_mu_tran * calc_sigma_inv * calc_x_mu ) * p(k);
pdf(k) = factor/denom;
end
%%%% Equation 14.14 ends
It seems that you tried to apply vector based equation by simply substituting vector for matrix, this is not how it works
(x - mu).' * Inv(sigma) * (x-mu)
is supposed to be mahalanobis norm of (x-mu), and you want to obtain this value per each row of matrix X, thus
(X - mu).' * Inv(sigma) =: A <- this is ok, this results in N x d matrix
and now you have to do point-wise multiplication of A with (X - mu), not a dot product, and finally sum over second axis (columns), this way you end up with N element vector, each containing a mahalanobis norm of corresponding row from X.
Related
I tried to implement the cost function of the Dual Absolute Quadric in Matlab according to the following equation mentioned in this paper, with this data.
My problem is that the results didn't converge.
The code is down.
main code
%---------------------
% clear and close all
%---------------------
clearvars
close all
clc
%---------------------
% Data type long
%---------------------
format long g
%---------------------
% Read data
%---------------------
load('data.mat')
%---------------------------
% Display The Initial Guess
%---------------------------
disp('=======================================================')
disp('Initial Intrinsic parameters: ');
disp(A);
disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
%=========================================================================
DualAbsoluteQuadric = Optimize(A,#DAQ);
%---------------------
% Display The Results
%---------------------
disp('=======================================================')
disp('Dual Absoute Quadric cost function: ');
disp(DualAbsoluteQuadric);
disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
The optimization function used is:
function output = Optimize(A,func)
%------------------------------
options = optimoptions('lsqnonlin','Algorithm','levenberg-marquardt',...
'Display','iter','FunctionTolerance',1e-16,'Tolx',1e-16,...
'MaxFunctionEvaluations', 1000, 'MaxIterations',39,...
'OptimalityTolerance',1e-16);
%------------------------------
% NonLinear Optimization
%------------------------------
output_line = lsqnonlin(func,[A(1,1), A(1,3), A(2,2), A(2,3), A(1,2)],...
[],[],options);
%------------------------------------------------------------------------
output = Reshape(output_line);
The Dual Absolute Quadric Function:
function cost = DAQ(params)
Aj = [params(1) params(5) params(2) ;
0 params(3) params(4) ;
0 0 1 ];
Ai = [params(1) params(5) params(2) ;
0 params(3) params(4) ;
0 0 1 ];
% W^-1 (IAC Image of the Absolute Conic)
W_inv = Ai * Aj';
%----------------
%Find plane at infinity from MQM' ~ w (Dual Absolute Quadric)
Plane_at_infinity = PlaneAtInfinity(W_inv);
%Find H_Infty = [e21]F+e21*n'
Homography_at_infty = H_Infty(Plane_at_infinity);
%----------------
% Initialization
%----------------
global Fs;
% Initialize the cost as a vector
% (N-1 * N-2)/2: 9*8/2 = 36
vector_size = (size(Fs,3)-1)*(size(Fs,4)-2)/2;
cost = zeros(1, vector_size);
% Cost Function
k = 0;
loop_size = 3 * vector_size;
Second_Term = W_inv / norm(W_inv,'fro');
for i=1:3:loop_size
k = k+1;
First_Term = Homography_at_infty(:,i:i+2) * W_inv * ((Homography_at_infty(:,i:i+2))');
First_Term = First_Term / norm(First_Term, 'fro');
cost(k) = norm(First_Term - Second_Term,'fro');
end
end
Plane at infinity function:
function P_infty = PlaneAtInfinity(W_inv)
global PPM;
% Symbolic variables
X = sym('X', 'real');
Y = sym('Y', 'real');
Z = sym('Z', 'real');
L2 = sym('L2','real');
n = [X; Y; Z];
% DAQ
Q = [W_inv , (W_inv * n) ;
(n' * W_inv) , (n' * W_inv * n)];
% Get one only camera matrix (any)
M = PPM(:, :, 3);
% Autocalibration equation
m = M * Q * M';
% solve linear equations
solution = solve(m(1, 1) == (L2 * W_inv(1, 1)), ...
m(2, 2) == (L2 * W_inv(2, 2)), ...
m(3, 3) == (L2 * W_inv(3, 3)), ...
m(1, 3) == (L2 * W_inv(1, 3)));
P_infty = [double(solution.X(1)) double(solution.Y(1))...
double(solution.Z(1))]';
Homography at infinity function:
function H_Inf = H_Infty(planeInf)
global Fs;
k = 1;
% (3 x 3) x ((N-1)*(N-2) /2)
H_Inf = zeros(3,3*(size(Fs,3)-1)*(size(Fs,4)-2)/2);%(3*3)*36
for i = 2:size(Fs,3)
for j = i+1:size(Fs,4)
[~, ~, V] = svd(Fs(:,:,i,j)');
epip = V(:,end);
H_Inf(:,k:k+2) = epipole(Fs(:,:,i,j)) * Fs(:,:,i,j)+ epip * planeInf';
k = k+3;
end
end
end
Reshape function:
function output = Reshape(input)
%---------------------
% Reshape Intrinsics
%---------------------
% K = [a skew u0 ;
% 0 B v0 ;
% 0 0 1 ];
output = [input(1) input(5) input(2) ;
0 input(3) input(4) ;
0 0 1 ];
end
Epipole Function:
function epip = epipole(Fs)
% SVD Decompostition of (Fs)^T
[~,~,V] = svd(Fs');
% Get the epipole from the last vector of the SVD
epi = V(:,end);
% Reshape the Vector into Matrix
epip = [ 0 -epi(3) epi(2);
epi(3) 0 -epi(1);
-epi(2) epi(1) 0 ];
end
The plane at infinity has to be calculated as following:
function plane = computePlaneAtInfinity(P, K)
%Input
% P - Projection matrices
% K - Approximate Values of Intrinsics
%
%Output
% plane - coordinate of plane at infinity
% Compute the DIAC W^-1
W_invert = K * K';
% Construct Symbolic Variables to Solve for Plane at Infinity
% X,Y,Z is the coordinate of plane at infinity
% XX is the scale
X = sym('X', 'real');
Y = sym('Y', 'real');
Z = sym('Z', 'real');
XX = sym('XX', 'real');
% Define Normal to Plane at Infinity
N = [X; Y; Z];
% Equation of Dual Absolute Quadric (DAQ)
Q = [W_invert, (W_invert * N); (N' * W_invert), (N' * W_invert * N)];
% Select Any One Projection Matrix
M = P(:, :, 2);
% Left hand side of the equation
LHS = M * Q * M';
% Solve for [X, Y, Z] considering the System of Linear Equations
% We need 4 equations for 4 variables X,Y,Z,XX
S = solve(LHS(1, 1) == (XX * W_invert(1, 1)), ...
LHS(1, 2) == (XX * W_invert(1, 2)), ...
LHS(1, 3) == (XX * W_invert(1, 3)), ...
LHS(2, 2) == (XX * W_invert(2, 2)));
plane = [double(S.X(1)); double(S.Y(1)); double(S.Z(1))];
end
Is there any function in Matlab which calculates the correlation ratio?
Here is an implementation I tried to do, but the results are not right.
function cr = correlation_ratio(X, Y, L)
ni = zeros(1, L);
sigmai = ni;
for i = 0:(L-1)
Yn = Y(X == i);
ni(1, i+1) = numel(Yn);
m = (1/ni(1, i+1))*sum(Yn);
sigmai(1, i+1) = (1/ni(1, i+1))*sum((Yn - m).^2);
end
n = sum(ni);
prod = ni.*sigmai;
cr = (1-(1/n)*sum(prod))^0.5;
This is the equation on the Wikipedia page:
where:
η is the correlation ratio,
yx,i are the sample values (x is the class label, i the sample index),
yx (with the bar on top) is the mean of sample values for class x,
y (with the bar on top) is the mean for all samples across all classes, and
nx is the number of samples in class x.
This is how I interpreted it into code:
function eta = correlation_ratio(X, Y)
X = X(:); % make sure we've got column vectors, simplifies things below a bit
Y = Y(:);
L = max(X);
mYx = zeros(1, L+1); % we'll write mean per class here
nx = zeros(1, L+1); % we'll write number of samples per class here
for i = unique(X).'
Yn = Y(X == i);
if numel(Yn)>1
mYx(i+1) = mean(Yn);
nx(i+1) = numel(Yn);
end
end
mY = mean(Y); % mean across all samples
eta = sqrt(sum(nx .* (mYx - mY).^2) / sum((Y-mY).^2));
The loop could be replaced with accumarray.
Let's say I have a matrix over GF(2) , i.e. a binary matrix. Now how do I go about computing the left null space of the given matrix over the finite field of 2?
Does MATLAB provide an in-built function for this?
I don't know of Matlab packages for linear algebra in finite space, but I programmed a simple
function that calculates LU-factorizations of matrices modulo a prime p (for example, 2):
function [L,D,U,rows,cols] = ModLU(A,p)
%
% LU-factorization of A, modulo p:
% A(rows,cols) - mod(L * diag(D)*U,p)
%
[m,n] = size(A);
% inverses in mod-p:
% mod(k*invp(k+1)) = 0 if k==0; 1 otherwise
invp = 2:p-2;
for i = 2:p-2; invp = mod(invp.*[2:p-2],p); end
invp = [0,1,invp,p-1];
% Initialize outputs:
L = eye(m); U = A;
rows = 1:m;
cols = 1:n;
% Sweep
for i = 1:m
% Pivoting
[row,col] = find(U(i:end,:));
if isempty(row); break; end
row = row(1)+i-1; col = col(1);
r = 1:m; r(i) = row; r(row) = i;
c = 1:n; c(i) = col; c(col) = i;
ri = rows(i); rows(i) = rows(row); rows(row)=ri;
ci = cols(i); cols(i) = cols(col); cols(col)=ci;
rinv = 1:m; rinv(r) = 1:m;
U = U(r,c); L=L(r,r);
% Gaussian elimination
L(i+1:end,i ) = mod(invp(U(i,i)+1) * U(i+1:end,i),p);
U(i+1:end,i:end) = mod(U(i+1:end,i:end) + (p-L(i+1:end,i)) * U(i,i:end),p);
end
% Factorize diagonal
D = zeros(m,1); D(1:min(m,n)) = diag(U);
U = mod(diag(invp(D+1)) * U,p );
Also, for an upper triangular matrix with ones on the diagonal, a function that calculates
the right-null space modulo p:
function N = NullPU(U,p)
% for an upper triangular matrix, calculate a base for the null space modulo p:
% U * N = 0
n = size(U,2);
rank = size(find(diag(U)),1);
A = U(1:rank,:);
for i=rank:-1:2
A(1:i-1,:) = mod(A(1:i-1,:) + (p-1) * A(1:i-1,i) * A(i,:),p);
end
N = [mod(p-A(:,rank+1:end),p); eye(n-rank)];
These functions are simply combined into a function that calculates the null space of
matrix A, modulo p:
function N = NullP(A,p)
% Calculate a basis for the null space of A, modulo p:
% mod(A*N,p) = 0
[L,D,U,rows,cols] = ModLU(A,p);
N = NullPU(U,p);
N(cols,:) = N;
Note that this function calculates a base for the right null space of A, modulo p. The left
null space is found using
N = NullP(A',p)';
I am trying to simulate the rotation dynamics of a system. I am testing my code to verify that it's working using simulation, but I never recovered the parameters I pass to the model. In other words, I can't re-estimate the parameters I chose for the model.
I am using MATLAB for that and specifically ode45. Here is my code:
% Load the input-output data
[torque outputs] = DataLogs2();
u = torque;
% using the simulation data
Ixx = 1.00;
Iyy = 2.00;
Izz = 3.00;
x0 = [0; 0; 0];
Ts = .02;
t = 0:Ts:Ts * ( length(u) - 1 );
[ T, x ] = ode45( #(t,x) rotationDyn( t, x, u(1+floor(t/Ts),:), Ixx, Iyy, Izz), t, x0 );
w = x';
N = length(w);
q = 1; % a counter for the A and B matrices
% The Algorithm
for k=1:1:N
w_telda = [ 0 -w(3, k) w(2,k); ...
w(3,k) 0 -w(1,k); ...
-w(2,k) w(1,k) 0 ];
if k == N % to handle the problem of the last iteration
w_dash(:,k) = (-w(:,k))/Ts;
else
w_dash(:,k) = (w(:,k+1)-w(:,k))/Ts;
end
a = kron( w_dash(:,k)', eye(3) ) + kron( w(:,k)', w_telda );
A(q:q+2,:) = a; % a 3N*9 matrix
B(q:q+2,:) = u(k,:)'; % a 3N*1 matrix % u(:,k)
q = q + 3;
end
% Forcing J to be diagonal. This is the case when we consider our quadcopter as two thin uniform
% rods crossed at the origin with a point mass (motor) at the end of each.
A_new = [A(:, 1) A(:, 5) A(:, 9)];
vec_J_diag = A_new\B;
J_diag = diag([vec_J_diag(1), vec_J_diag(2), vec_J_diag(3)])
eigenvalues_J_diag = eig(J_diag)
error = norm(A_new*vec_J_diag - B)
where my dynamic model is defined as:
function [dw, y] = rotationDyn(t, w, tau, Ixx, Iyy, Izz, varargin)
% The output equation
y = [w(1); w(2); w(3)];
% State equation
% dw = (I^-1)*( tau - cross(w, I*w) );
dw = [Ixx^-1 * tau(1) - ((Izz-Iyy)/Ixx)*w(2)*w(3);
Iyy^-1 * tau(2) - ((Ixx-Izz)/Iyy)*w(1)*w(3);
Izz^-1 * tau(3) - ((Iyy-Ixx)/Izz)*w(1)*w(2)];
end
Practically, what this code should do, is to calculate the eigenvalues of the inertia matrix, J, i.e. to recover Ixx, Iyy, and Izz that I passed to the model at the very begining (1, 2 and 3), but all what I get is wrong results.
Is the problem with using ode45?
Well the problem wasn't in the ode45 instruction, the problem is that in system identification one can create an n-1 samples-signal from an n samples-signal, thus the loop has to end at N-1 in the above code.
I wrote a code to implement steepest descent backpropagation with which I am having issues. I am using the Machine CPU dataset and have scaled the inputs and outputs into range [0 1]
The codes in matlab/octave is as follows:
steepest descent backpropagation
%SGD = Steepest Gradient Decent
function weights = nnSGDTrain (X, y, nhid_units, gamma, max_epoch, X_test, y_test)
iput_units = columns (X);
oput_units = columns (y);
n = rows (X);
W2 = rand (nhid_units + 1, oput_units);
W1 = rand (iput_units + 1, nhid_units);
train_rmse = zeros (1, max_epoch);
test_rmse = zeros (1, max_epoch);
for (epoch = 1:max_epoch)
delW2 = zeros (nhid_units + 1, oput_units)';
delW1 = zeros (iput_units + 1, nhid_units)';
for (i = 1:rows(X))
o1 = sigmoid ([X(i,:), 1] * W1); %1xn+1 * n+1xk = 1xk
o2 = sigmoid ([o1, 1] * W2); %1xk+1 * k+1xm = 1xm
D2 = o2 .* (1 - o2);
D1 = o1 .* (1 - o1);
e = (y_test(i,:) - o2)';
delta2 = diag (D2) * e; %mxm * mx1 = mx1
delta1 = diag (D1) * W2(1:(end-1),:) * delta2; %kxm * mx1 = kx1
delW2 = delW2 + (delta2 * [o1 1]); %mx1 * 1xk+1 = mxk+1 %already transposed
delW1 = delW1 + (delta1 * [X(i, :) 1]); %kx1 * 1xn+1 = k*n+1 %already transposed
end
delW2 = gamma .* delW2 ./ n;
delW1 = gamma .* delW1 ./ n;
W2 = W2 + delW2';
W1 = W1 + delW1';
[dummy train_rmse(epoch)] = nnPredict (X, y, nhid_units, [W1(:);W2(:)]);
[dummy test_rmse(epoch)] = nnPredict (X_test, y_test, nhid_units, [W1(:);W2(:)]);
printf ('Epoch: %d\tTrain Error: %f\tTest Error: %f\n', epoch, train_rmse(epoch), test_rmse(epoch));
fflush (stdout);
end
weights = [W1(:);W2(:)];
% plot (1:max_epoch, test_rmse, 1);
% hold on;
plot (1:max_epoch, train_rmse(1:end), 2);
% hold off;
end
predict
%Now SFNN Only
function [o1 rmse] = nnPredict (X, y, nhid_units, weights)
iput_units = columns (X);
oput_units = columns (y);
n = rows (X);
W1 = reshape (weights(1:((iput_units + 1) * nhid_units),1), iput_units + 1, nhid_units);
W2 = reshape (weights((((iput_units + 1) * nhid_units) + 1):end,1), nhid_units + 1, oput_units);
o1 = sigmoid ([X ones(n,1)] * W1); %nxiput_units+1 * iput_units+1xnhid_units = nxnhid_units
o2 = sigmoid ([o1 ones(n,1)] * W2); %nxnhid_units+1 * nhid_units+1xoput_units = nxoput_units
rmse = RMSE (y, o2);
end
RMSE function
function rmse = RMSE (a1, a2)
rmse = sqrt (sum (sum ((a1 - a2).^2))/rows(a1));
end
I have also trained the same dataset using the R RSNNS package mlp and the RMSE for train set (first 100 examples) are around 0.03 . But in my implementation I cannot achieve lower RMSE than 0.14 . And sometimes the errors grow for some higher learning rates, and no learning rate gets me lower RMSE than 0.14. Also a paper i referred report the RMSE in for the train set is around 0.03
I wanted to know where is the problem i the code. I have followed Raul Rojas book and confirmed that things are okay.
In backprobagation code the line
e = (y_test(i,:) - o2)';
is not correct, because the o2 is the output from the train set and i am finding the difference from one example from the test set y_test. The line should have been as below:
e = (y(i,:) - o2)';
which correctly finds the difference between the predicted output by the current model and the target output of the corresponding example.
This took me 3 days to find this one, I am fortunate enough to find this freaking bug which stopped me from going into further modifications.