I am using Matlab to find the spectral radius of the Jacobi iteration matrix where A=[4 2 1;1 3 1;1 1 4].
I can't seem to input the correct commands to get the size of the error after 5 iterations. Can someone help me?
Here are a list of commands that I put into Matlab so far:
A=[4 2 1;1 3 1;1 1 4]
A =
4 2 1
1 3 1
1 1 4
D=diagonal(diagonal(A));L=(A,-1);U=(A,1);
b=([3 -1 4])
x0j=zeros([0 0 0]);
x=D\(-(U+L)*x0j+b);r=b-A*x %Jacobi iteration.
------------------------------------------------------------------------------
Error using *
Inputs must be 2-D, o enter code here r at least one input must be scalar.
To compute element wise TIMES, use TIMES (.*) instead.
The spectral radius of a matrix is the maximum of the modulus of its eigenvalues. It can be simply computed using max(abs(eig(·))).
However, as others have noticed, your whole code seems pretty mixed up and not actually valid Matlab code, so your problem is not really to compute the spectral radius, is it? The algorithm is very straightforward and easy to implement:
% diagonal part of A and rest
D = diag(diag(A));
R = A - D;
% iteration matrix and offset
T = - inv(D) * R;
C = inv(D) * b;
% spectral radius condition
rho = max(abs(eig(T)));
if rho >= 1
error('no convergence')
end
% initial guess
x = randn(size(b));
% iteration
while norm(A * x - b) > 1e-15
x = T * x + C;
end
Note that I used inv(D) to directly follow the description in Wikipedia, but the inverse of a diagonal matrix can be easily computed using diag(1 ./ diag(D)).
I don't really see why one would need to separate R into an upper and lower part. I suppose it has to do with numerical efficiency, but then, Matlab is a very efficient high-level language for matrix computations already. So actually there is no need to implement the Jacobi algorithm in it explicitly when you can simply write A \ b – except for educational purposes I guess.
Related
As a part of project I have to construct a cubic spline with natural boundary conditions without using any built-in MATLAB functions such as spline or csape.
I tried programming the following function.
While I'm pretty sure it's correct up to the point where it calculates the coefficients q, I can't figure out how I will eventually get the actual cubic polynomials. What I am getting right now as an outpout when calling the function is 9 distinct values for S.
Any help or hints would be appreciated.
function S=cubic_s(x,y)
n=length(x);
%construction of the tri-diagonal matrix
for j=1:n
V(j,1)=1;
V(j,2)=4;
V(j,3)=1;
end
%the first row should be (1,0,...,0) and the last (0,0,...,0,1)
V(1,2)=1; V(n,2)=1; V(2,3)=0; V(n-1,1)=0;
d=[-1 0 1];
A=spdiags(V,d,n,n);
%construction of the vector b
b=zeros(n,1);
%the first and last elements of b must equal 0
b(1)=0; b(n)=0;
%distance between two consecutive points
h=(x(n)-x(1))/(n-1);
for j=2:n-1
b(j,1)=(6/h^2)*(y(j+1)-2*y(j)+y(j-1));
end
%solving for the coefficients q
q=A\b;
%finding the polynomials with the formula for the cubic spline
for j=1:n-1
for z=x(j):0.01:x(j+1)
S(j)=(q(j)/(6*h))*(x(j+1)-z)^3+(q(j+1)/(6*h))*(z-x(j))^3+(z-x(j))* (y(j+1)/h-(q(j+1)*h)/6)+(x(j+1)-z)*(y(j)/h-(q(j)*h)/6);
end
end
You should save S every z-time, see picture and code below
function plot_spline
x = (0:10);
y = [1 4 3 7 1 5 2 1 6 2 3];
xx = x(1):0.01:x(2);
[XX,YY]=cubic_s(x,y);
plot(x,y,'*r', XX,YY,'-k')
function [XX,YY]=cubic_s(x,y)
n=length(x);
%construction of the tri-diagonal matrix
for j=1:n
V(j,1)=1;
V(j,2)=4;
V(j,3)=1;
end
%the first row should be (1,0,...,0) and the last (0,0,...,0,1)
V(1,2)=1; V(n,2)=1; V(2,3)=0; V(n-1,1)=0;
d=[-1 0 1];
A=spdiags(V,d,n,n);
%construction of the vector b
b=zeros(n,1);
%the first and last elements of b must equal 0
b(1)=0; b(n)=0;
%distance between two consecutive points
h=(x(n)-x(1))/(n-1);
for j=2:n-1
b(j,1)=(6/h^2)*(y(j+1)-2*y(j)+y(j-1));
end
%solving for the coefficients q
q=A\b;
%finding the polynomials with the formula for the cubic spline
enum = 1;
for j=1:n-1
for z=x(j):0.01:x(j+1)
YY(enum)=(q(j)/(6*h))*(x(j+1)-z)^3+(q(j+1)/(6*h))*(z-x(j))^3+(z-x(j))* (y(j+1)/h-(q(j+1)*h)/6)+(x(j+1)-z)*(y(j)/h-(q(j)*h)/6);
XX(enum)=z;
enum = enum+1;
end
end
I'm kind've new to Matlab and stack overflow to begin with, so if I do something wrong outside of the guidelines, please don't hesitate to point it out. Thanks!
I have been trying to do convolution between two functions and I have been having a hard time trying to get it to work.
t=0:.01:10;
h=exp(-t);
x=zeros(size(t)); % When I used length(t), I would get an error that says in conv(), A and B must be vectors.
x(1)=2;
x(4)=5;
y=conv(h,x);
figure; subplot(3,1,1);plot(t,x); % The discrete function would not show (at x=1 and x=4)
subplot(3,1,2);plot(t,h);
subplot(3,1,3);plot(t,y(1:length(t))); %Nothing is plotted here when ran
I commented my issues with the code. I don't understand the difference of length and size in this case and how it would make a difference.
For the second comment, x=1 should have an amplitude of 2. While x=4 should have an amplitude of 5. When plotted, it only shows nothing in the locations specified but looks jumbled up at x=0. I'm assuming that's the reason why the convoluted plot won't be displayed.
The original problem statement is given if it helps to understand what I was thinking throughout.
Consider an input signal x(t) that consists of two delta functions at t = 1 and t = 4 with amplitudes A1 = 5 and A2 = 2, respectively, to a linear system with impulse response h that is an exponential pulse (h(t) = e ^−t ). Plot x(t), h(t) and the output of the linear system y(t) for t in the range of 0 to 10 using increments of 0.01. Use the MATLAB built-in function conv.
The initial question regarding size vs length
length yields a scalar that is equal to the largest dimension of the input. In the case of your array, the size is 1 x N, so length yields N.
size(t)
% 1 1001
length(t)
% 1001
If you pass a scalar (N) to ones, zeros, or a similar function, it will create a square matrix that is N x N. This results in the error that you see when using conv since conv does not accept matrix inputs.
size(ones(length(t)))
% 1001 1001
When you pass a vector to ones or zeros, the output will be that size so since size returns a vector (as shown above), the output is the same size (and a vector) so conv does not have any issues
size(ones(size(t)))
% 1 1001
If you want a vector, you need to explicitly specify the number of rows and columns. Also, in my opinion, it's better to use numel to the number of elements in a vector as it's less ambiguous than length
z = zeros(1, numel(t));
The second question regarding the convolution output:
First of all, the impulses that you create are at the first and fourth index of x and not at the locations where t = 1 and t = 4. Since you create t using a spacing of 0.01, t(1) actually corresponds to t = 0 and t(4) corresponds to t = 0.03
You instead want to use the value of t to specify where to put your impulses
x(t == 1) = 2;
x(t == 4) = 5;
Note that due to floating point errors, you may not have exactly t == 1 and t == 4 so you can use a small epsilon instead
x(abs(t - 1) < eps) = 2;
x(abs(t - 4) < eps) = 5;
Once we make this change, we get the expected scaled and shifted versions of the input function.
I have a matrix a and I want to calculate the distance from one point to all other points. So really the outcome matrix should have a zero (at the point I have chosen) and should appear as some sort of circle of numbers around that specific point.
This is what I have already but I cant seem to get the correct outcome.
a = [1 2 3 4 5 6 7 8 9 10]
for i = 2:20
a(i,:) = a(i-1,:) + 1;
end
N = 10
for I = 1:N
for J = 1:N
dx = a(I,1)-a(J,1);
dy = a(I,2)-a(J,2);
distance(I,J) = sqrt(dx^2 + dy^2)
end
end
Your a matrix is a 1D vector and is incompatible with the nested loop, which computes distance in 2D space from each point to each other point. So the following answer applies to the problem of finding all pairwise distances in a N-by-D matrix, as your loop does for the case of D=2.
Option 1 - pdist
I think you are looking for pdist with the 'euclidean' distance option.
a = randn(10, 2); %// 2D, 10 samples
D = pdist(a,'euclidean'); %// euclidean distance
Follow that by squareform to get the square matrix with zero on the diagonal as you want it:
distances = squareform(D);
Option 2 - bsxfun
If you don't have pdist, which is in the Statistics Toolbox, you can do this easily with bsxfun:
da = bsxfun(#minus,a,permute(a,[3 2 1]));
distances = squeeze(sqrt(sum(da.^2,2)));
Option 3 - reformulated equation
You can also use an alternate form of Euclidean (2-norm) distance,
||A-B|| = sqrt ( ||A||^2 + ||B||^2 - 2*A.B )
Writing this in MATLAB for two data arrays u and v of size NxD,
dot(u-v,u-v,2) == dot(u,u,2) + dot(v,v,2) - 2*dot(u,v,2) % useful identity
%// there are actually small differences from floating point precision, but...
abs(dot(u-v,u-v,2) - (dot(u,u,2) + dot(v,v,2) - 2*dot(u,v,2))) < 1e-15
With the reformulated equation, the solution becomes:
aa = a*a';
a2 = sum(a.*a,2); % diag(aa)
a2 = bsxfun(#plus,a2,a2');
distances = sqrt(a2 - 2*aa);
You might use this method if Option 2 eats up too much memory.
Timings
For a random data matrix of size 1e3-by-3 (N-by-D), here are timings for 100 runs (Core 2 Quad, 4GB DDR2, R2013a).
Option 1 (pdist): 1.561150 sec (0.560947 sec in pdist)
Option 2 (bsxfun): 2.695059 sec
Option 3 (bsxfun alt): 1.334880 sec
Findings: (i) Do computations with bsxfun, use the alternate formula. (ii) the pdist+squareform option has comparable performance. (iii) The reason why squareform takes twice as much time as pdist is probably because pdist only computes the triangular matrix since the distance matrix is symmetric. If you can do without the square matrix, then you can avoid squareform and do your computations in about 40% of the time required to do it manually with bsxfun (0.5609/1.3348).
This is what i was looking for, but thanks for all the suggestions.
A = rand(5, 5);
select_cell = [3 3];
distance = zeros(size(A, 1), size(A, 2));
for i = 1:size(A, 1)
for j = 1:size(A, 2)
distance(i, j) = sqrt((i - select_cell(1))^2 + (j - select_cell(2))^2);
end
end
disp(distance)
Also you can improve it by using vectorisation:
distances = sqrt((x-xCenter).^2+(y-yCenter).^2
IMPORTANT: data_matrix is D X N, where D is number of dimensions and N is number of data points!
final_dist_pairs=data_matrix'*data_matrix;
norms = diag(final_dist_pairs);
final_dist_pairs = bsxfun(#plus, norms, norms') - 2 * final_dist_pairs;
Hope it helps!
% Another important thing,
Never use pdist function of MATLAB. It is a sequential evaluation, that is something like for loops and takes a lot of time, maybe in O(N^2)
I try to write an algorithm which determine $\mu$, $\sigma$,$\pi$ for each class from a mixture multivariate normal distribution.
I finish with the algorithm partially, it works when I set the random guess values($\mu$, $\sigma$,$\pi$) near from the real value. But when I set the values far from the real one, the algorithm does not converge. The sigma goes to 0 $(2.30760684053766e-24 2.30760684053766e-24)$.
I think the problem is my covarience calculation, I am not sure that this is the right way. I found this on wikipedia.
I would be grateful if you could check my algorithm. Especially the covariance part.
Have a nice day,
Thanks,
2 mixture gauss
size x = [400, 2] (400 point 2 dimension gauss)
mu = 2 , 2 (1 row = first gauss mu, 2 row = second gauss mu)
for i = 1 : k
gaussEvaluation(i,:) = pInit(i) * mvnpdf(x,muInit(i,:), sigmaInit(i, :) * eye(d));
gaussEvaluationSum = sum(gaussEvaluation(i, :));
%mu calculation
for j = 1 : d
mu(i, j) = sum(gaussEvaluation(i, :) * x(:, j)) / gaussEvaluationSum;
end
%sigma calculation methode 1
%for j = 1 : n
% v = (x(j, :) - muNew(i, :));
% sigmaNew(i) = sigmaNew(i) + gaussEvaluation(i,j) * (v * v');
%end
%sigmaNew(i) = sigmaNew(i) / gaussEvaluationSum;
%sigma calculation methode 2
sub = bsxfun(#minus, x, mu(i,:));
sigma(i,:) = sum(gaussEvaluation(i,:) * (sub .* sub)) / gaussEvaluationSum;
%p calculation
p(i) = gaussEvaluationSum / n;
Two points: you can observe this even when you implement gaussian mixture EM correctly, but in your case, the code does seem to be incorrect.
First, this is just a problem that you have to deal with when fitting mixtures of gaussians. Sometimes one component of the mixture can collapse on to a single point, resulting in the mean of the component becoming that point and the variance becoming 0; this is known as a 'singularity'. Hence, the likelihood also goes to infinity.
Check out slide 42 of this deck: http://www.cs.ubbcluj.ro/~csatol/gep_tan/Bishop-CUED-2006.pdf
The likelihood function that you are evaluating is not log-concave, so the EM algorithm will not converge to the same parameters with different initial values. The link I gave above also gives some solutions to avoid this over-fitting problem, such as putting a prior or regularization term on your parameters. You can also consider running multiple times with different starting parameters and discarding any results with variance 0 components as having over-fitted, or just reduce the number of components you are using.
In your case, your equation is right; the covariance update calculation on Wikipedia is the same as the one on slide 45 of the above link. However, if you are in a 2d space, for each component the mean should be a length 2 vector and the covariance should be a 2x2 matrix. Hence your code (for two components) is wrong because you have a 2x2 matrix to store the means and a 2x2 matrix to store the covariances; it should be a 2x2x2 matrix.
I am trying to solve the following optimization problem in octave
The first contraint is that A be positive semi-definite.
S is a set of data points such that if (xi,xj) is in S then xi is similar to xj and D is a set of data points such that if (xi,xj) is in D then xi and xj are dissimilar. Note that the above formula is 2 separate sums and the second sum is not nested. Also xi and xj are assumed to be column vectors of length N.
Because this is a nonlinear optimization I am trying to use octave's nonlinear program solver, sqp.
The problem is that if I just provide it with the function to optimize, on some small toy tests the, BFGS method to find the Hessian
fails. Because of this I tried to provide my own Hessian function but now this problem occurs
error: __qp__: operator *: nonconformant arguments (op1 is 2x2, op2 is 3x1)
error: called from:
error: /usr/share/octave/3.6.3/m/optimization/qp.m at line 393, column 26
error: /usr/share/octave/3.6.3/m/optimization/sqp.m at line 414, column 32
when I make the following call to sqp
[A, ~, Info] = sqp(initial_guess, {#toOpt, #CalculateGradient,#CalculateHessian},
[],[],0,[],maxiter);
I simplified the constraint that A be positive semi-definite and diagonal by only solving for the diagonal entries and constraining all the diagonal entries to be >=0. initial_guess is a vector of ones that is N long.
Here is my code to calculate what I believe to be the Hessian matrix
%Hessian = CalculateHessian(A)
%calculates the Hessian of the function we are optimizing as follows
%H(i,j) = (sumsq(D(:,i),1) * sumsq(D(:,j),1)) / (sum(A.*sumsq(D,1))^2)
%where D is a matrix of of differences between observations that are dissimilar, with one difference on each row
%and sumsq is the sum of the squares
%input A: the current guess for A
%output Hessian: The hessian of the function we are optimizing
function Hessian = CalculateHessian(A)
global HessianNumerator; %this is a matrix with the numerator of H(i,j)
global Dsum_of_squares; %the sum of the squares of the differences of each dimensions of the dissimilar observations
if(iscolumn(A)) %if A is a column vector
A = A'; %make it a row vector. necessary to prevent broadcasting
endif
if(~isempty(Dsum_of_squares)) %if disimilar constraints were provided
Hessian = HessianNumerator / (sum(A.*Dsum_of_squares)^2)
else
Hessian = HessianNumerator; %the hessian is a matrix of 0s
endif
endfunction
and Dsum_of_squares and HessianNumertor are
[dissimilarRow,dissimilarColumn] = find(D); %find which observations are dissimilar to each other
DissimilarDiffs = X(dissimilarRow,:) - X(dissimilarColumn,:); %take the difference between the dissimilar observations
Dsum_of_squares = sumsq(DissimilarDiffs,1);
HessianNumerator = Dsum_of_squares .* Dsum_of_squares'; %calculate the numerator of the Hessian. it is a constant value
X is a M x N matrix with one observation per row.
D is a M x M dissimilarity matrix. if D(i,j) is 1 then row i of X is dissimlar to row j. 0 otherwise.
I believe my error is in one of the following areas (from least likely to most likely)
The math I used to derive the Hessian function is wrong. The formula I am using is in my comments for the function.
My implementation of the math.
The Hessian Matrix that sqp wants is different from the one described on the Hessian Matrix Wikipedia page.
Any help would be greatly appreciated. If you need me to post more code I would be happy to do so. Right now the amount of code to try and solve the optimization is about 160 lines.
Here is the test case I am running that causes the code to fail. It works if I only pass it the gradient function.
X = [1 2 3;
4 5 6;
7 8 9;
10 11 12];
S = [0 1 1 0;
1 0 0 0;
1 0 0 0;
0 0 0 0]; %this means row 1 of X is similar to rows 2 and 3
D = [0 0 0 0;
0 0 0 0;
0 0 0 1;
0 0 1 0]; %this means row 3 of X is dissimilar to row 4
gml(X,S,D, 200); %200 is the maximum number of iterations for sqp to run