I am using the SOR method and need to find the optimal weight factor. I think a good way to go about this is to run my SOR code with a number of omegas from 0 to 2, then store the number of iterations for each of these. Then I can see which iteration is the lowest and which omega it corresponds to. Being a novice programer, however, I am unsure how to go about this.
Here is my SOR code:
function [x, l] = SORtest(A, b, x0, TOL,w)
[m n] = size(A); % assigning m and n to number of rows and columns of A
l = 0; % counter variable
x = [0;0;0]; % introducing solution matrix
max_iter = 200;
while (l < max_iter) % loop until max # of iters.
l = l + 1; % increasing counter variable
for i=1:m % looping through rows of A
sum1 = 0; sum2 = 0; % intoducing sum1 and sum2
for j=1:i-1 % looping through columns
sum1 = sum1 + A(i,j)*x(j); % computing sum using x
end
for j=i+1:n
sum2 = sum2 + A(i,j)*x0(j); % computing sum using more recent values in x0
end
x(i) =(1-w)*x0(i) + w*(-sum1-sum2+b(i))/A(i,i); % assigning elements to the solution matrix.
end
if abs(norm(x) - norm(x0)) < TOL % checking tolerance
break
end
x0 = x; % assigning x to x0 before relooping
end
end
That's pretty easy to do. Simply loop through values of w and determine what the total number of iterations is at each w. Once the function finishes, check to see if this is the current minimum number of iterations required to get a solution. If it is, then update what the final solution would be. Once we iterate over all w, the result would be the solution vector that produced the smallest number of iterations to converge. Bear in mind that SOR has the w such that it does not include w = 0 or w = 2, or 0 < w < 2, so we can't include 0 or 2 in the range. As such, do something like this:
omega_vec = 0.01:0.01:1.99;
final_x = x0;
min_iter = intmax;
for w = omega_vec
[x, iter] = SORtest(A, b, x0, TOL, w);
if iter < min_iter
min_iter = iter;
final_x = x;
end
end
The loop checks to see if the total number of iterations at each w is less than the current minimum. If it is, log this and also record what the solution vector was. The final solution vector that was the minimum over all w will be stored in final_x.
Related
I have attempted to implement the iterative version of gradient descent algorithm which however is not working correctly. The vectorized implementation of the same algorithm however works correctly.
Here is the iterative implementation :
function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)
% get the number of rows and columns
nrows = size(X, 1);
ncols = size(X, 2);
% initialize the hypothesis vector
h = zeros(nrows, 1);
% initialize the temporary theta vector
theta_temp = zeros(ncols, 1);
% run gradient descent for the specified number of iterations
count = 1;
while count <= iterations
% calculate the hypothesis values and fill into the vector
for i = 1 : nrows
for j = 1 : ncols
term = theta(j) * X(i, j);
h(i) = h(i) + term;
end
end
% calculate the gradient
for j = 1 : ncols
for i = 1 : nrows
term = (h(i) - y(i)) * X(i, j);
theta_temp(j) = theta_temp(j) + term;
end
end
% update the gradient with the factor
fact = alpha / nrows;
for i = 1 : ncols
theta_temp(i) = fact * theta_temp(i);
end
% update the theta
for i = 1 : ncols
theta(i) = theta(i) - theta_temp(i);
end
% update the count
count += 1;
end
end
And below is the vectorized implementation of the same algorithm :
function [theta, theta_all, J_cost] = gradientDescent(X, y, theta, alpha)
% set the learning rate
learn_rate = alpha;
% set the number of iterations
n = 1500;
% number of training examples
m = length(y);
% initialize the theta_new vector
l = length(theta);
theta_new = zeros(l,1);
% initialize the cost vector
J_cost = zeros(n,1);
% initialize the vector to store all the calculated theta values
theta_all = zeros(n,2);
% perform gradient descent for the specified number of iterations
for i = 1 : n
% calculate the hypothesis
hypothesis = X * theta;
% calculate the error
err = hypothesis - y;
% calculate the gradient
grad = X' * err;
% calculate the new theta
theta_new = (learn_rate/m) .* grad;
% update the old theta
theta = theta - theta_new;
% update the cost
J_cost(i) = computeCost(X, y, theta);
% store the calculated theta value
if i < n
index = i + 1;
theta_all(index,:) = theta';
end
end
Link to the dataset can be found here
The filename is ex1data1.txt
ISSUES
For initial theta = [0, 0] (this is a vector!), learning rate of 0.01 and running this for 1500 iterations I get the optimal theta as :
theta0 = -3.6303
theta1 = 1.1664
The above is the output for the vectorized implementation which I know I have implemented correctly (it passed all the test cases on Coursera).
However, when I implemented the same algorithm using the iterative method (1st code I mentioned) the theta values I get are (alpha = 0.01, iterations = 1500):
theta0 = -0.20720
theta1 = -0.77392
This implementation fails to pass the test cases and I know therefore that the implementation is incorrect.
I am however unable to understand where I am going wrong as the iterative code does the same job, same multiplications as the vectorized one and when I tried to trace the output of 1 iteration of both the codes, the values came same (on pen and paper!) but failed when I ran them on Octave.
Any help regarding this would be of great help especially if you could point out where I went wrong and what exactly was the cause of failure.
Points to consider
The implementation of hypothesis is correct as I tested it out and both the codes gave the same results, so no issues here.
I printed the output of the gradient vector in both the codes and realised that the error lies here because the outputs here were very different!
Additionally, here is the code for pre-processing the data :
function[X, y] = fileReader(filename)
% load the dataset
dataset = load(filename);
% get the dimensions of the dataset
nrows = size(dataset, 1);
ncols = size(dataset, 2);
% generate the X matrix from the dataset
X = dataset(:, 1 : ncols - 1);
% generate the y vector
y = dataset(:, ncols);
% append 1's to the X matrix
X = [ones(nrows, 1), X];
end
What is going wrong with the first code is that the theta_temp and the h vectors are not being initialised properly. For the very first iteration (when count value equals 1) your code runs properly because for that particular iteration the the h and the theta_temp vectors have been initialised to 0 properly. However, since these are temporary vectors for each iteration of gradient descent, they have not been initialised to 0 vectors again for the subsequent iterations. That is, for iteration 2, the values that are modified into h(i) and theta_temp(i) are just added to the old values. Hence because of that, the code does not work properly. You need to update the vectors as zero vectors at the beginning of each iteration and then they would work correctly. Here is my implementation of your code (the first one, observe the changes) :
function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)
% get the number of rows and columns
nrows = size(X, 1);
ncols = size(X, 2);
% run gradient descent for the specified number of iterations
count = 1;
while count <= iterations
% initialize the hypothesis vector
h = zeros(nrows, 1);
% initialize the temporary theta vector
theta_temp = zeros(ncols, 1);
% calculate the hypothesis values and fill into the vector
for i = 1 : nrows
for j = 1 : ncols
term = theta(j) * X(i, j);
h(i) = h(i) + term;
end
end
% calculate the gradient
for j = 1 : ncols
for i = 1 : nrows
term = (h(i) - y(i)) * X(i, j);
theta_temp(j) = theta_temp(j) + term;
end
end
% update the gradient with the factor
fact = alpha / nrows;
for i = 1 : ncols
theta_temp(i) = fact * theta_temp(i);
end
% update the theta
for i = 1 : ncols
theta(i) = theta(i) - theta_temp(i);
end
% update the count
count += 1;
end
end
I ran the code and it gave the same values of theta which you have mentioned. However, what I wonder is how did you state that the output of hypothesis vector was the same in both cases where clearly, this was one of the reasons for the first code failing!
Write a MATLAB code to compute and determine the convergence rate of :
(exp(h)-(1+h+1/2*h^2))/h with h=1/2, 1/2^2,..., 1/2^10
My code was:
h0=(0.5)^i;
TOL=10^(-8);
N=10;
i=1;
flag=0;
table=zeros(30,1);
table(1)=h0
while i < N
h=(exp(h0)-(1+h0+0.5*h0^2))/h0;
table (i+1)=h;
if abs(h-h0)< TOL
flag=1;
break;
end
i=i+1;
h0=h;
end
if flag==1
h
else
error('failed');
end
The answer I received does not make quite any sense. Please help.
The main problem is in your tolerance check
if abs(h-h0) < TOL
If your expression approaches its limit fast enough, h can become 0 as h0 is larger than the tolerance. If so, the criteria isn't met and the loop continues. Next iteration h0 is 0 and h will be evaluated as NaN (since divition with 0 is bad)
If you, like in this case, expect a convergence towards 0, you could instead check
if h > TOL
or you could also add a NaN-check
if abs(h-h0) < TOL || isnan(h)
Apart from that, there are a few problems with your code.
First you initiate h0 using i (the imaginary number), you probably intented to use i = 1, but that line is below in you code.
You use a whileloop but in your case, as you intend to increment i, a forloop would be just as good.
The use of both the variable table, h and h0 is unnecessary. Make a single result vector, initialized with your h0 at the first index - see example below.
TOL = 1e-8; % Tolerance
N = 10; % Max number of iterations
% Value vector
h = zeros(N+1,1);
% Init value
h(1) = (0.5)^1;
for k = 1:N
h(k+1) = (exp(h(k)) - (1 + h(k) + 0.5*h(k)^2))/h(k);
if isnan(h(k+1)) || abs(diff(h(k + [0 1]))) < TOL
N = k;
break
end
end
% Crop vector
h = h(1:N);
% Display result
fprintf('Converged after %d iterations\n', N)
% Plot (with logarithmic y-xis)
semilogy(1:N, h,'*-')
I am aiming to minimize the below cost function over W
J = (E)^2
E = A - W .* B
Such that W(n+1) = W(n) - (u/2) * delJ
delJ = gradient of J = -2 * E .* B
u = step_size=0.2
where:
- A, B are STFT matrix of 2 audio signals (dimension is 257x4000 for a 16s audio with window size = 256 , 75% overlap, nfft=512)
- W is a matrix constructed with [257x1] vector repeated 4000 times (so that it become 257x4000] matrix
I have written my customize function as below. The problem is, the elements in A and B are so small (~e-20) that even after 1000 iteration that no change happens to g.
I am certainly missing something, if anybody can help or guide me to some link which explains the whole process for a new person.
[M,N] = size(A);
E =#(x) A - repmat(x,1,N) .* B; % Error Function
J = #(x) E(x) .^ 2; % Cost Function
G = #(x) -2 * E(x) .* B; % Gradiant Function
alpha = .2; % Step size
maxiter = 500; % Max iteration
dwmin = 1e-6; % Min change in gradiation
tolerence = 1e-6; % Max Gradiant norm
gnorm = inf;
w = rand(M,1);
dw = inf;
for i = 1:maxiter
g = G(w);
gnorm = norm(g);
wnew = w - (alpha/2)*g(:,1);
dw = norm(wnew-w)
if or(dw < dwmin, gnorm < tolerence)
break
end
end
w = wnew;
A & B are always positive real numbered vectors.
Your problem is actually a series of independent problems. If we index each row of A and B and each element of w with i, then minimizing the sum of squares across the error matrix
A - repmat(w, 1, N) .* B
is the same as minimizing the the sum of squares across the error vector
A(i, :) - w(i) * B(i, :)
for all rows separately. The latter problem can be solved using one of the least-squares operators of Matlab, in particular mrdivide or /:
for i = 1 : M
w(i) = A(i, :) / B(i, :);
end
As far as I can tell, there is no way to further vectorize this calculation.
In any case, there is no need to use a gradient descent or other form of optimization algorithm.
I'm writing a function of the Gauss Seidel method of solving a linear system of equations of the form Ax=b, x being the unknown we are looking for.
I am having a problem with the while loop in my function, it seems that it runs infinitely. I can't seem to figure out why.
This is my function for creating the coefficient matrix A and the column vectors x and b, all with the same number of rows of course. No problem with this one.
function [A, b, x0] = test_system(n)
u = ones(n, 1);
A = spdiags([u 4*u u], [-1 0 1], n, n);
b = zeros(n, 1);
b(1) = 3;
b(2 : 2 : end-2) = -2;
b(3 : 2 : end-1) = 2;
b(end) = -3;
x0 = ones(n, 1);
This is my function for solving the system. I have included all of it just in case, but I believe the real problem is within the while loop at the very end which runs infinitely when I execute the function. The counter doesn't break away from it either. I can't really see what its problem is. Any clues?
Be gentle, I'm new at Matlab :)
function [x] = GaussSeidel(A,b,x0,tol)
% implementation of the GaussSeidel iterative method
% for solving a linear system of equations Ax = b
%INPUTS:
% A: coefficient matrix
% b: column vector of constants
% x0: setup for the unknown vector (using vector of ones)
% tol: result must be within 'tol' of correct answer.
%OUTPUTS:
% x: unknown
%check that A is a matrix
if ~(ismatrix(A))
error('A is not a matrix');
end
%check that A is square
[m,n] = size(A);
if m ~= n
error('Matrix A is not square');
end
%check that b is a column vector
if ~(iscolumn(b))
error('b is not a column vector');
end
%check that x0 is a column vector
if ~(iscolumn(x0))
error('x0 is not a column vector');
end
%check that A, b and x0 agree in size
[rowA,colA] = size(A);
[rowb,colb] = size(b);
[rowx0,colx0] = size(x0);
if ~isequal(colA,rowb)||~isequal(rowb,rowx0)
error('matrix dimensions of A, b and xo do not agree');
end
%check that A and b have real entries
if ~isreal(A) || ~isreal(b)
error('matrix A or vector b do not have real entries');
end
%check that the provided tolerance is positive
if tol <= 0
error('tolerance must be positive');
end
%check that A is strictly diagonally dominant
absoluteA = abs(A);
row_sum=sum(absoluteA,2);
diagonal=diag(absoluteA);
if ~all(2*diagonal > row_sum)
warning('matrix A is not strictly diagonally dominant');
end
L = tril(A,-1);
U = triu(A,+1);
D = diag(diag(A));
x = x0;
M1 = inv(D).*L;
M2 = inv(D).*U;
M3 = D\b;
k = 0; %iterations counter
disp(size(M1));
disp(size(M2));
disp(size(M3));
disp(size(x));
while (norm(A*x - b) > tol)
for i=1:n
x(i) = - M1(i,:).*x - M2(i,:).*x + M3(i,:);
end
k=k+1;
if(k >= 10e4)
error('too many iterations carried out');
end
end
end %end function
Coding Discussion
The source of the coding error is the use of element-wise operations instead of matrix-matrix and matrix-vector operations.
These operations produce zero-matrices, so no progress is made.
The M matrices should be defined by
M1 = inv(D)*L; % Note that D\L is more efficient
M2 = inv(D)*U; % Note that D\U is more efficient
M3 = D\b;
and the iterator performing the update should be
x(i) = - M1(i,:)*x - M2(i,:)*x + M3(i,:);
Method Discussion
I also think it is worth mentioning that the code is currently implementing the Jacobi Method since the updater has the form
while a Gauss-Seidel update has the form
.
I don’t have 50 reputation so I can not comment this.
The line if(k >= 10e4), I think this does not make what you think it would. 10e4 is 100,000 and 1e4 is 10,000. This will be the reason why you think your counter does not work. Matlab ist still running because it’s running further than you think it will. Also I run in the same Problem knedlsepp already pointed out.
I have given P(x1...n) discrete independent probability values which represent for example the possibility of happening X.
I want a universal code for the question: With which probability does happening X occur at the same time 0-n times?
For example:
Given: 3 probabilities P(A),P(B),P(C) that each car(A,B,C) parks. Question would be: With which probability would no car, one car, two cars, and three cars park?
The answer for example for two cars parking at the same time would be:
P(A,B,~C) = P(A)*P(B)*(1-P(C))
P(A,~B,C) = P(A)*(1-P(B))*P(C)
P(~A,B,C) = (1-P(A))*P(B)*P(C)
P(2 of 3) = P(A,B,~C) + P(A,~B,C) + P(~A,B,C)
I have written the code for all possibilities, but the more values I get, of course the slower it gets due to more possible combinations.
% probability: Vector with probabilities P1, P2, ... PN
% result: Vector with results as stated above.
% All possibilities:
result(1) = prod(probability);
shift_vector = zeros(anzahl_werte,1);
for i = 1:anzahl_werte
% Shift Vector allocallization
shift_vector(i) = 1;
% Compute all unique permutations of the shift_vector
mult_vectors = uperm(shift_vector);
% Init Result Vector
prob_vector = zeros(length(mult_vectors(:,1)), 1);
% Calc Single Probabilities
for k = 1:length(mult_vectors(:,1))
prob_vector(k) = prod(abs(mult_vectors(k,:)'-probability));
end
% Sum of this Vector for one probability.
result(i+1) = sum(prob_vector);
end
end
%%%%% Calculate Permutations
function p = uperm(a)
[u, ~, J] = unique(a);
p = u(up(J, length(a)));
end % uperm
function p = up(J, n)
ktab = histc(J,1:max(J));
l = n;
p = zeros(1, n);
s = 1;
for i=1:length(ktab)
k = ktab(i);
c = nchoosek(1:l, k);
m = size(c,1);
[t, ~] = find(~p.');
t = reshape(t, [], s);
c = t(c,:)';
s = s*m;
r = repmat((1:s)',[1 k]);
q = accumarray([r(:) c(:)], i, [s n]);
p = repmat(p, [m 1]) + q;
l = l - k;
end
end
%%%%% Calculate Permutations End
Does anybody know a way to speed up this function? Or maybe Matlab has an implemented function for that?
I found the name of the calculation:
Poisson binomial distribution
How about this?
probability = [.3 .2 .4 .7];
n = numel(probability);
combs = dec2bin(0:2^n-1).'-'0'; %'// each column is a combination of n values,
%// where each value is either 0 or 1. A 1 value will represent an event
%// that happens; a 0 value will represent an event that doesn't happen.
result = NaN(1,n+1); %// preallocate
for k = 0:n; %// number of events that happen
ind = sum(combs,1)==k; %// combinations with exactly k 1's
result(k+1) = sum(prod(...
bsxfun(#times, probability(:), combs(:,ind)) + ... %// events that happen
bsxfun(#times, 1-probability(:), ~combs(:,ind)) )); %// don't happen
end