I'm trying to implement the Jacobi iteration in MATLAB but am unable to get it to converge. I have looked online and elsewhere for working code for comparison but am unable to find any that is something similar to my code and still works. Here is what I have:
function x = Jacobi(A,b,tol,maxiter)
n = size(A,1);
xp = zeros(n,1);
x = zeros(n,1);
k=0; % number of steps
while(k<=maxiter)
k=k+1;
for i=1:n
xp(i) = 1/A(i,i)*(b(i) - A(i,1:i-1)*x(1:i-1) - A(i,i+1:n)*x(i+1:n));
end
err = norm(A*xp-b);
if(err<tol)
x=xp;
break;
end
x=xp;
end
This just blows up no matter what A and b I use. It's probably a small error I'm overlooking but I would be very grateful if anyone could explain what's wrong because this should be correct but is not so in practice.
Your code is correct. The reason why it may not seem to work is because you are specifying systems that may not converge when you are using Jacobi iterations.
To be specific (thanks to #Saraubh), this method will converge if your matrix A is strictly diagonally dominant. In other words, for each row i in your matrix, the absolute summation of all of the columns j at row i without the diagonal coefficient at i must be less than the diagonal itself. In other words:
However, there are some systems that will converge with Jacobi, even if this condition isn't satisfied, but you should use this as a general rule before trying to use Jacobi for your system. It's actually more stable if you use Gauss-Seidel. The only difference is that you are re-using the solution of x and feeding it into the other variables as you progress down the rows. To make this Gauss-Seidel, all you have to do is change one character within your for loop. Change it from this:
xp(i) = 1/A(i,i)*(b(i) - A(i,1:i-1)*x(1:i-1) - A(i,i+1:n)*x(i+1:n));
To this:
xp(i) = 1/A(i,i)*(b(i) - A(i,1:i-1)*xp(1:i-1) - A(i,i+1:n)*x(i+1:n));
**HERE**
Here are two examples that I will show you:
Where we specify a system that does not converge by Jacobi, but there is a solution. This system is not diagonally dominant.
Where we specify a system that does converge by Jacobi. Specifically, this system is diagonally dominant.
Example #1
A = [1 2 2 3; -1 4 2 7; 3 1 6 0; 1 0 3 4];
b = [0;1;-1;2];
x = Jacobi(A, b, 0.001, 40)
xtrue = A \ b
x =
1.0e+09 *
4.1567
0.8382
1.2380
1.0983
xtrue =
-0.1979
-0.7187
0.0521
0.5104
Now, if I used the Gauss-Seidel solution, this is what I get:
x =
-0.1988
-0.7190
0.0526
0.5103
Woah! It converged for Gauss-Seidel and not Jacobi, even though the system isn't diagonally dominant, I may have an explanation for that, and I'll provide later.
Example #2
A = [10 -1 2 0; -1 -11 -1 3; 2 -1 10 -1; 0 3 -1 8];
b = [6;25;-11;15];
x = Jacobi(A, b, 0.001, 40);
xtrue = A \ b
x =
0.6729
-1.5936
-1.1612
2.3275
xtrue =
0.6729
-1.5936
-1.1612
2.3274
This is what I get with Gauss-Seidel:
x =
0.6729
-1.5936
-1.1612
2.3274
This certainly converged for both, and the system is diagonally dominant.
As such, there is nothing wrong with your code. You are just specifying a system that can't be solved using Jacobi. It's better to use Gauss-Seidel for iterative methods that revolve around this kind of solving. The reason why is because you are immediately using information from the current iteration and spreading this to the rest of the variables. Jacobi does not do this, which is the reason why it diverges more quickly. For Jacobi, you can see that Example #1 failed to converge, while Example #2 did. Gauss-Seidel converged for both. In fact, when they both converge, they're quite close to the true solution.
Again, you need to make sure that your systems are diagonally dominant so you are guaranteed to have convergence. Not enforcing this rule... well... you'll be taking a risk as it may or may not converge.
Good luck!
Though this does not point out the problem in your code, I believe that you are looking for the Numerical Methods: Jacobi File Exchange Submission.
%JACOBI Jacobi iteration for solving a linear system.
% Sample call
% [X,dX] = jacobi(A,B,P,delta,max1)
% [X,dX,Z] = jacobi(A,B,P,delta,max1)
It seems to do exactly what you describe.
As others have pointed out that not all systems are convergent using Jacobi method, but they do not point out why? Actually only a small sub-set of systems converge with Jacobi method.
The convergence criteria is that the "sum of all the coefficients (non-diagonal) in a row" must be lesser than the "coefficient at the diagonal position in that row". This criteria must be satisfied by all the rows. You can read more at: Jacobi Method Convergence
Before you decide to use Jacobi method, you must see whether this criteria is satisfied by the numerical method or not. The Gauss-Seidel method has a slightly more relaxed convergence criteria which allows you to use it for most of the Finite Difference type numerical methods.
Related
Actually I have to calculate values of 3 variables from probably 8 or 9 non-linear equations(may be more for accuracy).
I was using lsqnonlin and fsolve.
Using lsqnonlin, it says solver stopped prematurely (mainly due to value of iteration, funEvals and tolerance) and the output is far away from exact solution. I tried but I don't know on what basis I should set those parameters.
Using fsolve, it says no solution found.
I also used LMFnlsq and LMFsolve but it gives the output nowhere near the exact solution? I tried to change other parameters too but I could not bring those solutions to my desired values.
Is there any other way to solve these overdetermined non-linear equations?
My code till now:
x0 = [20 40 275];
eqn = #(x)[((((x(1)-Sat(1,1))^2+(x(2)-Sat(1,2))^2+(x(3)-Sat(1,3))^2))-dis(1)^2);
((((x(1)-Sat(2,1))^2+(x(2)-Sat(2,2))^2+(x(3)-Sat(2,3))^2))-dis(2)^2);
((((x(1)-Sat(3,1))^2+(x(2)-Sat(3,2))^2+(x(3)-Sat(3,3))^2))- dis(3)^2);
((((x(1)-Sat(4,1))^2+(x(2)-Sat(4,2))^2+(x(3)-Sat(4,3))^2))- dis(4))^2;
((((x(1)-Sat(5,1))^2+(x(2)-Sat(5,2))^2+(x(3)-Sat(5,3))^2))- dis(5))^2;
((((x(1)-Sat(6,1))^2+(x(2)-Sat(6,2))^2+(x(3)-Sat(6,3))^2))- dis(6))^2;
((((x(1)-Sat(7,1))^2+(x(2)-Sat(7,2))^2+(x(3)-Sat(7,3))^2))- dis(7))^2;
((((x(1)-Sat(8,1))^2+(x(2)-Sat(8,2))^2+(x(3)-Sat(8,3))^2))- dis(8))^2;
((((x(1)-Sat(9,1))^2+(x(2)-Sat(9,2))^2+(x(3)-Sat(9,3))^2))- dis(9))^2;
((((x(1)-Sat(10,1))^2+(x(2)-Sat(10,2))^2+(x(3)-Sat(10,3))^2))- dis(10))^2];
lb = [0 0 0];
ub = [100 100 10000];
options = optimoptions('lsqnonlin','MaxFunEvals',3000,'MaxIter',700,'TolFun',1e-18);%,'TolX',1);
x= lsqnonlin(eqn,x0,lb,ub,options)
**Error:**
**Solver stopped prematurely.**
lsqnonlin stopped because it exceeded the iteration limit,
options.MaxIter = 700 (the selected value).
x = 20.349 46.633 9561.5
Hoping for some suggestions!
Thanks in advance!
I usually model this explicitly:
min w'w
f_i(x) = w_i
w is a free variable
L<=x<=U
It should be easy to calculate a feasible (but non-optimal) solution in advance. If you can find a "good" initial solution that would be even better. Then use a general purpose NLP solver (e.g. fmincon) and pass on your initial feasible solution (both x and w). The best thing is to use a modeling system that allows automatic differentiation. Otherwise you should provide correct and precise gradients (and if needed second derivatives). See also the advice here.
Im implementing the k-means algorithm on matlab without using the k-means built-in function, The stopping criteria is when the new centroids doesn't change by new iterations, but i cannot implement it in matlab , can anybody help?
Thanks
Setting no change as a stopping criteria is a bad idea. There are a few main reasons you shouldn't use a 0 change condition
even for a well behaved function the difference between 0 change and a very small change (say 1e-5 perhaps)could be 1000+ iterations, so you are wasting time trying to get them to be exactly the same. Especially because computers usually keep far more digits than we are interested in. IF you only need 1 digit accuracy, why wait for the computer to find an answer within 1e-31?
computers have floating point errors everywhere. Try doing some easily reversible matrix operations like a = rand(3,3); b = a*a*inv(a); a-b theoretically this should be 0 but you will see it isn't. So these errors alone could prevent your program from ever stopping
dithering. lets say we have a 1d k means problem with 3 numbers and we want to split them into 2 groups. One iteration the grouping can be a,b vs c. the next iteration could be a vs b,c the next could be a,b vs c the next.... This is of course a simplified example, but there can be instances where a few data points can dither between clusters, and you will end up with a never ending algorithm. Since those few points are reassigned, the change will never be 0
the solution is to use a delta threshold. basically you subtract the current values from the previous and if they are less than a threshold you are done. This on its own is powerful, but as with any loop, you need a backup escape plan. And that is setting a max_iterations variable. Look at matlabs documentation for kmeans, even they have a MaxIter variable (default is 100) so even if your kmeans doesn't converge, at least it wont run endlessly. Something like this might work
%problem specific
max_iter = 100;
%choose a small number appropriate to your problem
thresh = 1e-3;
%ensures it runs the first time
delta_mu = thresh + 1;
num_iter = 0;
%do your kmeans in the loop
while (delta_mu > thresh && num_iter < max_iter)
%save these right away
old_mu = curr_mu;
%calculate new means and variances, this is the standard kmeans iteration
%then store the values in a variable called curr_mu
curr_mu = newly_calculate_values;
%use the two norm to find the delta as a single number. no matter what
%the original dimensionality of mu was. If old_mu -new_mu was
% 0 the norm is still 0. so it behaves well as a distance measure.
delta_mu = norm(old_mu - curr_mu,2);
num_ter = num_iter + 1;
end
edit
if you don't know the 2 norm is essentially the euclidean distance
I have some code which uses sparse indexing (and there's no way that I can get around that). I run this in a function, and use it for two problems, where the sizes of all the variables involved do not change. However, for one problem, the sparse indexing part takes 5 seconds, and for the other, takes 25 seconds.
I checked the size of every variable involved, and they are the same for both problems. I also checked that xv is a full matrix for both problem types.
So, anyone else ever run into something weird like this? Any ideas as to why this would happen? Mainly I am trying to make the code more efficient, and while 5 seconds is ok for my particular application, 25 seconds (especially when I can't explain it) is very bad.
Edit: Here is a link to a photo that profiles this weird behavior. The runtime values were recorded on the third run to ensure that the size of X is also not changing. And I did check that xv is a dense (not sparse) matrix both times.
https://www.dropbox.com/s/i41j6afanzbjdyg/weird_bcd_thing.png?dl=0
Thanks so much for any help!
Code below (runs in a for loop). If I use ptype = 1, then it's 5 seconds, ptype = 3 is 25 seconds.
clvec = cliques{k};
xcurr = full(X(clvec));
xv = reshape(xcurr - Z(offset_index(k) + 1 : offset_index(k) + ncl^2),ncl,ncl);
%these two functions both take a dense symmetric matrix and return a dense symmetric matrix, and in both cases the size is the same for a given k.
if ptype == 1
xv = proj_PSD(xv,0,0);
elseif ptype == 3
xv = proj_Schoenberg(xv,0);
end
Xd = vec(xv) - xcurr;
%THIS IS THE WEIRD LINE
tic
X(clvec) = xv;
toc;
In the 'WEIRD LINE' : X(clvec) = xv;
You are using a random access to a sparse matrix.
This access in a sparse matrix is not constant and depends on its data. The time is may depend on the matrix values and the indices you are trying to access.
This is not the case in regular matrix, where you usually get a stable access time, and faster.
In order to assure a stable constant access try to change the implementation based on your specific matrix usage, try to avoid values assign by random access.
See next code for as a reference:
X = sparse(randi(100,50,1),randi(100,50,1),randn(1),100,100);
for i=1:10000
rand_inds{i} = randperm(10000,100);
end
for i=1:100
ti = tic;
X(rand_inds{i}) = 3;
to_X(i) = toc(ti);
end
Xf = full(X);
for i=1:100
ti = tic;
Xf(rand_inds{i}) = 3;
to_Xf(i) = toc(ti);
end
figure;plot(to_X);hold on;plot(to_Xf,'r');
I solved my problem! I'm posting the answer because I think it's interesting.
One thing I didn't mention in the question is that the loop goes from k = 1 to k = L, and for ptype = 3, we add one more step, and that's assigning all the diagonal indices to 0:
X(diag_index) = 0
where diag_index is computed ahead of time.
The problem is, instead of just assigning the values to 0, MATLAB will automatically discard these indices, and the next loop, when accessing diagonal indices, it has to re-allocate for X. So, I changed that line to
X(diag_index) = eps;
and now they both run equally fast! (It's not the best solution, since that's going to be a source of error later, but there's no more mystery!)
The answer is never what you think it would be...
I am trying to use solve() to solve a system of equations of the following form
eq1=a1x+a2y;
eq2=b1x+b2y;
where a1 = .05 for values of x<5, .1 for values of 5
Is there a way to solve for this using solve? As in sol = solve(eq1,eq2);
I'm not sure what you're trying to do here. Can you please post a real example (with numbers) and what you would like the output to be?
I think you're trying to solve linear simultaeneous equations. Assuming that is what you are trying to do:
I would suggest multiplying all of your equations by 20, so that your minimum quanta size of 0.05 becomes 1.00. Your problem then becomes the solution of linear equations for integer values.
Note that if the system is fully constrained (that is, if there are n independent constraints on the n equations you want to solve) then there will only be one solution and it may not necessarily be an integer solution. For example the system:
1 = 2a + 4b
3 = a + b
has the solution a = 5.5, b = -2.5. No other solution is possible.
For under-constrained systems, i.e.
0 = 3x + y
x > 0
Then there will be an infinite number of solutions, some of which may have both x and y being integer values. (Or there may be no integer solutions at all.)
Okay let me give you a quick rundown.
if you want to solve an equation or a system of equations and conditions then you need to define them as such, so let me explain.
so by example
clear all; %just to be safe
syms x y b
a=0.5;
somevalue=1;
someothervalue=3;
eq1= a*x+a*y == somevalue; %this is your first equation
eq2= b*x+b*y == someothervalue; %this is your 2nd equation
cond1= x<5; %this is a condition which matlab sees as an "equation"
eqs=[eq1,eq2,cond1]; %these are the equations and conditions you want to solve for, use this for solve
eqs=[eq1,eq2]; %use this for vpasolve and set your condition in range
vars=[x,y,b]; %these are the variable you want to solve for
range = [-Inf 5; NaN NaN; NaN NaN]; %NaN means you set no range
%you can use solve or vpasolve, second one being numeric, which is the one you'll probably want
n=5;
sol=zeros(n,numel(vars));
for i = 1:n
temp1 = vpasolve(eqs, vars, range, 'random', true);
temp = vpasolve(eqs, vars, 'random', true);
sol(i,1) = temp.x;
sol(i,2) = temp.y;
sol(i,3) = temp.b;
end
sol
Now when I run this myself I can't get the range to properly work for some reason, still trying to figure that out. When you don't set a range it works just fine, if you can use the solve function then there also isn't a problem.
In theory the range function should work fine like this so it might be a bug on my end.
If you use solve you have some nice options where you can use assume to set extra conditions that are a bit more advanced, like only checking for real solutions or only integers, etc.
This question is related to these two:
Introduction to vectorizing in MATLAB - any good tutorials?
filter that uses elements from two arrays at the same time
Basing on the tutorials I read, I was trying to vectorize some procedure that takes really a lot of time.
I've rewritten this:
function B = bfltGray(A,w,sigma_r)
dim = size(A);
B = zeros(dim);
for i = 1:dim(1)
for j = 1:dim(2)
% Extract local region.
iMin = max(i-w,1);
iMax = min(i+w,dim(1));
jMin = max(j-w,1);
jMax = min(j+w,dim(2));
I = A(iMin:iMax,jMin:jMax);
% Compute Gaussian intensity weights.
F = exp(-0.5*(abs(I-A(i,j))/sigma_r).^2);
B(i,j) = sum(F(:).*I(:))/sum(F(:));
end
end
into this:
function B = rngVect(A, w, sigma)
W = 2*w+1;
I = padarray(A, [w,w],'symmetric');
I = im2col(I, [W,W]);
H = exp(-0.5*(abs(I-repmat(A(:)', size(I,1),1))/sigma).^2);
B = reshape(sum(H.*I,1)./sum(H,1), size(A, 1), []);
Where
A is a matrix 512x512
w is half of the window size, usually equal 5
sigma is a parameter in range [0 1] (usually one of: 0.1, 0.2 or 0.3)
So the I matrix would have 512x512x121 = 31719424 elements
But this version seems to be as slow as the first one, but in addition it uses a lot of memory and sometimes causes memory problems.
I suppose I've made something wrong. Probably some logic mistake regarding vectorizing. Well, in fact I'm not surprised - this method creates really big matrices and probably the computations are proportionally longer.
I have also tried to write it using nlfilter (similar to the second solution given by Jonas) but it seems to be hard since I use Matlab 6.5 (R13) (there are no sophisticated function handles available).
So once again, I'm asking not for ready solution, but for some ideas that would help me to solve this in reasonable time. Maybe you will point me what I did wrong.
Edit:
As Mikhail suggested, the results of profiling are as follows:
65% of time was spent in the line H= exp(...)
25% of time was used by im2col
How big are I and H (i.e. numel(I)*8 bytes)? If you start paging, then the performance of your second solution is going to be affected very badly.
To test whether you really have a problem due to too large arrays, you can try and measure the speed of the calculation using tic and toc for arrays A of increasing size. If the execution time increases faster than by the square of the size of A, or if the execution time jumps at some size of A, you can try and split the padded I into a number of sub-arrays and perform the calculations like that.
Otherwise, I don't see any obvious places where you could be losing lots of time. Well, maybe you could skip the reshape, by replacing B with A in your function (saves a little memory as well), and writing
A(:) = sum(H.*I,1)./sum(H,1);
You may also want to look into upgrading to a more recent version of Matlab - they've worked hard on improving performance.