How to calculate Imbalance Accuracy Metric in multi-class classification - classification

I am sorry to bother, but I found an interesting article "Mortaz, E. (2020). Imbalance accuracy metric for model selection in multi-class imbalance classification problems. Knowledge-Based Systems, 210, 106490" (https://www.sciencedirect.com/science/article/pii/S0950705120306195) and there they calculate this measure (IAM) (the formula is in the paper, and I understood it), but I would like to ask: how can I replicate it on R?
I apologise in advance for the dumb question. Thank you for your attention!

The IAM formula provided in the article is: IAM formula
Where cij is the element (i,j) in the classifier's confusion matrix (c). k is referred to the number of classes in the classification (k>=2). It is shown that this metric can be used as a solo metric in multi-class model selection.
The code to implement the IAM (Imbalance Accuracy Metric) in python provided below:
def IAM(c):
'''
c is a nested list presenting the confusion matrix of the classifier (len(c)>=2)
'''
l = len(c)
iam = 0
for i in range(l):
sum_row = 0
sum_col = 0
sum_row_no_i = 0
sum_col_no_i = 0
for j in range(l):
sum_row += c[i][j]
sum_col += c[j][i]
if j is not i:
sum_row_no_i += c[i][j]
sum_col_no_i += c[j][i]
iam += (c[i][i] - max(sum_row_no_i, sum_col_no_i))/max(sum_row, sum_col)
return iam/l
c = [[2129, 52, 0, 1],
[499, 70, 0, 2],
[46, 16, 0, 1],
[85, 18, 0, 7]]
IAM(c) = -0.5210576475801445
The code to implement the IAM (Imbalance Accuracy Metric) in R provided below:
IAM <- function(c) {
# c is a matrix representing the confusion matrix of the classifier.
l <- nrow(c)
result = 0
for (i in 1:l) {
sum_row = 0
sum_col = 0
sum_row_no_i = 0
sum_col_no_i = 0
for (j in 1:l){
sum_row = sum_row + c[i,j]
sum_col = sum_col + c[j,i]
if(i != j) {
sum_row_no_i = sum_row_no_i + c[i,j]
sum_col_no_i = sum_col_no_i + c[j,i]
}
}
result = result + (c[i,i] - max(sum_row_no_i, sum_col_no_i))/max(sum_row, sum_col)
}
return(result/l)
}
c <- matrix(c(2129,52,0,1,499,70,0,2,46,16,0,1,85,18,0,7), nrow=4, ncol=4)
IAM(c) = -0.5210576475801445
Another example from the iris dataset (3 class problem) and sklearn:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
X, y = load_iris(return_X_y=True)
clf = LogisticRegression(max_iter = 1000).fit(X, y)
pred = clf.predict(X)
c = confusion_matrix(y, pred)
print('confusion matrix:')
print(c)
print(f'accuarcy : {clf.score(X, y)}')
print(f'IAM : {IAM(c)}')
confusion matrix:
[[50 0 0]
[ 0 47 3]
[ 0 1 49]]
accuarcy : 0.97
IAM : 0.92

Related

Is there a way to derive implict ODE function from big symbolic expression?

I'm Davide and I have a problem with the derivation of a function that later should be given to ode15i in Matlab.
Basically I've derived a big symbolic expression that describe the motion of a spececraft with a flexible appendice (like a solar panel). My goal is to obtain a function handle that can be integrated using the built-in implicit solver in Matlab (ode15i).
The problem I've encounter is the slowness of the Symbolic computations, especially in the function "daeFunction" (I've run it and lost any hope for a responce after 3/4 hours had passed).
The system of equations, that is derived using the Lagrange's method is an implicit ode.
The complex nature of the system arise from the flexibility modelling of the solar panel.
I am open to any suggestions that would help me in:
running the code properly.
running it as efficiently as possible.
Thx in advance.
Here after I copy the code. Note: Matlab r2021a was used.
close all
clear
clc
syms t
syms r(t) [3 1]
syms angle(t) [3 1]
syms delta(t)
syms beta(t) [3 1]
mu = 3.986e14;
mc = 1600;
mi = 10;
k = 10;
kt = 10;
Ii = [1 0 0 % for the first link it is different thus I should do a functoin or something that writes everything into an array or a vector
0 5 0
0 0 5];
% Dimension of satellite
a = 1;
b = 1.3;
c = 1;
Ic = 1/12*mc* [b^2+c^2 0 0
0 c^2+a^2 0
0 0 a^2+b^2];
ra_c = [0 1 0]';
a = diff(r,t,t);
ddelta = diff(delta,t);
dbeta = diff(beta,t);
dddelta = diff(delta,t,t);
ddbeta = diff(beta,t,t);
R= [cos(angle1).*cos(angle3)-cos(angle2).*sin(angle1).*sin(angle3) sin(angle1).*cos(angle3)+cos(angle2).*cos(angle1).*sin(angle3) sin(angle2).*sin(angle3)
-cos(angle1).*sin(angle3)-cos(angle2).*sin(angle1).*cos(angle3) -sin(angle1).*sin(angle3)+cos(angle2).*cos(angle1).*cos(angle3) sin(angle2).*cos(angle3)
sin(angle2).*sin(angle3) -sin(angle2).*cos(angle1) cos(angle2)];
d_angle1 = diff(angle1,t);
d_angle2 = diff(angle2,t);
d_angle3 = diff(angle3,t);
dd_angle1 = diff(angle1,t,t);
dd_angle2 = diff(angle2,t,t);
dd_angle3 = diff(angle3,t,t);
d_angle = [d_angle1;d_angle2;d_angle3];
dd_angle = [dd_angle1;dd_angle2;dd_angle3];
omega = [d_angle2.*cos(angle1)+d_angle3.*sin(angle2).*sin(angle1);d_angle2.*sin(angle1)-d_angle3.*sin(angle2).*cos(angle1);d_angle1+d_angle3.*cos(angle2)]; % this should describe correctly omega_oc
d_omega = diff(omega,t);
v1 = diff(r1,t);
v2 = diff(r2,t);
v3 = diff(r3,t);
v = [v1; v2; v3];
[J,r_cgi,R_ci]= Jacobian_Rob(4,delta,beta);
% Perform matrix multiplication
for mm = 1:4
vel(:,mm) = J(:,:,mm)*[ddelta;dbeta];
end
vel = formula(vel);
dr_Ccgi = vel(1:3,:);
omega_ci = vel(4:6,:);
assumeAlso(angle(t),'real');
assumeAlso(d_angle(t),'real');
assumeAlso(dd_angle(t),'real');
assumeAlso(r(t),'real');
assumeAlso(a(t),'real');
assumeAlso(v(t),'real');
assumeAlso(beta(t),'real');
assumeAlso(delta(t),'real');
assumeAlso(dbeta(t),'real');
assumeAlso(ddelta(t),'real');
assumeAlso(ddbeta(t),'real');
assumeAlso(dddelta(t),'real');
omega = formula(omega);
Tc = 1/2*v'*mc*v+1/2*omega'*R*Ic*R'*omega;
% kinetic energy of all appendices
for h = 1:4
Ti(h) = 1/2*v'*mi*v+mi*v'*skew(omega)*R*ra_c+mi*v'*skew(omega)*R*r_cgi(:,h)+mi*v'*R*dr_Ccgi(:,h)+1/2*mi*ra_c'*R'*skew(omega)'*skew(omega)*R*ra_c ...
+ mi*ra_c'*R'*skew(omega)'*skew(omega)*R*r_cgi(:,h)+mi*ra_c'*R'*skew(omega)'*R*dr_Ccgi(:,h)+1/2*omega'*R*R_ci(:,:,h)*Ii*(R*R_ci(:,:,h))'*omega ...
+ omega'*R*R_ci(:,:,h)*Ii*R_ci(:,:,h)'*omega_ci(:,h)+1/2*omega_ci(:,h)'*R_ci(:,:,h)*Ii*R_ci(:,:,h)'*omega_ci(:,h)+1/2*mi*r_cgi(:,h)'*R'*skew(omega)'*skew(omega)*R*r_cgi(:,h)+mi*r_cgi(:,h)'*R'*skew(omega)'*R*dr_Ccgi(:,h)...
+ 1/2*mi*dr_Ccgi(:,h)'*dr_Ccgi(:,h);
Ugi(h) = -mu*mi/norm(r,2)+mu*mi*r'/(norm(r,2)^3)*(R*ra_c+R*R_ci(:,:,h)*r_cgi(:,h));
end
Ugc = -mu*mc/norm(r,2);
Ue = 1/2*kt*(delta)^2+sum(1/2*k*(beta).^2);
U = Ugc+sum(Ugi)+Ue;
L = Tc + sum(Ti) - U;
D = 1/2 *100* (ddelta^2+sum(dbeta.^2));
%% Equation of motion derivation
eq = [diff(jacobian(L,v),t)'-jacobian(L,r)';
diff(jacobian(L,d_angle),t)'-jacobian(L,angle)';
diff(jacobian(L,ddelta),t)'-jacobian(L,delta)'+jacobian(D,ddelta)';
diff(jacobian(L,dbeta),t)'-jacobian(L,beta)'+jacobian(D,dbeta)'];
%% Reduction to first order sys
[sys,newVars,R1]=reduceDifferentialOrder(eq,[r(t); angle(t); delta(t); beta(t)]);
DAEs = sys;
DAEvars = newVars;
%% ode15i implicit solver
pDAEs = symvar(DAEs);
pDAEvars = symvar(DAEvars);
extraParams = setdiff(pDAEs,pDAEvars);
f = daeFunction(DAEs,DAEvars,'File','ProvaSum');
y0est = [6778e3 0 0 0.01 0.1 0.3 0 0.12 0 0 0 7400 0 0 0 0 0 0 0 0]';
yp0est = zeros(20,1);
opt = odeset('RelTol', 10.0^(-7),'AbsTol',10.0^(-7),'Stats', 'on');
[y0,yp0] = decic(f,0,y0est,[],yp0est,[],opt);
% Integration
[tSol,ySol] = ode15i(f,[0 0.5],y0,yp0,opt);
%% Funcitons
function [J,p_cgi,R_ci]=Jacobian_Rob(N,delta,beta)
% Function to compute Jacobian see Robotics by Siciliano
% N total number of links
% delta [1x1] beta [N-1x1] variable that describe position of the solar
% panel elements
beta = formula(beta);
L_link = [1 1 1 1]'; % Length of each link elements in [m], later to be derived from file or as function input
for I = 1 : N
A1 = Homog_Matrix(I,delta,beta);
A1 = formula(A1);
R_ci(:,:,I) = A1(1:3,1:3);
if I ~= 1
p_cgi(:,I) = A1(1:3,4) + A1(1:3,1:3)*[1 0 0]'*L_link(I)/2;
else
p_cgi(:,I) = A1(1:3,4) + A1(1:3,1:3)*[0 0 1]'*L_link(I)/2;
end
for j = 1:I
A_j = formula(Homog_Matrix(j,delta,beta));
z_j = A_j(1:3,3);
Jp(:,j) = skew(z_j)*(p_cgi(:,I)-A_j(1:3,4));
Jo(:,j) = z_j;
end
if N-I > 0
Jp(:,I+1:N) = zeros(3,N-I);
Jo(:,I+1:N) = zeros(3,N-I);
end
J(:,:,I)= [Jp;Jo];
end
J = formula(J);
p_cgi = formula(p_cgi);
R_ci = formula(R_ci);
end
function [A_CJ]=Homog_Matrix(J,delta,beta)
% This function is made sopecifically for the solar panel
% define basic rotation matrices
Rx = #(angle) [1 0 0
0 cos(angle) -sin(angle)
0 sin(angle) cos(angle)];
Ry = #(angle) [ cos(angle) 0 sin(angle)
0 1 0
-sin(angle) 0 cos(angle)];
Rz = #(angle) [cos(angle) -sin(angle) 0
sin(angle) cos(angle) 0
0 0 1];
if isa(beta,"sym")
beta = formula(beta);
end
L_link = [1 1 1 1]'; % Length of each link elements in [m], later to be derived from file or as function input
% Rotation matrix how C sees B
R_CB = Rz(-pi/2)*Ry(-pi/2); % Clarify notation: R_CB represent the rotation matrix that describe the frame B how it is seen by C
% it is the same if it was wrtitten R_B2C
% becouse bring a vector written in B to C
% frame --> p_C = R_CB p_B
% same convention used in siciliano how C sees B frame
A_AB = [R_CB zeros(3,1)
zeros(1,3) 1];
A_B1 = [Rz(delta) zeros(3,1)
zeros(1,3) 1];
A_12 = [Ry(-pi/2)*Rx(-pi/2)*Rz(beta(1)) L_link(1)*[0 0 1]'
zeros(1,3) 1];
if J == 1
A_CJ = A_AB*A_B1;
elseif J == 0
A_CJ = A_AB;
else
A_CJ = A_AB*A_B1*A_12;
end
for j = 3:J
A_Jm1J = [Rz(beta(j-1)) L_link(j-1)*[1 0 0]'
zeros(1,3) 1];
A_CJ = A_CJ*A_Jm1J;
end
end
function [S]=skew(r)
S = [ 0 -r(3) r(2); r(3) 0 -r(1); -r(2) r(1) 0];
end
I found your question beautiful. My suggestion is to manipulate the problem numerically. symbolic manipulation in Matlab is good but is much slower than numerical calculation. you can define easily the ode into a system of first-order odes and solve them using numerical integration functions like ode45. Your code is very lengthy and I couldn't manage to follow its details.
All the Best.
Yasien

Implementing a Gaussian blur kernel in MATLAB from scratch

I recently implemented a box average filter in MATLAB. I found it to be a fun exercise. I wanted to do the same thing with a Gaussian blur filter, so as to eventually solve some super-resolution problems. But I have found this much more challenging. There are several questions on here about such a topic. All of the answers in those topics have been very helpful. However, I have not been able to create the kernel successfully. Currently my code simply implements an averaging filter. Wiki page for Gaussian blur: https://en.wikipedia.org/wiki/Gaussian_blur
Here is my code:
function out_im = GaussianBlur(in_im, sigma, N) %input image, sigma, Number of iterations
[rows, cols] = size(in_im);
orig_im = in_im;
Filter = zeros(rows,cols);
w2 = 0;
for k = 1 : N
for LR_ROWS = 2 : 1 : rows - 1
for LR_COLS = 2 : 1 : cols -1
for i = -1 : 1
for j = -1 : 1
w1 = exp(-(orig_im(LR_ROWS, LR_COLS).^2/(2*sigma^2)));
w2 = w2 + w1;
Filter(LR_ROWS, LR_COLS) = Filter(LR_ROWS, LR_COLS) + ...
w1*orig_im(LR_ROWS+i, LR_COLS+j);
end
end
Filter(LR_ROWS, LR_COLS) = Filter(LR_ROWS, LR_COLS)./w2;
w2 = 0;
end
end
orig_im = Filter;
Filter = zeros(rows, cols);
end
out_im = orig_im;
end
Any suggestions or advice is greatly appreciated! Thank you.

Fast CVX solvers in Matlab

I am wondering what is the fastest convex optimizer in Matlab or is there any way to speed up current solvers? I'm using CVX, but it's taking forever to solve the optimization problem I have.
The optimization I have is to solve
minimize norm(Ax-b, 2)
subject to
x >= 0
and x d <= delta
where the size of A and b are very large.
Is there any way that I can solve this by a least square solver and then transfer it to the constraint version to make it faster?
I'm not sure what x.d <= delta means, but I'll just assume it's supposed to be x <= delta.
You can solve this problem using the projected gradient method or an accelerated projected gradient method (which is just a slight modification of the projected gradient method, which "magically" converges much faster). Here is some python code that shows how to minimize .5|| Ax - b ||^2 subject to the constraint that 0 <= x <= delta using FISTA, which is an accelerated projected gradient method. More details about the projected gradient method and FISTA can be found for example in Boyd's manuscript on proximal algorithms.
import numpy as np
import matplotlib.pyplot as plt
def fista(gradf,proxg,evalf,evalg,x0,params):
# This code does FISTA with line search
maxIter = params['maxIter']
t = params['stepSize'] # Initial step size
showTrigger = params['showTrigger']
increaseFactor = 1.25
decreaseFactor = .5
costs = np.zeros((maxIter,1))
xkm1 = np.copy(x0)
vkm1 = np.copy(x0)
for k in np.arange(1,maxIter+1,dtype = np.double):
costs[k-1] = evalf(xkm1) + evalg(xkm1)
if k % showTrigger == 0:
print "Iteration: " + str(k) + " cost: " + str(costs[k-1])
t = increaseFactor*t
acceptFlag = False
while acceptFlag == False:
if k == 1:
theta = 1
else:
a = tkm1
b = t*(thetakm1**2)
c = -t*(thetakm1**2)
theta = (-b + np.sqrt(b**2 - 4*a*c))/(2*a)
y = (1 - theta)*xkm1 + theta*vkm1
(gradf_y,fy) = gradf(y)
x = proxg(y - t*gradf_y,t)
fx = evalf(x)
if fx <= fy + np.vdot(gradf_y,x - y) + (.5/t)*np.sum((x - y)**2):
acceptFlag = True
else:
t = decreaseFactor*t
tkm1 = t
thetakm1 = theta
vkm1 = xkm1 + (1/theta)*(x - xkm1)
xkm1 = x
return (xkm1,costs)
if __name__ == '__main__':
delta = 5.0
numRows = 300
numCols = 50
A = np.random.randn(numRows,numCols)
ATrans = np.transpose(A)
xTrue = delta*np.random.rand(numCols,1)
b = np.dot(A,xTrue)
noise = .1*np.random.randn(numRows,1)
b = b + noise
def evalf(x):
AxMinusb = np.dot(A, x) - b
val = .5 * np.sum(AxMinusb ** 2)
return val
def gradf(x):
AxMinusb = np.dot(A, x) - b
grad = np.dot(ATrans, AxMinusb)
val = .5 * np.sum(AxMinusb ** 2)
return (grad, val)
def evalg(x):
return 0.0
def proxg(x,t):
return np.maximum(np.minimum(x,delta),0.0)
x0 = np.zeros((numCols,1))
params = {'maxIter': 500, 'stepSize': 1.0, 'showTrigger': 5}
(x,costs) = fista(gradf,proxg,evalf,evalg,x0,params)
plt.figure()
plt.plot(x)
plt.plot(xTrue)
plt.figure()
plt.semilogy(costs)

neural network code explanation

The following is an implementation of a simple Perceptron supplied in a blog.
input = [0 0; 0 1; 1 0; 1 1];
numIn = 4;
desired_out = [0;1;1;1];
bias = -1;
coeff = 0.7;
rand('state',sum(100*clock));
weights = -1*2.*rand(3,1);
iterations = 10;
for i = 1:iterations
out = zeros(4,1);
for j = 1:numIn
y = bias*weights(1,1)+...
input(j,1)*weights(2,1)+input(j,2)*weights(3,1);
out(j) = 1/(1+exp(-y));
delta = desired_out(j)-out(j);
weights(1,1) = weights(1,1)+coeff*bias*delta;
weights(2,1) = weights(2,1)+coeff*input(j,1)*delta;
weights(3,1) = weights(3,1)+coeff*input(j,2)*delta;
end
end
I have the following questions,
(1) which one is training data here?
(2) which one is test data here?
(3) which are the labels here?
training data is [0 0; 0 1; 1 0; 1 1] in the other view every row is one set of training data as follow
>> input
input =
0 0
0 1
1 0
1 1
and target is
desired_out =
0
1
1
1
please think about desired_out this is your labels .. every row in training data(input) have a specific output(label) in binary set{0,1}(because this example for implementation of OR logic circuit.
in matlab you can use or function as below for further understanding:
>> or(0,0)
ans =
0
>> or(1,0)
ans =
1
>> or(0,1)
ans =
1
>> or(1,1)
ans =
1
Note that your code has not any training test and this code just trying to get weights and other parameters of perceptron but you can add training test to your code by just little program
NumDataTest = 10 ;
test=randi( [0 , 1] , [ NumDataTest , 2]) ...
+(2*rand(NumDataTest,2)-1)/20;
so test data will be similar to below
test =
1.0048 1.0197
0.0417 0.9864
-0.0180 1.0358
1.0052 1.0168
1.0463 0.9881
0.9787 0.0367
0.9624 -0.0239
0.0065 0.0404
1.0085 -0.0109
-0.0264 0.0429
for test this data you can use your own program by below code:
for i=1:NumDataTest
y = bias*weights(1,1)+test(i,1)*weights(2,1)+test(i,2)*weights(3,1);
out(i) = 1/(1+exp(-y));
end
and finally:
table(test(:,1),test(:,2),out,'VariableNames',{'input1' 'input2' 'output'})
output is
input1 input2 output
_________ _________ ________
1.0048 1.0197 0.99994
0.041677 0.98637 0.97668
-0.017968 1.0358 0.97527
1.0052 1.0168 0.99994
1.0463 0.98814 0.99995
0.97875 0.036674 0.9741
0.96238 -0.023861 0.95926
0.0064527 0.040392 0.095577
1.0085 -0.010895 0.97118
-0.026367 0.042854 0.080808
Code section:
clc
clear
input = [0 0; 0 1; 1 0; 1 1];
numIn = 4;
desired_out = [0;1;1;1];
bias = -1;
coeff = 0.7;
rand('state',sum(100*clock));
weights = -1*2.*rand(3,1);
iterations = 100;
for i = 1:iterations
out = zeros(4,1);
for j = 1:numIn
y = bias*weights(1,1)+input(j,1)*weights(2,1)+input(j,2)*weights (3,1);
out(j) = 1/(1+exp(-y));
delta = desired_out(j)-out(j);
weights(1,1) = weights(1,1)+coeff*bias*delta;
weights(2,1) = weights(2,1)+coeff*input(j,1)*delta;
weights(3,1) = weights(3,1)+coeff*input(j,2)*delta;
end
end
%% Test Section
NumDataTest = 10 ;
test=randi( [0 , 1] , [ NumDataTest , 2]) ...
+(2*rand(NumDataTest,2)-1)/20;
for i=1:NumDataTest
y = bias*weights(1,1)+test(i,1)*weights(2,1)+test(i,2)*weights(3,1);
out(i) = 1/(1+exp(-y));
end
table(test(:,1),test(:,2),out,'VariableNames',{'input1' 'input2' 'output'})
I hope this helps and sorry for my English if it's bad

Abnormal behavior algorithm implemented in matlab depending of the input

I'm doing a homework assignment for scientific computing, specifically the iterative methods Gauss-Seidel and SOR in matlab, the problem is that for a matrix gives me unexpected results (the solution does not converge) and for another matrix converges.
Heres the code of sor, where:
A: Matrix of the system A * x = b
Xini: array of initial iteration
b: array independent of the system A * x = b
maxiter: Maximum Iterations
tol: Tolerance;
In particular, the SOR method, will receive a sixth parameter called w which corresponds to the relaxation parameter.
Here´s the code for sor method:
function [x2,iter] = sor(A,xIni, b, maxIter, tol,w)
x1 = xIni;
x2 = x1;
iter = 0;
i = 0;
j = 0;
n = size(A, 1);
for iter = 1:maxIter,
for i = 1:n
a = w / A(i,i);
x = 0;
for j = 1:i-1
x = x + (A(i,j) * x2(j));
end
for j = i+1:n
x = x + (A(i,j) * x1(j));
end
x2(i) = (a * (b(i) - x)) + ((1 - w) * x1(i));
end
x1 = x2;
if (norm(b - A * x2) < tol);
break;
end
end
Here´s the code for Gauss-seidel method:
function [x, iter] = Gauss(A, xIni, b, maxIter, tol)
x = xIni;
xnew = x;
iter = 0;
i = 0;
j = 0;
n = size(A,1);
for iter = 1:maxIter,
for i = 1:n
a = 1 / A(i,i);
x1 = 0;
x2 = 0;
for j = 1:i-1
x1 = x1 + (A(i,j) * xnew(j));
end
for j = i+1:n
x2 = x2 + (A(i,j) * x(j));
end
xnew(i) = a * (b(i) - x1 - x2);
end
x= xnew;
if ((norm(A*xnew-b)) <= tol);
break;
end
end
For this input:
A = [1 2 -2; 1 1 1; 2 2 1];
b = [1; 2; 5];
when call the function Gauss-Seidel or sor :
[x, iter] = gauss(A, [0; 0; 0], b, 1000, eps)
[x, iter] = sor(A, [0; 0; 0], b, 1000, eps, 1.5)
the output for gauss is:
x =
1.0e+304 *
1.6024
-1.6030
0.0011
iter =
1000
and for sor is:
x =
NaN
NaN
NaN
iter =
1000
however for the following system is able to find the solution:
A = [ 4 -1 0 -1 0 0;
-1 4 -1 0 -1 0;
0 -1 4 0 0 -1;
-1 0 0 4 -1 0;
0 -1 0 -1 4 -1;
0 0 -1 0 -1 4 ]
b = [1 0 0 0 0 0]'
Solution:
[x, iter] = sor(A, [0; 0; 0], b, 1000, eps, 1.5)
x =
0.2948
0.0932
0.0282
0.0861
0.0497
0.0195
iter =
52
The behavior of the methods depends on the conditioning of both matrices? because I noticed that the second matrix is better conditioned than the first. Any suggestions?
From the wiki article on Gauss-Seidel:
convergence is only guaranteed if the matrix is either diagonally dominant, or symmetric and positive definite
Since SOR is similar to Gauss-Seidel, I expect the same conditions to hold for SOR, but you might want to look that one up.
Your first matrix is definitely not diagonally dominant or symmetric. Your second matrix however, is symmetric and positive definite (because all(A==A.') and all(eig(A)>0)).
If you use Matlab's default method (A\b) as the "real" solution, and you plot the norm of the difference between each iteration and the "real" solution, then you get the two graphs below. It is obvious the first matrix is not ever going to converge, while the second matrix already produces acceptable results after a few iterations.
Always get to know the limitations of your algorithms before applying them in the wild :)