I follow the formula on wiki:
http://en.wikipedia.org/wiki/Pseudo_inverse
to compute pseudoinverse but i can not receive the right result. for example:
I want to find theta of the equation: Y=R*theta, i write on matlab:
R = -[1/sqrt(2) 1 1/sqrt(2) 0;0 1/sqrt(2) 1 1/sqrt(2);-1/sqrt(2) 0 1/sqrt(2) 1];
% R is 3x4 matrix
Y = [0; -1/sqrt(2);-1]; %Y is 3x1 matrix
B1 = pinv(R);
theta1 = B1*Y;
result1 = R*theta1 - Y
B2 = R'*inv(R*R');
theta2 = B2*Y;
result2 = R*theta2 - Y
and this is the result:
result1 =
1.0e-15 *
-0.1110
-0.2220
-0.2220
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 1.904842e-17.
> In pseudoinverse at 14
result2 =
0.1036
-0.1768
-0.3536
Cleary, theta2 is the wrong answer, but i don't know how and why. I read many book and they give me the same formula as wiki.
Can anybody help me to do pseudo inverse by hand ? thanks !
The algebra tells you that a pseudo-inverse can be used to solve such equations, but the algebra isn't accounting for finite precision computation.
In fact computation of a pseudo-inverse using the matrix multiplication method is not suitable because it is numerically unstable. Use the \ operator for matrix division, as in
theta = R \ Y;
Algebraically, matrix division is the same as multiplication by pseudo-inverse. But MATLAB's implementation is far more stable.
For more information, including stable methods, see
Moore–Penrose pseudoinverse on Wikipedia
As Ben has already said, matrix inversion is numerically unstable. The function inv is not recommended unless you want to have the actual inversion of a matrix, see for example this link. The misuse of inv is the mistake a new student of numerical linear algebra most often makes.
In most linear algebra computations, you can circumvent inv by using a numerically-stable algorithm. For example, we have LU factorization for linear solvers, and QR or SVD method for ordinary least squares.
You're not calculating B1 correctly. In your case
B1 = inv(R'*R)*R';
Because R is leading (traditionally it is the other way around). However, that doesn't solve your singularity problem.
pinv used SVD to calculate the pseudo-inverse, which you can read about here.
So basically it does in a more elegant fashion:
[U,S,V] = svd(R);
S(abs(S)<(sum(sum(S))*1e-8)) = 0; % removes near-singular values.
Stemp = 1./S;
Stemp(isinf(Stemp)) = 0; % This take the inverse of non-zero terms... I'm sure there is faster way
B1 = V * Stemp' * U';
And then you can continue on your way...
Related
I need to solve the linear system
A x = b
which can be done efficiently by
x = A \ b
But now A is very large and I actually only need one component, say x(1). Is there a way to solve this more efficiently than to compute all components of x?
A is not sparse. Here, efficiency is actually an issue because this is done for many b.
Also, storing the inverse of K and multiplying only its first row to b is not possible because K is badly conditioned. Using the \ operator employs the LDL solver in this case, and accuracy is lost when the inverse is explicitly used.
I don't think you'd technically get a speed-up over the very optimized Matlab routine however if you understand how it is solved then you can just solve for one part of x. E.g the following. in traditional solver you use backsub for QR solve for instance. In LU solve you use both back sub and front sub. I could get LU. Unfortunately, it actually starts at the end due to how it solves it. The same is true for LDL which would employ both. That doesn't preclude that fact there may be more efficient ways of solving whatever you have.
function [Q,R] = qrcgs(A)
%Classical Gram Schmidt for an m x n matrix
[m,n] = size(A);
% Generates the Q, R matrices
Q = zeros(m,n);
R = zeros(n,n);
for k = 1:n
% Assign the vector for normalization
w = A(:,k);
for j=1:k-1
% Gets R entries
R(j,k) = Q(:,j)'*w;
end
for j = 1:k-1
% Subtracts off orthogonal projections
w = w-R(j,k)*Q(:,j);
end
% Normalize
R(k,k) = norm(w);
Q(:,k) = w./R(k,k);
end
end
function x = backsub(R,b)
% Backsub for upper triangular matrix.
[m,n] = size(R);
p = min(m,n);
x = zeros(n,1);
for i=p:-1:1
% Look from bottom, assign to vector
r = b(i);
for j=(i+1):p
% Subtract off the difference
r = r-R(i,j)*x(j);
end
x(i) = r/R(i,i);
end
end
The method mldivide, generally represented as \ accepts solving many systems with the same A at once.
x = A\[b1 b2 b3 b4] # where bi are vectors with n rows
Solves the system for each b, and will return an nx4 matrix, where each column is the solution of each b. Calling mldivide like this should improve efficiency becaus the descomposition is only done once.
As in many decompositions like LU od LDL' (and in the one you are interested in particular) the matrix multiplying x is upper diagonal, the first value to be solved is x(n). However, having to do the LDL' decomposition, a simple backwards substitution algorithm won't be the bottleneck of the code. Therefore, the decomposition can be saved in order to avoid repeating the calculation for every bi. Thus, the code would look similar to this:
[LA,DA] = ldl(A);
DA = sparse(DA);
% LA = sparse(LA); %LA can also be converted to sparse matrix
% loop over bi
xi = LA'\(DA\(LA\bi));
% end loop
As you can see in the documentation of mldivide (Algorithms section), it performs some checks on the input matrixes, and having defined LA as full and DA as sparse, it should directly go for a triangular solver and a tridiagonal solver. If LA was converted to sparse, it would use a triangular solver too, and I don't know if the conversion to sparse would represent any improvement.
I'm trying to solve the following equation for the variable h:
F is the normal cumulative distribution function
f is the probability density function of chi-square distribution.
I want to solve it numerically using Matlab.
First I tried to solve the problem with a simpler version of function F and f.
Then, I tried to solve the problem as below:
n = 3; % n0 in the equation
k = 2;
P = 0.99; % P* in the equation
F = #(x,y,h) normcdf(h./sqrt((n-1)*(1./x+1./y)));
g1 = #(y,h)integral(#(x) F(x,y,h).*chi2pdf(x,n),0,Inf, 'ArrayValued', true);
g2 = #(h) integral(#(y) (g1(y,h).^(k-1)).*chi2pdf(y,n),0,Inf)-P;
bsolve = fzero(g2,0)
I obtained an answer of 5.1496. The problem is that this answer seems wrong.
There is a paper which provides a table of h values obtained by solving the above equation. This paper was published in 1984, when the computer power was not good enough to solve the equation quickly. They solved it with various values:
n0 = 3:20, 30:10:50
k = 2:10
P* = 0.9, 0.95 and 0.99
They provide the value of h in each case.
Now I'm trying to solve this equation and get the value of h with different values of n0, k and P* (for example n0=25, k=12 and P*=0.975), where the paper does not provide the value of h.
The Matlab code above gives me different values of h than the paper.
For example:
n0=3, k=2 and P*=0.99: my code gives h = 5.1496, the paper gives h = 10.276.
n0=10, k=4 and P*=0.9: my code gives h = 2.7262, the paper gives h = 2.912.
The author says
The integrals were evaluated using 32 point Gauss-Laguerre numerical quadrature. This was accomplished by evaluating the inner integral at the appropriate 32 values of y which were available from the IBM subroutine QA32. The inner integral was also evaluated using 32 point Gauss-Laguerre numerical quadrature. Each of the 32 values of the inner integral was raised to the k-1 power and multiplied by the appropriate constant.
Matlab seems to use the same method of quadrature:
q = integral(fun,xmin,xmax) numerically integrates function fun from xmin to xmax using global adaptive quadrature and default error tolerances.
Does any one have any idea which results are correct? Either I have made some mistake(s) in the code, or the paper could be wrong - possibly because the author used only 32 values in estimating the integrals?
This does work, but the chi-squared distribution has (n-1) degrees of freedom, not n as in the code posted. If that's fixed then the Rinott's constant values agree with literature. Or at least, the literature that isn't behind a paywall. Can't speak to the 1984 Wilcox.
I have an unconstrained quadratic optimization problem- I have to find u that will minimize norm (u^H * A_k *u - d_k)) where A_k is 4x4 matrix (k=1,2...180) and u is 4x1 vector(H denotes hermitian).
There are several functions in MATLAB to solve optimization problems but I am not able to figure out the method I need to use for my OP. I would highly appreciate if anyone provides me some hint or suggestion to solve this problem in MATLAB.
Math preliminaries:
If A is symmetric, positive definite, then the the matrix A defines a norm. This norm, called the A norm, is written ||x||_A = x'Ax. Your problem, minimize (over x) |x'*A*x - d| is equivalent to searching for a vector x whose 'A norm' is equal to d. Such a vector is trivial to find by simply scaling up or down any non-zero vector until it has the appropriate magnitude.
Pick arbitrary y (can't be all zero). eg. y = [1; zeros(n-1, 1)]; or y = rand(n, 1);
Solve for some scalar c such that x = c*y and x'Ax=d. Substituting we get c^2 y'Ay=d hence:.
Trivial procedure to find an answer:
y = [1; zeros(n-1)]; %Any arbitrary y will do as long as its not all zero
c = sqrt(d / (y'*A*y))
x = c * y`
x now solves your problem. No toolbox required!
options = optimoptions('fminunc','GradObj','on','TolFun',1e-9,'TolX',1e-9);
x = fminunc(#(x)qfun(x,A,d),zeros(4,1),options);
with
function [y,g]=qfun(x,A,d)
y=(x'*A*x-d);
g=2*y*(A+A')*x;
y=y*y;
Seems to work.
Linearly Non-Separable Binary Classification Problem
First of all, this program isn' t working correctly for RBF ( gaussianKernel() ) and I want to fix it.
It is a non-linear SVM Demo to illustrate classifying 2 class with hard margin application.
Problem is about 2 dimensional radial random distrubuted data.
I used Quadratic Programming Solver to compute Lagrange multipliers (alphas)
xn = input .* (output*[1 1]); % xiyi
phi = gaussianKernel(xn, sigma2); % Radial Basis Function
k = phi * phi'; % Symmetric Kernel Matrix For QP Solver
gamma = 1; % Adjusting the upper bound of alphas
f = -ones(2 * len, 1); % Coefficient of sum of alphas
Aeq = output'; % yi
beq = 0; % Sum(ai*yi) = 0
A = zeros(1, 2* len); % A * alpha <= b; There isn't like this term
b = 0; % There isn't like this term
lb = zeros(2 * len, 1); % Lower bound of alphas
ub = gamma * ones(2 * len, 1); % Upper bound of alphas
alphas = quadprog(k, f, A, b, Aeq, beq, lb, ub);
To solve this non linear classification problem, I wrote some kernel functions such as gaussian (RBF), homogenous and non-homogenous polynomial kernel functions.
For RBF, I implemented the function in the image below:
Using Tylor Series Expansion, it yields:
And, I seperated the Gaussian Kernel like this:
K(x, x') = phi(x)' * phi(x')
The implementation of this thought is:
function phi = gaussianKernel(x, Sigma2)
gamma = 1 / (2 * Sigma2);
featDim = 10; % Length of Tylor Series; Gaussian Kernel Converge 0 so It doesn't have to Be Inf Dimension
phi = []; % Kernel Output, The Dimension will be (#Sample) x (featDim*2)
for k = 0 : (featDim - 1)
% Gaussian Kernel Trick Using Tylor Series Expansion
phi = [phi, exp( -gamma .* (x(:, 1)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 1).^k, ...
exp( -gamma .* (x(:, 2)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 2).^k];
end
end
*** I think my RBF implementation is wrong, but I don' t know how to fix it. Please help me here.
Here is what I got as output:
where,
1) The first image : Samples of Classes
2) The second image : Marking The Support Vectors of Classes
3) The third image : Adding Random Test Data
4) The fourth image : Classification
Also, I implemented Homogenous Polinomial Kernel " K(x, x') = ( )^2 ", code is:
function phi = quadraticKernel(x)
% 2-Order Homogenous Polynomial Kernel
phi = [x(:, 1).^2, sqrt(2).*(x(:, 1).*x(:, 2)), x(:, 2).^2];
end
And I got surprisingly nice output:
To sum up, the program is working correctly with using homogenous polynomial kernel but when I use RBF, it isn' t working correctly, there is something wrong with RBF implementation.
If you know about RBF (Gaussian Kernel) please let me know how I can make it right..
Edit: If you have same issue, use RBF directly that defined above and dont separe it by phi.
Why do you want to compute phi for Gaussian Kernel? Phi will be infinite dimensional vector and you are bounding the terms in your taylor series to 10 when we don't even know whether 10 is enough to approximate the kernel values or not! Usually, the kernel is computed directly instead of getting phi (and the computing k). For example [1].
Does this mean we should never compute phi for Gaussian? Not really, no, but we have to be slightly smarter about it. There have been recent works [2,3] which show how to compute phi for Gaussian so that you can compute approximate kernel matrices while having just finite dimensional phi's. Here [4] I give the very simple code to generate the approximate kernel using the trick from the paper. However, in my experiments I needed to generate anywhere from 100 to 10000 dimensional phi's to be able to get a good approximation of the kernel (depending upon on the number of features the original input had as well as the rate at which the eigenvalues of the original matrix tapers off).
For the moment, just use code similar to [1] to generate the Gaussian kernel and then observe the result of SVM. Also, play around with the gamma parameter, a bad gamma parameter can result in really bad classification.
[1] https://github.com/ssamot/causality/blob/master/matlab-code/Code/mfunc/indep/HSIC/rbf_dot.m
[2] http://www.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf
[3] http://www.eecs.berkeley.edu/~brecht/papers/08.rah.rec.nips.pdf
[4] https://github.com/aruniyer/misc/blob/master/rks.m
Since Gaussian kernel is often referred as mapping to infinity dimensions, I always have faith in its capacity. The problem here maybe due to a bad parameter while keeping in mind grid search is always needed for SVM training. Thus I propose you could take a look at here where you could find some tricks for parameter tuning. Exponentially increasing sequence is usually used as candidates.
Given L and U LU decomposition and vector of constants b such that LU*x=b , is there any built in function which find the x ? Mean something like -
X = functionName(L,U,b)
Note that in both L and U we are dealing with triangular matrices which can be solved directly by forward and backward substitution without using the Gaussian elimination process.
Edit :
Solving this linear equation system should be according to the following steps -
1. define y - s.t Ux=y
2. solve Ly=b by forward substitution
3. solve Ux=y by backward substitution
4. return y
Edit 2 :
I found linalg::matlinsolveLU but I didn't try it cause I have too old version (R2010a) . Is it working for anyone ?
If you have:
A = rand(3);
b = rand(3,1);
then the solution to the system can be simply computed as:
x = A\b
Or if you already have an LU decomposition of A, then:
[L,U] = lu(A);
xx = U\(L\b)
the mldivide function is smart enough to detect that the matrix is triangular and chose an algorithm accordingly (forward/backward substitution)
I think this is what you're looking for:
A = rand(3,3); % Random 3-by-3 matrix
b = rand(3,1); % Random 3-by-1 vector
[L,U] = lu(A); % LU decomposition
x = U\(L\b) % Solve system of equations via mldivide (same as x = A\b or x = (L*U)\b)
err = L*U*x-b % Numerical error
The system of equations is solved using mldivide. You might also look at qr which implements QR decomposition instead of using LU decomposition. qr can directly solve A*x = b type problems and is more efficient. Also look at linsolve. For symbolic systems you may still be able to use mldivide, or try linalg::matlinsolveLU in MuPAD.