LDLt factorization using SciPy's Python bindings to LAPACK - scipy

I am trying to get the LDLt factorization of a given symmetric matrix with SciPy's Python bindings to LAPACK using the dsysv routine which actually solves linear systems using this matrix factorization.
I have tried the following:
import numpy as np
from scipy.linalg.lapack import dsysv
A = np.random.randint(1, 1000, size=(5, 5))
A = (A + A.T)
b = np.random.randn(5)
lult, piv, x, _ = dsysv(A, b, lower=1)
Where x would be the solution for the above linear system and lult and piv contain information about the factorization.
How can I reconstruct LDLt from it? Sometimes negative values are contained in piv and from the docs I was not able to understand their meaning.
LAPACK's sytrf actually computes this factorization (without solving any linear system) but it does not seem available via SciPy.
There is an example here with the output I am interested in (see eq. 3-23).

All the required information is found in the documentation of systrf. But admittedly, it is a somewhat verbose.
So just give me the code:
import numpy as np
from scipy.linalg.lapack import dsysv
def swapped(i, k, n):
"""identity matrix where ith row and column are swappend with kth row and column"""
P = np.eye(n)
P[i, i] = 0
P[k, k] = 0
P[i, k] = 1
P[k, i] = 1
return P
# example
n = 5
A = np.random.rand(n, n)
A = (A + A.T)
b = np.random.randn(n)
lult, piv, x, _ = dsysv(A, b, lower=1)
# reconstruct L and D
D = np.zeros_like(A, dtype=float)
L = np.eye(n)
k = 0
while k < n:
i = piv[k]
if i < 0:
s = 2
s = 1
if s == 1:
i = i - 1
D[k, k] = lult[k, k] # D(k) overwrites A(k,k)
Pk = swapped(k, i, n)
v = lult[k+1:n, k] # v overwrites A(k+1:n,k)
Lk = np.eye(n)
Lk[k+1:n, k] = v
m = -i - 1
D[k:k+2, k:k+2] = lult[k:k+2, k:k+2] # the lower triangle of D(k) overwrites A(k,k), A(k+1,k), and A(k+1,k+1)
D[k, k+1] = D[k+1, k] # D is symmeric
Pk = swapped(k+1, m, n)
v = lult[k+2:n, k:k+2] # v overwrites A(k+2:n,k:k+1)
Lk = np.eye(n)
Lk[k+2:n, k:k+2] = v
L = L.dot(Pk).dot(Lk)
if s == 1:
k += 1
k += 2
print(np.max(np.abs(A - L.dot(D).dot(L.T)))) # should be close to 0
The snipped above reconstructs L and D from the decomposition (it would need to be adapted to reconstruct U from an UDUt decomposition). I will try to explain below. First a quote from the documentation:
... additional row interchanges are required to recover U or L explicitly (which is seldom necessary).
Reconstructing L (or U) requires a number of iterations with row exchanging operations and matrix multiplication. This is not very efficient (less so when done in Python) but luckily this reconstruction is seldom necessary. So make sure you really have to do this!
We reconstruct L from L = P(1)*L(1)* ... *P(k)*L(k)*...,. (Fortran indices are 1-based). So we need to iterate k from 0 to n, obtain K and L in each step and multiply them.
P is a permutation matrix, defined by piv. A positive value of piv is straight-forward (i = piv[k]). It means that the ith and kth row/column were swapped in A before performing the operation. In this case the kth diagonal element of lult corresponds to the kth diagonal element of D. L(k) contains the kth column of the lower diagonal matrix - after the swapping.
A negative value of piv means that the corresponding element of D is a 2x2 block instead of just one element, and L(k) corresponds to two columns of the lower diagonal matrix.
Now for each step in k we obtain L(k), apply the swapping operation P(k), and combine it with the existing L. We also obtain the 1x1 or 2x2 block of D and correspondingly increase k by 1 or 2 for the next step.
I won't blame anyone for not comprehending my explanation. I simply wrote it down as I figured it out... Hopefully, the combination of the code snippet, the description, and the original documentation prove useful :)

dsysv is the linear system solver and it does all the magic internally including calls to dsytrf. So for the factorization it is not needed. As kazemakase mentioned this is now available in SciPy (PR 7941 and will appear officially in version 1.1) and you can just use the scipy.linalg.ldl() to get the factorization and the permutation information of the outer factors. Actually this was the reason why ?sytrf and ?hetrf was added.
You can look at its source code to see how ipiv is sanitized.
With SciPy v.1.1 built with OpenBlas on Windows 10 machine vs. matlab using mkl, the performance is given below
Adding extra JIT-compilers on top of it probably would bring it to matlab speed. Since the ipiv handling and the factorization construction is done in pure numpy/python. Or better cythonize it if performance is the utmost importance.

Updating scipy to version >= 1.0.0 should do the trick.
A wrapper to sytrf has been added to the master branch in mid-September, just before the 1.0.0 Beta release.
You can find the relevant pull-request and commit on Github.


Trying to solve an error problem with Matlab v 9.13 when doing matrix multiplication

Here is the problem and current code I am using.
Use the svd() function in MATLAB to compute , the rank-2 approximation of . Clearly state what is, rounded to 4 decimal places. Also, compute the root-mean square error (RMSE) between and . Which approximation is better, or ? Explain.
[U , S, V] = svd(A)
k = 2
V = V(:,1:k)
V = transpose(V)
Ak = U(:,1:k) .* S(1:k,1:k) .* V
diffA = A - A2
fro_norm = norm(diffA,'fro')
RMSE2 = (fro_norm)/sqrt(m*n)
However, when running, the line AK = . . . keeps giving an error because the matrix sizes are not compatible. So I understand that the matrix sizes need to match in order to do the multiplication, but I also know that the problem requires the following calculation requirements, that when k = 2, U has to use the first 2 columns, S has to use the first 2 rows and first 2 columns, and V has to be the transpose of V only using the first two columns.
I must be missing something in my understanding of the calculation, or creation of the sub k matrices. The matrix I have to use is a 3 x 3.

Adding a sparse vector to a dense vector in Matlab

Suppose I have a high dimensional vector v which is dense and another high dimensional vector x which is sparse and I want to do an operation which looks like
v = v + x
Ideally since one needs to update only a few entries in v this operation should be fast but it is still taking a good amount of time even when I have declared x to be sparse. I have tried with v being in full as well as v being in sparse form and both are fairly slow.
I have also tried to extract the indices from the sparse vector x by calling a find and then updating the original vector in a for loop. This is faster than the above operations, but is there a way to achieve the same with much less code.
Quoting from the Matlab documentation (emphasis mine):
Binary operators yield sparse results if both operands are sparse, and full results if both are full. For mixed operands, the result is full unless the operation preserves sparsity. If S is sparse and F is full, then S+F, S*F, and F\S are full, while S.*F and S&F are sparse. In some cases, the result might be sparse even though the matrix has few zero elements.
Therefore, if you wish to keep x sparse, I think using logical indexing to update v with the nonzero values of x is best. Here is a sample function that shows either logical indexing or explicitly full-ing x is best (at least on my R2015a install):
function [] = blur()
n = 5E6;
v = rand(n,1);
x = sprand(n,1,0.001);
xf = full(x);
vs = sparse(v);
disp(['Full-Sparse: ',num2str(timeit(#() v + x) ,'%9.5f')]);
disp(['Full-Full: ',num2str(timeit(#() v + xf) ,'%9.5f')]);
disp(['Sparse-Sparse: ',num2str(timeit(#() vs + x) ,'%9.5f')]);
disp(['Logical Index: ',num2str(timeit(#() update(v,x)),'%9.5f')]);
function [] = update(v,x)
mask = x ~= 0;
v(mask) = v(mask) + x(mask);

Vectorization of double for loop including sine of two variables

I need to numerically evaluate some integrals which are all of the form shown in this image:
These integrals are the matrix elements of a N x N matrix, so I need to evaluate them for all possible combinations of n and m in the range of 1 to N. The integrals are symmetric in n and m which I have implemented in my current nested for loop approach:
function [V] = coulomb3(N, l, R, R0, c, x)
r1 = 0.01:x:R;
r2 = R:x:R0;
r = [r1 r2];
rl1 = r1.^(2*l);
rl2 = r2.^(2*l);
sines = zeros(N, length(r));
V = zeros(N, N);
for i = 1:N;
sines(i, :) = sin(i*pi*r/R0);
x1 = length(r1);
x2 = length(r);
for nn = 1:N
for mm = 1:nn
f1 = (1/6)*rl1.*r1.^2.*sines(nn, 1:x1).*sines(mm, 1:x1);
f2 = ((R^2/2)*rl2 - (R^3/3)*rl2.*r2.^(-1)).*sines(nn, x1+1:x2).*sines(mm, x1+1:x2);
value = 4*pi*c*x*trapz([f1 f2]);
V(nn, mm) = value;
V(mm, nn) = value;
I figured that calling sin(x) in the loop was a bad idea, so I calculate all the needed values and store them. To evaluate the integrals I used trapz, but as the first and the second/third integrals have different ranges the function values need to be calculated separately and then combined.
I've tried a couple different ways of vectorization but the only one that gives the correct results takes much longer than the above loop (used gmultiply but the arrays created are enourmous). I've also made an analytical solution (which is possible assuming m and n are integers and R0 > R > 0) but these solutions involve a cosine integral (cosint in MATLAB) function which is extremely slow for large N.
I'm not sure the entire thing can be vectorized without creating very large arrays, but the inner loop at least should be possible. Any ideas would be be greatly appreciated!
The inputs I use currently are:
R0 = 1000;
R = 8.4691;
c = 0.393*10^(-2);
x = 0.01;
l = 0 # Can reasonably be 0-6;
N = 20; # Increasing the value will give the same results,
# but I would like to be able to do at least N = 600;
Using these values
V(1, 1:3) = 873,379900963549 -5,80688363271849 -3,38139152472590
Although the diagonal values never converge with increasing R0 so they are less interesting.
You will lose the gain from the symmetricity of the problem with my approach, but this means a factor of 2 loss. Odds are that you'll still benefit in the end.
The idea is to use multidimensional arrays, making use of trapz supporting these inputs. I'll demonstrate the first term in your figure, as the two others should be done similarly, and the point is the technique:
r1 = 0.01:x:R;
r2 = R:x:R0;
r = [r1 r2].';
rl1 = r1.'.^(2*l);
rl2 = r2.'.^(2*l);
sines = zeros(length(r),N); %// CHANGED!!
%// V = zeros(N, N); not needed now, see later
%// you can define sines in a vectorized way as well:
sines = sin(r*(1:N)*pi/R0); %//' now size [Nr, N] !
%// note that implicitly r is of size [Nr, 1, 1]
%// and sines is of size [Nr, N, 1]
sines2mat = permute(sines,[1, 3, 2]); %// size [Nr, 1, N]
%// the first term in V: perform integral along first dimension
%//V1 = 1/6*squeeze(trapz(bsxfun(#times,bsxfun(#times,r.^(2*l+2),sines),sines2mat),1))*x; %// 4*pi*c prefactor might be physics, not math
V1 = 1/6*permute(trapz(bsxfun(#times,bsxfun(#times,r.^(2*l+2),sines),sines2mat),1),[2,3,1])*x; %// 4*pi*c prefactor might be physics, not math
The key point is that bsxfun(#times,r.^(2*l+2),sines) is a matrix of size [Nr,N,1], which is again multiplied by sines2mat using bsxfun, the result is of size [Nr,N,N] and an element (k1,k2,k3) corresponds to an integrand at radial point k1, n=k2 and m=k3. Using trapz() with explicitly the first dimension (which would be default) reduces this to an array of size [1,N,N], which is just what you need after a good squeeze(). Update: as per #Dev-iL's comment you should use permute instead of squeeze to get rid of the leading singleton dimension, as that might be more efficent.
The two other terms can be handled the same way, and of course it might still help if you restructure the integrals based on overlapping and non-overlapping parts.

What is the Haskell / hmatrix equivalent of the MATLAB pos function?

I'm translating some MATLAB code to Haskell using the hmatrix library. It's going well, but
I'm stumbling on the pos function, because I don't know what it does or what it's Haskell equivalent will be.
The MATLAB code looks like this:
[U,S,V] = svd(Y,0);
diagS = diag(S);
A = U * diag(pos(diagS-tau)) * V';
E = sign(Y) .* pos( abs(Y) - lambda*tau );
M = D - A - E;
My Haskell translation so far:
(u,s,v) = svd y
diagS = diag s
a = u `multiply` (diagS - tau) `multiply` v
This actually type checks ok, but of course, I'm missing the "pos" call, and it throws the error:
inconsistent dimensions in matrix product (3,3) x (4,4)
So I'm guessing pos does something with matrix size? Googling "matlab pos function" didn't turn up anything useful, so any pointers are very much appreciated! (Obviously I don't know much MATLAB)
Incidentally this is for the TILT algorithm to recover low rank textures from a noisy, warped image. I'm very excited about it, even if the math is way beyond me!
Looks like the pos function is defined in a different MATLAB file:
function P = pos(A)
P = A .* double( A > 0 );
I can't quite decipher what this is doing. Assuming that boolean values cast to doubles where "True" == 1.0 and "False" == 0.0
In that case it turns negative values to zero and leaves positive numbers unchanged?
It looks as though pos finds the positive part of a matrix. You could implement this directly with mapMatrix
pos :: (Storable a, Num a) => Matrix a -> Matrix a
pos = mapMatrix go where
go x | x > 0 = x
| otherwise = 0
Though Matlab makes no distinction between Matrix and Vector unlike Haskell.
But it's worth analyzing that Matlab fragment more. Per http://www.mathworks.com/help/matlab/ref/svd.html the first line computes the "economy-sized" Singular Value Decomposition of Y, i.e. three matrices such that
U * S * V = Y
where, assuming Y is m x n then U is m x n, S is n x n and diagonal, and V is n x n. Further, both U and V should be orthonormal. In linear algebraic terms this separates the linear transformation Y into two "rotation" components and the central eigenvalue scaling component.
Since S is diagonal, we extract that diagonal as a vector using diag(S) and then subtract a term tau which must also be a vector. This might produce a diagonal containing negative values which cannot be properly interpreted as eigenvalues, so pos is there to trim out the negative eigenvalues, setting them to 0. We then use diag to convert the resulting vector back into a diagonal matrix and multiply the pieces back together to get A, a modified form of Y.
Note that we can skip some steps in Haskell as svd (and its "economy-sized" partner thinSVD) return vectors of eigenvalues instead of mostly 0'd diagonal matrices.
(u, s, v) = thinSVD y
-- note the trans here, that was the ' in Matlab
a = u `multiply` diag (fmap (max 0) s) `multiply` trans v
Above fmap maps max 0 over the Vector of eigenvalues s and then diag (from Numeric.Container) reinflates the Vector into a Matrix prior to the multiplys. With a little thought it's easy to see that max 0 is just pos applied to a single element.
(A>0) returns the positions of elements of A which are larger than zero,
so forexample, if you have
A = [ -1 2 -3 4
5 6 -7 -8 ]
then B = (A > 0) returns
B = [ 0 1 0 1
1 1 0 0]
Note that we have ones corresponding to an elemnt of A which is larger than zero, and 0 otherwise.
Now if you multiply this elementwise with A using the .* notation, then you are multipling each element of A that is larger than zero with 1, and with zero otherwise. That is, A .* B means
[ -1*0 2*1 -3*0 4*1
5*1 6*1 -7*0 -8*0 ]
giving finally,
[ 0 2 0 4
5 6 0 0 ]
So you need to write your own function that will return positive values intact, and negative values set to zero.
And also, u and v does not match in dimension, for a generall SVD decomposition, so you actually would need to REDIAGONALIZE pos(diagS - Tau), so that u* diagnonalized_(diagS -tau) agrres to v

Atomic sparse-matrix-multiply-and-compare to avoid out-of-memory error

I have two sparse matrices A (logical, 80274 x 80274) and B (non-negative integer, 21018 x 80274) and a vector c (positive integer, 21018 x 1).
I'd like to find the result res (logical, 21018 x 80274) of
mat = B * A;
res = mat > sparse(diag(c - 1)) * spones(mat);
# Note that since c is positive, this is equivalent to
# res = bsxfun(#gt, mat, c-1)
# but octave's sparse bsxfun support is somewhat shoddy,
# so I'm doing this instead as a workaround
The problem is B * A has enough nonzero values (I think 60824321 which doesn't seem like a lot, but somehow, the computation of spones(mat) uses up over a gigabyte of memory before octave crashes) to exhaust all my machine's memory even though most of these do not exceed c-1.
Is there a way to do this without computing the intermediate matrix mat = B * A ?
CLARIFICATION: It probably doesn't matter, but B and c are actually double matrices that happen to only be holding integer values (and B is sparse).
Can't you just work on the nonzero values of mat? (for the zero values you know the result will be 0):
c = c(:); % make c a column
ind = find(mat>0); % linear index
[row, ~] = ind2sub(size(mat),ind); % row index within mat (for use in c)
res = mat(ind) > c(row)-1; % results for the nonzero values of mat
You can try to update to Octave 3.8.1, bsxfun has been updated to be sparse aware, this greatly enhances the performances!