approximate low rank matrix with weighted sum of rows - scipy

I'd like to approximate a given n x m matrix A with n >> m as a weighted sum W of some k rows B (ideally selected from A, but could also be arbitrary). The weights must sum up to 1 and need to be positive.
import numpy as np
n = 1000 # rows
m = 3 # columns
k = 2 # hidden rank
# create random matrix with rank k
A = np.random.rand(n, k).dot(np.random.rand(k, m))
# estimate hidden rank
u, s, vt = np.linalg.svd(A, full_matrices=False, compute_uv=True)
k_est = np.count_nonzero(~np.isclose(s, 0))
# truncate to k_est
B = np.diag(s[:k_est]) # vt[..., :k_est, :]
W = u[..., :k_est]
# do some magic with B and W to come up with
assert np.all(W >= 0)
assert np.all(np.isclose(W.sum(1), 1))
assert np.all(np.isclose(A, W # B))
I tried with SVD which is able to reproduce A by W # B, but the weights are negative and don't sum up to 1.
From my gut feeling it seems like I'm searching for a convex hull of A, but with only k_est points.

Related

How to create a mxn matrix with a specific rank in matlab?

I want to create a m by n matrix with rank k.
Like A is 8 × 8 with rank 5 or B is 4 × 6 with rank 4.
So I try to write a function in MATLAB like below.
My thought is:
generate an m by n zeros matrix
generate m by n matrix and convert it into reduced row echelon form
assign rank of 2.'s matrix to num
if num = k, then assign current matrix to the output
break the iteration
function output = check_rank(m,n,k)
while 1
output = zeros(m,n);
matrix = randi(20,m,n);
tmp = rref(matrix);
num = rank(tmp);
if (num == k)
output = matrix;
break;
end
disp(output);
end
A = check_rank(8,8,4)
The outcome is an infinite loop and all the answers are 6x6 zeros matrix:
Command Window Output
I have also tried method in the how to create a rank k matrix using matlab?
A = zeros(8,8);
for i = 1:4, A = A + randn(8,1) * randn(1,8); end
A
rank(A)
It can reach my goal, but I have no idea how it work successfully?
Thanks, #anonymous!
If you want to generate a random matrix with specified rank, you can try to build a user function like below
function [Y,rk] = fn(m,n,k)
P = orth(randn(m,k));
Q = orth(randn(n,k))';
Y = P*Q;
rk = rank(Y);
end
where P and Q are unitary matrices. Y is the generated matrix with random values, and rk helps you check the rank.
Example
>> [Y,rk] = fn(8,6,5)
Y =
3.8613e-02 7.5837e-03 -7.1011e-02 -7.0392e-02 -3.8519e-02 1.6612e-01
-3.1381e-02 -3.6287e-02 1.4888e-01 -7.6202e-02 -3.7867e-02 3.2707e-01
-1.9689e-01 2.2684e-01 1.2606e-01 -1.2657e-03 1.9724e-01 7.2793e-02
-1.2652e-01 7.7531e-02 1.3906e-01 3.1568e-02 1.8327e-01 -1.3804e-01
-2.6604e-01 -1.4345e-01 1.6961e-03 -9.7833e-02 5.9299e-01 -1.5765e-01
1.7787e-01 -3.5007e-01 3.8482e-01 -6.0741e-02 -2.1415e-02 -2.4317e-01
8.9910e-02 -2.5538e-01 -1.8029e-01 -7.0032e-02 -1.0739e-01 2.2188e-01
-3.4824e-01 3.7603e-01 2.8561e-02 2.6553e-02 2.4871e-02 6.8021e-01
rk = 5
You can easily use eye function:
I = eye(k);
M = zeros(m,n);
M(1:k, 1:k) = I;
The rank(M) is equal to k.

Gram Schmidt Orthonormalisation

I am writing the following code for Gram Schmidt Orthogonalization. It says that there's an error in calling the function. What's the error and how to rectify it?
A =[1,1,1,1;-1,4,4,-1;4,-2,2,0];
A =A';
B=myGramschmidt(A);
function [B] = myGramschmidt(A)
x1=A(:,1);
x2=A(:,2);
x3=A(:,3);
v1=x1;
c = dot(v1);
v2 = x2-((dot(x2,v1)/c)* v1);
d = dot(v2);
v3 = x3-((dot(x3,v1)/c)* v1)-((dot(x3,v2)/d)* v2);
C=[v1,v2,v3];
V1=normc(v1);
V2=normc(v2);
V3=normc(v3);
B=[V1,V2,V3];
end
Using the Wikipedia Gram-Schmidt page, but Luis Mendo is correct as to why you got the error.
function [B] = myGramschmidt(A)
B = A;
for k = 1:size(A, 1)
for j = 1:k-1
B(k, :) = B(k, :) - proj(B(j, :), A(k, :));
end
end
end
function p = proj(u, v)
% https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process#The_Gram.E2.80.93Schmidt_process
p = dot(v, u) / dot(u, u) * u;
end
Try this vectorized implementation in python.
Also I would suggest to go through David C lay book for theory.
def replace_zero(array):
for i in range(len(array)) :
if array[i] == 0 :
array[i] = 1
return array
def gram_schmidt(self,A, norm=True, row_vect=False):
"""Orthonormalizes vectors by gram-schmidt process
Parameters
-----------
A : ndarray,
Matrix having vectors in its columns
norm : bool,
Do you need Normalized vectors?
row_vect: bool,
Does Matrix A has vectors in its rows?
Returns
-------
G : ndarray,
Matrix of orthogonal vectors
Gram-Schmidt Process
--------------------
The Gram–Schmidt process is a simple algorithm for
producing an orthogonal or orthonormal basis for any
nonzero subspace of Rn.
Given a basis {x1,....,xp} for a nonzero subspace W of Rn,
define
v1 = x1
v2 = x2 - (x2.v1/v1.v1) * v1
v3 = x3 - (x3.v1/v1.v1) * v1 - (x3.v2/v2.v2) * v2
.
.
.
vp = xp - (xp.v1/v1.v1) * v1 - (xp.v2/v2.v2) * v2 - .......
.... - (xp.v(p-1) / v(p-1).v(p-1) ) * v(p-1)
Then {v1,.....,vp} is an orthogonal basis for W .
In addition,
Span {v1,.....,vp} = Span {x1,.....,xp} for 1 <= k <= p
References
----------
Linear Algebra and Its Applications - By David.C.Lay
"""
if row_vect :
# if true, transpose it to make column vector matrix
A = A.T
no_of_vectors = A.shape[1]
G = A[:,0:1].copy() # copy the first vector in matrix
# 0:1 is done to to be consistent with dimensions - [[1,2,3]]
# iterate from 2nd vector to number of vectors
for i in range(1,no_of_vectors):
# calculates weights(coefficents) for every vector in G
numerator = A[:,i].dot(G)
denominator = np.diag(np.dot(G.T,G)) #to get elements in diagonal
weights = np.squeeze(numerator/denominator)
# projected vector onto subspace G
projected_vector = np.sum(weights * G,
axis=1,
keepdims=True)
# orthogonal vector to subspace G
orthogonalized_vector = A[:,i:i+1] - projected_vector
# now add the orthogonal vector to our set
G = np.hstack((G,orthogonalized_vector))
if norm :
# to get orthoNormal vectors (unit orthogonal vectors)
# replace zero to 1 to deal with division by 0 if matrix has 0 vector
# or normazalization value comes out to be zero
G = G/self.replace_zero(np.linalg.norm(G,axis=0))
if row_vect:
return G.T
return G
G = np.array([[1,0,0],[1,1,0],[1,1,1],[1,1,1]])
gram_schmidt(G)
>
array([[ 0.5 , -0.8660254 , 0. ],
[ 0.5 , 0.28867513, -0.81649658],
[ 0.5 , 0.28867513, 0.40824829],
[ 0.5 , 0.28867513, 0.40824829]])

How to check whether vector b is in Col A?

How do I determine whether b∈Col A or b∉Col A in matlab? A being an m x n matrix where m >= n, and b being a vector. Is there a built in function for this already, or would I need to create one? If b∈Col A, how would I go about determining whether matrix A has orthonormal columns/is orthogonal?
You can use ismember as explained in a previous answer.
// some sample data
A = [eye(3); zeros(3)];
v = [0; 1; 0; 0; 1; 0];
ismember(A', v', 'rows')
To check orthogonality, you could do the following
// A scalar initialised outside the for-loop. It stores sums of inner products.
dp = 0;
// Take the columns of A one by one and compute the inner product with all subsequent columns. If A is orthogonal, all the inner products have to be zero and, hence, their sum has to be zero.
for i = 1:size(A, 2)
dp = dp + sum(A(:, i)'*A(:, i+1:end));
end
if (dp == 0)
disp('The columns are orthogonal')
else
disp('The columns are not orthogonal')
end
To have orthonormal columns, the norm of each column has to be 1, so:
// Check each column for unit length
M = mat2cell(A, size(A, 1), ones(size(A, 2), 1));
if find(cellfun(#(x)norm(x,2), M) ~= 1)
disp('Columns are not of unit length')
else
disp('Columns are of unit length')
end
Note that all these operations become simpler and faster if m=n (since you allow this case).
Say you have a matrix A that is nxm and a vector b that is nx1, and you want to see if b is a column in A.
You can do this by taking the transpose of both A and b, and then looking to see if the vector b is a member of A. This is the code:
member = ismember(A',b','rows');
Here is an example;
A =
1 5
2 2
3 3
4 4
b =
1
2
3
4
member = ismember(A',b','rows')
member =
1
0
So the first column of A and b are a match but the second column of A and b are not the same. If you want to check the orthogonality of the columns you can do this:
orthcheck = triu(A'*A);
if there are any zeros on the upper triangular matrix then the columns are orthogonal. The A'*A checks the dot product of all the columns and you only need the upper triagular part since the matrix is symmetric.
Another way of testing if v is a column of A:
any(all(bsxfun(#eq,A,v))) %// gives 1 if it is; 0 otherwise
To test if A is orthogonal:
product = A*A'; %'// I'm using ' in case you have complex numbers
product(1:size(A,1)+1:end) = 0; %// remove diagonal
all(product(:)==0) %// gives 1 if it is; 0 otherwise

Iterating over all integer vectors summing up to a certain value in MATLAB?

I would like to find a clean way so that I can iterate over all the vectors of positive integers of length, say n (called x), such that sum(x) == 100 in MATLAB.
I know it is an exponentially complex task. If the length is sufficiently small, say 2-3 I can do it by a for loop (I know it is very inefficient) but how about longer vectors?
Thanks in advance,
Here is a quick and dirty method that uses recursion. The idea is that to generate all vectors of length k that sum to n, you first generate vectors of length k-1 that sum to n-i for each i=1..n, and then add an extra i to the end of each of these.
You could speed this up by pre-allocating x in each loop.
Note that the size of the output is (n + k - 1 choose n) rows and k columns.
function x = genperms(n, k)
if k == 1
x = n;
elseif n == 0
x = zeros(1,k);
else
x = zeros(0, k);
for i = 0:n
y = genperms(n-i,k-1);
y(:,end+1) = i;
x = [x; y];
end
end
Edit
As alluded to in the comments, this will run into memory issues for large n and k. A streaming solution is preferable, which generates the outputs one at a time. In a non-strict language like Haskell this is very simple -
genperms n k
| k == 1 = return [n]
| n == 0 = return (replicate k 0)
| otherwise = [i:y | i <- [0..n], y <- genperms (n-i) (k-1)]
viz.
>> mapM_ print $ take 10 $ genperms 100 30
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,99]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,98]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,97]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,96]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,95]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,94]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,93]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,92]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,91]
which runs virtually instantaneously - no memory issues to worry about.
In Python you could achieve something nearly as simple using generators and the yield keyword. In Matlab it is certainly possible, but I leave the translation up to you!
This is one possible method to generate all vectors at once (will give memory problems for moderately large n):
s = 10; %// desired sum
n = 3; %// number of digits
vectors = cell(1,n);
[vectors{:}] = ndgrid(0:s); %// I assume by "integer" you mean non-negative int
vectors = cell2mat(cellfun(#(c) reshape(c,1,[]), vectors, 'uni', 0).');
vectors = vectors(:,sum(vectors)==s); %// each column is a vector
Now you can iterate over those vectors:
for vector = vectors %// take one column at each iteration
%// do stuff with the vector
end
To avoid memory problems it is better to generate each vector as needed, instead of generating all of them initially. The following approach iterates over all possible n-vectors in one for loop (regardless of n), rejecting those vectors whose sum is not the desired value:
s = 10; %// desired sum
n = 3;; %// number of digits
for number = 0: s^n-1
vector = dec2base(number,s).'-'0'; %// column vector of n rows
if sum(vector) ~= s
continue %// reject that vector
end
%// do stuff with the vector
end

Matlab Generating a Matrix

I am trying to generate a matrix in matlab which I will use to solve a polynomial regression formula.
Here is how I am trying to generate the matrix:
I have an input vector X containing N elements and an integer d. d is the integer to know how many times we will add a new column to the matrix we are trying to generate int he following way.
N = [X^d X^{d-1} ... X^2 X O]
O is a vector of same length as X with all 1's.
Everytime d > 2 it does not work.
Can you see any errors in my code (i am new to matlab):
function [ PR ] = PolyRegress( X, Y, d )
O = ones(length(X), 1)
N = [X O]
for j = 2:d
tmp = power(X, j)
N = [tmp N]
end
%TO DO: compute PR
end
It looks like the matlab function vander already does what you want to do.
The VANDER function will only generate powers of the vector upto d = length(X)-1. For a more general solution, you can use the BSXFUN function (works with any value of d):
N = bsxfun(#power, X(:), d:-1:0)
Example:
>> X = (1:.5:2);
>> d = 5;
>> N = bsxfun(#power, X(:), d:-1:0)
N =
1 1 1 1 1 1
7.5938 5.0625 3.375 2.25 1.5 1
32 16 8 4 2 1
I'm not sure if this is the order you want, but it can be easily reversed: use 0:d instead of d:-1:0...