I am trying to implement NMF with Alternating Least Squares method. I am just curious about the following basic implementation of the problem:
If I understand correctly, we can solve for each matrix equation stated in this pseudocode without nonnegativity constraints, with closed form solution and set the negative entries to 0, in a brute force way. Is this understanding correct? Is this a basic alternative to more complicated, constrained optimization problems, where we use projected gradient descent, for example? More importantly, if implemented in this basic way, will the algorithm have any practical value? I want to use NMF for variable reduction purposes and it is important that I use NMF, since my data is by definition non-negative. I am looking for opinions on this one.
If I understand correctly, we can solve for each matrix equation stated in this pseudocode without nonnegativity constraints, with closed form solution and set the negative entries to 0, in a brute force way. Is this understanding correct? Yes.
Is this a basic alternative to more complicated, constrained optimization problems, where we use projected gradient descent, for example? ---In a sense, yes. This is indeed a fast way of Nonnegative factorization. However, articles related to NMF would point out that although this method is fast, it does not guarantee convergence of the nonnegative factors. A better implementation to use would be Hierarchical Alternating Least Squares for NMF (HALS-NMF). Check this paper for a comparison of some popular NMF algorithms: http://www.cc.gatech.edu/~hpark/papers/jgo.pdf
More importantly, if implemented in this basic way, will the algorithm have any practical value? Just basing from my experience, I would say that results aren't as good as compared to say HALS or BPP(Block Pivoting Principle).
Using nonnegative least squares in this algo as opposed to clipping off negative values would obviously be better in this algorithm, but in general I would not recommend this basic ALS/ANNLS method as it has bad convergence properties (it often fluctuates or can even show divergence) - a minimal Matlab implementation of a better method, the accelerated-Hierarchical Alternating Least Squares method for NMF (of Cichocki et al.), which is currently one of the fastest methods is shown here (code by Nicolas Gillis) :
% Accelerated hierarchical alternating least squares (HALS) algorithm of
% Cichocki et al.
%
% See N. Gillis and F. Glineur, "Accelerated Multiplicative Updates and
% Hierarchical ALS Algorithms for Nonnegative Matrix Factorization”,
% Neural Computation 24 (4), pp. 1085-1105, 2012.
% See http://sites.google.com/site/nicolasgillis/
%
% [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)
%
% Input.
% M : (m x n) matrix to factorize
% (U,V) : initial matrices of dimensions (m x r) and (r x n)
% alpha : nonnegative parameter of the accelerated method
% (alpha=0.5 seems to work well)
% delta : parameter to stop inner iterations when they become
% inneffective (delta=0.1 seems to work well).
% maxiter : maximum number of iterations
% timelimit : maximum time alloted to the algorithm
%
% Output.
% (U,V) : nonnegative matrices s.t. UV approximate M
% (e,t) : error and time after each iteration,
% can be displayed with plot(t,e)
%
% Remark. With alpha = 0, it reduces to the original HALS algorithm.
function [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)
% Initialization
etime = cputime; nM = norm(M,'fro')^2;
[m,n] = size(M); [m,r] = size(U);
a = 0; e = []; t = []; iter = 0;
if nargin <= 3, alpha = 0.5; end
if nargin <= 4, delta = 0.1; end
if nargin <= 5, maxiter = 100; end
if nargin <= 6, timelimit = 60; end
% Scaling, p. 72 of the thesis
eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1; j = 0;
scaling = sum(sum(A.*U))/sum(sum( B.*(U'*U) )); U = U*scaling;
% Main loop
while iter <= maxiter && cputime-etime <= timelimit
% Update of U
if j == 1, % Do not recompute A and B at first pass
% Use actual computational time instead of estimates rhoU
eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1;
end
j = 1; eit2 = cputime; eps = 1; eps0 = 1;
U = HALSupdt(U',B',A',eit1,alpha,delta); U = U';
% Update of V
eit1 = cputime; A = (U'*M); B = (U'*U); eit1 = cputime-eit1;
eit2 = cputime; eps = 1; eps0 = 1;
V = HALSupdt(V,B,A,eit1,alpha,delta);
% Evaluation of the error e at time t
if nargout >= 3
cnT = cputime;
e = [e sqrt( (nM-2*sum(sum(V.*A))+ sum(sum(B.*(V*V')))) )];
etime = etime+(cputime-cnT);
t = [t cputime-etime];
end
iter = iter + 1; j = 1;
end
% Update of V <- HALS(M,U,V)
% i.e., optimizing min_{V >= 0} ||M-UV||_F^2
% with an exact block-coordinate descent scheme
function V = HALSupdt(V,UtU,UtM,eit1,alpha,delta)
[r,n] = size(V);
eit2 = cputime; % Use actual computational time instead of estimates rhoU
cnt = 1; % Enter the loop at least once
eps = 1; eps0 = 1; eit3 = 0;
while cnt == 1 || (cputime-eit2 < (eit1+eit3)*alpha && eps >= (delta)^2*eps0)
nodelta = 0; if cnt == 1, eit3 = cputime; end
for k = 1 : r
deltaV = max((UtM(k,:)-UtU(k,:)*V)/UtU(k,k),-V(k,:));
V(k,:) = V(k,:) + deltaV;
nodelta = nodelta + deltaV*deltaV'; % used to compute norm(V0-V,'fro')^2;
if V(k,:) == 0, V(k,:) = 1e-16*max(V(:)); end % safety procedure
end
if cnt == 1
eps0 = nodelta;
eit3 = cputime-eit3;
end
eps = nodelta; cnt = 0;
end
For full code and comparison to other methods, see
https://sites.google.com/site/nicolasgillis/code
(section Accelerated MU and HALS algorithms for NMF)
and
N. Gillis and F. Glineur, "Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization”, Neural Computation 24 (4), pp. 1085-1105, 2012.
Yes, this can be done, but no you should not do it.
The bottleneck in NMF is not the non-negative least squares calculation, it's the calculation of the right-hand side of the least squares equations and the loss calculation (if used to determine convergence). In my experience, with a fast NNLS solver, the NNLS adds less than 1% relative runtime compared to basic least squares solving. Nowadays (maybe not when you asked the question) there are very fast methods such as TNT-NN and sequential coordinate descent which make things very fast.
I have tried this method and the model quality was really poor. It was hardly reminiscent of HALS or multiplicative updates.
So, I'm trying to do the Gauss-Seidel method in Matlab and I found a code that does this but when I apply it to my matrices I get the Subscripted assignment dimension mismatch. error. I will show you my code in order to get a better idea.
%size of the matrix
n = 10;
%my matrices are empty in the beginning because my professor wants to run the algorithm for n = 100
and n = 1000. A's diagonal values are 3 and every other value is -1. b has the constants and the
first and last value will be 2,while every other value will be 1.
A = [];
b = [];
%assign the values to my matrices
for i=1:n
for j=1:n
if i == j
A(i,j) = 3;
else
A(i,j) = -1;
end
end
end
for i=2:n-1
b(i) = 1;
end
%here is the Gauss-Seidel algorithm
idx = 0;
while max(error) > 0.5 * 10^(-4)
idx = idx + 1;
Z = X;
for i = 1:n
j = 1:n; % define an array of the coefficients' elements
j(i) = []; % eliminate the unknow's coefficient from the remaining coefficients
Xtemp = X; % copy the unknows to a new variable
Xtemp(i) = []; % eliminate the unknown under question from the set of values
X(i) = (b(i) - sum(A(i,j) * Xtemp)) / A(i,i);
end
Xsolution(:,idx) = X;
error = abs(X - Z);
end
GaussSeidelTable = [1:idx;Xsolution]'
MaTrIx = [A X b]
I get the error for the Xsolution(:,idx) = X; part. I don't know what else to do. The code posted online works though, and the only difference is that the matrices are hardcoded in the m-file and A is a 5x5 matrix while b is a 5x1 matrix.
I am unable to run your code because some variables are not initialised, at least error and X. I assume the Problem is caused because Xsolution is already initialised from a previous run with a different size. Insert a Xsolution=[] to fix this.
Besides removing the error I have some suggestions to improve your code:
Use Functions, there are no "left over" variables from a previous run, causing errors like you got here.
Don't use the variable name error or i. error is a build-in function to throw errors and i is the imaginary unit. Both can cause hard to debug errors.
Initialise A with A=-1*ones(n,n);A(eye(size(A))==1)=3;, it's faster not to use a for loop in this case. To initialise b you can simply write b(1)=0;b(2:n-1)=1;
Use preallocation
the first time you run the code, Xsolution(:,idx) = X will create a Xsolution with the size of X.
the second time you run it, the existing Xsolution does not fit the size of new X.
this is another reason why you always want to allocate the array before using it.
I have a serie and I do not know how to sum the elements together in my for loop.
for j=1:50
E=a(j,1).*(x.^j)
(what should I do now)
end
Thanks in advance
Just for completeness I'll add the vectorized answer:
j = 1:50
E=sum(A.*(x.^j)) %//Assuming you have an n-by-1 vector of coefficients call A and x is a constant
This way you won't need a loop at all and is generally the preferred Matlab method. You should revisit this once you've understood the basics of Matlab .
You would have to:
1) store each element separately and then add them together, so that you don't overwrite their values as the loop goes on.
Here is a very simple example:
clear
clc
a = rand(50,1); % generate dummy values for the coefficients;
n = 50;
x = 3; % dummy x value
MySum = zeros(1,n);
for Counter = 1:n
CurrentValue = a(Counter,1)*(x^Counter); % Calculate the current value
MySum(Counter) = CurrentValue; % Store in an array
end
TotalSum = sum(MySum) ;% Once the loop is complete, sum all the values together.
This not the most efficient way. However it would allow you to access every individual sum calculated for each iteration, which could be somehow useful.
2) Alternatively, you could simply add each "Current Value" to the previous sum calculated, and then the final sum would be the last sum calculated in the loop.:
MySum = zeros(1,n);
CurrentSum = 0; % Initialize CurrentSum.
for Counter = 1:n
CurrentValue = a(Counter,1)*(x^Counter)
CurrentSum = CurrentSum + CurrentValue
end
TotalSum = CurrentSum
So basically your problem goes down to this:
E = E + a(j,1).*(x.^j)
That was a pretty long answer for a simple question sorry! Hope the principles of indexing and for loops is clearer for you now :)
E = 0;
for j=1:50
E= E +a(j,1).*(x.^j);
end
I am applying an ML estimate of a Bernoulli random variable. I have initially the following code:
muBern = 0.75;
bernoulliSamples = rand(1, N);
bernoulliSamples(bernoulliSamples < muBern) = 1;
bernoulliSamples(bernoulliSamples > muBern & bernoulliSamples ~= 1) = 0;
bernoulliSamples; % 1xN matrix of Bernoulli measurements, 1's and 0's
estimateML = zeros(1,N);
for n = 1:N
estimateML(n) = (1/n)*sum(bernoulliSamples(1:n)); % The ML estimate for muBern
end
This works fairly well, but every run of the code is only one possible result of taking N=100 observations. I want to repeat this experiment I=100 times and take the average of all the results, to get a solution that accurately represents the experiment.
muBern = 0.75;
bernoulliSamples = rand(I, N);
bernoulliSamples(bernoulliSamples < muBern) = 1;
bernoulliSamples(bernoulliSamples > muBern & bernoulliSamples ~= 1) = 0;
bernoulliSamples; % IxN matrix of Bernoulli measurements, 1's and 0's
estimateML = zeros(I,N);
for n = 1:N
estimateML(n,:) = (1/n)*sum(bernoulliSamples(1:n,2)); % The ML estimate for muBern
end
I am wondering if this for loop is doing what I want it to: each row represents a completely different experiment. Is the second code instance doing the same thing as the first one, only with 100 different results as a cause of 100 different experiments?
You don't need any loops. In the single-experiment case, replace the loop by this, which does the same thing:
estimateML = cumsum(bernoulliSamples) ./ (1:N);
In the multiple-experiment case, use this:
estimateML = bsxfun(#rdivide, cumsum(bernoulliSamples,2), 1:N);
Came up with the answer, I was just overthinking it, if anyone is interested the following is what I was looking for:
for n = 1:N
estimateML(:,n) = (1/n)*sum(bernoulliSamples(:,1:n),2); % The ML estimate for muBern
end
My code works BUT I need to add 2 more things:
output- a vector containing the sequence of estimates including the initial guess x0,
input- max iterations
function [ R, E ] = myNewton( f,df,x0,tol )
i = 1;
while abs(f(x0)) >= tol
R(i) = x0;
E(i) = abs(f(x0));
i = i+1;
x0 = x0 - f(x0)/df(x0);
end
if abs(f(x0)) < tol
R(i) = x0;
E(i) = abs(f(x0));
end
end
well, everything you need is pretty much done already and you should be able to deal with it, btw..
max iteration is contained in the variable i, thus you need to return it; add this
function [ R, E , i] = myNewton( f,df,x0,tol )
Plot sequence of estimates:
plot(R); %after you call myNewton
display max number of iterations
disp(i); %after you call myNewton