Figure 1. Hypothesis plot. y axis: Mean entropy. x axis: Bits.
This Question is in continuation to a previous one asked Matlab : Plot of entropy vs digitized code length
I want to calculate the entropy of a random variable that is discretized version (0/1) of a continuous random variable x. The random variable denotes the state of a nonlinear dynamical system called as the Tent Map. Iterations of the Tent Map yields a time series of length N.
The code should exit as soon as the entropy of the discretized time series becomes equal to the entropy of the dynamical system. It is known theoretically that the entropy of the system, H is log_e(2) or ln(2) = 0.69 approx. The objective of the code is to find number of iterations, j needed to produce the same entropy as the entropy of the system, H.
Problem 1: My problem in when I calculate the entropy of the binary time series which is the information message, then should I be doing it in the same base as H? OR Should I convert the value of H to bits because the information message is in 0/1 ? Both give different results i.e., different values of j.
Problem 2: It can happen that the probality of 0's or 1's can become zero so entropy correspondng to it can become infinity. To prevent this, I thought of putting a check using if-else. But, the loop
if entropy(:,j)==NaN
entropy(:,j)=0;
end
does not seem to be working. Shall be greateful for ideas and help to solve this problem. Thank you
UPDATE : I implemented the suggestions and answers to correct the code. However, my logic of solving was not proper earlier. In the revised code, I want to calculate the entropy for length of time series having bits 2,8,16,32. For each code length, entropy is calculated. Entropy calculation for each code length is repeated N times starting for each different initial condition of the dynamical system. This appraoch is adopted to check at which code length the entropy becomes 1. The nature of the plot of entropy vs bits should be increasing from zero and gradually reaching close to 1 after which it saturates - remains constant for all the remaining bits. I am unable to get this curve (Figure 1). Shall appreciate help in correcting where I am going wrong.
clear all
H = 1 %in bits
Bits = [2,8,16,32,64];
threshold = 0.5;
N=100; %Number of runs of the experiment
for r = 1:length(Bits)
t = Bits(r)
for Runs = 1:N
x(1) = rand;
for j = 2:t
% Iterating over the Tent Map
if x(j - 1) < 0.5
x(j) = 2 * x(j - 1);
else
x(j) = 2 * (1 - x(j - 1));
end % if
end
%Binarizing the output of the Tent Map
s = (x >=threshold);
p1 = sum(s == 1 ) / length(s); %calculating probaility of number of 1's
p0 = 1 - p1; % calculating probability of number of 0'1
entropy(t) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculating entropy in bits
if isnan(entropy(t))
entropy(t) = 0;
end
%disp(abs(lambda-H))
end
Entropy_Run(Runs) = entropy(t)
end
Entropy_Bits(r) = mean(Entropy_Run)
plot(Bits,Entropy_Bits)
For problem 1, H and entropy can be in either nats or bits units, so long as they are both computed using the same units. In other words, you should use either log for both or log2 for both. With the code sample you provided, H and entropy are correctly calculated using consistant nats units. If you prefer to work in units of bits, the conversion of H should give you H = log(2)/log(2) = 1 (or using the conversion factor 1/log(2) ~ 1.443, H ~ 0.69 * 1.443 ~ 1).
For problem 2, as #noumenal already pointed out you can check for NaN using isnan. Alternatively you could check if p1 is within (0,1) (excluding 0 and 1) with:
if (p1 > 0 && p1 < 1)
entropy(:,j) = -p1 * log(p1) - (1 - p1) * log(1 - p1); %calculating entropy in natural base e
else
entropy(:, j) = 0;
end
First you just
function [mean_entropy, bits] = compute_entropy(bits, blocks, threshold, replicate)
if replicate
disp('Replication is ON');
else
disp('Replication is OFF');
end
%%
% Populate random vector
if replicate
seed = 849;
rng(seed);
else
rng('default');
end
rs = rand(blocks);
%%
% Get random
trial_entropy = zeros(length(bits));
for r = 1:length(rs)
bit_entropy = zeros(length(bits), 1); % H
% Traverse bit trials
for b = 1:(length(bits)) % N
tent_map = zeros(b, 1); %Preallocate for memory management
%Initialize
tent_map(1) = rs(r);
for j = 2:b % j is the iterator, b is the current bit
if tent_map(j - 1) < threshold
tent_map(j) = 2 * tent_map(j - 1);
else
tent_map(j) = 2 * (1 - tent_map(j - 1));
end % if
end
%Binarize the output of the Tent Map
s = find(tent_map >= threshold);
p1 = sum(s == 1) / length(s); %calculate probaility of number of 1's
%p0 = 1 - p1; % calculate probability of number of 0'1
bit_entropy(b) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculate entropy in bits
if isnan(bit_entropy(b))
bit_entropy(b) = 0;
end
%disp(abs(lambda-h))
end
trial_entropy(:, r) = bit_entropy;
disp('Trial Statistics')
data = get_summary(bit_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
end
% TO DO Compute the mean for each BIT index in trial_entropy
mean_entropy = 0;
disp('Overall Statistics')
data = get_summary(trial_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
%This is the wrong mean...
mean_entropy = data.mean;
function summary = get_summary(entropy)
summary = struct('mean', mean(entropy), 'sd', std(entropy));
end
end
and then you just have to
% Entropy Script
clear all
%% Settings
replicate = false; % = false % Use true for debugging only.
%H = 1; %in bits
Bits = 2.^(1:6);
Threshold = 0.5;
%Tolerance = 0.001;
Blocks = 100; %Number of runs of the experiment
%% Run
[mean_entropy, bits] = compute_entropy(Bits, Blocks, Threshold, replicate);
%What we want
%plot(bits, mean_entropy);
%What we have
plot(1:length(mean_entropy), mean_entropy);
Related
I am struggling to plot the PDF and CDF graphs of where
Sn=X1+X2+X3+....+Xn
using central limit theorem where n = 1; 2; 3; 4; 5; 10; 20; 40
I am taking Xi to be a uniform continuous random variable for values between (0,3).
Here is what i have done so far -
close all
%different sizes of input X
%N=[1 5 10 50];
N = [1 2 3 4 5 10 20 40];
%interval (1,6) for random variables
a=0;
b=3;
%to store sum of differnet sizes of input
for i=1:length(N)
%generates uniform random numbers in the interval
X = a + (b-a).*rand(N(i),1);
S=zeros(1,length(X));
S=cumsum(X);
cd=cdf('Uniform',S,0,3);
plot(cd);
hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('CDF PLOT')
figure;
for i=1:length(N)
%generates uniform random numbers in the interval
X = a + (b-a).*rand(N(i),1);
S=zeros(1,length(X));
S=cumsum(X);
cd=pdf('Uniform',S,0,3);
plot(cd);
hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('PDF PLOT')
My output is nowhere near what I am expecting any help is much appreciated.
This can be done with vectorization using rand() and cumsum().
For example, the code below generates 40 replications of 10000 samples of a Uniform(0,3) distribution and stores in X. To meet the Central Limit Theorem (CLT) assumptions, they are independent and identically distributed (i.i.d.). Then cumsum() transforms this into 10000 copies of the Sn = X1 + X2 + ... where the first row is n = 10000copies of Sn = X1, the 5th row is n copies of S_5 = X1 + X2 + X3 + X4 + X5. The last row is n copies of S_40.
% MATLAB R2019a
% Setup
N = [1:5 10 20 40]; % values of n we are interested in
LB = 0; % lowerbound for X ~ Uniform(LB,UB)
UB = 3; % upperbound for X ~ Uniform(LB,UB)
n = 10000; % Number of copies (samples) for each random variable
% Generate random variates
X = LB + (UB - LB)*rand(max(N),n); % X ~ Uniform(LB,UB) (i.i.d.)
Sn = cumsum(X);
You can see from the image that the n = 2 case, the sum is indeed a Triangular(0,3,6) distribution. For the n = 40 case, the sum is approximately Normally distributed (Gaussian) with mean 60 (40*mean(X) = 40*1.5 = 60). This shows the convergence in distribution for both the probability density function (PDF) and the cumulative distribution function (CDF).
Note: The CLT is often stated with convergence in distribution to a Normal distribution with zero mean as it has been shifted. Shifting the results by subtracting mean(Sn) = n*mean(X) = n*0.5*(LB+UB) from Sn gets this done.
Code below isn't the gold standard but it produced the image.
figure
s(11) = subplot(6,2,1) % n = 1
histogram(Sn(1,:),'Normalization','pdf')
title(s(11),'n = 1')
s(12) = subplot(6,2,2)
cdfplot(Sn(1,:))
title(s(12),'n = 1')
s(21) = subplot(6,2,3) % n = 2
histogram(Sn(2,:),'Normalization','pdf')
title(s(21),'n = 2')
s(22) = subplot(6,2,4)
cdfplot(Sn(2,:))
title(s(22),'n = 2')
s(31) = subplot(6,2,5) % n = 5
histogram(Sn(5,:),'Normalization','pdf')
title(s(31),'n = 5')
s(32) = subplot(6,2,6)
cdfplot(Sn(5,:))
title(s(32),'n = 5')
s(41) = subplot(6,2,7) % n = 10
histogram(Sn(10,:),'Normalization','pdf')
title(s(41),'n = 10')
s(42) = subplot(6,2,8)
cdfplot(Sn(10,:))
title(s(42),'n = 10')
s(51) = subplot(6,2,9) % n = 20
histogram(Sn(20,:),'Normalization','pdf')
title(s(51),'n = 20')
s(52) = subplot(6,2,10)
cdfplot(Sn(20,:))
title(s(52),'n = 20')
s(61) = subplot(6,2,11) % n = 40
histogram(Sn(40,:),'Normalization','pdf')
title(s(61),'n = 40')
s(62) = subplot(6,2,12)
cdfplot(Sn(40,:))
title(s(62),'n = 40')
sgtitle({'PDF (left) and CDF (right) for Sn with n \in \{1, 2, 5, 10, 20, 40\}';'note different axis scales'})
for tgt = [11:10:61 12:10:62]
xlabel(s(tgt),'Sn')
if rem(tgt,2) == 1
ylabel(s(tgt),'pdf')
else % rem(tgt,2) == 0
ylabel(s(tgt),'cdf')
end
end
Key functions used for plot: histogram() from base MATLAB and cdfplot() from the Statistics toolbox. Note this could be done manually without requiring the Statistics toolbox with a few lines to obtain the cdf and then just calling plot().
There was some concern in comments over the variance of Sn.
Note the variance of Sn is given by (n/12)*(UB-LB)^2 (derivation below). Monte Carlo simulation shows our samples of Sn do have the correct variance; indeed, it converges to this as n gets larger. Simply call var(Sn(40,:)).
% with n = 10000
var(Sn(40,:)) % var(S_40) = 30 (will vary slightly depending on random seed)
(40/12)*((UB-LB)^2) % 29.9505
You can see the convergence is very good by S_40:
step = 0.01;
Domain = 40:step:80;
mu = 40*(LB+UB)/2;
sigma = sqrt((40/12)*((UB-LB)^2));
figure, hold on
histogram(Sn(40,:),'Normalization','pdf')
plot(Domain,normpdf(Domain,mu,sigma),'r-','LineWidth',1.4)
ylabel('pdf')
xlabel('S_n')
Derivation of mean and variance for Sn:
For the expectation (mean), the second equality holds by linearity of expectation. The third equality holds since X_i are identically distributed.
The discrete version of this is posted here.
I am writing a Matlab script that will approximate sin(x) and cos(x) using their Maclaurin polynomials.
When I input
arg = (5*pi)/4 I expect to get the correct approximations for
sin((5*pi)/4) = -0.7071067811865474617
cos((5*pi)/4) = -0.7071067811865476838.
Instead I get the following when running the script:
Approximation of sin(3.92699) >> -0.7071067811865474617
Actual sin(3.92699) = -0.7071067811865474617
Error approximately = 0.0000000000000000000 (0)
----------------------------------------------------------
Approximation of cos(3.92699) >> 0.7071067811865474617
Actual cos(3.92699) = -0.7071067811865476838
Error approximately = 0.0000000000000001110 (1.1102e-16)
I am getting the correct answers for sin but incorrect for cosine when the argument (angle) is in quadrant 3 or 4. The problem is that I am getting the wrong sign on the cos(arg) value. Where have I messed up?
CalculatorForSineCosine.m
% Argument for sine/cosine in radians.
arg = (5*pi)/4;
% Move the argument x so it's within [0, pi/2].
newArg = moveArgumentV2(arg);
% Calculate what degree we need for our Taylorpolynomial.
TOL = 0; % If 0, assume we want Machine Epsilon.
r = findDegreeV2(TOL);
% Plot nth degree Taylorpolynomial around x = 0 for sine.
% and calculate approximation of sin(x).
[approximatedSin, errorSin] = sin_taylorV2(r, newArg);
eS = num2str(errorSin); % errorSin in string format
% Plot nth degree Taylorpolynomial around x = 0 for cosine.
% and calculate approximation of cos(x).
[approximatedCos, errorCos] = cos_taylorV2(r, newArg);
eC = num2str(errorCos); % errorCos in string format
% Print out the result.
fprintf('\nApproximation of sin(%.5f)\t >> %.19f\n', arg, approximatedSin);
fprintf('Actual sin(%.5f)\t\t\t\t = %.19f\n', arg, sin(arg));
fprintf('Error approximately\t\t\t\t = %.19f (%s)\n', errorSin, eS);
disp("----------------------------------------------------------")
fprintf('Approximation of cos(%.5f)\t >> %.19f\n', arg, approximatedCos);
fprintf('Actual cos(%.5f)\t\t\t\t = %.19f\n', arg, cos(arg));
fprintf('Error approximately\t\t\t\t = %.19f (%s)\n\n', errorCos, eC);
sin_taylorV2.m
function [approximatedSin, errorSin] = sin_taylorV2(r, x)
%% sss
% Q_2n+1(x) where 2n+1 = degree of polynomial.
n = (r - 1)/2;
% Approximate sin(x) using its Taylorpolynomial.
approximatedSin = 0;
for k = 0:n
approximatedSin = approximatedSin + (((-1).^k) .* (x.^(2.*k+1)))./(factorial(2.*k+1));
end
% Calculate the error.
errorSin = abs(sin(x) - approximatedSin);
end
cos_taylorV2.m
function [approximatedCos, errorCos] = cos_taylorV2(r, x)
%% sss
% Q_2n+1(x) where 2n+1 = degree of polynomial and n = # terms.
n = (r - 1)/2;
% Approximate cos(x) using its Taylorpolynomial.
approximatedCos = 0;
for k = 0:n
approximatedCos = approximatedCos + (((-1).^k) .* (x.^(2.*k)))./(factorial(2.*k));
end
% Calculate the error.
errorCos = abs(cos(x) - approximatedCos);
end
moveArgumentV2.m
function newArg = moveArgumentV2(arg)
%% Moves the argument x to the interval [0, pi/2].
% Make use of sines periodocity and choose n as ceil( (x-pi)/2pi) )
n = ceil((arg-pi)/(2*pi));
x1 = arg - 2*pi*n; % New angle will be in [-pi, pi]
x2 = abs(x1); % Angle will be in [0, pi]
if (x2 < pi/2) && (x2 > 0)
x3 = x2;
else
x3 = pi - x2;
end
newArg = x3*sign(x1); % Angle will be in [0, pi/2]
end
I would like to notice two things in your code.
First, you don't need the moveArgumentV2(arg) function, as, if you remember, the radius of convergence for the Maclaurin/Taylor series of the sin(x)/cos(x) is the set of all real numbers. That means the series should converge for any real x, disregarding the round-off errors inherently to every arithmetic operations done in a computer.
As a matter of fact, following your code, we can write a function that approximates the cos as:
function y = mycos(x,n)
y = 0;
for k=0:n
term = (-1)^k*x.^(2*k)/factorial(2*k);
y = y + term;
end
end
Notice this function works for values outside the range [-pi,pi]:
x = -10*pi:0.1:10*pi;
ye = cos(x) % exact value
ya = mycos(x,100) % approximated value
plot(x,ye,x,ya,'o')
The values returned by the mycos function are close to the exact value given by the cos built-in function. This happens because I calculated the approximation with the first 100 terms. The error, however, for higher values of x, is extremely large if we use just a few terms.
ya = mycos(x,10) % approximated value with 10 terms only
plot(x,ye-ya); title('error')
The problem now is that we can't just increase the number of terms without running in another problem.
If we increase the number of points, the mycos function crumbles due to round-off errors, because of the factorial function that overflows. A good idea is to try to change your code in order to avoid the use of the factorial function. Notice the recurrence between sucessive terms in the Maclaurin expansion of the cos function, and you can create another function without the use of the factorial:
function y = mycos2(x,n)
term = 1;
y = 1;
for k=1:n
term = -term.*x.^2/(2*k-1)/(2*k);
y = y + term;
end
end
Here, we calculate each term in the series expansion from the previous calculated term. We avoid the calculation of the factorial and make use of what we already have. This speeds the code and avoids overflow. As a matter of fact, if we now calculate the cos approximation with 500 terms, we get:
x = -10*pi:0.5:10*pi;
ye = cos(x); % exact value
ya = mycos(x,500); % approximated value
ya2 = mycos2(x,500); % approximated value
plot(x,ye,x,ya,'x',x,ya2,'s')
legend('ye','ya','ya2')
Notice in this figure the x marks are the calculations done with the mycos function, while the o marks are done without using the factorial function. The first function crumbles for values outside the range [-2,2], but the second one runs just fine. It works even when I use 1e5 terms. Increasing the number of terms reduces the errors, so you can estimate how much terms you will use on an approximation, given a desired tolerance. If this number is greater than 170, the first function will not work properly.
factorial(170) returns 7.2574e+306, but factorial(171) returns Inf, so any value that should be calculated with more than 170 terms will have problems in the first function. Avoid the calculation of factorial at all costs.
This is what I tried:
x = -3*pi:0.01:3*pi;
y = x;
for ii=1:numel(y)
y(ii) = moveArgumentV2(y(ii)); % not vectorized
end
plot(sin(x))
hold on
plot(sin(y))
Both sin(x) and sin(y) produce the same plot. But:
plot(cos(x))
hold on
plot(cos(y))
Now we see that cos(x) and cos(y) are not the same! This is because moveArgumentV2 changes the angle to be in the first and fourth quadrant (in the range [-pi/2, pi/2]), which is what you need for the sin function, but is not adequate for the cos function.
I would modify sin_taylorV2 and cos_taylorV2 to call moveArgumentV2, so you don't rely on the caller to know what the valid input range is. In cos_taylorV2 you would need to call it this way:
x = moveArgumentV2(x+pi/2) - pi/2;
and in sin_taylorV2 you'd call it the same way you do now.
Or, better, write cos_taylorV2 in terms of sin_taylorV2, which we know to be correct. This avoids code duplication.
I am trying to implement NMF with Alternating Least Squares method. I am just curious about the following basic implementation of the problem:
If I understand correctly, we can solve for each matrix equation stated in this pseudocode without nonnegativity constraints, with closed form solution and set the negative entries to 0, in a brute force way. Is this understanding correct? Is this a basic alternative to more complicated, constrained optimization problems, where we use projected gradient descent, for example? More importantly, if implemented in this basic way, will the algorithm have any practical value? I want to use NMF for variable reduction purposes and it is important that I use NMF, since my data is by definition non-negative. I am looking for opinions on this one.
If I understand correctly, we can solve for each matrix equation stated in this pseudocode without nonnegativity constraints, with closed form solution and set the negative entries to 0, in a brute force way. Is this understanding correct? Yes.
Is this a basic alternative to more complicated, constrained optimization problems, where we use projected gradient descent, for example? ---In a sense, yes. This is indeed a fast way of Nonnegative factorization. However, articles related to NMF would point out that although this method is fast, it does not guarantee convergence of the nonnegative factors. A better implementation to use would be Hierarchical Alternating Least Squares for NMF (HALS-NMF). Check this paper for a comparison of some popular NMF algorithms: http://www.cc.gatech.edu/~hpark/papers/jgo.pdf
More importantly, if implemented in this basic way, will the algorithm have any practical value? Just basing from my experience, I would say that results aren't as good as compared to say HALS or BPP(Block Pivoting Principle).
Using nonnegative least squares in this algo as opposed to clipping off negative values would obviously be better in this algorithm, but in general I would not recommend this basic ALS/ANNLS method as it has bad convergence properties (it often fluctuates or can even show divergence) - a minimal Matlab implementation of a better method, the accelerated-Hierarchical Alternating Least Squares method for NMF (of Cichocki et al.), which is currently one of the fastest methods is shown here (code by Nicolas Gillis) :
% Accelerated hierarchical alternating least squares (HALS) algorithm of
% Cichocki et al.
%
% See N. Gillis and F. Glineur, "Accelerated Multiplicative Updates and
% Hierarchical ALS Algorithms for Nonnegative Matrix Factorization”,
% Neural Computation 24 (4), pp. 1085-1105, 2012.
% See http://sites.google.com/site/nicolasgillis/
%
% [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)
%
% Input.
% M : (m x n) matrix to factorize
% (U,V) : initial matrices of dimensions (m x r) and (r x n)
% alpha : nonnegative parameter of the accelerated method
% (alpha=0.5 seems to work well)
% delta : parameter to stop inner iterations when they become
% inneffective (delta=0.1 seems to work well).
% maxiter : maximum number of iterations
% timelimit : maximum time alloted to the algorithm
%
% Output.
% (U,V) : nonnegative matrices s.t. UV approximate M
% (e,t) : error and time after each iteration,
% can be displayed with plot(t,e)
%
% Remark. With alpha = 0, it reduces to the original HALS algorithm.
function [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)
% Initialization
etime = cputime; nM = norm(M,'fro')^2;
[m,n] = size(M); [m,r] = size(U);
a = 0; e = []; t = []; iter = 0;
if nargin <= 3, alpha = 0.5; end
if nargin <= 4, delta = 0.1; end
if nargin <= 5, maxiter = 100; end
if nargin <= 6, timelimit = 60; end
% Scaling, p. 72 of the thesis
eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1; j = 0;
scaling = sum(sum(A.*U))/sum(sum( B.*(U'*U) )); U = U*scaling;
% Main loop
while iter <= maxiter && cputime-etime <= timelimit
% Update of U
if j == 1, % Do not recompute A and B at first pass
% Use actual computational time instead of estimates rhoU
eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1;
end
j = 1; eit2 = cputime; eps = 1; eps0 = 1;
U = HALSupdt(U',B',A',eit1,alpha,delta); U = U';
% Update of V
eit1 = cputime; A = (U'*M); B = (U'*U); eit1 = cputime-eit1;
eit2 = cputime; eps = 1; eps0 = 1;
V = HALSupdt(V,B,A,eit1,alpha,delta);
% Evaluation of the error e at time t
if nargout >= 3
cnT = cputime;
e = [e sqrt( (nM-2*sum(sum(V.*A))+ sum(sum(B.*(V*V')))) )];
etime = etime+(cputime-cnT);
t = [t cputime-etime];
end
iter = iter + 1; j = 1;
end
% Update of V <- HALS(M,U,V)
% i.e., optimizing min_{V >= 0} ||M-UV||_F^2
% with an exact block-coordinate descent scheme
function V = HALSupdt(V,UtU,UtM,eit1,alpha,delta)
[r,n] = size(V);
eit2 = cputime; % Use actual computational time instead of estimates rhoU
cnt = 1; % Enter the loop at least once
eps = 1; eps0 = 1; eit3 = 0;
while cnt == 1 || (cputime-eit2 < (eit1+eit3)*alpha && eps >= (delta)^2*eps0)
nodelta = 0; if cnt == 1, eit3 = cputime; end
for k = 1 : r
deltaV = max((UtM(k,:)-UtU(k,:)*V)/UtU(k,k),-V(k,:));
V(k,:) = V(k,:) + deltaV;
nodelta = nodelta + deltaV*deltaV'; % used to compute norm(V0-V,'fro')^2;
if V(k,:) == 0, V(k,:) = 1e-16*max(V(:)); end % safety procedure
end
if cnt == 1
eps0 = nodelta;
eit3 = cputime-eit3;
end
eps = nodelta; cnt = 0;
end
For full code and comparison to other methods, see
https://sites.google.com/site/nicolasgillis/code
(section Accelerated MU and HALS algorithms for NMF)
and
N. Gillis and F. Glineur, "Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization”, Neural Computation 24 (4), pp. 1085-1105, 2012.
Yes, this can be done, but no you should not do it.
The bottleneck in NMF is not the non-negative least squares calculation, it's the calculation of the right-hand side of the least squares equations and the loss calculation (if used to determine convergence). In my experience, with a fast NNLS solver, the NNLS adds less than 1% relative runtime compared to basic least squares solving. Nowadays (maybe not when you asked the question) there are very fast methods such as TNT-NN and sequential coordinate descent which make things very fast.
I have tried this method and the model quality was really poor. It was hardly reminiscent of HALS or multiplicative updates.
I'm trying to get volume-time graph of .wav file. First, I recorded sound (patient exhalations) via android as .wav file, but when I read this .wav file in MATLAB it has negative values. What is the meaning of negative values? Second, MATLAB experts could you please check if the code below does the same as written in my comments? Also another question. Y = fft(WindowArray);
p = abs(Y).^2;
I took the power of values returned from fft...is that correct and what is the goal of this step??
[data, fs] = wavread('newF2');
% read exhalation audio wav file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows
while(((start+WINDOW_SIZE)-1)<=array_size)
j=1;
i =start;
while(j<=WINDOW_SIZE)
WindowArray(j) = transformed(i);
j = j+1;
i = i +1;
end
Y = fft(WindowArray);
p = abs(Y).^2; %power
[a, b] = max(abs(Y)); % find max a and its indices b
[m, i] = max(p); %the maximum of the power m and its indices i
maximum(g) = m;
index(t) = i;
power(o) = a;
indexP(g) = b;
start = start + step;
k = k+1;
t = t+1;
g = g+1;
o=o+1;
end
% low pass filter
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
M = max(maximum); %highest value in the array
Accept = 0.05* M;
if(maximum(u) > Accept)
maximum = maximum(u:length(maximum));
break;
end
end
% preparing the time of the graph,
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks));
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume
ylabel('Volume');
xlabel('Time (in seconds)');
(I only answer the part of the question which I understood)
Per default Matlab normalizes the audio wave to - 1...1 range. Use the native option if you want the integer data.
First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:
def plot_signal(file_name):
sampFreq, snd = wavfile.read(file_name)
snd = snd / (2.**15) #Convert sound array to floating point values
#Floating point values range from -1 to 1
s1 = snd[:,0] #left channel
s2 = snd[:,1] #right channel
timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000 #scale to milliseconds
timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000 #scale to milliseconds
n = len(s1)
p = fft(s1) # take the fourier transform
m = len(s2)
p2 = fft(s2)
nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)
mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)
'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p = p**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
p[1:len(p)] = p[1:len(p)] * 2
else:
p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()
'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p2 = p2**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()
I hope this helps.
I'm trying to create a matrix such that if I define a random number between 0 and 1 and a random location in the matrix, I want all the values around that to "diffuse" out. Here's sort of an example:
0.214 0.432 0.531 0.631 0.593 0.642
0.389 0.467 0.587 0.723 0.654 0.689
0.421 0.523 0.743 0.812 0.765 0.754
0.543 0.612 0.732 0.843 0.889 0.743
0.322 0.543 0.661 0.732 0.643 0.694
0.221 0.321 0.492 0.643 0.521 0.598
if you notice, there's a peak at (4,5) = 0.889 and all the other numbers decrease as they move away from that peak.
I can't figure out a nice way to generate a code that does this. Any thoughts? I need to be able to generate this type of matrix with random peaks and a random rate of decrease...
Without knowing what other constraints you want to implement:
Come up with a function z = f(x,y) whose peak value is at (x0,y0) == (0,0) and whose values range between [0,1]. As an example, the PDF for the Normal distribution with mu = 0 and sigma = 1/sqrt(2*pi) has a peak at x == 0 of 1.0, and whose lower bound is zero. Similarly, a bivariate normal PDF with mu = {0,0} and determinate(sigma) == [1/(2*pi)]^2 will have similar characteristics.
Any mathematical function may have its domain shifted: f(x-x0, y-y0)
Your code will look something like this:
someFunction = #(x,y) theFunctionYouPicked(x,y);
[x0,y0,peak] = %{ you supply these values %};
myFunction = #(x,y) peak * someFunction(x - x0, y - y0);
[dimX,dimY] = %{ you supply these values %};
mymatrix = bsxfun( myFunction, 0:dimX, (0:dimY)' );
You can read more about bsxfun here; however, here's an example of how it works:
bsxfun( blah, [a b c], [d e f]' )
That should give the following matrix (or its transpose ... I don't have matlab in front of me):
[blah(a,d) blah(a,e) blah(a,f);
blah(b,d) blah(b,e) blah(b,f);
blah(c,d) blah(c,e) blah(c,f)]
Get a toy example working, then you can tinker with it to be more flexible. If the function dictating how it decreases is random (with the constraint that points closer to (x0,y0) are larger than more distant points), it won't be an issue to make a procedural function instead of using strictly mathematical ones.
In response to your answer:
Your equation could be thought of as a model for gravity where an object instantaneously induces a force on another mass, then stops exerting force. Following that logic, it could be modified to a naive vector formulation like this:
% v1 & v2 are vectors that point from the two peak points to the point [ii,jj]
theMatrix(ii,jj) = norm( (r1 / norm( v1 )) * v1 / norm( v1 ) ...
+ (r2 / norm( v2 )) * v2 / norm( v2 ) ...
);
The most extreme type of corner case you'll run into is one where v1 & v2 point in the same direction as in the following row:
[ . . A X1 X2 . . ]
... where you want a value for A w/respect to X1 & X2. Using the above expression it'll boil down to A = X1 / norm(v1) + X2 / norm(v2), which will definitely exceed the peak value at X1 because norm(v1) == 1. You could certainly do some dirty stuff to Band-Aid it, but personally I'd start looking for a different function.
Along those lines, if you used Newton's Law of Universal Gravitation with a few modifications:
You wouldn't need an analogue for G, so you could just assume G == 1
Treat each of the points in the matrix as having mass m2 == 1, so the equation reduces to: F_12 == -1 * (m1 / r^2) * RHAT_12
Sum the "force" vectors and calculate the norm to get each value
... you'll still run into the same problem. The corner case I laid out above would boil down to A = X1/norm(v1)^2 + X2/norm(v2)^2 == X1 + X2/4. Since it's inversely proportional to the square of the distances, it'd be easier to Band-Aid than the linear one, but I wouldn't recommend it.
Similarly, if you use polynomials it won't scale well; you can design one that won't ever exceed your chosen peaks, but there wouldn't be a lower bound.
You could use the logistic function to help with this:
1 / (1 + E^(-c*x))
Here's an example of using the logistic function on a degree 4 polynomial with peaks at points 2 & 4; you'll note I gave the polynomial a scaling factor to pull the polynomial down to relatively small values so calculated values aren't so close together.
I ended up creating a code that wraps the way I want based on a dimension, which I provide. Here's the code:
dims = 100;
A = zeros(dims);
b = floor(1+dims*rand(1));
c = floor(1+dims*rand(1));
d = rand(1);
x1 = c;
y1 = b;
A(x1,y1) = d;
for i = 1:dims
for j = i
k = 1-j;
while k <= j
if x1-j>0 && y1+k>0 && y1+k <= dims
if A(x1-j,y1+k) == 0
A(x1-j,y1+k) = eqn(d,x1-j,y1+k,x1,y1);
end
end
k = k+1;
end
end
for k = i
j = 1-k;
while j<=k
if x1+j>0 && y1+k>0 && y1+k <= dims && x1+j <= dims
if A(x1+j,y1+k)==0
A(x1+j, y1+k) = eqn(d,x1+j,y1+k,x1,y1);
end
end
j = j+1;
end
end
for j = i
k = 1-j;
while k<=j
if x1+j>0 && y1-k>0 && x1+j <= dims && y1-k<= dims
if A(x1+j,y1-k) == 0
A(x1+j,y1-k) = eqn(d,x1+j,y1-k,x1,y1);
end
end
k=k+1;
end
end
for k = i
j = 1-k;
while j<=k
if x1-j>0 && y1-k>0 && x1-j <= dims && y1-k<= dims
if A(x1-j,y1-k)==0
A(x1-j,y1-k) = eqn(d,x1-j,y1-k,x1,y1);
end
end
j = j+1;
end
end
end
colormap('hot');
imagesc(A);
colorbar;
If you notice, the code calls a function (I called it eqn), which provided the information for how to changes the values in each cell. The function that I settled on is d/distance (distance being computed using the standard distance formula).
It seems to work pretty well. I'm now just trying to develop a good way to have multiple peaks in the same square without one peak completely overwriting the other.