I have a joint probability density f(x,y,z) and I wish to find the conditional distribution X|Y=y,Z=z, which is equivalent to treating x as data and y and z as parameters (constants).
For example, if I have X|Y=y,Z=z being the pdf of a N(1-2y,3z^2+2), the function would be:
syms x y z
f(y,z) = 1/sqrt(2*pi*(3*z^2+2)) * exp(-1/(2*(3*z^2+2)) * (x-(1-2*y))^2);
I would like to compare it to the following:
syms mu s L a b
Normal(mu,s) = (1/sqrt(2*pi*s^2)) * exp(-1/(2*s^2) * (x-mu)^2);
Exponential(L) = L * exp(-L*x);
Gamma(a,b) = (b^a / gamma(a)) * x^(a-1)*exp(-b*x);
Beta(a,b) = (1/beta(a,b)) * x^(a-1)*(1-x)^(b-1);
Question
How do I make a program whichDistribution that would be able to print which of these four, f is equivalent to (up to proportionality) with respect to the variable x, and what are the parameters? E.g. f and x as above, the distribution is Normal, mu=1-2*y, s=3*z^2+2.
NB: there would not always be a unique solution, since some distributions are are equivalent (e.g. Gamma(1,L)==Exponential(L))
Desired outputs
syms x y z
f = 1/sqrt(2*pi*(3*z^2+2)) * exp(-1/(2*(3*z^2+2)) * (x-(1-2*y))^2)
whichDistribution(f,x) %Conditional X|Y,Z
% Normal(1-2*y,3*z^2+2)
syms x y
f = y^(1/2)*exp(-(x^2)/2 - y/2 * (1+(4-x)^2+(6-x)^2)) % this is not a pdf because it is missing a constant of proportionality, but it should still work
whichDistribution(f,x) %Conditional X|Y
% Normal(10*y/(2*y+1), 1/(2*y+1))
whichDistribution(f,y) %Conditional Y|X
% Gamma(3/2, x^2 - 10*x + 53/2)
f = exp(-x) %also missing a constant of proportionality
whichDistribution(f,x)
% Exponential(1)
f = 1/(2*pi)*exp(-(x^2)/2 - (y^2)/2)
whichDistribution(f,x)
% Normal(0,1)
whichDistribution(f,y)
% Normal(0,1)
What I have tried so far:
Using solve():
q = solve(f(y,z) == Normal(mu,s), mu, s)
Which gives wrong results, since parameters can't depend on x:
>> q.mu
ans =
(z1^2*(log((2^(1/2)*exp(x^2/(2*z1^2) - (x + 2*y - 1)^2/(6*z^2 + 4)))/(2*pi^(1/2)*(3*z^2 + 2)^(1/2))) + pi*k*2i))/x
>> q.s
ans =
z1
Attempting to simplify f(y,z) up to proportionality (in x variable) using a propto() function that I wrote:
>> propto(f(y,z),x)
ans =
exp(-(x*(x + 4*y - 2))/(2*(3*z^2 + 2)))
>> propto(Normal(mu,s),x)
ans =
exp((x*(2*mu - x))/(2*s^2))
This is almost on the money, since it is easy to spot that s^2=3*z^2 + 2 and 2*mu=-(4*y - 2), but I don't know how to deduce this programmatically.
In case it is useful: propto(f,x) attempts to simplify f by dividing f by children of f which don't involve x, and then output whichever form has the least number of children. Here is the routine:
function out = propto(f,x)
oldf = f;
newf = propto2(f,x);
while (~strcmp(char(oldf),char(newf))) % if the form of f changed, do propto2 again. When propto2(f) == f, stop
oldf = newf;
newf = propto2(oldf,x);
end
out = newf;
end
function out = propto2(f,x)
t1 = children(expand(f)); % expanded f
i1 = ~has([t1{:}],x);
out1 = simplify(f/prod([t1{i1}])); % divides expanded f by terms that do not involve x
t2 = children(f); % unexpanded f
i2 = ~has([t2{:}],x);
out2 = simplify(f/prod([t2{i2}])); % divides f by terms that do not involve x
A = [f, symlength(f); out1, symlength(out1); out2, symlength(out2)];
A = sortrows(A,2); % outputs whichever form has the fewest number of children
out = A(1,1);
end
function L = symlength(f)
% counts the number of children of f by repeatingly applying children() to itself
t = children(f);
t = [t{:}];
L = length(t);
if (L == 1)
return
end
oldt = f;
while(~strcmp(char(oldt),char(t)))
oldt = t;
t = children(t);
t = [t{:}];
t = [t{:}];
end
L = length(t);
end
edit: added desired outputs
edit2: clarified the desired function
I have managed to solve my own problem using solve() from Symbolic Toolbox. There were two issues with my original approach: I needed to set up n simultaneous equations for n parameters, and the solve() doesn't cope well with exponentials:
solve(f(3) == g(3), f(4) == g(4), mu,s)
yields no solutions, but
logf(x) = feval(symengine,'simplify',log(f),'IgnoreAnalyticConstraints');
logg(x) = feval(symengine,'simplify',log(g),'IgnoreAnalyticConstraints');
solve(logf(3) == logg(3), logf(4) == logg(4), mu,s)
yields good solutions.
Solution
Given f(x), for each PDF g(x) we attempt to solve simultaneously
log(f(r1)) == log(g(r1)) and log(f(r2)) == log(g(r2))
for some simple non-equal numbers r1, r2. Then output g for which the solution has the lowest complexity.
The code is:
function whichDist(f,x)
syms mu s L a b x0 x1 x2 v n p g
f = propto(f,x); % simplify up to proportionality
logf(x) = feval(symengine,'simplify',log(f),'IgnoreAnalyticConstraints');
Normal(mu,s,x) = propto((1/sqrt(2*pi*s)) * exp(-1/(2*s) * (x-mu)^2),x);
Exponential(L,x) = exp(-L*x);
Gamma(a,b,x) = x^(a-1)*exp(-b*x);
Beta(a,b,x) = x^(a-1)*(1-x)^(b-1);
ChiSq(v,x) = x^(v/2 - 1) * exp(-x/2);
tdist(v,x) = (1+x^2 / v)^(-(v+1)/2);
Cauchy(g,x0,x) = 1/(1+((x-x0)/g)^2);
logf = logf(x);
best_sol = {'none', inf};
r1 = randi(10); r2 = randi(10); r3 = randi(10);
while (r1 == r2 || r2 == r3 || r1 == r3) r1 = randi(10); r2 = randi(10); r3 = randi(10); end
%% check Exponential:
if (propto(logf,x) == x) % pdf ~ exp(K*x), can read off Lambda directly
soln = -logf/x;
if (~has(soln,x)) % any solution can't depend on x
fprintf('\nExponential: rate L = %s\n\n', soln);
return
end
end
%% check Chi-sq:
if (propto(logf + x/2, log(x)) == log(x)) % can read off v directly
soln = 2*(1+(logf + x/2) / log(x));
if (~has(soln,x))
dof = feval(symengine,'simplify',soln,'IgnoreAnalyticConstraints');
fprintf('\nChi-Squared: v = %s\n\n', dof);
return
end
end
%% check t-dist:
h1 = propto(logf,x);
h = simplify(exp(h1) - 1);
if (propto(h,x^2) == x^2) % pdf ~ exp(K*x), can read off Lambda directly
soln = simplify(x^2 / h);
if (~has(soln,x))
fprintf('\nt-dist: v = %s\n\n', soln);
return
end
end
h = simplify(exp(-h1) - 1); % try again if propto flipped a sign
if (propto(h,x^2) == x^2) % pdf ~ exp(K*x), can read off Lambda directly
soln = simplify(x^2 / h);
if (~has(soln,x))
fprintf('\nt-dist: v = %s\n\n', soln);
return
end
end
%% check Normal:
logn(x) = feval(symengine,'simplify',log(Normal(mu,s,x)),'IgnoreAnalyticConstraints');
% A = (x - propto(logf/x, x))/2;
% B = simplify(-x/(logf/x - mu/s)/2);
% if (~has(A,x) && ~has(B,x))
% fprintf('Normal: mu = %s, s^2 = %s', A, B);
% return
% end
logf(x) = logf;
try % attempt to solve the equation
% solve simultaneously for two random non-equal integer values r1,r2
qn = solve(logf(r1) == logn(r1), logf(r2) == logn(r2), mu, s);
catch error
end
if (exist('qn','var')) % if solve() managed to run
if (~isempty(qn.mu) && ~isempty(qn.s) && ~any(has([qn.mu,qn.s],x))) % if solution exists
complexity = symlength(qn.mu) + symlength(qn.s);
if complexity < best_sol{2} % store best solution so far
best_sol{1} = sprintf('Normal: mu = %s, s^2 = %s', qn.mu, qn.s);
best_sol{2} = complexity;
end
end
end
%% check Cauchy:
logcau(x) = feval(symengine,'simplify',log(Cauchy(g,x0,x)),'IgnoreAnalyticConstraints');
f(x) = f;
try
qcau = solve(f(r1) == Cauchy(g,x0,r1), f(r2) == Cauchy(g,x0,r2), g, x0);
catch error
end
if (exist('qcau','var'))
if (~isempty(qcau.g) && ~isempty(qcau.x0) && ~any(has([qcau.g(1),qcau.x0(1)],x)))
complexity = symlength(qcau.g(1)) + symlength(qcau.x0(1));
if complexity < best_sol{2}
best_sol{1} = sprintf('Cauchy: g = %s, x0 = %s', qcau.g(1), qcau.x0(1));
best_sol{2} = complexity;
end
end
end
f = f(x);
%% check Gamma:
logg(x) = feval(symengine,'simplify',log(Gamma(a,b,x)),'IgnoreAnalyticConstraints');
t = children(logf); t = [t{:}];
if (length(t) == 2)
if (propto(t(1),log(x)) == log(x) && propto(t(2),x) == x)
soln = [t(1)/log(x) + 1, -t(2)/x];
if (~any(has(soln,x)))
fprintf('\nGamma: shape a = %s, rate b = %s\n\n',soln);
return
end
elseif (propto(t(2),log(x)) == log(x) && propto(t(1),x) == x)
soln = [t(2)/log(x) + 1, -t(1)/x];
if (~any(has(soln,x)))
fprintf('\nGamma: shape a = %s, rate b = %s\n\n',soln);
return
end
end
end
logf(x) = logf;
try % also try using solve(), just in case.
qg = solve(logf(r1) == logg(r1), logf(r2) == logg(r2), a, b);
catch error
end
if (exist('qg','var'))
if (~isempty(qg.a) && ~isempty(qg.b) && ~any(has([qg.a,qg.b],x)))
complexity = symlength(qg.a) + symlength(qg.b);
if complexity < best_sol{2}
best_sol{1} = sprintf('Gamma: shape a = %s, rate b = %s', qg.a, qg.b);
best_sol{2} = complexity;
end
end
end
logf = logf(x);
%% check Beta:
B = feval(symengine,'simplify',log(propto(f,x-1)),'IgnoreAnalyticConstraints');
if (propto(B,log(x-1)) == log(x-1))
B = B / log(x-1) + 1;
A = f / (x-1)^(B-1);
A = feval(symengine,'simplify',log(abs(A)),'IgnoreAnalyticConstraints');
if (propto(A,log(abs(x))) == log(abs(x)))
A = A / log(abs(x)) + 1;
if (~any(has([A,B],x)))
fprintf('\nBeta1: a = %s, b = %s\n\n', A, B);
return
end
end
elseif (propto(B,log(1-x)) == log(1-x))
B = B / log(1-x);
A = simplify(f / (1-x)^(B-1));
A = feval(symengine,'simplify',log(A),'IgnoreAnalyticConstraints');
if (propto(A,log(x)) == log(x))
A = A / log(x) + 1;
if (~any(has([A,B],x)))
fprintf('\nBeta1: a = %s, b = %s\n\n', A, B);
return
end
end
end
%% Print solution with lowest complexity
fprintf('\n%s\n\n', best_sol{1});
end
Tests:
>> syms x y z
>> f = y^(1/2)*exp(-(x^2)/2 - y/2 * (1+(4-x)^2+(6-x)^2))
>> whichDist(f,x)
Normal: mu = (10*y)/(2*y + 1), s^2 = 1/(2*y + 1)
>> whichDist(f,y)
Gamma: a = 3/2, b = x^2 - 10*x + 53/2
>> Beta(a,b,x) = propto((1/beta(a,b)) * x^(a-1)*(1-x)^(b-1), x);
>> f = Beta(1/z + 7*y/(1-sqrt(z)), z/y + 1/(1-z), x)
Beta: a = -(7*y*z - z^(1/2) + 1)/(z*(z^(1/2) - 1)), b = -(y + z - z^2)/(y*(z - 1))
All correct.
Sometimes bogus answers if the parameters are numeric:
whichDist(Beta(3,4,x),x)
Beta: a = -(pi*log(2)*1i + pi*log(3/10)*1i - log(2)*log(3/10) + log(2)*log(7/10) - log(3/10)*log(32) + log(2)*log(1323/100000))/(log(2)*(log(3/10) - log(7/10))), b = (pi*log(2)*1i + pi*log(7/10)*1i + log(2)*log(3/10) - log(2)*log(7/10) - log(7/10)*log(32) + log(2)*log(1323/100000))/(log(2)*(log(3/10) - log(7/10)))
So there is room for improvement and I will still award bounty to a better solution than this.
Edit: Added more distributions. Improved Gamma and Beta distribution identifications by spotting them directly without needing solve().
For the purpose of generalization, I hope Matlab can automatically compute the 1st & 2nd derivatives of the associated function f(x). (in case I change f(x) = sin(6x) to f(x) = sin(8x))
I know there exists built-in commands called diff() and syms, but I cannot figure out how to deal with them with the index i in the for-loop. This is the key problem I am struggling with.
How do I make changes to the following set of codes? I am using MATLAB R2019b.
n = 10;
h = (2.0 * pi) / (n - 1);
for i = 1 : n
x(i) = 0.0 + (i - 1) * h;
f(i) = sin(6 * x(i));
dfe(i) = 6 * cos(6 * x(i)); % first derivative
ddfe(i) = -36 * sin(6 * x(i)); % second derivative
end
You can simply use subs and double to do that. For your case:
% x is given here
n = 10;
h = (2.0 * pi) / (n - 1);
syms 'y';
g = sin(6 * y);
for i = 1 : n
x(i) = 0.0 + (i - 1) * h;
f(i) = double(subs(g,y,x(i)));
dfe(i) = double(subs(diff(g),y,x(i))); % first derivative
ddfe(i) = double(subs(diff(g,2),y,x(i))); % second derivative
end
By #Daivd comment, you can vectorize the loop as well:
% x is given here
n = 10;
h = (2.0 * pi) / (n - 1);
syms 'y';
g = sin(6 * y);
x = 0.0 + ((1:n) - 1) * h;
f = double(subs(g,y,x));
dfe = double(subs(diff(g),y,x)); % first derivative
ddfe = double(subs(diff(g,2),y,x)); % second derivative
I am doing two updates for h and x given in this [paper] http://paris.cs.illinois.edu/pubs/nasser-icassp2015.pdf (You don't have to read the paper, just look for the equations 4,10 and 14 for updating h and x given on page 2 and 3).
This is the code snippet that I have tried so far. Can you tell me if its correct? Also, is there any way to optimize these for loops?
In some cases (t-tau) term was negative and MATLAB was giving an error. So, I put a condition that only implement if (t-tau)>0. Doing this is correct or is there any other way to take care of the negative indices?
%updates
Lh=10;
lambda = 0.1*(sum(reverberatedspeechspec(:))/(size(Y,1)*size(Y,2)));
S=reverberatedspeechspec;
W=basis_mel_act; W=gather(W); W = double(W); %W is the dictionary
%---initialization for H(RIR)----
H=rand(size(Y,1),Lh);
nmfIter = 50;
%---initialization for X(Activations)-----
W_trans=W';
X=W_trans*S; X=double(X);
Y = zeros(size(S,1));
for idx = 1 : size(Y,1)
Y(idx,:) = filter(S(idx,:),1,H(idx,:));
end
Stilde = zeros(size(S));
Ytilde = zeros(size(S));
for iter=1:nmfIter
% update for H
Stilde = W*X;
Ytilde = zeros(size(S));
for j=1:size(Stilde,1)
Ytilde (j,:) = filter(Stilde(j,:),1,H(j,:));
end
ratio = Y./Ytilde;
numerator = zeros(size(H));
denominator = numerator;
for k = 1 :size(Y,1)
for tau = 1:Lh
for t= 1:size(Y,2)
if gt (t-tau , 0)
numerator (k,tau) = numerator(k,tau) + ratio(k,t) * Stilde(k,t-tau);
denominator(k,tau) = denominator (k,tau) + Stilde (k,t-tau);
end
end
end
end
H = H .* numerator ./denominator ;
%updating Ytilde after getting a new value for H
for j=1:size(Stilde,1)
Ytilde (j,:) = filter(Stilde(j,:),1,H(j,:));
end
%update for X
ratio = Y./Ytilde;
ratio = [ratio zeros(size(Y,1),Lh)]; %zero padding Y and Ytilde for (t+tau) term in update of X.
Product = H.' * W; % Product of H_transpose and W in th update which is equivalent to the term ∑H(k,tau)W(k,r)
numerator = zeros(size(H));
denominator = numerator;
for r = 1:size(W,2)
for t = 1:size(Y,2)
for k = 1:size(Y,1)
for tau = 1:Lh
numerator(r,t) = numerator(r,t) + ratio(k,t+tau) * Product(tau,r);
denominator(r,t) = denominator(r,t) + Product(tau,r) + lambda;
end
end
end
end
X = X .* numerator ./denominator;
end