"Spotting" probability density functions of distributions programmatically (Symbolic Toolbox) - matlab
I have a joint probability density f(x,y,z) and I wish to find the conditional distribution X|Y=y,Z=z, which is equivalent to treating x as data and y and z as parameters (constants).
For example, if I have X|Y=y,Z=z being the pdf of a N(1-2y,3z^2+2), the function would be:
syms x y z
f(y,z) = 1/sqrt(2*pi*(3*z^2+2)) * exp(-1/(2*(3*z^2+2)) * (x-(1-2*y))^2);
I would like to compare it to the following:
syms mu s L a b
Normal(mu,s) = (1/sqrt(2*pi*s^2)) * exp(-1/(2*s^2) * (x-mu)^2);
Exponential(L) = L * exp(-L*x);
Gamma(a,b) = (b^a / gamma(a)) * x^(a-1)*exp(-b*x);
Beta(a,b) = (1/beta(a,b)) * x^(a-1)*(1-x)^(b-1);
Question
How do I make a program whichDistribution that would be able to print which of these four, f is equivalent to (up to proportionality) with respect to the variable x, and what are the parameters? E.g. f and x as above, the distribution is Normal, mu=1-2*y, s=3*z^2+2.
NB: there would not always be a unique solution, since some distributions are are equivalent (e.g. Gamma(1,L)==Exponential(L))
Desired outputs
syms x y z
f = 1/sqrt(2*pi*(3*z^2+2)) * exp(-1/(2*(3*z^2+2)) * (x-(1-2*y))^2)
whichDistribution(f,x) %Conditional X|Y,Z
% Normal(1-2*y,3*z^2+2)
syms x y
f = y^(1/2)*exp(-(x^2)/2 - y/2 * (1+(4-x)^2+(6-x)^2)) % this is not a pdf because it is missing a constant of proportionality, but it should still work
whichDistribution(f,x) %Conditional X|Y
% Normal(10*y/(2*y+1), 1/(2*y+1))
whichDistribution(f,y) %Conditional Y|X
% Gamma(3/2, x^2 - 10*x + 53/2)
f = exp(-x) %also missing a constant of proportionality
whichDistribution(f,x)
% Exponential(1)
f = 1/(2*pi)*exp(-(x^2)/2 - (y^2)/2)
whichDistribution(f,x)
% Normal(0,1)
whichDistribution(f,y)
% Normal(0,1)
What I have tried so far:
Using solve():
q = solve(f(y,z) == Normal(mu,s), mu, s)
Which gives wrong results, since parameters can't depend on x:
>> q.mu
ans =
(z1^2*(log((2^(1/2)*exp(x^2/(2*z1^2) - (x + 2*y - 1)^2/(6*z^2 + 4)))/(2*pi^(1/2)*(3*z^2 + 2)^(1/2))) + pi*k*2i))/x
>> q.s
ans =
z1
Attempting to simplify f(y,z) up to proportionality (in x variable) using a propto() function that I wrote:
>> propto(f(y,z),x)
ans =
exp(-(x*(x + 4*y - 2))/(2*(3*z^2 + 2)))
>> propto(Normal(mu,s),x)
ans =
exp((x*(2*mu - x))/(2*s^2))
This is almost on the money, since it is easy to spot that s^2=3*z^2 + 2 and 2*mu=-(4*y - 2), but I don't know how to deduce this programmatically.
In case it is useful: propto(f,x) attempts to simplify f by dividing f by children of f which don't involve x, and then output whichever form has the least number of children. Here is the routine:
function out = propto(f,x)
oldf = f;
newf = propto2(f,x);
while (~strcmp(char(oldf),char(newf))) % if the form of f changed, do propto2 again. When propto2(f) == f, stop
oldf = newf;
newf = propto2(oldf,x);
end
out = newf;
end
function out = propto2(f,x)
t1 = children(expand(f)); % expanded f
i1 = ~has([t1{:}],x);
out1 = simplify(f/prod([t1{i1}])); % divides expanded f by terms that do not involve x
t2 = children(f); % unexpanded f
i2 = ~has([t2{:}],x);
out2 = simplify(f/prod([t2{i2}])); % divides f by terms that do not involve x
A = [f, symlength(f); out1, symlength(out1); out2, symlength(out2)];
A = sortrows(A,2); % outputs whichever form has the fewest number of children
out = A(1,1);
end
function L = symlength(f)
% counts the number of children of f by repeatingly applying children() to itself
t = children(f);
t = [t{:}];
L = length(t);
if (L == 1)
return
end
oldt = f;
while(~strcmp(char(oldt),char(t)))
oldt = t;
t = children(t);
t = [t{:}];
t = [t{:}];
end
L = length(t);
end
edit: added desired outputs
edit2: clarified the desired function
I have managed to solve my own problem using solve() from Symbolic Toolbox. There were two issues with my original approach: I needed to set up n simultaneous equations for n parameters, and the solve() doesn't cope well with exponentials:
solve(f(3) == g(3), f(4) == g(4), mu,s)
yields no solutions, but
logf(x) = feval(symengine,'simplify',log(f),'IgnoreAnalyticConstraints');
logg(x) = feval(symengine,'simplify',log(g),'IgnoreAnalyticConstraints');
solve(logf(3) == logg(3), logf(4) == logg(4), mu,s)
yields good solutions.
Solution
Given f(x), for each PDF g(x) we attempt to solve simultaneously
log(f(r1)) == log(g(r1)) and log(f(r2)) == log(g(r2))
for some simple non-equal numbers r1, r2. Then output g for which the solution has the lowest complexity.
The code is:
function whichDist(f,x)
syms mu s L a b x0 x1 x2 v n p g
f = propto(f,x); % simplify up to proportionality
logf(x) = feval(symengine,'simplify',log(f),'IgnoreAnalyticConstraints');
Normal(mu,s,x) = propto((1/sqrt(2*pi*s)) * exp(-1/(2*s) * (x-mu)^2),x);
Exponential(L,x) = exp(-L*x);
Gamma(a,b,x) = x^(a-1)*exp(-b*x);
Beta(a,b,x) = x^(a-1)*(1-x)^(b-1);
ChiSq(v,x) = x^(v/2 - 1) * exp(-x/2);
tdist(v,x) = (1+x^2 / v)^(-(v+1)/2);
Cauchy(g,x0,x) = 1/(1+((x-x0)/g)^2);
logf = logf(x);
best_sol = {'none', inf};
r1 = randi(10); r2 = randi(10); r3 = randi(10);
while (r1 == r2 || r2 == r3 || r1 == r3) r1 = randi(10); r2 = randi(10); r3 = randi(10); end
%% check Exponential:
if (propto(logf,x) == x) % pdf ~ exp(K*x), can read off Lambda directly
soln = -logf/x;
if (~has(soln,x)) % any solution can't depend on x
fprintf('\nExponential: rate L = %s\n\n', soln);
return
end
end
%% check Chi-sq:
if (propto(logf + x/2, log(x)) == log(x)) % can read off v directly
soln = 2*(1+(logf + x/2) / log(x));
if (~has(soln,x))
dof = feval(symengine,'simplify',soln,'IgnoreAnalyticConstraints');
fprintf('\nChi-Squared: v = %s\n\n', dof);
return
end
end
%% check t-dist:
h1 = propto(logf,x);
h = simplify(exp(h1) - 1);
if (propto(h,x^2) == x^2) % pdf ~ exp(K*x), can read off Lambda directly
soln = simplify(x^2 / h);
if (~has(soln,x))
fprintf('\nt-dist: v = %s\n\n', soln);
return
end
end
h = simplify(exp(-h1) - 1); % try again if propto flipped a sign
if (propto(h,x^2) == x^2) % pdf ~ exp(K*x), can read off Lambda directly
soln = simplify(x^2 / h);
if (~has(soln,x))
fprintf('\nt-dist: v = %s\n\n', soln);
return
end
end
%% check Normal:
logn(x) = feval(symengine,'simplify',log(Normal(mu,s,x)),'IgnoreAnalyticConstraints');
% A = (x - propto(logf/x, x))/2;
% B = simplify(-x/(logf/x - mu/s)/2);
% if (~has(A,x) && ~has(B,x))
% fprintf('Normal: mu = %s, s^2 = %s', A, B);
% return
% end
logf(x) = logf;
try % attempt to solve the equation
% solve simultaneously for two random non-equal integer values r1,r2
qn = solve(logf(r1) == logn(r1), logf(r2) == logn(r2), mu, s);
catch error
end
if (exist('qn','var')) % if solve() managed to run
if (~isempty(qn.mu) && ~isempty(qn.s) && ~any(has([qn.mu,qn.s],x))) % if solution exists
complexity = symlength(qn.mu) + symlength(qn.s);
if complexity < best_sol{2} % store best solution so far
best_sol{1} = sprintf('Normal: mu = %s, s^2 = %s', qn.mu, qn.s);
best_sol{2} = complexity;
end
end
end
%% check Cauchy:
logcau(x) = feval(symengine,'simplify',log(Cauchy(g,x0,x)),'IgnoreAnalyticConstraints');
f(x) = f;
try
qcau = solve(f(r1) == Cauchy(g,x0,r1), f(r2) == Cauchy(g,x0,r2), g, x0);
catch error
end
if (exist('qcau','var'))
if (~isempty(qcau.g) && ~isempty(qcau.x0) && ~any(has([qcau.g(1),qcau.x0(1)],x)))
complexity = symlength(qcau.g(1)) + symlength(qcau.x0(1));
if complexity < best_sol{2}
best_sol{1} = sprintf('Cauchy: g = %s, x0 = %s', qcau.g(1), qcau.x0(1));
best_sol{2} = complexity;
end
end
end
f = f(x);
%% check Gamma:
logg(x) = feval(symengine,'simplify',log(Gamma(a,b,x)),'IgnoreAnalyticConstraints');
t = children(logf); t = [t{:}];
if (length(t) == 2)
if (propto(t(1),log(x)) == log(x) && propto(t(2),x) == x)
soln = [t(1)/log(x) + 1, -t(2)/x];
if (~any(has(soln,x)))
fprintf('\nGamma: shape a = %s, rate b = %s\n\n',soln);
return
end
elseif (propto(t(2),log(x)) == log(x) && propto(t(1),x) == x)
soln = [t(2)/log(x) + 1, -t(1)/x];
if (~any(has(soln,x)))
fprintf('\nGamma: shape a = %s, rate b = %s\n\n',soln);
return
end
end
end
logf(x) = logf;
try % also try using solve(), just in case.
qg = solve(logf(r1) == logg(r1), logf(r2) == logg(r2), a, b);
catch error
end
if (exist('qg','var'))
if (~isempty(qg.a) && ~isempty(qg.b) && ~any(has([qg.a,qg.b],x)))
complexity = symlength(qg.a) + symlength(qg.b);
if complexity < best_sol{2}
best_sol{1} = sprintf('Gamma: shape a = %s, rate b = %s', qg.a, qg.b);
best_sol{2} = complexity;
end
end
end
logf = logf(x);
%% check Beta:
B = feval(symengine,'simplify',log(propto(f,x-1)),'IgnoreAnalyticConstraints');
if (propto(B,log(x-1)) == log(x-1))
B = B / log(x-1) + 1;
A = f / (x-1)^(B-1);
A = feval(symengine,'simplify',log(abs(A)),'IgnoreAnalyticConstraints');
if (propto(A,log(abs(x))) == log(abs(x)))
A = A / log(abs(x)) + 1;
if (~any(has([A,B],x)))
fprintf('\nBeta1: a = %s, b = %s\n\n', A, B);
return
end
end
elseif (propto(B,log(1-x)) == log(1-x))
B = B / log(1-x);
A = simplify(f / (1-x)^(B-1));
A = feval(symengine,'simplify',log(A),'IgnoreAnalyticConstraints');
if (propto(A,log(x)) == log(x))
A = A / log(x) + 1;
if (~any(has([A,B],x)))
fprintf('\nBeta1: a = %s, b = %s\n\n', A, B);
return
end
end
end
%% Print solution with lowest complexity
fprintf('\n%s\n\n', best_sol{1});
end
Tests:
>> syms x y z
>> f = y^(1/2)*exp(-(x^2)/2 - y/2 * (1+(4-x)^2+(6-x)^2))
>> whichDist(f,x)
Normal: mu = (10*y)/(2*y + 1), s^2 = 1/(2*y + 1)
>> whichDist(f,y)
Gamma: a = 3/2, b = x^2 - 10*x + 53/2
>> Beta(a,b,x) = propto((1/beta(a,b)) * x^(a-1)*(1-x)^(b-1), x);
>> f = Beta(1/z + 7*y/(1-sqrt(z)), z/y + 1/(1-z), x)
Beta: a = -(7*y*z - z^(1/2) + 1)/(z*(z^(1/2) - 1)), b = -(y + z - z^2)/(y*(z - 1))
All correct.
Sometimes bogus answers if the parameters are numeric:
whichDist(Beta(3,4,x),x)
Beta: a = -(pi*log(2)*1i + pi*log(3/10)*1i - log(2)*log(3/10) + log(2)*log(7/10) - log(3/10)*log(32) + log(2)*log(1323/100000))/(log(2)*(log(3/10) - log(7/10))), b = (pi*log(2)*1i + pi*log(7/10)*1i + log(2)*log(3/10) - log(2)*log(7/10) - log(7/10)*log(32) + log(2)*log(1323/100000))/(log(2)*(log(3/10) - log(7/10)))
So there is room for improvement and I will still award bounty to a better solution than this.
Edit: Added more distributions. Improved Gamma and Beta distribution identifications by spotting them directly without needing solve().
Related
Steepest Descent using Armijo rule
I want to determine the Steepest descent of the Rosenbruck function using Armijo steplength where x = [-1.2, 1]' (the initial column vector). The problem is, that the code has been running for a long time. I think there will be an infinite loop created here. But I could not understand where the problem was. Could anyone help me? n=input('enter the number of variables n '); % Armijo stepsize rule parameters x = [-1.2 1]'; s = 10; m = 0; sigma = .1; beta = .5; obj=func(x); g=grad(x); k_max = 10^5; k=0; % k = # iterations nf=1; % nf = # function eval. x_new = zeros([],1) ; % empty vector which can be filled if length is not known ; [X,Y]=meshgrid(-2:0.5:2); fx = 100*(X.^2 - Y).^2 + (X-1).^2; contour(X, Y, fx, 20) while (norm(g)>10^(-3)) && (k<k_max) d = -g./abs(g); % steepest descent direction s = 1; newobj = func(x + beta.^m*s*d); m = m+1; if obj > newobj - (sigma*beta.^m*s*g'*d) t = beta^m *s; x = x + t*d; m_new = m; newobj = func(x + t*d); nf = nf+1; else m = m+1; end obj=newobj; g=grad(x); k = k + 1; x_new = [x_new, x]; end % Output x and k x_new, k, nf fprintf('Optimal Solution x = [%f, %f]\n', x(1), x(2)) plot(x_new) function y = func(x) y = 100*(x(1)^2 - x(2))^2 + (x(1)-1)^2; end function y = grad(x) y(1) = 100*(2*(x(1)^2-x(2))*2*x(1)) + 2*(x(1)-1); end
Berlekamp Massey Algorithm for BCH simplified binary version
I am trying to follow Lin, Costello's explanation of the simplified BM algorithm for the binary case in page 210 of chapter 6 with no success on finding the error locator polynomial. I'm trying to implement it in MATLAB like this: function [locator_polynom] = compute_error_locator(syndrome, t, m, field, alpha_powers) % % Initial conditions for the BM algorithm polynom_length = 2*t; syndrome = [syndrome; zeros(3, 1)]; % Delta matrix storing the powers of alpha in the corresponding place delta_rho = uint32(zeros(polynom_length, 1)); delta_rho(1)=1; delta_next = uint32(zeros(polynom_length, 1)); % Premilimnary values n_max = uint32(2^m - 1); % Initialize step mu = 1 delta_next(1) = 1; delta_next(2) = syndrome(1); % 1 + S1*X % The discrepancy is stored in polynomial representation as uint32 numbers value = gf_mul_elements(delta_next(2), syndrome(2), field, alpha_powers, n_max); discrepancy_next = bitxor(syndrome(3), value); % The degree of the locator polynomial locator_degree_rho = 0; locator_degree_next = 1; % Update all values locator_polynom = delta_next; delta_current = delta_next; discrepancy_rho = syndrome(1); discrepancy_current = discrepancy_next; locator_degree_current = locator_degree_next; rho = 0; % The row with the maximum value of 2mu - l starts at 1 for i = 1:t % Only the even steps are needed (so make t out of 2*t) if discrepancy_current ~= 0 % Compute the correction factor correction_factor = uint32(zeros(polynom_length, 1)); x_exponent = 2*(i - rho); if (discrepancy_current == 1 || discrepancy_rho == 1) d_mu_times_rho = discrepancy_current * discrepancy_rho; else alpha_discrepancy_mu = alpha_powers(discrepancy_current); alpha_discrepancy_rho = alpha_powers(discrepancy_rho); alpha_inver_discrepancy_rho = n_max - alpha_discrepancy_rho; % The alpha power for dmu * drho^-1 is alpha_d_mu_times_rho = alpha_discrepancy_mu + alpha_inver_discrepancy_rho; % Equivalent to aux mod(2^m - 1) alpha_d_mu_times_rho = alpha_d_mu_times_rho - ... n_max * uint32(alpha_d_mu_times_rho > n_max); d_mu_times_rho = field(alpha_d_mu_times_rho); end correction_factor(x_exponent+1) = d_mu_times_rho; correction_factor = gf_mul_polynoms(correction_factor,... delta_rho,... field, alpha_powers, n_max); % Finally we add the correction factor to get the new delta delta_next = bitxor(delta_current, correction_factor(1:polynom_length)); % Update used data l = polynom_length; while delta_next(l) == 0 && l>0 l = l - 1; end locator_degree_next = l-1; % Update previous maximum if the degree is higher than recorded if (2*i - locator_degree_current) > (2*rho - locator_degree_rho) locator_degree_rho = locator_degree_current; delta_rho = delta_current; discrepancy_rho = discrepancy_current; rho = i; end else % If the discrepancy is 0, the locator polynomial for this step % is passed to the next one. It satifies all newtons' equations % until now. delta_next = delta_current; end % Compute the discrepancy for the next step syndrome_start_index = 2 * i + 3; discrepancy_next = syndrome(syndrome_start_index); % First value for k = 1:locator_degree_next value = gf_mul_elements(delta_next(k + 1), ... syndrome(syndrome_start_index - k), ... field, alpha_powers, n_max); discrepancy_next = bitxor(discrepancy_next, value); end % Update all values locator_polynom = delta_next; delta_current = delta_next; discrepancy_current = discrepancy_next; locator_degree_current = locator_degree_next; end end I'm trying to see what's wrong but I can't. It works for the examples in the book, but not always. As an aside, to compute the discrepancy S_2mu+3 is needed, but when I have only 24 syndrome coefficients how is it computed on step 11 where 2*11 + 3 is 25? Thanks in advance!
It turns out the code is ok. I made a different implementation from Error Correction and Coding. Mathematical Methods and gives the same result. My problem is at the Chien Search. Code for the interested: function [c] = compute_error_locator_v2(syndrome, m, field, alpha_powers) % % Initial conditions for the BM algorithm % Premilimnary values N = length(syndrome); n_max = uint32(2^m - 1); polynom_length = N/2 + 1; L = 0; % The curent length of the LFSR % The current connection polynomial c = uint32(zeros(polynom_length, 1)); c(1) = 1; % The connection polynomial before last length change p = uint32(zeros(polynom_length, 1)); p(1) = 1; l = 1; % l is k - m, the amount of shift in update dm = 1; % The previous discrepancy for k = 1:2:N % For k = 1 to N in steps of 2 % ========= Compute discrepancy ========== d = syndrome(k); for i = 1:L aux = gf_mul_elements(c(i+1), syndrome(k-i), field, alpha_powers, n_max); d = bitxor(d, aux); end if d == 0 % No change in polynomial l = l + 1; else % ======== Update c ======== t = c; % Compute the correction factor correction_factor = uint32(zeros(polynom_length, 1)); % This is d * dm^-1 dd_sum = modulo(alpha_powers(d) + n_max - alpha_powers(dm), n_max); for i = 0:polynom_length - 1 if p(i+1) ~= 0 % Here we compute d*d^-1*p(x_i) ddp_sum = modulo(dd_sum + alpha_powers(p(i+1)), n_max); if ddp_sum == 0 correction_factor(i + l + 1) = 1; else correction_factor(i + l + 1) = field(ddp_sum); end end end % Finally we add the correction factor to get the new locator c = bitxor(c, correction_factor); if (2*L >= k) % No length change in update l = l + 1; else p = t; L = k - L; dm = d; l = 1; end end l = l + 1; end end The code comes from this implementation of the Massey algorithm
Sequential Quadratic Programming Matlab Implementation
I have a problem with my MATLAB code that solves a nonlinear quadratic problem with SQP algorithm (Sequential quadratic programming) but in the "QP-SUB PROBLEM" section of the code that i have formulated analytically a "num2str"error appears and honestly, i don't know how to fix that and also have to tell you that this method uses KT conditions for a better solution . In every section of the code i write a comment for better understanding and function with constraints can be found in the code below : % Maximize f(x1,x2) = x1^4 -2x1^2x2 +x1^2 +x1x2^2 -2x1 +4 % % h1(x1,x2) = x1^2 + x2^2 -2 = 0 % g1(x1,x2) = 0.25x1^2 +0.75x2^2 -1 <=0 % 0 <= x1 <= 4; 0 <= x2 <= 4 % %-------------------------------------------------------- % The KT conditions for the QP subproblem % are applied analytically % There are two cases for a single inequality constraint % Case (a) : beta = 0 g < 0 % Case (b) : beta ~= 0, g = 0 % The best solution is used % -------------------------------------------------------- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% management functions clear % clear all variable/information in the workspace - use CAUTION clear global % again use caution - clears global information clc % position the cursor at the top of the screen close % closes the figure window format compact % avoid skipping a line when writing to the command window warning off %#ok<WNOFF> % don't report any warnings like divide by zero etc. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% DATA --------------------- %%% starting design xb(1) = 3; xb(2) = 2; it = 10; % number of iterations %%% plot range for delx1 : -3 , +3 dx1L = -3; dx1U = +3; dx2L = -3; dx2U = +3; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% Define functions syms x1 x2 f g h syms gradf1 gradf2 gradh1 gradh2 gradg1 gradg2 f = x1^4 - 2*x1*x1*x2 + x1*x1 + x1*x2*x2 - 2*x1 + 4; h = x1*x1 + x2*x2 - 2; g = 0.25*x1*x1 +0.75*x2*x2 -1; %%% the gradient functions gradf1 = diff(f,x1); gradf2 = diff(f,x2); % the hessian hess = [diff(gradf1,x1), diff(gradf1,x2); diff(gradf2,x1), diff(gradf2,x2)]; % gradient of the constraints gradh1 = diff(h,x1); gradh2 = diff(h,x2); gradg1 = diff(g,x1); gradg2 = diff(g,x2); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% graphical/symbolic solution for SLP %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% fprintf('***********************') fprintf('\nSQP - Example 7.1') fprintf('\n*********************\n') for i = 1:it %figure; f1 = subs(f,{x1,x2},{xb(1),xb(2)}); g1 = subs(g,{x1,x2},{xb(1),xb(2)}); h1 = subs(h,{x1,x2},{xb(1),xb(2)}); %%% Print Information fprintf('\n******************************') fprintf('\nIteration : '),disp(i) fprintf('******************************\n') fprintf('Linearized about [x1, x2] : '),disp([xb(1) xb(2)]) fprintf('Objective function value f(x1,x2) : '),disp(f1); fprintf('Equality constraint value value h(x1,x2) : '),disp(h1); fprintf('Inequality constraint value value g(x1,x2) : '),disp(g1); %fprintf('\nsolution for [delx1 delx2] : '),disp(sol') % hold on % calculate the value of the gradients % f1 = subs(f,{x1,x2},{xb(1),xb(2)}); % g1 = subs(g,{x1,x2},{xb(1),xb(2)}); % h1 = subs(h,{x1,x2},{xb(1),xb(2)}); fprintf('\n-----------------------') fprintf('\nQP - SUB PROBLEM') fprintf('\n---------------------\n') gf1 = double(subs(gradf1,{x1,x2},{xb(1),xb(2)})); gf2 = double(subs(gradf2,{x1,x2},{xb(1),xb(2)})); hess1 = double(subs(hess,{x1,x2},{xb(1),xb(2)})); gh1 = double(subs(gradh1,{x1,x2},{xb(1),xb(2)})); gh2 = double(subs(gradh2,{x1,x2},{xb(1),xb(2)})); gg1 = double(subs(gradg1,{x1,x2},{xb(1),xb(2)})); gg2 = double(subs(gradg2,{x1,x2},{xb(1),xb(2)})); % the QP subproblem syms dx1 dx2 % change in design fquad = f1 + [gf1 gf2]*[dx1; dx2] + 0.5*[dx1 dx2]*hess1*[dx1 ;dx2]; hlin = h1 + [gh1 gh2]*[dx1; dx2]; glin = g1 + [gg1 gg2]*[dx1; dx2]; Fquadstr = strcat(num2str(f1),' + ',num2str(gf1), ... '*','dx1',' + ',num2str(gf2),' * ','dx2', ... ' + 0.5*',num2str(hess1(1,1)),' * dx1^2', ... ' +',num2str(hess1(1,2)),' * dx1*dx2', ... ' + 0.5*',num2str(hess1(2,2)),' * dx2^2'); hlinstr = strcat(num2str(h1),' + ',num2str(gh1), ... '*','dx1',' + ',num2str(gh2),' * ','dx2'); glinstr = strcat(num2str(g1),' + ',num2str(gg1), ... '*','dx1',' + ',num2str(gg2),' * ','dx2'); fprintf('Quadratic Objective function f(x1,x2): \n'),disp(Fquadstr); fprintf('Linearized equality h(x1,x2): '),disp(hlinstr); fprintf('Linearized inequality g(x1,x2): '),disp(glinstr); fprintf('\n') % define Lagrangian for the QP problem syms lamda beta F = fquad + lamda*hlin + beta*glin; fprintf('Case a: beta = 0\n'); Fnobeta = fquad + lamda*hlin; %%% initialize best solution dx1best = 0; dx2best = 0; Fbbest = 0; %%%%%%%%%%%%%%%%%%%%%%% %%% solve case (a) %%% %%%%%%%%%%%%%%%%%%%%%%% xcasea = solve(diff(Fnobeta,dx1),diff(Fnobeta,dx2),hlin); sola = [double(xcasea.dx1) double(xcasea.dx2) double(xcasea.lamda)]; dx1a = double(xcasea.dx1); dx2a = double(xcasea.dx2); lamdaa = double(xcasea.lamda); hlina = double(subs(hlin,{dx1,dx2},{dx1a,dx2a})); glina = double(subs(glin,{dx1,dx2},{dx1a,dx2a})); Fa = double(subs(Fnobeta,{dx1,dx2,lamda},{dx1a,dx2a,lamdaa})); %%% results for case (a) x1a = dx1a + xb(1); x2a = dx2a + xb(2); fv = double(subs(f,{x1,x2},{x1a,x2a})); hv = double(subs(h,{x1,x2},{x1a,x2a})); gv = double(subs(g,{x1,x2},{x1a,x2a})); fprintf('Change in design vector: '),disp([dx1a dx2a]); fprintf('The linearized quality constraint: '),disp(hlina); fprintf('The linearized inequality constraint: '),disp(glina); fprintf('New design vector: '),disp([x1a x2a]); fprintf('The objective function: '),disp(fv); fprintf('The equality constraint: '),disp(hv); fprintf('The inequality constraint: '),disp(gv); if (glina <= 0) xb(1) = xb(1) + dx1a; xb(2) = xb(2) + dx2a; fbest = Fa; dx1best = dx1a; dx2best = dx2a; end %%%%%%%%%%%%%%%%%%%%%%% %%% solve case (b) %%% %%%%%%%%%%%%%%%%%%%%%%% fprintf('\n Case b: g = 0\n'); xcaseb = solve(diff(F,dx1),diff(F,dx2),hlin,glin); solb = [double(xcaseb.dx1) double(xcaseb.dx2) double(xcaseb.lamda) double(xcaseb.beta)]; dx1b = double(xcaseb.dx1); dx2b = double(xcaseb.dx2); betab = double(xcaseb.beta); lamdab = double(xcaseb.lamda); hlinb = double(subs(hlin,{dx1,dx2},{dx1b,dx2b})); glinb = double(subs(glin,{dx1,dx2},{dx1b,dx2b})); Fb = double(subs(F,{dx1,dx2,lamda,beta},{dx1b,dx2b,lamdab,betab})); x1b = dx1b + xb(1); x2b = dx2b + xb(2); fv = double(subs(f,{x1,x2},{x1b,x2b})); hv = double(subs(h,{x1,x2},{x1b,x2b})); gv = double(subs(g,{x1,x2},{x1b,x2b})); fprintf('Change in design vector: '),disp([dx1b dx2b]); fprintf('The linearized quality constraint: '),disp(hlinb); fprintf('The linearized inequality constraint: '),disp(glinb); fprintf('New design vector: '),disp([x1b x2b]); fprintf('The objective function: '),disp(fv); fprintf('The equality constraint: '),disp(hv); fprintf('The inequality constraint: '),disp(gv); fprintf('Multiplier beta: '),disp(betab); fprintf('Multiplier lamda: '),disp(lamdab); if (betab > 0) & (Fb <= fbest) xb(1) = x1b; xb(2) = x2b; dx1best = dx1b; dx2best = dx2b; end %%% stopping criteria if ([dx1best dx2best]*[dx1best dx2best]') <= 1.0e-08 fprintf('\n&&&&&&&&&&&&&&&&&&&&&&&&&&&&&') fprintf('\nStopped: Design Not Changing') fprintf('\n&&&&&&&&&&&&&&&&&&&&&&&&&&&&&\n\n') break; elseif i == it fprintf('\n&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&') fprintf('\nStpped: Number of iterations at maximum') fprintf('\n&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&\n\n') break; end end
f1, g1, h1 are still syms variable type. Change them to numeric type using the function double() before applying the function num2str() You have already applied double() to the variables gf1, gf2, gh1, gh2,gg1, gg2 and hess1 right above, no need to touch them Here is the section you should replace Fquadstr = strcat(num2str(f1),' + ',num2str(gf1), ... '*','dx1',' + ',num2str(gf2),' * ','dx2', ... ' + 0.5*',num2str(hess1(1,1)),' * dx1^2', ... ' +',num2str(hess1(1,2)),' * dx1*dx2', ... ' + 0.5*',num2str(hess1(2,2)),' * dx2^2'); hlinstr = strcat(num2str(h1),' + ',num2str(gh1), ... '*','dx1',' + ',num2str(gh2),' * ','dx2'); glinstr = strcat(num2str(g1),' + ',num2str(gg1), ... '*','dx1',' + ',num2str(gg2),' * ','dx2'); by this % apply double() to f1 Fquadstr = strcat(num2str(double(f1)),' + ',num2str(gf1), ... '*','dx1',' + ',num2str(gf2),' * ','dx2', ... ' + 0.5*',num2str(hess1(1,1)),' * dx1^2', ... ' +',num2str(hess1(1,2)),' * dx1*dx2', ... ' + 0.5*',num2str(hess1(2,2)),' * dx2^2'); % apply double() to h1 hlinstr = strcat(num2str(double(h1)),' + ',num2str(gh1), ... '*','dx1',' + ',num2str(gh2),' * ','dx2'); % apply double() to g1 glinstr = strcat(num2str(double(g1)),' + ',num2str(gg1), ... '*','dx1',' + ',num2str(gg2),' * ','dx2');
Error in evaluating a function
EDIT: The code that I have pasted is too long. Basicaly I dont know how to work with the second code, If I know how calculate alpha from the second code I think my problem will be solved. I have tried a lot of input arguments for the second code but it does not work! I have written following code to solve a convex optimization problem using Gradient descend method: function [optimumX,optimumF,counter,gNorm,dx] = grad_descent() x0 = [3 3]';%'// terminationThreshold = 1e-6; maxIterations = 100; dxMin = 1e-6; gNorm = inf; x = x0; counter = 0; dx = inf; % ************************************ f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2; %alpha = 0.01; % ************************************ figure(1); clf; ezcontour(f,[-5 5 -5 5]); axis equal; hold on f2 = #(x) f(x(1),x(2)); % gradient descent algorithm: while and(gNorm >= terminationThreshold, and(counter <= maxIterations, dx >= dxMin)) g = grad(x); gNorm = norm(g); alpha = linesearch_strongwolfe(f,-g, x0, 1); xNew = x - alpha * g; % check step if ~isfinite(xNew) display(['Number of iterations: ' num2str(counter)]) error('x is inf or NaN') end % ************************************** plot([x(1) xNew(1)],[x(2) xNew(2)],'ko-') refresh % ************************************** counter = counter + 1; dx = norm(xNew-x); x = xNew; end optimumX = x; optimumF = f2(optimumX); counter = counter - 1; % define the gradient of the objective function g = grad(x) g = [(8*x(1) + 2*x(2) +10) (2*x(1) + 16*x(2) + 1)]; end end As you can see, I have commented out the alpha = 0.01; part. I want to calculate alpha via an other code. Here is the code (This code is not mine) function alphas = linesearch_strongwolfe(f,d,x0,alpham) alpha0 = 0; alphap = alpha0; c1 = 1e-4; c2 = 0.5; alphax = alpham*rand(1); [fx0,gx0] = feval(f,x0,d); fxp = fx0; gxp = gx0; i=1; while (1 ~= 2) xx = x0 + alphax*d; [fxx,gxx] = feval(f,xx,d); if (fxx > fx0 + c1*alphax*gx0) | ((i > 1) & (fxx >= fxp)), alphas = zoom(f,x0,d,alphap,alphax); return; end if abs(gxx) <= -c2*gx0, alphas = alphax; return; end if gxx >= 0, alphas = zoom(f,x0,d,alphax,alphap); return; end alphap = alphax; fxp = fxx; gxp = gxx; alphax = alphax + (alpham-alphax)*rand(1); i = i+1; end function alphas = zoom(f,x0,d,alphal,alphah) c1 = 1e-4; c2 = 0.5; [fx0,gx0] = feval(f,x0,d); while (1~=2), alphax = 1/2*(alphal+alphah); xx = x0 + alphax*d; [fxx,gxx] = feval(f,xx,d); xl = x0 + alphal*d; fxl = feval(f,xl,d); if ((fxx > fx0 + c1*alphax*gx0) | (fxx >= fxl)), alphah = alphax; else if abs(gxx) <= -c2*gx0, alphas = alphax; return; end if gxx*(alphah-alphal) >= 0, alphah = alphal; end alphal = alphax; end end But I get this error: Error in linesearch_strongwolfe (line 11) [fx0,gx0] = feval(f,x0,d); As you can see I have written the f function and its gradient manually. linesearch_strongwolfe(f,d,x0,alpham) takes a function f, Gradient of f, a vector x0 and a constant alpham. is there anything wrong with my declaration of f? This code works just fine if I put back alpha = 0.01;
As I see it: x0 = [3; 3]; %2-element column vector g = grad(x0); %2-element column vector f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2; linesearch_strongwolfe(f,-g, x0, 1); %passing variables inside the function: [fx0,gx0] = feval(f,x0,-g); %variable names substituted with input vars This will in effect call [fx0,gx0] = f(x0,-g); but f(x0,-g) is a single 2-element column vector with these inputs. Assingning the output to two variables will not work. You either have to define f as a proper named function (just like grad) to output 2 variables (one for each component), or edit the code of linesearch_strongwolfe to return a single variable, then slice that into 2 separate variables yourself afterwards. If you experience a very rare kind of laziness and don't want to define a named function, you can still use an anonymous function at the cost of duplicating code for the two components (at least I couldn't come up with a cleaner solution): f = #(x1,x2) deal(4.*x1(1)^2 + 2.*x1(1)*x2(1) +8.*x2(1)^2 + 10.*x1(1) + x2(1),... 4.*x1(2)^2 + 2.*x1(2)*x2(2) +8.*x2(2)^2 + 10.*x1(2) + x2(2)); [fx0,gx0] = f(x0,-g); %now works fine as long as you always have 2 output variables. Note that this is more like a proof of concept, since this is ugly, inefficient, and very susceptible to typos.
Vectorizing 4 nested for loops
I'm trying to vectorize the 2 inner nested for loops, but I can't come up with a way to do this. The FS1 and FS2 functions have been written to accept argument for N_theta and N_e, which is what the loops are iterating over %% generate regions for raw_r=1:visual_field_width for raw_c=1:visual_field_width r = raw_r - center_r; c = raw_c - center_c; % convert (r,c) to polar: (eccentricity, angle) e = sqrt(r^2+c^2)*deg_per_pixel; a = mod(atan2(r,c),2*pi); for nt=1:N_theta for ne=1:N_e regions(raw_r, raw_c, nt, ne) = ... FS_1(nt-1,a,N_theta) * ... FS_2(ne-1,e,N_e,e0_in_deg, e_max); end end end end Ideally, I could replace the two inner nested for loops by: regions(raw_r,raw_c,:,:) = FS_1(:,a,N_theta) * FS_2(:,N_e,e0_in_deg,e_max); But this isn't possible. Maybe I'm missing an easy fix or vectorization technique? e0_in_deg and e_max are parameters. The FS_1 function is function h = FS_1(n,theta,N,t) if nargin==2 N = 9; t=1/2; elseif nargin==3 t=1/2; end w = (2*pi)/N; theta = theta + w/4; if n==0 && theta>(3/2)*pi theta = theta - 2*pi; end h = FS_f((theta - (w*n + 0.5*w*(1-t)))/w); the FS_2 function is function g = FS_gne(n,e,N,e0, e_max) if nargin==2 N = 10; e0 = .5; elseif nargin==3 e0 = .5; end w = (log(e_max) - log(e0))/N; g = FS_f((log(e)-log(e0)-w*(n+1))/w); and the FS_f function is function f = FS_f(x, t) if nargin<2 t = 0.5; end f = zeros(size(x)); % case 1 idx = x>-(1+t)/2 & x<=(t-1)/2; f(idx) = (cos(0.5*pi*((x(idx)-(t-1)/2)/t))).^2; % case 2 idx = x>(t-1)/2 & x<=(1-t)/2; f(idx) = 1; % case 3 idx = x>(1-t)/2 & x<=(1+t)/2; f(idx) = -(cos(0.5*pi*((x(idx)-(1+t)/2)/t))).^2+1;
I had to assume values for the constants, and then used ndgrid to find the possible configurations and sub2ind to get the indices. Doing this I removed all loops. Let me know if this produced the correct values. function RunningFunction %% generate regions visual_field_width = 10; center_r = 2; center_c = 3; deg_per_pixel = 17; N_theta = 2; N_e = 5; e0_in_deg = 35; e_max = 17; [raw_r, raw_c, nt, ne] = ndgrid(1:visual_field_width, 1:visual_field_width, 1:N_theta, 1:N_e); ind = sub2ind(size(raw_r), raw_r, raw_c, nt, ne); r = raw_r - center_r; c = raw_c - center_c; % convert (r,c) to polar: (eccentricity, angle) e = sqrt(r.^2+c.^2)*deg_per_pixel; a = mod(atan2(r,c),2*pi); regions(ind) = ... FS_1(nt-1,a,N_theta) .* ... FS_2(ne-1,e,N_e,e0_in_deg, e_max); regions = reshape(regions, size(raw_r)); end function h = FS_1(n,theta,N,t) if nargin==2 N = 9; t=1/2; elseif nargin==3 t=1/2; end w = (2*pi)./N; theta = theta + w/4; theta(n==0 & theta>(3/2)*pi) = theta(n==0 & theta>(3/2)*pi) - 2*pi; h = FS_f((theta - (w*n + 0.5*w*(1-t)))/w); end function g = FS_2(n,e,N,e0, e_max) if nargin==2 N = 10; e0 = .5; elseif nargin==3 e0 = .5; end w = (log(e_max) - log(e0))/N; g = FS_f((log(e)-log(e0)-w*(n+1))/w); end function f = FS_f(x, t) if nargin<2 t = 0.5; end f = zeros(size(x)); % case 1 idx = x>-(1+t)/2 & x<=(t-1)/2; f(idx) = (cos(0.5*pi*((x(idx)-(t-1)/2)/t))).^2; % case 2 idx = x>(t-1)/2 & x<=(1-t)/2; f(idx) = 1; % case 3 idx = x>(1-t)/2 & x<=(1+t)/2; f(idx) = -(cos(0.5*pi*((x(idx)-(1+t)/2)/t))).^2+1; end