Matlab: fmincon, Portfolio Optimization, difficulties with free variables - matlab

i´m trying to replicate a paper ("Is the marketporfolio mean-variance efficient after all?" from Levy & Roll) in Matlab. I want to minimize the following function (D) with its constraints:
Objective function
Constraints
I want to get mu and sigma,
furthermore alpha, q and r_z are free variables, where
alpha: 0 ≤ α ≤ 1 is a parameter determining
the relative weight assigned to deviations of the means relative to deviations
of the standard deviations.
q: q > 0 is the constant of proportionality
r_z : zero beta rate (risk free interest rate)
I have:
mu^sam: Vector of means of historical returns
sigma^sam: Vector of standard deviation of historical returns
Covariance matrix
weights x_m,i
Now i don´t know how to handle the free variables alpha, q and r_z
My objective function D :
function dist=distOpt(alpha, N, x0, mu_sam, sigma_sam)
sumSigma=0;
for i=1:N
sumMu=sumMu+((x0(i)-mu_sam(i))/sigma_sam(i))^2;
sumSigma=sumSigma+((x0(N+i)-sigma_sam(i))/sigma_sam(i))^2;
end
dist=sqrt(alpha*(1/N)*sumMu + (1-alpha)*(1/N)*sumSigma);
And constraints:
Mu_sam=repelem(-0.0001,y-1);
Sigma_sam=repelem(-0.0002,y-1);
x0=[Mu_sam Sigma_sam];
function [c,ceq] = confuneq(x0,rz,Cor,Weights)
q=1;
ceq=diag(x0(7:end))*Cor*diag(x0(7:end))*Weights' - q*(x0(1:6)'-rz);
c=[];
My use of fmincon:
optFunc=#(x0) distOpt(alpha,(y-1),x0, Mu_sam, Sig_sam);
options = optimoptions(#fmincon,'Algorithm','sqp','MaxFunctionEvaluations', 12000,'FunValCheck','on');
nonlcon=#(c,ceq) confuneq(x0,r_z, rho, Weights);
[x0, fval, exitFlag]= fmincon(optFunc,x0,[],[],[],[],[],[],nonlcon,options);
Can anyone help?

Now i don´t know how to handle the free variables alpha, q and r_z
I think you have two options:
1) Choose fix values for them.
2) Include them into your x to get optimal values.

Related

Fitting Data with Linear Combination of Vectors in MATLAB with Constraints

Working in MATLAB. I have the following equation:
S = aW + bX + cY + dZ
where S,W,X,Y, and Z are all known n x 1 vectors. I am trying to fit the data of S with a linear combination of the basis vectors W,X,Y, and Z with the constraint of the constants (a,b,c,d) being greater than 0. I have managed to do this in Excel's solver, and have attempted to figure it out on MATLAB, being directed towards functions like fmincon, but I am not overly familiar with MATLAB and feel I am misunderstanding the use of fmincon.
I am looking for help with understanding fmincon's use towards my problem, or redirection towards a more efficient method for solving.
Currently I have:
initials = [0.2 0.2 0.2 0.2];
fun = #(x)x(1)*W + x(2)*X + x(3)*Y + x(4)*Z;
lb = [0,0,0,0];
soln = fmincon(fun,initials,data,b,Aeq,beq,lb,ub);
I am receiving an error stating "A must have 4 column(s)". Where A is referring to my variable data which corresponds to my S in the above equation. I do not understand why it is expecting 4 columns. Also to note the variables that are in my above snippet that are not explicitly defined are defined as [], serving as space holders.
Using fmincon is a huge overkill in this case. It's like using big heavy microscope to crack nuts... or martian rover to tow a pushcart or... anyway :) May be it's OK if you don't have to process large sets of vectors. If you need to fit hundreds of thousands of such vectors it can take hours. But this classic solution will be faster by several orders of magnitude.
%first make a n x 4 matrix of your vectors;
P=[W,X,Y,Z];
% now your equation looks like this S = P*m where m is 4 x 1 vectro
% containing your coefficients ( m = [a,b,c,d] )
% so solution will be simply
m_1 = inv(P'*P)*P'*S;
% or you can use this form
m_2 = (P'*P)\P'*S;
% or even simpler
m_3 = (S'/P')';
% all three solutions should give exactly same resul
% third solution is the neatest but may not work in every version of matlab
% Your modeled vector will be
Sm = P*m_3; %you can use any of m_1, m_2 or m_3;
% And your residual
R = S-Sm;
If you need to procees many vectors don't use for cycle. For cycles are very slow in Matlab and you should use matrices instead, if possible. S can also be nk matrix, where k is number vectors you want to process. In this case m will be 4k matrix.
What you are trying to do is similar to the answer I gave at is there a python (or matlab) function that achieves the minimum MSE between given set of output vector and calculated set of vector?.
The procedure is similar to what you are doing in EXCEL with solver. You create an objective function that takes (a, b, c, d) as the input parameters and output a measure of fit (mse) and then use fmincon or similar solver to get the best (a, b, c, d) that minimize this mse. See the code below (no MATLAB to test it but it should work).
function [a, b, c, d] = optimize_abcd(W, X, Y, Z)
%=========================================================
%Second argument is the starting point, second to the last argument is the lower bound
%to ensure the solutions are all positive
res = fmincon(#MSE, [0,0,0,0], [], [], [], [], [0,0,0,0], []);
a=res(1);
b=res(2);
c=res(3);
d=res(4);
function mse = MSE(x_)
a_=x_(1);
b_=x_(2);
c_=x_(3);
d_=x_(4);
S_ = a_*W + b_*X + c_*Y + d_*Z
mse = norm(S_-S);
end
end

Draw random numbers from a custom probability density function in Matlab

I want to sample R random numbers from a custom probability density function in Matlab.
This is the expression of the probability density function evaluated at x.
I thought about using slicesample
R=10^6;
f = #(x) 1/(2*pi^(1/2))*(1/(x^(3/2)))*exp(-1/(4*x));
epsilon= slicesample(0.3,R,'pdf',f,'thin',1,'burnin',1000);
However, it does not work because I get the error
Error using slicesample (line 175)
The step-out procedure failed.
I tried to change starting value and values of thin and burning parameters but it does not seem to work. Could you advise, either on how to make slicesample work or on alternative solutions to sample random numbers from a custom probability density function in Matlab?
Let X be a random variable distributed according to your target pdf. Applying the change of variable y = 1/x and using the well-known theorem for a function of a random variable, the distribution of Y = 1/X is recognized to be a Gamma distribution with parameters α = 1/2, β = 1/4.
Therefore, it suffices to generate a Gamma random variable (using gamrnd) with those parameters and take the inverse. Note that Matlab's definition of the Gamma distribution uses parameters A = α, B = 1/β.
R = 1e5; % desired sample size
x = 1./gamrnd(1/2, 4, [1 R]); % result
Check:
histogram(x, 'Normalization', 'pdf', 'BinEdges', 0:.1:10)
hold on
f = #(x) 1/2/sqrt(pi)./x.^(3/2).*exp(-1/4./x); % target pdf
fplot(f, 'linewidth', .75)

numerical integration for Gaussian function - indefinite integral

My approach
fun = #(y) (1/sqrt(pi))*exp(-(y-1).^2).*log(1 + exp(-4*y))
integral(fun,-Inf,Inf)
This gives NaN.
So I tried plotting it.
y= -10:0.1:10;
plot(y,exp(-(y-1).^2).*log(1 + exp(-4*y)))
Then understood that domain (siginificant part) is from -4 to +4.
So changed the limits to
integral(fun,-10,10)
However I do not want to always plot the graph and then know its limits. So is there any way to know the integral directly from -Inf to Inf.
Discussion
If your integrals are always of the form
I would use a high-order Gauss–Hermite quadrature rule.
It's similar to the Gauss-Legendre-Kronrod rule that forms the basis for quadgk but is specifically tailored for integrals over the real line with a standard Gaussian multiplier.
Rewriting your equation with the substitution x = y-1, we get
.
The integral can then be computed using the Gauss-Hermite rule of arbitrary order (within reason):
>> order = 10;
>> [nodes,weights] = GaussHermiteRule(order);
>> f = #(x) log(1 + exp(-4*(x+1)))/sqrt(pi);
>> sum(f(nodes).*weights)
ans =
0.1933
I'd note that the function below builds a full order x order matrix to compute nodes, so it shouldn't be made too large.
There is a way to avoid this by explicitly computing the weights, but I decided to be lazy.
Besides, event at order 100, the Gaussian multiplier is about 2E-98, so the integrand's contribution is extremely minimal.
And while this isn't inherently adaptive, a high-order rule should be sufficient in most cases ... I hope.
Code
function [nodes,weights] = GaussHermiteRule(n)
% ------------------------------------------------------------------------------
% Find the nodes and weights for a Gauss-Hermite Quadrature integration.
%
if (n < 1)
error('There is no Gauss-Hermite rule of order 0.');
elseif (n < 0) || (abs(n - round(n)) > eps())
error('Given order ''n'' must be a strictly positive integer.');
else
n = round(n);
end
% Get the nodes and weights from the Golub-Welsch function
n = (0:n)' ;
b = n*0 ;
a = b + 0.5 ;
c = n ;
[nodes,weights] = GolubWelsch(a,b,c,sqrt(pi));
end
function [xk,wk] = GolubWelsch(ak,bk,ck,mu0)
%GolubWelsch
% Calculate the approximate* nodes and weights (normalized to 1) of an orthogonal
% polynomial family defined by a three-term reccurence relation of the form
% x pk(x) = ak pkp1(x) + bk pk(x) + ck pkm1(x)
%
% The weight scale factor mu0 is the integral of the weight function over the
% orthogonal domain.
%
% Calculate the terms for the orthonormal version of the polynomials
alpha = sqrt(ak(1:end-1) .* ck(2:end));
% Build the symmetric tridiagonal matrix
T = full(spdiags([[alpha;0],bk,[0;alpha]],[-1,0,+1],length(alpha),length(alpha)));
% Calculate the eigenvectors and values of the matrix
[V,xk] = eig(T,'vector');
% Calculate the weights from the eigenvectors - technically, Golub-Welsch requires
% a normalization, but since MATLAB returns unit eigenvectors, it is omitted.
wk = mu0*(V(1,:).^2)';
end
I've had success with transforming such infinite-bounded integrals using a numerical variable transformation, as explained in Numerical Recipes 3e, section 4.5.3. Basically, you substitute in y=c*tan(t)+b and then numerically integrate over t in (-pi/2,pi/2), which sweeps y from -infinity to infinity. You can tune the values of c and b to optimize the process. This approach largely dodges the question of trying to determine cutoffs in the domain, but for this to work reliably using quadrature you have to know that the integrand does not have features far from y=b.
A quick and dirty solution would be to look for a position, where your function is sufficiently small enough and then taking it as limits. This assumes that for x>0 the function fun decreases montonically and fun(x) is roughly the same size as fun(-x) for all x.
%// A small number
epsilon = eps;
%// Stepsize for searching bound
stepTest = 1;
%// Starting position for searching bound
position = 0;
%// Not yet small enough
smallEnough = false;
%// Search bound
while ~smallEnough
smallEnough = (fun(position) < eps);
position = position + stepTest;
end
%// Calculate integral
integral(fun, -position, position)
If your were happy with plotting the function, deciding by eye where you can cut, then this code will suffice, I guess.

Basic SVM Implemented in MATLAB

Linearly Non-Separable Binary Classification Problem
First of all, this program isn' t working correctly for RBF ( gaussianKernel() ) and I want to fix it.
It is a non-linear SVM Demo to illustrate classifying 2 class with hard margin application.
Problem is about 2 dimensional radial random distrubuted data.
I used Quadratic Programming Solver to compute Lagrange multipliers (alphas)
xn = input .* (output*[1 1]); % xiyi
phi = gaussianKernel(xn, sigma2); % Radial Basis Function
k = phi * phi'; % Symmetric Kernel Matrix For QP Solver
gamma = 1; % Adjusting the upper bound of alphas
f = -ones(2 * len, 1); % Coefficient of sum of alphas
Aeq = output'; % yi
beq = 0; % Sum(ai*yi) = 0
A = zeros(1, 2* len); % A * alpha <= b; There isn't like this term
b = 0; % There isn't like this term
lb = zeros(2 * len, 1); % Lower bound of alphas
ub = gamma * ones(2 * len, 1); % Upper bound of alphas
alphas = quadprog(k, f, A, b, Aeq, beq, lb, ub);
To solve this non linear classification problem, I wrote some kernel functions such as gaussian (RBF), homogenous and non-homogenous polynomial kernel functions.
For RBF, I implemented the function in the image below:
Using Tylor Series Expansion, it yields:
And, I seperated the Gaussian Kernel like this:
K(x, x') = phi(x)' * phi(x')
The implementation of this thought is:
function phi = gaussianKernel(x, Sigma2)
gamma = 1 / (2 * Sigma2);
featDim = 10; % Length of Tylor Series; Gaussian Kernel Converge 0 so It doesn't have to Be Inf Dimension
phi = []; % Kernel Output, The Dimension will be (#Sample) x (featDim*2)
for k = 0 : (featDim - 1)
% Gaussian Kernel Trick Using Tylor Series Expansion
phi = [phi, exp( -gamma .* (x(:, 1)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 1).^k, ...
exp( -gamma .* (x(:, 2)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 2).^k];
end
end
*** I think my RBF implementation is wrong, but I don' t know how to fix it. Please help me here.
Here is what I got as output:
where,
1) The first image : Samples of Classes
2) The second image : Marking The Support Vectors of Classes
3) The third image : Adding Random Test Data
4) The fourth image : Classification
Also, I implemented Homogenous Polinomial Kernel " K(x, x') = ( )^2 ", code is:
function phi = quadraticKernel(x)
% 2-Order Homogenous Polynomial Kernel
phi = [x(:, 1).^2, sqrt(2).*(x(:, 1).*x(:, 2)), x(:, 2).^2];
end
And I got surprisingly nice output:
To sum up, the program is working correctly with using homogenous polynomial kernel but when I use RBF, it isn' t working correctly, there is something wrong with RBF implementation.
If you know about RBF (Gaussian Kernel) please let me know how I can make it right..
Edit: If you have same issue, use RBF directly that defined above and dont separe it by phi.
Why do you want to compute phi for Gaussian Kernel? Phi will be infinite dimensional vector and you are bounding the terms in your taylor series to 10 when we don't even know whether 10 is enough to approximate the kernel values or not! Usually, the kernel is computed directly instead of getting phi (and the computing k). For example [1].
Does this mean we should never compute phi for Gaussian? Not really, no, but we have to be slightly smarter about it. There have been recent works [2,3] which show how to compute phi for Gaussian so that you can compute approximate kernel matrices while having just finite dimensional phi's. Here [4] I give the very simple code to generate the approximate kernel using the trick from the paper. However, in my experiments I needed to generate anywhere from 100 to 10000 dimensional phi's to be able to get a good approximation of the kernel (depending upon on the number of features the original input had as well as the rate at which the eigenvalues of the original matrix tapers off).
For the moment, just use code similar to [1] to generate the Gaussian kernel and then observe the result of SVM. Also, play around with the gamma parameter, a bad gamma parameter can result in really bad classification.
[1] https://github.com/ssamot/causality/blob/master/matlab-code/Code/mfunc/indep/HSIC/rbf_dot.m
[2] http://www.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf
[3] http://www.eecs.berkeley.edu/~brecht/papers/08.rah.rec.nips.pdf
[4] https://github.com/aruniyer/misc/blob/master/rks.m
Since Gaussian kernel is often referred as mapping to infinity dimensions, I always have faith in its capacity. The problem here maybe due to a bad parameter while keeping in mind grid search is always needed for SVM training. Thus I propose you could take a look at here where you could find some tricks for parameter tuning. Exponentially increasing sequence is usually used as candidates.

Using Matlab to Minimize the amplitude of this sum

I have output matrix Y as:
Y=E*A;
where
E=Exponential phase vectors (1xM)
A=Fixed complex numbers (MxN)
say,
M=4;N=256;
a = complex(randn(4,256),randn(4,256)); % coefficient matrix of "A"
theta=has four values
Objective:
What range of theta will minimize the peak of sum expression of a kth column
of "Y" ? i.e.,
Theta subjected to and m=1,2,...,M
for e.g.,
Min: Y(theta)=Peak{a11*exp(j*theta1)+ a21*exp(j*theta2)+ ...+aM1*exp(j*thetaM)}
My Question:
Can i use MATLAB to develop a logic to formulate such a problem and solve it ?
I think this is related to Linear Programming with constraints (??).
My approach will be to use the fmincon function from the matlab optimization toolbox.
Your theta vector would be the x vector of solutions you're looking for.
Then specify your problem by defining the objective function f(x) as the max from Yk for all K between 0 and N.
objectiveFunction = #(x) maxY(...
x,... %
A);
Express your constraint 0<=theta<2pi with the individuals constraints lb and ub (lower and upper bounds).
Then call fmincon
options = optimset('Algorithm','sqp', ...
'Display', 'iter-detailed',...
'UseParallel', 'always',...
'MaxFunEvals', 3000);
[result, ~, ~] = fmincon(objectiveFunction,x0,[],[],[],[],lb,ub,[],options);