I have a question with regards to Portfolio Optimization in Matlab. Is there a way to plot and obtain the values in the IN-efficent frontier (the bottom locus of points that envelopes the feasible solutions as opposed to the efficient frontier which envelopes the top portion)?
if ~exist('quadprog')
msgbox('The Optimization Toolbox(TM) is required to run this example.','Product dependency')
return
end
returns = [0.1 0.15 0.12];
STDs = [0.2 0.25 0.18];
correlations = [ 1 0.3 0.4
0.3 1 0.3
0.4 0.3 1 ];
% Converting to correlation and STD to covariance matrix
covariances = corr2cov(STDs , correlations);
% Calculating and Plotting Efficient Frontier
portopt(returns , covariances , 20)
% random Portfolio Generation
weights = exprnd(1,1000,3);
total = sum(weights , 2);
total = total(:,ones(3,1));
weights = weights./total;
[portRisk , portReturn] = portstats(returns , covariances , weights);
hold on
plot(portRisk , portReturn , '.r')
title('Mean-Variance Efficient Frontier and Random Portfolios')
hold off
Is there a way/command to obtain the return/risk/weight of the lower envelope of the feasible solution in the same way that the efficient frontier can be calculated?
Thanks in advance!
Related
I want to generate 100 different 5 by 5 random matrices using matlab in [0,1] with the following property: Assume that the required matrix is A=[a_{ij}] having the condition a_{ih}+a_{hj}-a_{ij}-0.5=0 (Matrix A is the so-called Fuzzy Preference matrix (i.e.a_{ji}=1-a_{ij} for all i,j), and also consistent). But, I am stuck to write the matlab code. Could anyone help me? Thanks!
Example of such matrix, but not consistent:
A=[.5 .5 .5 .8155 .5 .3423;...
.5 .5 .6577 .8155 .5 .3423;...
.5 .3423 .5 .88662 .75 .3423;...
.1845 .8145 .1338 .5 .25 .25;...
.5 .5 .25 .75 .5 .25;...
.6577 .6577 .6577 .75 .75 .5]
Example of 3 by 3 consistent Fuzzy Preference matrix:
B=[.5 .2 .5;...
.8 .5 .8;...
.5 .2 .5]
To solve this question we need to find solutions to a system of linear equations. The unknowns are the matrix entries, so for an NxN matrix there will be N^2 unknowns. There are N^3 equations since there is an equation for each combination of i, j, and h. This is an overdetermined system, but is consistent, it's easy to see that the constant matrix with all entries 0.5 is a solution.
The system of equations can be written in matrix form, with matrix M defined as follows:
N = 5; % size of resulting matrix
M = zeros(N^3,N^2); % ar = size of matrix A
t = 0;
for i = 1:N
for j = 1:N
for h = 1:N
t = t+1;
M(t,(i-1)*N+h) = M(t,(i-1)*N+h)+1;
M(t,(h-1)*N+j) = M(t,(h-1)*N+j)+1;
M(t,(i-1)*N+j) = M(t,(i-1)*N+j)-1;
end
end
end
The right hand side is a vector of N^3 0.5's. The rank of the matrix appears to be N^2-N+1 (I think this is because the main diagonal of the result must be filled with 0.5, so there are really only N^2-N unknowns). The rank of the augmented system is the same, so we do have an infinite number of solutions, spanning an N-1 dimensional space.
So we can find the solutions as the sum of the vector of 0.5's plus any element of the null space of the matrix.
a = 0.5*ones(N^2,1) % solution to non-homogenous equation
nullspace = null(M) % columns form a basis for the null space
So we can now generate as many solutions as we like by adding multiples of the basis vectors of the null space to a,
s = a+sum(rand(1,N-1).*nullspace,2) % always a solution
The final problem is to sample uniformly from this null space while requiring the all the entries stay within [0,1].
I think it's hard even to say what uniform sampling over this space is, but at least I can generate random elements by the following:
% first check the size of the largest element
% we can't addd/subtract more than 0.5to the homogeneous solution
for p = 1:size(nullspace,2)
mn(p) = 0.5/max(abs(nullspace(:,p)));
end
c = 0;
mat = {};
while c<5 % generate 5 matrices
% a solution is the homogeneous part 0.5*ones(N) plus
% some element of the nullspace sum(rand(1,size(n,2)).*mn.* nullspace,2)
% reshaped to be a square matrix
newM = 0.5*ones(N)+reshape(sum((2*rand(1,size(nullspace,2))-1).*mn.* nullspace,2),N,N)
% make sure the entries are all in [0,1]
if all(newM(:)>=0) && all(newM(:)<=1)
c = c+1;
mat{c} = newM;
end
end
% mat is a cell array of 5 matrices satisfying the condition
I don't have much idea about the distribution of the resulting matrices.
I am using Matlab curve-fitting tool cftool to fit my data. The issue is that the y values are largely varying (strongly decreasing) with respect to x-axis. A sample is given below,
x y
0.1 237.98
1 25.836
10 3.785
30 1.740
100 0.804
300 0.431
1000 0.230
2000 0.180
The fitted format is: y=a/x^b+c/x^d with a,b,c, and d as constants. The curve-fit from matlab is quite accurate for large y-values (that's at lower x-range) with less than 0.1% deviation. However, at higher x-values, the accuracy of the fit is not good (around 11% deviation). I would like also to include % deviation in the curve-fitting iteration to make sure the data is captured exactly. The plot is given for the fit and data reference.
Can anyone suggest me for better ways to fit the data?
The most common way to fit a curve is to do a least squares fit, which minimizes the sum of the square differences between the data and the fit. This is why your fit is tighter when y is large: an 11% deviation on a value of 0.18 is only a squared error of 0.000392, while a 0.1% deviation on a value of 240 is a squared error of 0.0576, much more significant.
If what you care about is deviations rather than absolute (squared) errors, then you can either reformulate the fitting algorithm, or transform your data in a clever way. The second way is a common and useful tool to know.
One way to do this in your case is fit the log(y) instead of y. This will have the effect of making small errors much more significant:
data = [0.1 237.98
1 25.836
10 3.785
30 1.740
100 0.804
300 0.431
1000 0.230
2000 0.180];
x = data(:,1);
y = data(:,2);
% Set up fittype and options.
ft = fittype( 'a/x^b + c/x^d', 'independent', 'x', 'dependent', 'y' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.Display = 'Off';
opts.StartPoint = [0.420712466925742 0.585539298834167 0.771799485946335 0.706046088019609];
%% Usual least-squares fit
[fitresult] = fit( x, y, ft, opts );
yhat = fitresult(x);
% Plot fit with data.
figure
semilogy( x, y );
hold on
semilogy( x, yhat);
deviation = abs((y-yhat))./y * 100
%% log-transformed fit
[fitresult] = fit( x, log(y), ft, opts );
yhat = exp(fitresult(x));
% Plot fit with data.
figure
semilogy( x, y );
hold on
semilogy( x, yhat );
deviation = abs((y-yhat))./y * 100
One approach would be to fit to the lowest sum-of-squared relative error, rather than the lowest sum-of-squared absolute error. When I use the data posted in your question, fitting to lowest sum-of-squared relative error yields +/- 4 percent error - so this may be a useful option. To verify if you might want to consider this approach, here are the coefficients I determined from your posted data using this method:
a = 2.2254477037465399E+01
b = 1.0038013513610324E+00
c = 4.1544917994119190E+00
d = 4.2684956973959676E-01
The perfcurve function in Matlab falsely, asserts AUC=1 when two records are clearly misclassified for reasonable cutoff values.
If I run the same data through a confusion matrix with cutoff 0.5, the accuracy is rightfully below 1.
The MWE contains data from one of my folds. I noticed the problem because I saw perfect auc with less than perfect accuracy in my results.
I use Matlab 2016a and Ubuntu 16.4 64bit.
% These are the true classes in one of my test-set folds
classes = transpose([ones(1,9) 2*ones(1,7)])
% These are predictions from my classifier
% Class 1 is very well predicted
% Class 2 has two records predicted as class 1 with threshold 0.5
confidence = transpose([1.0 1.0 1.0 1.0 0.9999 1.0 1.0...
1.0 1.0 0.0 0.7694 0.0 0.9917 0.0 0.0269 0.002])
positiveClass = 1
% Nevertheless, the AUC yields a perfect 1
% I understand why X, Y, T have less values than classes and confidence
% Identical records are dealt with by one point on the ROC curve
[X,Y,T,AUC] = perfcurve(classes, confidence, positiveClass)
% The confusion matrix for comparison
threshold = 0.5
confus = confusionmat(classes,(confidence<threshold)+1)
accuracy = trace(confus)/sum(sum(confus))
This simply means that there is another cutoff where the separation is perfect.
Try:
threshold = 0.995
confus = confusionmat(classes,(confidence<threshold)+1)
accuracy = trace(confus)/sum(sum(confus))
Can someone please help me vectorize a moving slope calculation. I trying to eliminate the for loop but I am not sure how to do so.
>> pv = [18 19 20 20.5 20.75 21 21.05 21.07 21.07]'; %% price vector
>> slen = 3; %% slope length
function [slope] = slope(pv , slen)
svec = (1:1:slen)';
coef = [];
slope = zeros(size(pv));
for i = slen+1 : size(pv,1)
X = [ones(slen,1) svec];
y = pv( (i - (slen-1)) : i );
a = X\y;
slope(i,1) = a(2);
end
>> slp = slope(pv,3)
slp =
0
0
0
0.75
0.375
0.25
0.15
0.035
0.01
Thanks
EDIT: completely changing answer to make it scalable
function [slope] = calculate_slope(pv , slen) %% Note: bad practice to give a function and variable the same name
svec = (1:1:slen)';
X = [ones(slen,1) svec];
%% the following two lines basically create the all the sliding windows of length slen (as a submatrix of a larger matrix)
c = repmat ( flipud(pv), 1, length(pv))
d = flipud(reshape(c(1:end-1), length(pv)-1, length(pv) + 1));
%% then run MATLAB solver with all windows simultaneously
least_sq_result = X\d( end - slen + 1:end, (slen+1):end);
slope = [zeros(slen-1, 1); least_sq_result(2,:)']; %% padding with zeros is optional
EDIT: fixed swapped indices
Finding the slope in a sliding window using least-squares regression is equivalent to first-order Savitzy-Golay filtering (using a differentiating filter). The concept of SG filtering is to perform local polynomial fits in a sliding window, then use the local model to smooth the signal or compute its derivative. When the data points are spaced equally in time (as they are here), the computation can be run very efficiently by pre-computing a set of filter coefficients, then convolving them with the data. This should be much faster than constructing a giant matrix and doing regression on it.
This is a pretty standard technique, and there's definitely existing matlab code floating around. Search for something like 'Savitzky-Golay differentiation'. Note that SG filters can also perform smoothing (the matlab builtin SG filtering functions do this), but you want the version that does differentiation.
Savitzky and Golay (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures
I'm trying to generate a random road which will be used as input for a Quarter-car model.
I used the procedure described in this article http://link.springer.com/article/10.1007%2Fs12544-013-0127-8/fulltext.html .
In Figure 2, generated roads are plotted with a maximum elevation of 15 mm for A-B category and 100 mm for D-E. My problem is that I get much higher amplitudes from those reported by them.
I'm not sure what I'm doing wrong, any guidance would be appreciated.
Length of road = 250 meters
Spatial frequency band = 0.004 -> 4
I used the formula (8) and the simplified version (9) from the article both give me same results.
My matlab code:
clear all;close all;
% spatial frequency (n0) cycles per meter
Omega0 = 0.1;
% psd ISO (used for formula 8)
Gd_0 = 32 * (10^-6);
% waveviness
w = 2;
% road length
L = 250;
%delta n
N = 1000;
Omega_L = 0.004;
Omega_U = 4;
delta_n = 1/L; % delta_n = (Omega_U - Omega_L)/(N-1);
% spatial frequency band
Omega = Omega_L:delta_n:Omega_U;
%PSD of road
Gd = Gd_0.*(Omega./Omega0).^(-w);
% calculate amplitude using formula(8) in the article
%Amp = sqrt(2*Gd*delta_n);
%calculate amplitude using simplified formula(9) in the article
k = 3;
Amp = sqrt(delta_n) * (2^k) * (10^-3) * (Omega0./Omega);
%random phases
Psi = 2*pi*rand(size(Omega));
% x abicsa from 0 to L
x = 0:0.25:250;
% road sinal
h= zeros(size(x));
for i=1:length(x)
h(i) = sum( Amp.*cos(2*pi*Omega*x(i) + Psi) );
end
plot(x, h*1000 );
xlabel('Distance m');
ylabel('Elevation (mm)');
grid on
In this paper:
Josef Melcer “numerical simulation of vehicle motion along the road structure”, 2012 (just google it)
only the final formula for road hight is given (formula 4) and is different from the formula in the paper of Agostinacchio. The difference is the 2*pi in the cosin term. Deleting the 2*pi term leads to much "better" amplitudes (better in a sense of “the scripted plot fits better to the plots in the paper of Agostinacchio”). But I am not sure if this is physical and mathematical correct.
Do you have another solution?
I managed to contact the author of the article to review my code and he said it's correct. It seems that the values for 'k' were wrong in the article, k=6 was actually k=5, k=5 was k=4 and so on, that`s why the amplitudes were higher than expected.
Of course, the formulas are slightly different from article to article, some use the sin() instead of cos() or the angular spatial frequency(which already includes the 2*pi term) instead of the spatial frequency.