Assign outcomes to discrete distribution Matlab - matlab

I am trying to fit a distribution to a discrete dataset.
The possible outcomes are A = [1 3 4 5 9 10] with a respective probability of prob
prob = [0.2 0.15 0.1 0.05 0.35 0.15];
I have used makedist to find the distribution
pd = makedist('Multinomial','probabilities', prob);
I wonder if there is a way to include the outcomes 1 to 10 from A in the distribution, such that I can calculate the mean and the variance of the possible outcomes with var(pd), mean(pd). Up till now the mean is 3.65, but my aim is it to have mean(pd) = 5.95, which is the weighted sum of the possible outcomes . Thanks in advance.

The solution is pretty easy. The possible outcomes of a multinomial distrubution are defined by a sequence of values starting at 1 and ending at numel(prob). From this official documentation page:
Create a multinomial distribution object for a distribution with three
possible outcomes. Outcome 1 has a probability of 1/2, outcome 2 has a
probability of 1/3, and outcome 3 has a probability of 1/6.
pd = makedist('Multinomial','probabilities',[1/2 1/3 1/6])
Basically, your vector of possible outcomes includes a few values associated to a null (signally 0) probability. Thus, define your distribution as follows in order to obtain the expected result:
p = [0.20 0.00 0.15 0.10 0.05 0.00 0.00 0.00 0.35 0.15];
pd = makedist('Multinomial','probabilities',p);
mean(pd) % 5.95
var(pd) % 12.35

Related

Magnitude ratio fitting of a first order system with cftool

I am trying to plot the magnitude ratio of a first order system using cftool, I'm aware there are other ways to do that but I need to get to the solution through this method.
I have simulated an RC circuit and, after having applied a sine input at several frequencies, I have measured the output of the system;
Here are the vectors I have created in MATLAB with the data I have measured:
f = [1 10 100 120 130 150 160 170 1000 2000 3000 10000];
Vi = zeros(1,12);
Vi(1,:) = 1; %amplitude
Vo = [0.99 0.99 0.85 0.79 0.77 0.73 0.7 0.68 0.16 0.08 0.05 0.02]; %amplitudes
Vdb = 20*log10(Vo./Vi); %Vo converted to dB
Now, given that an RC circuit is a first order system, I know that the relationship beetween magnitude ratio and frequency can be written as:
M(omega) = 1/(sqrt(1 + (omega * tau)^2))
So, opening cftool in MATLAB, I have set:
X data: f
Y data: Vdb
Custom Equation: 1/sqrt(1 + (2*pi*a*x)^2) %omega = 2*pi*f
Using these settings, however, cftool doesn't plot what I expected to see, so I would like to figure out where my mistakes are.
I believe the Y-data should be V0, not Vdb.
If you want the curvefit for the relationship between the voltage gain in dB and the frequency, then you need to alter the custom equation.

decay rate of data in matlab

I want to know how quickly some data returns to baseline after an initial peak (here at ca x=5);
The quadratic fit looks about right (from the figures option of matlab, shown below) - but I'm looking for a concise quantification of this curve, therefore I presume the 'decay rate' of the exponential function would be one very straightforward.
Is this assumption correct?
If yes, I looked at the formula on wiki for this, and attempted shamelessly to find a solve for the time constant (but unsuccessfully so). Can someone help me out, or is this actually a not so trivial problem?
edit: I was planning to find the peak using MathWorks' findpeaks() function, and the lowest point of the curve using the 'inverse' findpeaks() (as in: -y)
%approx data values of the curves below
y= [0 0.07 0.08 0.08 0.08 0.06 0.06 0.05 0.04 0.05 0.04 0.02 0.01 0.02 0.01 0.01 0.03 0.02 0.02 0.02 0.03 0.01 0.02 0.01 0.01 0.03 0.02 0.01 0.02 0.01];
x=1:numel(y);
plot(x,y);
These are the two options I was looking for, maybe someone can elaborate / improve this answer about the differences for these approaches - for me this is good enough, thanks for the comments. Before this step, using the data provided in the example of the question, the local maximum and minimum has to be extracted, this can be done easily using findpeaks()
approach 1) requires the curve toolbox from Matlab [Source]
%Fit a Single-Term Exponential Model, copy from Mathworks documentation, all credits go there
x = (0:0.2:5)';
y = 2*exp(-0.2*x) + 0.1*randn(size(x));
f = fit(x,y,'exp1')
f =
General model Exp1:
f(x) = a*exp(b*x)
Coefficients (with 95% confidence bounds):
a = 2.021 (1.89, 2.151)
b = -0.1812 (-0.2104, -0.152)
plot(f,x,y)
or approach 2) requires the optimizaion toolbox from Matlab [Source]
%copy from Mathworks documentation, all credits go there
rng default % for reproducibility
d = linspace(0,3);
y = exp(-1.3*d) + 0.05*randn(size(d));
fun = #(r)exp(-d*r)-y;
x0 = 4;
x = lsqnonlin(fun,x0)
plot(d,y,'ko',d,exp(-x*d),'b-')
legend('Data','Best fit')
xlabel('t')
ylabel('exp(-tx)')

How can I plot cumulative plots with specific x values?

I was trying to find out, how to plot a cumulative distribution function (cdf) with specific x values but was not successful.
For example, if the dataset is:
x = [2.50 5.21 7.67 8.43 9.15 11.47 14.59 21.45];
y = [0.20 0.09 0.15 0.13 0.17 0.04 0.7 0.15]; % (total 1)
the graph shape definitely looks wrong, when I use y = cdfplot(x).
I also plotted the graph with cumsum(y) and x to check the shape and it looks fine, but I would like to know, if there is any code which plots cumulative distribution plots.
There's the stairs function for creating "stairstep graphs", which should be exactly what you want, incorporating your cumsum(y) idea.
Please see the following code snippet. I added two additional points for the start and end of some interval, here [0 ... 25]. Also, your values in y sum up to something larger than 1, so I modified these values, too.
x = [0 2.50 5.21 7.67 8.43 9.15 11.47 14.59 21.45 25];
y = [0 0.10 0.09 0.05 0.10 0.14 0.04 0.4 0.08 0];
stairs(x, cumsum(y));
xlim([-1 26]);
ylim([-0.2 1.2]);
That'd be the output (Octave 5.1.0, but also tested with MATLAB Online):
Hope that helps!

Matlab function perfcurve falsely asserts ROC AUC = 1

The perfcurve function in Matlab falsely, asserts AUC=1 when two records are clearly misclassified for reasonable cutoff values.
If I run the same data through a confusion matrix with cutoff 0.5, the accuracy is rightfully below 1.
The MWE contains data from one of my folds. I noticed the problem because I saw perfect auc with less than perfect accuracy in my results.
I use Matlab 2016a and Ubuntu 16.4 64bit.
% These are the true classes in one of my test-set folds
classes = transpose([ones(1,9) 2*ones(1,7)])
% These are predictions from my classifier
% Class 1 is very well predicted
% Class 2 has two records predicted as class 1 with threshold 0.5
confidence = transpose([1.0 1.0 1.0 1.0 0.9999 1.0 1.0...
1.0 1.0 0.0 0.7694 0.0 0.9917 0.0 0.0269 0.002])
positiveClass = 1
% Nevertheless, the AUC yields a perfect 1
% I understand why X, Y, T have less values than classes and confidence
% Identical records are dealt with by one point on the ROC curve
[X,Y,T,AUC] = perfcurve(classes, confidence, positiveClass)
% The confusion matrix for comparison
threshold = 0.5
confus = confusionmat(classes,(confidence<threshold)+1)
accuracy = trace(confus)/sum(sum(confus))
This simply means that there is another cutoff where the separation is perfect.
Try:
threshold = 0.995
confus = confusionmat(classes,(confidence<threshold)+1)
accuracy = trace(confus)/sum(sum(confus))

Matlab Portfolio Optimization: Calculating the IN-efficient frontier

I have a question with regards to Portfolio Optimization in Matlab. Is there a way to plot and obtain the values in the IN-efficent frontier (the bottom locus of points that envelopes the feasible solutions as opposed to the efficient frontier which envelopes the top portion)?
if ~exist('quadprog')
msgbox('The Optimization Toolbox(TM) is required to run this example.','Product dependency')
return
end
returns = [0.1 0.15 0.12];
STDs = [0.2 0.25 0.18];
correlations = [ 1 0.3 0.4
0.3 1 0.3
0.4 0.3 1 ];
% Converting to correlation and STD to covariance matrix
covariances = corr2cov(STDs , correlations);
% Calculating and Plotting Efficient Frontier
portopt(returns , covariances , 20)
% random Portfolio Generation
weights = exprnd(1,1000,3);
total = sum(weights , 2);
total = total(:,ones(3,1));
weights = weights./total;
[portRisk , portReturn] = portstats(returns , covariances , weights);
hold on
plot(portRisk , portReturn , '.r')
title('Mean-Variance Efficient Frontier and Random Portfolios')
hold off
Is there a way/command to obtain the return/risk/weight of the lower envelope of the feasible solution in the same way that the efficient frontier can be calculated?
Thanks in advance!