Custom histogram density evaluation in MatLab

Custom histogram density evaluation in MatLab - matlab

Does MatLab have any built in function to evaluate the density of a random variable from a custom histogram? (I suspect there are probably lots of ways to do this, I am just looking to see if there is already any builtin MatLab functionality).
Thanks.

The function hist gives you an approximation of the probability density you are evaluating.
If you want a continuous representation of it, this article from the Matlab documentation explains how to get one using the spline command from the Curve Fitting Toolbox. Basically the article explains how to make a cubic interpolation of your histogram.
The resulting code is :
y = randn(1,5001); % Replace y by your own dataset
[heights,centers] = hist(y);
hold on
n = length(centers);
w = centers(2)-centers(1);
t = linspace(centers(1)-w/2,centers(end)+w/2,n+1);
dt = diff(t);
Fvals = cumsum([0,heights.*dt]);
F = spline(t, [0, Fvals, 0]);
DF = fnder(F);
hold on
fnplt(DF, 'r', 2)
hold off
ylims = ylim;
ylim([0,ylims(2)]);

A popular way is to use kernel density estimation. The simplest way to do this in Matlab is using ksdensity.

Related

How to plot precision and recall of a CNN in MATLAB?

How to plot the precision and recall curves of a CNN?
I have generated the scores from CNN and want to plot the precision-recall curve, but I am unable to get that.
I have calculated TP, TN, FP, and FN using:
idx = (ACTUAL()==1);
p = length(ACTUAL(idx));
n = length(ACTUAL(~idx));
N = p+n;
tp = sum(ACTUAL(idx)==PREDICTED(idx));
tn = sum(ACTUAL(~idx)==PREDICTED(~idx));
fp = n-tn;
fn = p-tp;
The formula of precision and recall is
precision = tp/(tp+fp)
but with that, I am getting some undesired plot.
I have obtained scores of the CNN using the following command:
[YTest,score]=classify(convnet,TestData)

MATLAB has a function for creating ROC curves and similar performance curves (such as precision-recall curves) in the Statistics and Machine Learning Toolbox: perfcurve.
By default, the ROC curve is calculated.
The function has the following syntax:
[X, Y] = perfcurve(labels, scores, posclass)
Here, labels is the true label for each sample, scores is the prediction of the CNN (or any other classifier), and posclass is the label of the class you assume to be "positive" - which appears to be 1 in your example. The outputs of the perfcurve function are the (x, y) coordinates of the ROC curve, so you can easily plot it using
plot(X, Y)
To make perfcurve plot the precision-recall curve instead of the ROC curve, you have to set the optional 'XCrit' and 'YCrit' arguments of the function. As described in the documentation, different pre-defined criteria such as number of false positives ('fp'), true positive rate ('tpr'), accuracy ('accu') and many more, or even custom functions can be used.
By setting 'XCrit' to 'tpr' (Recall) and 'YCrit' to 'prec' (Precision), a precision-recall curve is created:
[X, Y] = perfcurve(labels, scores, posclass, 'XCrit', 'tpr', 'YCrit', 'prec');
plot(X, Y);
xlabel('Recall')
ylabel('Precision')
xlim([0, 1])
ylim([0, 1])
For example (using randomly generated data and a SVM):

The answer of hbaderts is correct but the end of the answer is wrong.
[X,Y] = perfcurve(labels,scores,posclass,'xCrit', 'fpr', 'yCrit', 'tpr');
Then the generated Receiver operating characteristic (ROC) curve is correct.

Fourier transform of normal density function

I am using the following MATLAB code to perform a Fourier transformation of a normal density function:
N=100;
j=0:(N-1);
a=-5;
b=5;
dx = (b-a)/N;
x = a+j*dx;
dt = 2*pi/(N*dx);
f1 = -N/2*dt;
f2 = N/2*dt;
t= f1+ j*dt;
GX = normpdf(x,0,1);
fft_GX = real(fft(GX))';
However, I do not get the expected bell shaped curve when I try to plot fft_GX.
The Fourier transformation of a normal density has the form of e^(-t^2/2). Can someone please help as to what I am doing incorrect?

Trying using abs instead of real.
Another helpful function to recenter the frequency domain is fftshift. Otherwise you will see the plot from 0 to 2*pi I believe, instead of the more recognizable view from -pi to pi.
fft_GX = abs(fftshift((fft(GX))');
plot(fft_GX);
You may need to do some further normalization based on the number of samples you have, but it looks more like the expected bell curve than what you were seeing originally.

pchip for angular data

I'm trying to fit a shape perserving interpolation curve to my angular data (r/phi).
However, as I have repeated x-values when I transform the datapoints to (x/y), I can not simply use pchip.
I know for spline interpolation, there is cscvn and fnplt, is there anything similar for pchip?
Furthermore, there is an example of spline fitting to angular data in the matlab documentation of "spline", but I don't quite get it how I could adapt it to pchip and different data points.
I also found the interparc-function by John d'Errico, but I would like to keep my datapoints instead of having equally spaced ones.
To make it clearer, here a figure of my datapoints with linear (blue) and spline interpolation (black). The curve I'd like to get would be something in between this two, without the steep edges in the linear case but with less overshoot than in the spline case....
Thanks for your help!

use 1D parametric interpolation:
n = 20;
r = 1 + rand(n-1,1)*0.01;%noisy r's
theta = sort(2*pi*rand(n-1,1));
% closing the circle
r(end+1) = r(1);
theta(end+1) = theta(1);
% convert to cartesian
[x,y] = pol2cart(theta,r);
% interpolate with parameter t
t = (1:n)';
v = [x,y];
tt = linspace(1,n,100);
X = interp1(t,v,tt,'pchip');
% plot
plot(x,y,'o');
hold on
plot(X(:,1),X(:,2));

How to write MatLab Code for bimodal Probability Density Functions?

I want to write a bimodal Probability Density Function (PDF with multiple peaks, Galtung S) without using the pdf function from statistics toolbox. Here is my code:
x = 0:0.01:5;
d = [0.5;2.5];
a = [12;14]; % scale parameter
y = 2*a(1).*(x-d(1)).*exp(-a(1).*(x-d(1)).^2) + ...
2*a(2).*(x-d(2)).*exp(-a(2).*(x-d(2)).^2);
plot(x,y)
Here's the curve.
plot(x,y)
I would like to change the mathematical formula to to get rid of the dips in the curve that appear at approx. 0<x<.5 and 2<x<2.5.
Is there a way to implement x>d(1) and x>d(2) in line 4 of the code to avoid y < 0? I would not want to solve this with a loop because I need to convert the formula to CDF later on.

If you want to plot only for x>max(d1,d2), you can use logical indexing:
plot(x(x>max(d)),y(x>max(d)))
If you to plot for all x but plot max(y,0), you just can write so:
plot(x,max(y,0))

What interpolation technique does Matlab plot function use to show the data?

It seems to be very basic question, but I wonder when I plot x values against y values, what interpolation technique is used behind the scene to show me the discrete data as continuous? Consider the following example:
x = 0:pi/100:2*pi;
y = sin(x);
plot(x,y)
My guess is it is a Lagrangian interpolation?

No, it's just a linear interpolation. Your example uses a quite long dataset, so you can't tell the difference. Try plotting a short dataset and you'll see it.

MATLAB's plot performs simple linear interpolation. For finer resolution you'd have to supply more sample points or interpolate between the given x values.
For example taking the sinus from the answer of FamousBlueRaincoat, one can just create an x vector with more equidistant values. Note, that the linear interpolated values coincide with the original plot lines, as the original does use linear interpolation as well. Note also, that the x_ip vector does not include (all) of the original points. This is why the do not coincide at point (~0.8, ~0.7).
Code
x = 0:pi/4:2*pi;
y = sin(x);
x_ip = linspace(x(1),x(end),5*numel(x));
y_lin = interp1(x,y,x_ip,'linear');
y_pch = interp1(x,y,x_ip,'pchip');
y_v5c = interp1(x,y,x_ip,'v5cubic');
y_spl = interp1(x,y,x_ip,'spline');
plot(x,y,x_ip,y_lin,x_ip,y_pch,x_ip,y_v5c,x_ip,y_spl,'LineWidth',1.2)
set(gca,'xlim',[pi/5 pi/2],'ylim',[0.5 1],'FontSize',16)
hLeg = legend(...
'No Interpolation','Linear Interpolation',...
'PChip Interpolation','v5cubic Interpolation',...
'Spline Interpolation');
set(hLeg,'Location','south','Fontsize',16);
By the way..this does also apply to mesh and others
[X,Y] = meshgrid(-8:2:8);
R = sqrt(X.^2 + Y.^2) + eps;
Z = sin(R)./R;
figure
mesh(Z)

No, Lagrangian interpolation with 200 equally spaced points would be an incredibly bad idea. (See: Runge's phenomenon).
The plot command simply connects the given (x,y) points by straight lines, in the order given. To see this for yourself, use fewer points:
x = 0:pi/4:2*pi;
y = sin(x);
plot(x,y)