How to extract fitted data from normal probability density function - matlab

If I fit a uni-variate data with normal distribution, how can i get back the fitted values in MATLAB.
I am using this simple example
load hospital % data
x = hospital.Weight;
[mu sigma]=normfit(x) %normal fitting
%To visualize the pdf
xval=min(x):0.1:max(x)
yval=normpdf(xval,mu,sigma)
plot(xval,yval)
yval is giving the probabilities of xval values. Now, If I would like to extract the fitted values of 'x' after approximating it with the above normal distribution, how do I do that?. As can be seen in the picture the y-axis values are the pdf and lies between 0 and 1, however I want the corresponding fitted values from the data that follows normal distribution.
Would the fitted values be x_fitted = yval*sigma + mu? !I think I am missing some basic maths here.

normfit simply gives you the mu and sigma of the fitted normal pdf. From those you build that pdf with normpdf. So the desired y values for your input x would be
y = normpdf(x,mu,sigma)
which you could plot with
hold on
plot(x,y,'ro')
Note that, with this procedure, the data lie exactly on the normal pdf, even if those data do not actually follow a normal distribution.

Related

How can I take data and plot it as a normal distribution in MATLAB

New to MATLAB, I want to take a vector data, normalize it, and plot it as a normal distribution. I have code to normalize my data and plot it as a histogram, but it does not come out as a normally distributed, so can someone point me in the right direction as to how to do this. The code below is for normalizing the data:
subplot(3, 1, 1)
[x, y] = hist(data, 50);
bar(y, x/trapz(y, x))
So this normalizes my histogram but does not make a normally distributed curve. The data is not random and is stored as a vector.
What you probably need is histfit which produces a histogram and fits a distribution to it at the same time. You can choose the number of bins and this distribution to fit as arguments to the function. The example below uses 100 bins for the histogram and fits a normal curve to your data:
histfit(data, 100, 'normal')
The default value for number of bins is square-root of the number of elements in data, rounded up, and the default value for distribution is normal. Full documentation for histfit is available here.

Difficulty plotting correct IFFT of 2D function from given data

I am trying to read in a 2D data set into a matrix, plot the matrix, as well as plot the IFFT of the matrix. The data is 128x2 data set, with frequency in the first column (A) vs amplitude in the second column (B)
Unfortunately, plotting the matrix of the data is not plotting the correct waveform. Also, the IFFT seems to be incorrect as well.
waves = csvread('10cm.txt');
A = waves(:,1);
B = abs(waves(:,2));
Matrix = [A B];
waves_transform = abs(ifft2(waves));
figure, plot(waves);
figure, plot(waves_transform)
When I read in each column of the data and plot A vs B, the waveform of the data is correct but the ifft2 of the data is incorrect. I need to properly take the inverse Fourier transform of the two dimensional data that I have read in.
waves = csvread('10cm.txt');
A = waves(:,1);
B = abs(waves(:,2));
Matrix = [A B];
waves_transform = abs(ifft2(Matrix));
figure, plot(A,B);
figure, plot(waves_transform)
waves & waves_transform
Does anyone know why reading in the data and plotting it is different than reading in each of the columns and plotting it results in different graphs? Also, can anyone help me take the IFFT of the 2D data correctly?
10cm.txt DATA FILE HERE: http://pastebin.com/0t0TwVvC
According to MATLAB documentation, if you do plot(Y) and Y is a matrix, then the plot function plots the columns of Y versus their row number. The x-axis scale ranges from 1 to the number of rows in Y.
So, in your case you have to do:
plot(waves(:,1), waves(:,2))
Might I also suggest a free and IMO better numpy package for python

How to export fitted curve to 1D vector

With the use of Curve Fitting toolbox I'm fitting to 11 data points a curve described by a custom equation. As a result I get something like this:
I want to save 1D vector represented bye the red line on the plot above into a matlab variable. I try to use Fit->Save to Workspace... option from the Curve Fitting toolbox menu, but saved variables do not contain any of the fitted data.
How can I save fitted data into matlab variable?
The saved MATLAB-object (default name is fittedmodel) contains the fitted function as a function-handle and the fitted coefficients corresponding to this function-handle. You can then evaluate at the data points of your choice with feval.
In the following example the fitted function will be evaluated at the original datapoints x:
y = feval(fittedmodel, x);
Now you can directly plot the result:
plot(x,y);

Sum of MATLAB Gaussian distribution of an image is greater than 1

I am using the below code to calculate the probabilities of pixel intensities for the image given below. However, the total sum of probabilities sum(sum(probOfPixelIntensities)) is greater than 1.
I'm not sure where the mistake may be. Any help in figuring this out would be greatly appreciated. Thanks in advance.
clear all
clc
close all
I = imread('Images/cameraman.jpg');
I = rgb2gray(I);
imshow(I)
muHist = 134;
sigmaHist = 54;
Iprob = normpdf(double(I), muHist, sigmaHist);
sum(sum(Iprob))
What you are doing is computing the PDF values for every pixel in the image. Iprob is not a normal distribution but you are simply using the image pixels to sample from the distribution of a known mean and standard deviation.
Essentially, you are just performing a data transformation where the image pixel intensities get mapped to values on a normal PDF with a known mean and standard deviation. This is not the same as a PDF and that's why the sum is not 1. On top of this, the image pixel intensities don't even follow a normal distribution itself so there wouldn't be any way that the sum of the distribution is 1.
Not much more to say other than the output of normpdf is not what you are expecting it to be. You should opt to read the documentation of normpdf more carefully: http://www.mathworks.com/help/stats/normpdf.html
If it is your desire to determine the actual PDF of the image, what you need to do is find the histogram of the image, and not do a data transformation. You can do that with imhist. Once you do that, assuming that encountering the intensities is equiprobable, you would divide each histogram entry by the total size of the image and then sum along all bins. You should get the sum to be 1 in this case.
Just to verify, let's use the image you provided in your post. We'll read this in from StackOverflow. Once we do that, compute the PDF and then sum over all bins:
%// Load in image
im = rgb2gray(imread('http://i.stack.imgur.com/0XiU5.jpg'));
%// Compute PDF
h = imhist(im) / numel(im);
%// Sum over all bins
fprintf('Total sum over all bins is: %f\n', sum(h));
We get:
Total sum over all bins is: 1.000000
Just to be absolutely sure you understand, this is the PDF of the image. What you did before was perform a data transformation where you transformed all image pixel intensities that conforms to a Gaussian distribution with a known mean and standard deviation. This will not give you a sum of 1 as you expect.
Remember that PDF is only the probability density function $p(x)$. Function which is restricted to range $[0, 1]$ is the integral over all domain of that function $\int_D p(x)dx$.
Refer to the Matlab manual, Y = normpdf(X,mu,sigma) computes the pdf at each of the values in X using the normal distribution with mean mu and standard deviation sigma.
The sum of the pdf is equal to 1.
The sum of the output is not.

matlab: cdfplot of relative error

The figure shown above is the plot of cumulative distribution function (cdf) plot for relative error (attached together the code used to generate the plot). The relative error is defined as abs(measured-predicted)/(measured). May I know the possible error/interpretation as the plot is supposed to be a smooth curve.
X = load('measured.txt');
Xhat = load('predicted.txt');
idx = find(X>0);
x = X(idx);
xhat = Xhat(idx);
relativeError = abs(x-xhat)./(x);
cdfplot(relativeError);
The input data file is a 4x4 matrix with zeros on the diagonal and some unmeasured entries (represent with 0). Appreciate for your kind help. Thanks!
The plot should be a discontinuous one because you are using discrete data. You are not plotting an analytic function which has an explicit (or implicit) function that maps, say, x to y. Instead, all you have is at most 16 points that relates x and y.
The CDF only "grows" when new samples are counted; otherwise its value remains steady, just because there isn't any satisfying sample that could increase the "frequency".
You can check the example in Mathworks' `cdfplot1 documentation to understand the concept of "empirical cdf". Again, only when you observe a sample can you increase the cdf.
If you really want to "get" a smooth curve, either 1) add more points so that the discontinuous line looks smoother, or 2) find any statistical model of whatever you are working on, and plot the analytic function instead.