calculate x value from normal distribution in matlab - matlab

I have data x , with nonzero mean. I can plot normal distribution with following
x=randi([1 20],1,60);
a=mean(x);
b=sqrt(var(x));
x= [-130:1:120];
norm = normpdf(x,a,b);
plot(x,norm)
to calculate x value for 97%, I have to see z score for 97% and then use z=(x-mean)/standard deviation for calculation of x value but how can I calculate x value directly by code.

Related

Trouble with the unmixing matrix from fastica toolbox in Matlab

I'm using the FastIca toolbox (https://research.ics.aalto.fi/ica/fastica/) but am confused about the orientation of the resulting W (separating/unmixing) matrix.
Let X be a n x B matrix where n is the number of signals in a data set and B is the number of time points sampled at.
I've been calculating the W matrix using:
[A,W] = fastica(X)
However, because W is a n x n matrix I can't tell how it is oriented and whether to use W or its transpose (W.') in subsequent calculations, and can't seem to find a clear answer in the documentation.
To help me get my bearings, is fastica maximising the independence in the rows of Y where Y = W.'X or Y = WX?
It should be Y=W*X. To be sure, you can reduce the number of component to estimate and then W should no longer be square:
[A,W] = fastica(X,'numOfIC',2)
If W is of size 2xn then Y=W*X. Else W has a size nx2 and Y=W'*X

Matlab: generate random numbers from custom made probability density function

I have a dataset with 3-hourly precipitation amounts for the month of January in the period 1977-1983 (see attachment). However, I want to generate precipitation data for the period 1984-1990 based upon these data. Therefore, I was wondering if it would be possible to create a custom made probability density function of the precipitation amounts (1977-1983) and from this, generate random numbers (precipitation data) for the desired period (1984-1990).
Is this possible in Matlab and could someone help me by doing so?
Thanks in advance!
A histogram will give you an estimate of the PDF -- just divide the bin counts by the total number of samples. From there you can estimate the CDF by integrating. Finally, you can choose a uniformly distributed random number between 0 and 1 and estimate the argument of the CDF that would yield that number. That is, if y is the random number you choose, then you want to find x such that CDF(x) = y. The value of x will be a random number with the desired PDF.
If you have 'Statistics and Machine Learning Toolbox', you can evaluate the PDF of the data with 'Kernel Distribution' method:
Percip_pd = fitdist(Percip,'Kernel');
Then use it for generating N random numbers from the same distribution:
y = random(Percip_pd,N,1);
Quoting #AnonSubmitter85:
"estimate the CDF by integrating. Finally, you can choose a uniformly
distributed random number between 0 and 1 and estimate the argument of
the CDF that would yield that number. That is, if y is the random
number you choose, then you want to find x such that CDF(x) = y. The
value of x will be a random number with the desired PDF."
%random sampling
N=10; %number of resamples
pdf = normrnd(0, 1, 1,100); %your pdf
s = cumsum(pdf); %its cumulative distribution
r = rand(N,1); %random numbers between 0 and 1
for ii=1:N
inds = find(s>r(ii));
indeces(ii)=inds(1); %find first value greater than the random number
end
resamples = pdf(indeces) %the resamples

Check weather predicted values follow the gaussian distribution or not using matlab?

I have used Gaussian Process for my prediction. Now let us assume I have predicted value store in x of size 1900 X 1. Now I want to check whether its distribution follow the gaussian distribution or not . I need this in order to compare the distribution functions of other methods predicted values like NN,KNN in order to judge which one is following smooth gaussian or normal distribution functions
How I can Do this ? Better if I can get some result in the form of numerical data. the code is written as follows,
m = mean(ypred); % mean of r
s = std(ypred); % stdev of r
pd = makedist('Normal','mu',m,'sigma',s); % make probability distribution with mu = m and sigma = s
[h,p] = kstest(ypred,'CDF',pd); % calculate probability that it is a normal distribution
The ypred value is the output obtain from fitrgp of matlab. Sample of ypred values are attached here
The [figure]2 is a residual qq_plot of measured and predicted values.
You can make a One-sample Kolmogorov-Smirnov test:
x = 1 + 2.*randn(1000,1); % just some random normal distributed data, replace it with your actual 1900x1 vector.
m = mean(x); % mean of r
s = std(x); % stdev of r
pd = makedist('Normal','mu',m,'sigma',s); % make probability distribution with mu = m and sigma = s
[h,p] = kstest(x,'CDF',pd); % calculate probability that it is a normal distribution
Where p is the probability that it follows a normal distribution and h = 1 if the null-hypothesis is rejected with a significance of 0.05. Since the null-hypothesis is "it follows a normal distribution", h = 0means that it is normal distributed.
Since x was in this example was sampled from a normal distribution, most likely h = 0 and p > 0.05. If you run above code with
x = 1 + 2.*rand(1000,1); % sampled from uniform distribution
h will most likely be 1 and p<0.05. Of course you can write the whole thing as a one-liner to avoid creating m,s and pd.

Generating a random number based off normal distribution in matlab

I am trying to generate a random number based off of normal distribution traits that I have (mean and standard deviation). I do NOT have the Statistics and Machine Learning toolbox.
I know one way to do it would be to randomly generate a random number r from 0 to 1 and find the value that gives a probability of that random number. I can do this by entering the standard normal function
f= #(y) (1/(1*2.50663))*exp(-((y).^2)/(2*1^2))
and solving for
r=integral(f,-Inf,z)
and then extrapolating from that z-value to the final answer X with the equation
z=(X-mew)/sigma
But as far as I know, there is no matlab command that allows you to solve for x where x is the limit of an integral. Is there a way to do this, or is there a better way to randomly generate this number?
You can use the built-in randn function which yields random numbers pulled from a standard normal distribution with a zero mean and a standard deviation of 1. To alter this distribution, you can multiply the output of randn by your desired standard deviation and then add your desired mean.
% Define the distribution that you'd like to get
mu = 2.5;
sigma = 2.0;
% You can any size matrix of values
sz = [10000 1];
value = (randn(sz) * sigma) + mu;
% mean(value)
% 2.4696
%
% std(value)
% 1.9939
If you just want a single number from the distribution, you can use the no-input version of randn to yield a scalar
value = (randn * sigma) + mu;
Just for the fun of it, you can generate a Gaussian random variable using a uniform random generator:
The logarithm of a uniform random variable on (0,1) has an exponential distribution
The square root of that has a Rayleigh distribution
Multiply by the cosine (or sine) of a uniform random variable on (0,2*pi) and the result is Gaussian. You need to multiply by sqrt(2) to normalize.
The obtained Gaussian variable is normalized (zero mean, unit standard deviation). If you need specific mean and standard deviation, multiply by the latter and then add the former.
Example (normalized Gaussian):
m = 1; n = 1e5; % desired output size
x = sqrt(-2*log(rand(m,n))).*cos(2*pi*rand(m,n));
Check:
>> mean(x)
ans =
-0.001194631660594
>> std(x)
ans =
0.999770464360453
>> histogram(x,41)

Lognormal random numbers in specific range in Matlab

I want to develop a lognormal distribution with range [0.42,1.19], whose few elements are given as D=[1.19,1.00,0.84,0.71,0.59,0.50,0.42]. The mean should be 0.84 and standard deviation as small as possible. Also given is that the 90% of cdf (=90% of the grains) lies between 0.59 and 1.19.
Once I know all the elements of this lognormal distribution which incorporate the given conditions I can find its pdf, which is what I require. Here are simple steps I tried:
D=[1.19,1.00,0.84,0.71,0.59,0.50,0.42];
s=0.30; % std dev of the lognormal distribution
m=0.84; % mean of the lognormal distribution
mu=log(m^2/sqrt(s^2+m^2)); % mean of the associated normal dist.
sigma=sqrt(log((s^2/m^2)+1)); % std dev of the associated normal dist.
[r,c]=size(D);
for i=1:c
D_normal(i)=mu+(sigma.*randn(1));
w(i)=(D_normal(i)-mu)/sigma; % the probability or the wt. percentage
end
sizes=exp(D_normal);
If you have the statistics toolbox and you want to draw random values from the lognormal distribution, you can simply call LOGNRND. If you want to know the density of the lognormal distribution with a given mean and sigma at a specific value, you use LOGNPDF.
Since you're calculating weights, you may be looking for the density. These would be, in your example:
weights = lognpdf([1.19,1.00,0.84,0.71,0.59,0.50,0.42],0.84,0.3)
weights =
0.095039 0.026385 0.005212 0.00079218 6.9197e-05 5.6697e-06 2.9244e-07
EDIT
If you want to know what percentage of grains falls into the range of 0.59 to 1.19, you use LOGNCDF:
100*diff(logncdf([0.59,1.19],0.84,0.3))
ans =
1.3202
That's not a lot. If you plot the distribution, you'll notice that the lognormal distribution with your values peaks a bit above 2
x = 0:0.01:10;
figure
plot(x,lognpdf(x,0.84,0.3))
It seems that you are looking to generate truncated lognormal random numbers. If my assumption is correct you can either use the rejection sampling or inverse transform sampling to generate the necessary samples. Caveat: Rejection sampling is very inefficient if your bounds are very far from the mean.
Rejection Sampling
If x ~ LogNormal(mu,sigma) I(lb < x < ub )
Then generate, x ~ LogNormal(mu,sigma) and accept the draw if lb < x < ub.
Inverse Transform Sampling
If x ~ LogNormal(mu,sigma) I(lb < x < ub ) then
CDF(x) = phi((log(x) - mu)/sigma) /( phi((log(ub) - mu)/sigma) - phi((log(lb) - mu)/sigma))
Generate, u ~ Uniform(0,1).
Set, CDF(x) = u and invert for x.
In other words,
x = exp( mu + sigma * phi_inverse( u * ( phi((log(ub) - mu)/sigma) - phi((log(lb) - mu)/sigma)) ) )