How to simulate the Hypergeometric distribution in Matlab - matlab

i want to simulate in Matlab program the Hypergeometric distribution with probability mass function and parameters as described here : https://en.wikipedia.org/wiki/Hypergeometric_distribution
how can i code that while producing random numbers from uniform distribution.

Most sensible would be to use the builtin hypergeometric generator.
If you have to do this for an assignment or some other arbitrary reason, the generic solution when an inverse CDF exists is to do inversion—use the uniform generator to create a p-value (a value between 0 and 1), and plug that into the inverse CDF. Since Matlab provides an inverse CDF function, this should be straightforward.

This is easily done with the randperm function, which generates a sample without replacement.
Let the distribution parameters be defined as follows:
N = 10; % population size
K = 3; % number of success states in the population
n = 5; % number of draws
Then the variable k obtained as
k = sum(randperm(N,n)<=K);
has a hypergeometric distribution with parameters N,K,n.
If you really need to use a uniform random number generator (rand function):
[~, x] = sort(rand(1,N));
x = x(1:n); % this gives the same result as randperm(N,n)
k = sum(x<=K);

Related

Draw random numbers from a custom probability density function in Matlab

I want to sample R random numbers from a custom probability density function in Matlab.
This is the expression of the probability density function evaluated at x.
I thought about using slicesample
R=10^6;
f = #(x) 1/(2*pi^(1/2))*(1/(x^(3/2)))*exp(-1/(4*x));
epsilon= slicesample(0.3,R,'pdf',f,'thin',1,'burnin',1000);
However, it does not work because I get the error
Error using slicesample (line 175)
The step-out procedure failed.
I tried to change starting value and values of thin and burning parameters but it does not seem to work. Could you advise, either on how to make slicesample work or on alternative solutions to sample random numbers from a custom probability density function in Matlab?
Let X be a random variable distributed according to your target pdf. Applying the change of variable y = 1/x and using the well-known theorem for a function of a random variable, the distribution of Y = 1/X is recognized to be a Gamma distribution with parameters α = 1/2, β = 1/4.
Therefore, it suffices to generate a Gamma random variable (using gamrnd) with those parameters and take the inverse. Note that Matlab's definition of the Gamma distribution uses parameters A = α, B = 1/β.
R = 1e5; % desired sample size
x = 1./gamrnd(1/2, 4, [1 R]); % result
Check:
histogram(x, 'Normalization', 'pdf', 'BinEdges', 0:.1:10)
hold on
f = #(x) 1/2/sqrt(pi)./x.^(3/2).*exp(-1/4./x); % target pdf
fplot(f, 'linewidth', .75)

How to use the randn function in Matlab to create an array of values (range 0-10) of size 1,000 that follows a Gaussian distribution? [duplicate]

Matlab has the function randn to draw from a normal distribution e.g.
x = 0.5 + 0.1*randn()
draws a pseudorandom number from a normal distribution of mean 0.5 and standard deviation 0.1.
Given this, is the following Matlab code equivalent to sampling from a normal distribution truncated at 0 at 1?
while x <=0 || x > 1
x = 0.5 + 0.1*randn();
end
Using MATLAB's Probability Distribution Objects makes sampling from truncated distributions very easy.
You can use the makedist() and truncate() functions to define the object and then modify (truncate it) to prepare the object for the random() function which allows generating random variates from it.
% MATLAB R2017a
pd = makedist('Normal',0.5,0.1) % Normal(mu,sigma)
pdt = truncate(pd,0,1) % truncated to interval (0,1)
sample = random(pdt,numRows,numCols) % Sample from distribution `pdt`
Once the object is created (here it is pdt, the truncated version of pd), you can use it in a variety of function calls.
To generate samples, random(pdt,m,n) produces a m x n array of samples from pdt.
Further, if you want to avoid use of toolboxes, this answer from #Luis Mendo is correct (proof below).
figure, hold on
h = histogram(cr,'Normalization','pdf','DisplayName','#Luis Mendo samples');
X = 0:.01:1;
p = plot(X,pdf(pdt,X),'b-','DisplayName','Theoretical (w/ truncation)');
You need the following steps
1. Draw a random value from uniform distribution, u.
2. Assuming the normal distribution is truncated at a and b. get
u_bar = F(a)*u +F(b) *(1-u)
3. Use the inverse of F
epsilon= F^{-1}(u_bar)
epsilon is a random value for the truncated normal distribution.
Why don't you vectorize? It will probably be faster:
N = 1e5; % desired number of samples
m = .5; % desired mean of underlying Gaussian
s = .1; % desired std of underlying Gaussian
lower = 0; % lower value for truncation
upper = 1; % upper value for truncation
remaining = 1:N;
while remaining
result(remaining) = m + s*randn(1,numel(remaining)); % (pre)allocates the first time
remaining = find(result<=lower | result>upper);
end

Generating a random number based off normal distribution in matlab

I am trying to generate a random number based off of normal distribution traits that I have (mean and standard deviation). I do NOT have the Statistics and Machine Learning toolbox.
I know one way to do it would be to randomly generate a random number r from 0 to 1 and find the value that gives a probability of that random number. I can do this by entering the standard normal function
f= #(y) (1/(1*2.50663))*exp(-((y).^2)/(2*1^2))
and solving for
r=integral(f,-Inf,z)
and then extrapolating from that z-value to the final answer X with the equation
z=(X-mew)/sigma
But as far as I know, there is no matlab command that allows you to solve for x where x is the limit of an integral. Is there a way to do this, or is there a better way to randomly generate this number?
You can use the built-in randn function which yields random numbers pulled from a standard normal distribution with a zero mean and a standard deviation of 1. To alter this distribution, you can multiply the output of randn by your desired standard deviation and then add your desired mean.
% Define the distribution that you'd like to get
mu = 2.5;
sigma = 2.0;
% You can any size matrix of values
sz = [10000 1];
value = (randn(sz) * sigma) + mu;
% mean(value)
% 2.4696
%
% std(value)
% 1.9939
If you just want a single number from the distribution, you can use the no-input version of randn to yield a scalar
value = (randn * sigma) + mu;
Just for the fun of it, you can generate a Gaussian random variable using a uniform random generator:
The logarithm of a uniform random variable on (0,1) has an exponential distribution
The square root of that has a Rayleigh distribution
Multiply by the cosine (or sine) of a uniform random variable on (0,2*pi) and the result is Gaussian. You need to multiply by sqrt(2) to normalize.
The obtained Gaussian variable is normalized (zero mean, unit standard deviation). If you need specific mean and standard deviation, multiply by the latter and then add the former.
Example (normalized Gaussian):
m = 1; n = 1e5; % desired output size
x = sqrt(-2*log(rand(m,n))).*cos(2*pi*rand(m,n));
Check:
>> mean(x)
ans =
-0.001194631660594
>> std(x)
ans =
0.999770464360453
>> histogram(x,41)

Random samples from Lognormal distribution

I have a parameter X that is lognormally distributed with mean 15 and standard deviation 0.48. For monte carlo simulation in MATLAB, I want to generate 40,000 samples from this distribution. How could be done in MATLAB?
To generate an MxN matrix of lognornally distributed random numbers with parameter mu and sigma, use lognrnd (Statistics Toolbox):
result = lognrnd(mu,sigma,M,N);
If you don't have the Statistics Toolbox, you can equivalently use randn and then take the exponential. This exploits the fact that, by definition, the logarithm of a lognormal random variable is a normal random variable:
result = exp(mu+sigma*randn(M,N));
The parameters mu and sigma of the lognormal distribution are the mean and standard deviation of the associated normal distribution. To see how the mean and standard deviarion of the lognormal distribution are related to parameters mu, sigma, see lognrnd documentation.
To generate random samples, you need the inverted cdf. If you have done this, generating samples is nothing more than 'my_icdf(rand(n, m))'
First get the cdf (integrating the pdf) and then invert the function to get the inverted cdf.
You can convert between the mean and variance of the Lognormal distribution and its parameters (mu,sigma) which correspond to the associated Normal (Gaussian) distribution using the formulas.
The approach below uses the Probability Distribution Objects introduced in MATLAB 2013a. More specifically, it uses the makedist, random, and pdf functions.
% Notation
% if X~Lognormal(mu,sigma) the E[X] = m & Var(X) = v
m = 15; % Target mean for Lognormal distribution
v = 0.48; % Target variance Lognormal distribution
getLmuh=#(m,v) log(m/sqrt(1+(v/(m^2))));
getLvarh=#(m,v) log(1 + (v/(m^2)));
mu = getLmuh(m,v);
sigma = sqrt(getLvarh(m,v));
% Generate Random Samples
pd = makedist('Lognormal',mu,sigma);
X = random(pd,1000,1); % Generates a 1000 x 1 vector of samples
You can verify the correctness via the mean and var functions and the distribution object:
>> mean(pd)
ans =
15
>> var(pd)
ans =
0.4800
Generating samples via the inverse transform is also made easy using the icdf (inverse CDF) function.
% Alternate way to generate X~Lognormal(mu,sigma)
U = rand(1000,1); % U ~ Uniform(0,1)
X = icdf(pd,U); % Inverse Transform
The graphic generated by following code (MATLAB 2018a).
Xrng = [0:.01:20]';
figure, hold on, box on
h(1) = histogram(X,'DisplayName','Random Sample (N = 1000)');
h(2) = plot(Xrng,pdf(pd,Xrng),'b-','DisplayName','Theoretical PDF');
legend('show','Location','northwest')
title('Lognormal')
xlabel('X')
ylabel('Probability Density Function')
% Options
h(1).Normalization = 'pdf';
h(1).FaceColor = 'k';
h(1).FaceAlpha = 0.35;
h(2).LineWidth = 2;

How do I compute a PMF and CDF for a binomial distribution in MATLAB?

I need to calculate the probability mass function, and cumulative distribution function, of the binomial distribution. I would like to use MATLAB to do this (raw MATLAB, no toolboxes). I can calculate these myself, but was hoping to use a predefined function and can't find any. Is there something out there?
function x = homebrew_binomial_pmf(N,p)
x = [1];
for i = 1:N
x = [0 x]*p + [x 0]*(1-p);
end
You can use the function NCHOOSEK to compute the binomial coefficient. With that, you can create a function that computes the value of the probability mass function for a set of k values for a given N and p:
function pmf = binom_dist(N,p,k)
nValues = numel(k);
pmf = zeros(1,nValues);
for i = 1:nValues
pmf(i) = nchoosek(N,k(i))*p^k(i)*(1-p)^(N-k(i));
end
end
To plot the probability mass function, you would do the following:
k = 0:40;
pmf = binom_dist(40,0.5,k);
plot(k,pmf,'r.');
and the cumulative distribution function can be found from the probability mass function using CUMSUM:
cummDist = cumsum(pmf);
plot(k,cummDist,'r.');
NOTE: When the binomial coefficient returned from NCHOOSEK is large you can end up losing precision. A very nice alternative is to use the submission Variable Precision Integer Arithmetic from John D'Errico on the MathWorks File Exchange. By converting your numbers to his vpi type, you can avoid the precision loss.
octave provides a good collection of distribution pdf, cdf, quantile; they have to be translated from octave, but this is relatively trivial (convert endif to end, convert != to ~=, etc;) see e.g. octave binocdf for the binomial cdf function.
looks like for the CDF of the binomial distribution, my best bet is the incomplete beta function betainc.
For PDF
x=1:15
p=.45
c=binopdf(x,15,p)
plot(x,c)
Similarly CDF
D=binocdf(x,15,p)
plot(x,D)