can someone explain or point me to a page that explain how to create normally distributed random number in matlab using just error function, the inverse of the error function, and rand()(uniform random number generator between 0 and 1)? the random number doesn't have to be bounded to a certain interval. I'm having problem understand the concept of the error function and the inverse of it, and how it relates to creating random number that is normally distribute
You need to apply the method called inverse transform sampling, which consists in the following. Assume you want to generate a random variable with a given distribution function F. If you can compute the inverse function F-1, then you can obtain the desired random variable by applying F-1 to random samples with uniform distribution on the interval [0,1].
The error function (erf in Matlab) almost gives the distribution function of a normal random variable. Its inverse function is called erfinv in Matlab. Uniformly distributed random numbers are generated with rand.
With these ingredients you should be able to do the task. Please give it a try, and then see the code hovering the mouse over the rectangle:
N = 1e6; % number of samples
x = erfinv(2*rand(1,N)-1); % note factor 2, because of definition of erf
hist(x,31) % plot histogram to check it is (approximately) normal
This link from Mathworks seems to give the answer.
Here's the example from the link:
% First, initialize the random number generator to make the results in this
% example repeatable.
rng(0,'twister');
% Create a vector of 1000 random values drawn from a normal distribution
% with a mean of 500 and a standard deviation of 5.
a = 5;
b = 500;
y = a.*randn(1000,1) + b;
% Calculate the sample mean, standard deviation, and variance.
stats = [mean(y) std(y) var(y)]
% stats =
%
% 499.8368 4.9948 24.9483
%
% The mean and variance are not 500 and 25 exactly because they are
% calculated from a sampling of the distribution.
Related
I am using MATLAB R2020a with MacOS. I am trying to find the exponentially weighted moving mean of the cycle period of an ECG signal, and have used the dsp.MovingAverage function from the DSP signal processing toolbox, and called the commands shown. However, I am not sure how to specify how many of the elements of the vector to include in the weighted mean. At the moment, is it just adding a weight to all of the elements and then finding the moving mean?
movavgExp = dsp.MovingAverage('Method', 'Exponential weighting', 'ForgettingFactor', 0.1);
Whenever I call the 'WindowLength' command as specified in the DSP documentation, it produces an error:
movavgExp = dsp.MovingAverage(10, 'Method', 'Exponential weighting', 'ForgettingFactor', 0.1);
Warning: The WindowLength property is not relevant in this configuration of the System
object.
I would really appreciate any suggestions for this, thanks in advance!
From the Mathworks page for dsp.MovingAverage:
"Exponential weighting — The block multiplies the samples by a set of weighting factors. The magnitude of the weighting factors decreases exponentially as the age of the data increases, but the magnitude never reaches zero. To compute the average, the algorithm sums the weighted data."
So there is no real averaging window as you use all your signal up to time t (exponentially weighted) for the mean value at that instant.
Of course older samples are weighted less than newer ones, and the parameter for that is that ForgettingFactor. I guess you could then define an "effective" averaging window as the number of samples whose weight is larger than a threshold.
Unfortunately it doesn't seem like dsp.MovingAverage can return the weights itself, but you can calculate them yourself. From the Mathworks page,
where is the weight for the Nth sample and is your forgetting factor. Remember to initialize the weight for the first sample to 1, so that you could have something like:
w = zeros(length(x),1); % where x is your signal
w(1) = 1; % initialize the weight for the first sample
for i = 2:length(x)
w(i) = lambda*w(i-1) + 1; % calculate the successive weights
end
To have then the averaging window for the N-th sample I would probably then normalize the weights from 1 to N with respect to the their sum:
thr = 1.e-3; % your threshold, you'll probably have to play with this a bit
lengthAveragWdw = zeros(length(x),1);
for i = 1:length(x)
wi = w(1:i); % weights used to calculate the moving average up to the i-th sample
wi = wi./sum(wi); % normalize the weights
lengthAveragWdw(i) = sum(wi >= thr); % count the number of samples whose weight is greater than the threshold
end
where thr is a threshold value that you have to decide beforehand.
This line of code is supposed to generate exponential service times, but I am not able to get the logic behind it.
% Exponential service time with rate 1
mean = 1;
dt = -mean * log(1 - rand());
This is the source link, but MATLAB is needed to open the example.
I was also thinking if exprnd(1) will give the same result of generating numbers from the exponential distribution that has a mean of 1?
You are right!
First, note that MATLAB parameterizes the Exponential distribution by the mean, not the rate, so exprnd(5) would have a rate lambda = 1/5.
This line of code is another way to do the same thing:
-mean * log(1 - rand());
This is the inverse transform for the Exponential distribution.
If X follows an Exponential distribution, then
and rewriting the cumulative distribution function (CDF) and letting U ~ Uniform(0,1), we can derive the inverse transform.
Note the last equality is because 1-U and U are equal in distribution. In other words, 1-U ~ Uniform(0,1) and U ~ Uniform(0,1).
You can test this yourself with this example code with multiple approaches.
% MATLAB R2018b
rate = 1; % mean = 1 % mean = 1/rate
NumSamples = 1000;
% Approach 1
X1 = (-1/rate)*log(1-rand(NumSamples,1)); % inverse transform
% Approach 2
X2 = exprnd(1/rate,NumSamples,1);
% Approach 3
pd = makedist('Exponential',1/rate) % create probability distribution object
X3 = random(pd,NumSamples,1);
EDIT: The OP asked is there was a reason to generate from the CDF rather than from the probability density function (PDF). This is my attempt to answer that.
The inverse transform method uses the CDF to take advantage of the fact that the CDF is itself a probability and so must be on the interval [0, 1]. Then it is very easy to generate very good (pseudo) random numbers which will be on that interval. The CDF is sufficient to uniquely define the distribution, and inverting the CDF means that its unique "shape" will properly map the uniformly distributed numbers on [0, 1] to a non-uniform shape in the domain which will follow the probability density function (PDF).
You can see the CDF performing this nonlinear mapping in this figure.
One use of the PDF would be Acceptance-Rejection methods, which can be useful for some distributions including custom PDFs (thanks to #pjs for jogging my memory).
I've got an arbitrary probability density function discretized as a matrix in Matlab, that means that for every pair x,y the probability is stored in the matrix:
A(x,y) = probability
This is a 100x100 matrix, and I would like to be able to generate random samples of two dimensions (x,y) out of this matrix and also, if possible, to be able to calculate the mean and other moments of the PDF. I want to do this because after resampling, I want to fit the samples to an approximated Gaussian Mixture Model.
I've been looking everywhere but I haven't found anything as specific as this. I hope you may be able to help me.
Thank you.
If you really have a discrete probably density function defined by A (as opposed to a continuous probability density function that is merely described by A), you can "cheat" by turning your 2D problem into a 1D problem.
%define the possible values for the (x,y) pair
row_vals = [1:size(A,1)]'*ones(1,size(A,2)); %all x values
col_vals = ones(size(A,1),1)*[1:size(A,2)]; %all y values
%convert your 2D problem into a 1D problem
A = A(:);
row_vals = row_vals(:);
col_vals = col_vals(:);
%calculate your fake 1D CDF, assumes sum(A(:))==1
CDF = cumsum(A); %remember, first term out of of cumsum is not zero
%because of the operation we're doing below (interp1 followed by ceil)
%we need the CDF to start at zero
CDF = [0; CDF(:)];
%generate random values
N_vals = 1000; %give me 1000 values
rand_vals = rand(N_vals,1); %spans zero to one
%look into CDF to see which index the rand val corresponds to
out_val = interp1(CDF,[0:1/(length(CDF)-1):1],rand_vals); %spans zero to one
ind = ceil(out_val*length(A));
%using the inds, you can lookup each pair of values
xy_values = [row_vals(ind) col_vals(ind)];
I hope that this helps!
Chip
I don't believe matlab has built-in functionality for generating multivariate random variables with arbitrary distribution. As a matter of fact, the same is true for univariate random numbers. But while the latter can be easily generated based on the cumulative distribution function, the CDF does not exist for multivariate distributions, so generating such numbers is much more messy (the main problem is the fact that 2 or more variables have correlation). So this part of your question is far beyond the scope of this site.
Since half an answer is better than no answer, here's how you can compute the mean and higher moments numerically using matlab:
%generate some dummy input
xv=linspace(-50,50,101);
yv=linspace(-30,30,100);
[x y]=meshgrid(xv,yv);
%define a discretized two-hump Gaussian distribution
A=floor(15*exp(-((x-10).^2+y.^2)/100)+15*exp(-((x+25).^2+y.^2)/100));
A=A/sum(A(:)); %normalized to sum to 1
%plot it if you like
%figure;
%surf(x,y,A)
%actual half-answer starts here
%get normalized pdf
weight=trapz(xv,trapz(yv,A));
A=A/weight; %A normalized to 1 according to trapz^2
%mean
mean_x=trapz(xv,trapz(yv,A.*x));
mean_y=trapz(xv,trapz(yv,A.*y));
So, the point is that you can perform a double integral on a rectangular mesh using two consecutive calls to trapz. This allows you to compute the integral of any quantity that has the same shape as your mesh, but a drawback is that vector components have to be computed independently. If you only wish to compute things which can be parametrized with x and y (which are naturally the same size as you mesh), then you can get along without having to do any additional thinking.
You could also define a function for the integration:
function res=trapz2(xv,yv,A,arg)
if ~isscalar(arg) && any(size(arg)~=size(A))
error('Size of A and var must be the same!')
end
res=trapz(xv,trapz(yv,A.*arg));
end
This way you can compute stuff like
weight=trapz2(xv,yv,A,1);
mean_x=trapz2(xv,yv,A,x);
NOTE: the reason I used a 101x100 mesh in the example is that the double call to trapz should be performed in the proper order. If you interchange xv and yv in the calls, you get the wrong answer due to inconsistency with the definition of A, but this will not be evident if A is square. I suggest avoiding symmetric quantities during the development stage.
I was wondering if it is possible to generate a random distribution that is a function of a certain parameter. In other words, using MATLAB I type rand(1,5) I have a uniformly random distribution of 5 numbers between 0 and 1. It is possible to have this result as a function of a certain parameter? Do you know any algorithm about that? I just need that in an interval don't need a 2D representation.
I think you want to do this:
http://en.wikipedia.org/wiki/Inverse_transform_sampling
In MATLAB, it's quite straightforward, you simply specify the function.
n = 10000; % number of random draws
r = rand(n, 1); % generate uniform random numbers
f = #norminv; % specify transforming function
tr = f(r); % transformed numbers, now normally distributed
hist(tr, 30) % plot histogram
This example is a bit contrived, since we could simply have used randn. But the method holds generally.
If you have the Statistics toolbox, and you want to sample from one of the popular distributions, take a look at the random number generators that are available to you, link.
I am trying to input the following piecewise function into matlab as a probability distribution. Then I'm trying to generate random values of X. I have the statistic tool box so I can generate the random numbers using that, but I cannot figure out how to input the function so that I can actually generate the random numbers.
P(X)= Ax 0<=x<1
A/2 1<=x<2
0 otherwise
A is a normalization constant.
I ultimately want to show a histogram of 10,000 trials from this distribution and find the mean and standard deviation of my simulation.
samples from given distribution can be generated for instance using inverse transform sampling (see http://en.wikipedia.org/wiki/Inverse_transform_sampling) It's quite easy since you just generate uniformly distributed values and then compute inverse of your cumulative distribuition function
Cumulative distrib. function can be computed by integration of propability density function, in your case
x^2/2 ... x from <0,1>
x/2 ... x from (1,2>
Note, that the normalizing constant is A=1
now, the m-file doing this is the following
function vals =genDist(len)
vals = rand(len,1);
for i=1:length(vals)
if vals(i)<=1/2 % vals(i) 0..0.5
vals(i) = sqrt(2*vals(i));%inverse function of x^2/2
else % vals(i) 0.5-1
vals(i) = vals(i)*2; %inverse function of x/2
end
end
end