What's the range of random variable with randn in matlab? - matlab

I am using matlab to plot the random variables satisfying the normal distribution. I plot the histogram as
w = 0.2;
y = randn(1, 1000)*w;
hist(y);
this shows the variables in the histogram ranges from -40 to 40, but what's that? I think since the width of the normal distribution is 0.2, I think the range of the variable should be within -1 to 1, right? So why the hist shows from -40 to 40? How do I know the actual range of the random variable? Thanks.

In the normal random variable, sometimes called Gaussian distribution, the range could be from -infinity to +infinity in theory. However, the distribution has a bell shape, this means the larger values have lower probability of occurring, but there is a chance that they happen. So if instead of randn(1, 1000) you use randn(1,1000000) with a high probability you will see a larger range. The value 0.2 that you multiply the randn() with just changes the energy of this random signal.

Can you give a bit more information?
When I run your snippet, I get a Gaussian histogram with min and max:
>> [min(y) max(y)]
ans =
-0.6464 0.7157

Related

Sampling at exactly Nyquist rate in Matlab

Today I have stumbled upon a strange outcome in matlab. Lets say I have a sine wave such that
f = 1;
Fs = 2*f;
t = linspace(0,1,Fs);
x = sin(2*pi*f*t);
plot(x)
and the outcome is in the figure.
when I set,
f = 100
outcome is in the figure below,
What is the exact reason of this? It is the Nyquist sampling theorem, thus it should have generated the sine properly. Of course when I take Fs >> f I get better results and a very good sine shape. My explenation to myself is that Matlab was having hardtime with floating numbers but I am not so sure if this is true at all. Anyone have any suggestions?
In the first case you only generate 2 samples (the third input of linspace is number of samples), so it's hard to see anything.
In the second case you generate 200 samples from time 0 to 1 (including those two values). So the sampling period is 1/199, and the sampling frequency is 199, which is slightly below the Nyquist rate. So there is aliasing: you see the original signal of frequency 100 plus its alias at frequency 99.
In other words: the following code reproduces your second figure:
t = linspace(0,1,200);
x = .5*sin(2*pi*99*t) -.5*sin(2*pi*100*t);
plot(x)
The .5 and -.5 above stem from the fact that a sine wave can be decomposed as the sum of two spectral deltas at positive and negative frequencies, and the coefficients of those deltas have opposite signs.
The sum of those two sinusoids is equivalent to amplitude modulation, namely a sine of frequency 99.5 modulated by a sine of frequency 1/2. Since time spans from 0 to 1, the modulator signal (whose frequency is 1/2) only completes half a period. That's what you see in your second figure.
To avoid aliasing you need to increase sample rate above the Nyquist rate. Then, to recover the original signal from its samples you can use an ideal low pass filter with cutoff frequency Fs/2. In your case, however, since you are sampling below the Nyquist rate, you would not recover the signal at frequency 100, but rather its alias at frequency 99.
Had you sampled above the Nyquist rate, for example Fs = 201, the orignal signal could ideally be recovered from the samples.† But that would require an almost ideal low pass filter, with a very sharp transition between passband and stopband. Namely, the alias would now be at frequency 101 and should be rejected, whereas the desired signal would be at frequency 100 and should be passed.
To relax the filter requirements you need can sample well above the Nyquist rate. That way the aliases are further appart from the signal and the filter has an easier job separating signal from aliases.
† That doesn't mean the graph looks like your original signal (see SergV's answer); it only means that after ideal lowpass filtering it will.
Your problem is not related to the Nyquist theorem and aliasing. It is simple problem of graphic representation. You can change your code that frequency of sine will be lower Nyquist limit, but graph will be as strange as before:
t = linspace(0,1,Fs+2);
plot(sin(2*pi*f*t));
Result:
To explain problem I modify your code:
Fs=100;
f=12; %f << Fs
t=0:1/Fs:0.5; % step =1/Fs
t1=0:1/(10*Fs):0.5; % step=1/(10*Fs) for precise graphic representation
subplot (2, 1, 1);
plot(t,sin(2*pi*f*t),"-b",t,sin(2*pi*f*t),"*r");
subplot (2, 1, 2);
plot(t1,sin(2*pi*f*t1),"g",t,sin(2*pi*f*t),"r*");
See result:
Red star - values of sin(2*pi*f) with sampling rate of Fs.
Blue line - lines which connect red stars. It is usual data representation of function plot() - line interpolation between data points
Green curve - sin(2*pi*f)
Your eyes and brain can easily understand that these graphs represent the sine
Change frequency to more high:
f=48; % 2*f < Fs !!!
See on blue lines and red stars. Your eyes and brain do not understand now that these graphs represent the same sine. But your "red stars" are actually valid value of sine. See on bottom graph.
Finally, there is the same graphics for sine with frequency f=50 (2*f = Fs):
P.S.
Nyquist-Shannon sampling theorem states for your case that if:
f < 2*Fs
You have infinite number of samples (red stars on our plots)
then you can reproduce values of function in any time (green curve on our plots). You must use sinc interpolation to do it.
copied from Matlab Help:
linspace
Generate linearly spaced vectors
Syntax
y = linspace(a,b)
y = linspace(a,b,n)
Description
The linspace function generates linearly spaced vectors. It is similar to the colon operator ":", but gives direct control over the number of points.
y = linspace(a,b) generates a row vector y of 100 points linearly spaced between and including a and b.
y = linspace(a,b,n) generates a row vector y of n points linearly spaced between and including a and b. For n < 2, linspace returns b.
Examples
Create a vector of 100 linearly spaced numbers from 1 to 500:
A = linspace(1,500);
Create a vector of 12 linearly spaced numbers from 1 to 36:
A = linspace(1,36,12);
linspace is not apparent for Nyquist interval, so you can use the common form:
t = 0:Ts:1;
or
t = 0:1/Fs:1;
and change the Fs values.
The first Figure is due to the approximation of '0': sin(0) and sin(2*pi). We can notice the range is in 10^(-16) level.
I wrote the function reconstruct_FFT that can recover critically sampled data even for short observation intervals if the input sequence of samples is periodic. It performs lowpass filtering in the frequency domain.

Matlab create array random samples Gaussian distribution

I'd like to make an array of random samples from a Gaussian distrubution.
Mean value is 0 and variance is 1.
If I take enough samples, I would think my maximum value of a sample can be 0+1=1.
However, I find that I get values like 4.2891 ...
My code:
x = 0+sqrt(1)*randn(100000,1);
mean(x)
var(x)
max(x)
This would give me a mean like 0, a var of 0.9937 but my maximum value is 4.2891?
Can anyone help me out why it does this?
As others have mentioned, there is no bound on the possible values that x can take on in a gaussian distribution. However, the farther x is from the mean, the less likely it is to be drawn.
To give you some intuition for what the variance actually means (for any distribution, not just the gaussian case), you can look at the 68-95-99.7 rule. The rule says:
about 68% of the population will be within one sigma of the mean
about 95% of the population will be within two sigma's of the mean
about 99.7% of the population will be within three sigma's of the mean
Here sigma = sqrt(var) is the standard deviation of the distribution.
So while in theory it is possible to draw any x from a gaussian distribution, in practice it is unlikely to draw anything past 5 or 6 standard deviations away for a population of 100000.
This will yield N random numbers using the gaussian normal distribution.
N = 100;
mu = 0;
sigma = 1;
Xs = normrnd(mu, sigma, N);
EDIT:
I just realized that your code is in fact equivalent to what I've written.
As others already pointed out: variance is not the maximum distance a sample can deviate from the mean! (It is just the average of the squares of those distances)

Generate white noise with amplitude between [-1 1] with Matlab

I'm using the Matlab function Y = WGN(M,N,P) to generate white noise with Gaussian distribution. This function uses a power value (dB Watts) to calculate the amplitude of the output signal. Since I want to get an output amplitude range of -1 V to 1 V there is a function mode 'linear'.
I'm trying to use the 'linear' mode to produce the output but the result is an output amplitude range of [-4 4]
RandomSignal = wgn(10000,1,1,1,'linear');
Time = linspace(0,10,10000);
figure()
plot(Time,RandomSignal)
figure()
hist(RandomSignal,100)
Is there another function to produce this result, or am I just doing something wrong?
As others have said, you can't limit a Gaussian distribution.
What you can do is define your range to be 6 standard deviations, and then use randn(m,sigma) to generate your signal.
For example if you want a range of [-1 1] you will choose sigma=2/6=0.333 and Mu=0. This will create a chance of 99.7% to be inside the range. You can then round up and down those numbers that are out of the range.
This will not be a pure Gaussian distribution, but this is the closest you can get.
why you just take randn function of whatever bound and then just normalize it like this ex.
noise=randn(400); noise=noise./max(max(noise));
so whatever is the output of randn finally you will have a w.n. inside [-1 1].
Gaussian noise has an unbounded range. (The support of the Gaussian pdf is infinite.)
You can use rand rather than Gaussian generator. The output range of rand is 0-1, so to make it in the range -1 1 you use rand(args)*2 -1.
It should be noted that this generator is sampling a uniform density.
Don't want to say something very wrong, but when I copied your code and changed
RandomSignal = .25*wgn(10000,1,1,1,'linear');
it was then ok. Hope it works for you.(Assuming random data/4 is still random data)
hello I think it is so late to answer this question but I think it can be useful for other research.
the function which is shown below is worked for producing gaussian random noise between [-1,1].
function nois=randn_fun_Q(S_Q)
% S_Q a represents the number of rows of the noise vector. For example, in
% system identification, it shows the number of rows of the covariance matrix.
i=0;
nois=[];
for j=1:S_Q
while i~=1
nois_n=randn;
if nois_n<1 && nois_n>-1
nois_=nois_n;
i=1;
else
continue
end
end
nois=[nois;nois_];
i=0;
end
end
you should use this function in "for" loop in main cod to produce n number of noisy point in range of [-1,1].

Matlab ksdensity point range

I am using this form of the ksdensity function in MATLAB.
[f,xi] = ksdensity(x)
The documentation says that "f is the vector of density values evaluated at the points in xi ... The density is evaluated at 100 equally spaced points that cover the range of the data in x. "
Now, my xi values cover a much larger range than the data in x. Why is this?
For my data,
>> min(x)
ans =
-2.2588
>> min(xi)
ans =
-6.8010
>> max(x)
ans =
6.5326
>> max(xi)
ans =
11.0748
I know I can specify an xi range myself, but why is it not equally spaced between min and max of x by default?
It makes it hard to compare histogram estimators and kernel estimators when the bins in the histogram only cover the range of x, whereas the test points given from ksdensity exceed this range.
ksdensity performs a smoothing of the histogram with a Gaussian kernel. As an illustration, check the output of the impulse response:
[f,xi] = ksdensity(0);
plot(xi,f)
The width of the Gaussian is 1, which means that you'll add about 3 on both sides of the data (which can make it appear as if there were negative values in a strictly positive histogram). Thus, if your data span a small range only, you need to shrink the kernel width in ksdensity.

Generating random noise in matlab

When I add Gaussian noise to an array shouldnt the histogram be Gaussian? Although the noise is random, the distribution should be gaussian right? That is not what I get.
A=zeros(10);
A=imnoise(A,'gaussian');
imhist(A)
Two things could be going on:
You don't have enough of a sample size, or
The default mean of imnoise with gaussian distribution is 0, meaning you're only seeing the right half of the bell curve.
Try
imhist(imnoise(zeros(1000), 'gaussian', 0.5));
This is what your code is doing:
A = zeros(10);
mu = 0; sd = 0.1; %# mean, std dev
B = A + randn(size(A))*sd + mu; %# add gaussian noise
B = max(0,min(B,1)); %# make sure that 0 <= B <= 1
imhist(B) %# intensities histogram
can you see where the problem is? (Hint: RANDN returns number ~N(0,1), thus the resulting added noise is ~N(mu,sd))
Perhaps what you are trying to do is:
hist( randn(1000,1) )
imnoise() is a function that can be applied to images, not plain arrays.
Maybe you can look into the randn() function, instead.
You might not see a bell-curve with a sampling frame of only 10.
See the central limit theorem.
http://en.wikipedia.org/wiki/Central_limit_theorem
I would try increasing the sampling frame to something much larger.
Reference:
Law of Large Numbers
http://en.wikipedia.org/wiki/Law_of_large_numbers