Proper way to add noise to signal - matlab

In many areas I have found that while adding noise, we mention some specification like zero mean and variance. I need to add AWGN, colored noise, uniform noise of varying SNR in Db. The following code shows the way how I generated and added noise. I am aware of the function awgn() but it is a kind of black box thing without knowing how the noise is getting added. So, can somebody please explain the correct way to generate and add noise. Thank you
SNR = [-10:5:30]; %in Db
snr = 10 .^ (0.1 .* SNR);
for I = 1:length(snr)
noise = 1 / sqrt(2) * (randn(1, N) + 1i * randn(1, N));
u = y + noise .* snr(I);
end

I'm adding another answer since it strikes me that Steven's is not quite correct and Horchler's suggestion to look inside function awgn is a good one.
Either MATLAB or Octave (in the communications toolbox) have a function awgn that adds (white Gaussian) noise to attain a desired signal-to-noise power level; the following is the relevant portion of the code (from the Octave function):
if (meas == 1) % <-- if using signal power to determine appropriate noise power
p = sum( abs( x(:)) .^ 2) / length(x(:));
if (strcmp(type,"dB"))
p = 10 * log10(p);
endif
endif
if (strcmp(type,"linear"))
np = p / snr;
else % <-- in dB
np = p - snr;
endif
y = x + wgn (m, n, np, 1, seed, type, out);
As you can see by the way p (the power of the input data) is computed, the answer from Steven does not appear to be quite right.
You can ask the function to compute the total power of your data array and combine that with the desired s/n value you provide to compute the appropriate power level of the added noise. You do this by passing the string "measured" among the optional inputs, like this (see here for the Octave documentation or here for the MATLAB documentation):
y = awgn (x, snr, 'measured')
This leads ultimately to meas=1 and so meas==1 being true in the code above. The function awgn then uses the signal passed to it to compute the signal power, and from this and the desired s/n it then computes the appropriate power level for the added noise.
As the documentation further explains
By default the snr and pwr are assumed to be in dB and dBW
respectively. This default behavior can be chosen with type set to
"dB". In the case where type is set to "linear", pwr is assumed to be
in Watts and snr is a ratio.
This means you can pass a negative or 0 dB snr value. The result will also depend then on other options you pass, such as the string "measured".
For the MATLAB case I suggest reading the documentation, it explains how to use the function awgn in different scenarios. Note that implementations in Octave and MATLAB are not identical, the computation of noise power should be the same but there may be different options.
And here is the relevant part from wgn (called above by awgn):
if (strcmp(type,"dBW"))
np = 10 ^ (p/10);
elseif (strcmp(type,"dBm"))
np = 10 ^((p - 30)/10);
elseif (strcmp(type,"linear"))
np = p;
endif
if(!isempty(seed))
randn("state",seed);
endif
if (strcmp(out,"complex"))
y = (sqrt(imp*np/2))*(randn(m,n)+1i*randn(m,n)); % imp=1 assuming impedance is 1 Ohm
else
y = (sqrt(imp*np))*randn(m,n);
endif
If you want to check the power of your noise (np), the awgn and awg functions assume the following relationships hold:
np = var(y,1); % linear scale
np = 10*log10(np); % in dB
where var(...,1) is the population variance for the noise y.

Most answers here forget that SNR is specified in decibels. Therefore, you shouldn't encounter 'division by 0' error, because you should really divide by 10^(targetSNR/10) which is never negative nor zero for real targetSNR.

This 'should not divide by 0' problem could be easily solved if you add a condition to check if targetSNR is 0 and do these only if it is not 0. When your target SNR is 0, it means it's pure noise.
function out_signal = addAWGN(signal, targetSNR)
sigLength = length(signal); % length
awgnNoise = randn(size(signal)); % orignal noise
pwrSig = sqrt(sum(signal.^2))/sigLength; % signal power
pwrNoise = sqrt(sum(awgnNoise.^2))/sigLength; % noise power
if targetSNR ~= 0
scaleFactor = (pwrSig/pwrNoise)/targetSNR; %find scale factor
awgnNoise = scaleFactor*awgnNoise;
out_signal = signal + awgnNoise; % add noise
else
out_signal = awgnNoise; % noise only
end

You can use randn() to generate a noise vector 'awgnNoise' of the length you want. Then, given a specified SNR value, calculate the power of the orignal signal and the power of the noise vector 'awgnNoise'.
Get the right amplitude scaling factor for the noise vector and just scale it.
The following code is an example to corrupt signal with white noise, assuming input signal is 1D and real valued.
function out_signal = addAWGN(signal, targetSNR)
sigLength = length(signal); % length
awgnNoise = randn(size(signal)); % orignal noise
pwrSig = sqrt(sum(signal.^2))/sigLength; % signal power
pwrNoise = sqrt(sum(awgnNoise.^2))/sigLength; % noise power
scaleFactor = (pwrSig/pwrNoise)/targetSNR; %find scale factor
awgnNoise = scaleFactor*awgnNoise;
out_signal = signal + awgnNoise; % add noise
Be careful about the sqrt(2) factor when you deal with complex signal, if you want to generate the real and imag part separately.

Related

Plot Bit-Error-rate (BER) vs Eb_N0 of a discrete BPSK signal passed through integrator in MATLAB

I am a novice to communication systems but I am in need to solve this question. I would appreciate inputs in this regard. Apologies if you my question lengthy.
I have a BPSK signal such that the input to a low pass filter(LPF) is of the form
s(k) = A + nk (read n subscript k where k = 0,1...)
where A is a constant and nk, k=0,1....are independent zero-mean normal random variables with variance σ2.
Suppose the LPF is an integrator of the form
y= 1/N { ∑(k=0 to N-1)[s(k)] } ( read ∑ from k = 0 as lower bound to N-1 as upper bound)
After applying this LPF the output y is a test statistic with bit-energy-tonoise
ratio Eb/N0. We know that for BPSK the probability that we make a
wrong decision using this test statistic is Pb = theoretical BER = 0.5 erfc{sqrt(Eb/N0)}
If an Infinite impulse response(IIR) filter corresponds to
y ̃ (n)= (1 - α)s(n)+ αy ̃ (n - 1),n = 0,1....N-1 where y ̃ (-1) = 0
How can I use Matlab to simulate the signal s(k) and the filter operation producing y and y ̃ (n) respectively above for N = 8 and plot simulated bit error rate vs. Eb/N0 and theoretical BER vs. Eb/N0 to compare. Eb/N0 range should be from 0 to 10 dB.
Note: For the IIR filter I have already determined α = (N-1)/(N+1)
My code fails to achieve desired results as I am unaware of how to simulate a WGN with variance = σ2. The one I have coded is for zero variance. Additionally I don't know how to pass my signal through the filters mentioned above.
Could somebody please help me rectify this piece of code. If yo could help with code snippet, would be a significant step forward for me. Thanks in advance!
Here is my approach.
clc;
clear all;
close all;
N=8; % Number of Bits to be processed
Eb_N0_dB = 0 : 1: 10;
for i = 1:length(Eb_N0_dB)
x = rand(1,N)>0.5; %generating 0,1s
xpolar = 2*x -1; %BPSK modulation
noise = 1/sqrt(2)*[randn(1,N) + j*randn(1,N)]; %noise
y = xpolar + 10^(-Eb_N0_dB(i)/20)*noise;
xdecode = real(y)>0; %receiver hard decison decoding
nErr(i) = size(find([x - xdecode]),2);
end
simulatedBER = nErr/N; %simulated BER
theoryBER = 0.5erfc(sqrt(10.^(Eb_N0_dB/10))); % theoretical BER
%plot
figre
semilogy(Eb_N0_dB,simulatedBER,'bs-');
hold on
semilogy(Eb_N0_dB,theoryBER,'mx-');
axis([0 10 10^-5 0.5])
grid on

Linear regression -- Stuck in model comparison in Matlab after estimation?

I want to determine how well the estimated model fits to the future new data. To do this, prediction error plot is often used. Basically, I want to compare the measured output and the model output. I am using the Least Mean Square algorithm as the equalization technique. Can somebody please help what is the proper way to plot the comparison between the model and the measured data? If the estimates are close to true, then the curves should be very close to each other. Below is the code. u is the input to the equalizer, x is the noisy received signal, y is the output of the equalizer, w is the equalizer weights. Should the graph be plotted using x and y*w? But x is noisy. I am confused since the measured output x is noisy and the model output y*w is noise-free.
%% Channel and noise level
h = [0.9 0.3 -0.1]; % Channel
SNRr = 10; % Noise Level
%% Input/Output data
N = 1000; % Number of samples
Bits = 2; % Number of bits for modulation (2-bit for Binary modulation)
data = randi([0 1],1,N); % Random signal
d = real(pskmod(data,Bits)); % BPSK Modulated signal (desired/output)
r = filter(h,1,d); % Signal after passing through channel
x = awgn(r, SNRr); % Noisy Signal after channel (given/input)
%% LMS parameters
epoch = 10; % Number of epochs (training repetation)
eta = 1e-3; % Learning rate / step size
order=10; % Order of the equalizer
U = zeros(1,order); % Input frame
W = zeros(1,order); % Initial Weigths
%% Algorithm
for k = 1 : epoch
for n = 1 : N
U(1,2:end) = U(1,1:end-1); % Sliding window
U(1,1) = x(n); % Present Input
y = (W)*U'; % Calculating output of LMS
e = d(n) - y; % Instantaneous error
W = W + eta * e * U ; % Weight update rule of LMS
J(k,n) = e * e'; % Instantaneous square error
end
end
Lets start step by step:
First of all when using some fitting method it is a good practice to use RMS error . To get this we have to find error between input and output. As I understood x is an input for our model and y is an output. Furthermore you already calculated error between them. But you used it in loop without saving. Lets modify your code:
%% Algorithm
for k = 1 : epoch
for n = 1 : N
U(1,2:end) = U(1,1:end-1); % Sliding window
U(1,1) = x(n); % Present Input
y(n) = (W)*U'; % Calculating output of LMS
e(n) = x(n) - y(n); % Instantaneous error
W = W + eta * e(n) * U ; % Weight update rule of LMS
J(k,n) = e(n) * (e(n))'; % Instantaneous square error
end
end
Now e consists of errors at the last epoch. So we can use something like this:
rms(e)
Also I'd like to compare results using mean error and standard deviation:
mean(e)
std(e)
And some visualization:
histogram(e)
Second moment: we can't use compare function just for vectors! You can use it for dynamic system models. For it you have to made some workaround about using this method as dynamic model. But we can use some functions as goodnessOfFit for example. If you want something like error at each step that consider all previous points of data then make some math workaround - calculate it at each point using [1:currentNumber].
About using LMS method. There are built-in function calculating LMS. Lets try to use it for your data sets:
alg = lms(0.001);
eqobj = lineareq(10,alg);
y1 = equalize(eqobj,x);
And lets see at the result:
plot(x)
hold on
plot(y1)
There are a lot of examples of such implementation of this function: look here for example.
I hope this was helpful for you!
Comparison of the model output vs observed data is known as residual.
The difference between the observed value of the dependent variable
(y) and the predicted value (ŷ) is called the residual (e). Each data
point has one residual.
Residual = Observed value - Predicted value
e = y - ŷ
Both the sum and the mean of the residuals are equal to zero. That is,
Σ e = 0 and e = 0.
A residual plot is a graph that shows the residuals on the vertical
axis and the independent variable on the horizontal axis. If the
points in a residual plot are randomly dispersed around the horizontal
axis, a linear regression model is appropriate for the data;
otherwise, a non-linear model is more appropriate.
Here is an example of residual plots from a model of mine. On the vertical axis is the difference between the output of the model and the measured value. On the horizontal axis is one of the independent variables used in the model.
We can see that most of the residuals are within 0.2 units which happens to be my tolerance for this model. I can therefore make a conclusion as to the worth of the model.
See here for a similar question.
Regarding you question about the lack of noise in your models output. We are creating a linear model. There's the clue.

BPSK modulation and SNR : Matlab

Considering an Additive White Gaussian Noise (AWGN) communication channel where a signal taking values from BPSK modulation is being transmitted. Then, the received noisy signal is :y[k] = s[k] + w[k] where s[k] is either +1,-1 symbol and w[k] is the zero mean white gaussian noise.
-- I want to estimate the signal s and evaluate the performance by varing SNR from 0:40 dB. Let, the estimated signal be hat_s.
So, the graph for this would have on X axis the SNR range and on Y Axis the Mean Square Error obtained between the known signal values and the estimates i.e., s[k] - hat_s[k]
Question 1: How do I define signal-to-noise ratio? Would the formula of SNR be sigma^2/sigma^2_w. I am confused about the term in the numerator: what is the variance of the signal, sigma^2, usually considered?
Question 2: But, I don't know what the value of the variance of the noise is, so how does one add noise?
This is what I have done.
N = 100; %number of samples
s = 2*round(rand(N,1))-1;
%bpsk modulation
y = awgn(s,10,'measured'); %adding noise but I don't know
the variance of the signal and noise
%estimation using least squares
hat_s = y./s;
mse_s = ((s-hat_s).^2)/N;
Please correct me where wrong. Thank you.
First I think that it is important to know what are the things we have an a BPSK system:
The constellation of a BPSK system is [-A , A] in this case [-1,1]
the SNR will vary from 0 db to 40 db
I thing that the answer is in this function:
y = awgn( ... ); from matlab central:
y = awgn(x,snr) adds white Gaussian noise to the vector signal x. The
scalar snr specifies the signal-to-noise ratio per sample, in dB. If x
is complex, awgn adds complex noise. This syntax assumes that the
power of x is 0 dBW.
y = awgn(x,snr,sigpower) is the same as the syntax above, except that
sigpower is the power of x in dBW.
y = awgn(x,snr,'measured') is the same as y = awgn(x,snr), except that
awgn measures the power of x before adding noise.
you use y = awgn(x,snr,'measured'), so you do not need to worry, beacuse matlab carries all for you, measure the power of the signal, and then apply to channel a noise with the variance needed to get that SNR ratio.
let's see how can this happen
SNRbit = Eb/No = A^2/No = dmin^2 /4N0
the constelation [A,-A] in this case is [-1,1] so
10 log10(A^2/N0) = 10 log10(1/N0) = SNRbitdb
SNRlineal = 10^(0.1*SNRdb)
so with that:
noise_var=0.5/(EbN0_lin); % s^2=N0/2
and the signal will be something like this
y = s + sqrt(noise_var)*randn(1,size);
so in your case, I will generate the signal as you do:
>> N = 100; %number of samples
>> s = 2*round(rand(N,1))-1; %bpsk modulation
then prepare a SNR varies from 0 to 40 db
>> SNR_DB = 0:1:40;
after that calulating all the posible signals:
>> y = zeros(100,length(SNR_DB));
>> for i = 1:41
y(:,i) = awgn(s,SNR_DB(i),'measured');
end
at this point the best way to see the signal is using a constellation plot like this:
>> scatterplot(y(:,1));
>> scatterplot(y(:,41));
you can see a bad signal 0 db noise equal power as signal and a very good signal signal bigger than 40 DB noise. Eb/No = Power signal - Power noise db, so 0 means power noise equal to power of signal, 40 db means power of signal bigger bigger bigger than power of noise
then for you plot calculate the mse, matlab has one function for this
err = immse(X,Y) Description
example
err = immse(X,Y) calculates the mean-squared error (MSE) between the
arrays X and Y. X and Y can be arrays of any dimension, but must be of
the same size and class.
so with This:
>> for i = 1:41
err(i) = immse(s,y(:,i));
end
>> stem(SNR_DB,err)
For plots, and since we are working in db, it should be beeter to use logarithmic axes

How to generate noise using specific variance

In the matlab function awgn() that is used to add noise to a signal, is there a way specify the variance?
In general, I would have simply done noisevec = sqrt(2)*randn(length(X),1); creates a noise vector of variance 2. Then the noisy observations are
Y = X+noisevec
But, I would like to apply awgn() and then check if the variance of noise is indeed as specified by the user. How to do that?
% add noise to produce
% an SNR of 10dB, use:
X = sin(0:pi/8:6*pi);
Y = awgn(X,10,'measured');
UPDATE : Based on the solution, the output should be same when generating noise with specific variance using the awgn() given in the answer/ solution provided and when using without awgn(). Is something wrong in my understanding? Here is how I checked.
x = rand(1,10); $generating source input
snr =10;
variance = 0.1;
%This procedure is based on the answer
y1 = awgn(x, snr, 'measured');
y1 = x + (y1 - x) * sqrt(variance / var(y1 - x));
%This is the traditional way, without using awgn()
y2 = x+sqrt(variance)*randn(1,10);
y1 is not equal to y2. I wonder why?
awgn does not generate a noise with a specific variance. But if you have to generate a noise with a specific variance, you may consider defining your own noise generator which could be simply scaling the noise up or down to the desired level:
function y = AddMyNoise(x, variance)
y = awgn(x, 10, 'measured');
y = x + (y - x) * sqrt(variance / var(y - x));
end
UPDATE: Note that this method of forcing the output to have a specific variance could be dangerous: It will give strange outputs if x has few elements. In the limit of x being a scalar, this approach will add a fixed value of +-sqrt(variance) to x. No white noise anymore. But if you have more than a few data points, you will get a reasonably white noise.

Echo State Network learning Mackey-Glass function, but how?

I got this example of a minimal Echo State Network (ESN) which I analyse while trying to understand Echo State Networks. Unfortunately I have some problems understanding why this really works. It all breaks down to the questions:
[ What defines | What is] the echo state of an ESN?
What is it that makes an ESN so easy and fast learning of such complex nonlinear functions like the Mackey-Glass function?
First here is a little piece of code that shows the important part of initialization:
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Generate the ESN reservoir
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rand('seed', 42);
trainLen = 2000;
testLen = 2000;
initLen = 100;
data = load('MackeyGlass_t17.txt');
% Input neurons
inSize = 1;
% Output neurons
outSize = 1;
% Reservoir size
resSize = 1000;
% Leaking rate
a = 0.3;
% Input weights
Win = ( rand(resSize, (inSize+1) ) - 0.5) .* 1;
% Reservoir weights
W = rand(resSize, resSize) - 0.5;
Running the reservoir:
I understand that every single data-point of the input data set is propagated from the input neuron to the reservoir neurons. After a warm-up of size initLen the states are accepted and stored in matrix X. When this is done every single column of X represents a "vector of reservoir neuron activations". And here comes the point where I am not sure if I got it right:
The comment already says "collected states" or "design matrix" X. Am I getting this right, that all this does is storing the state of the whole network in the rows of matrix X?
If we assume that t was just a time parameter then X(:,t) represents the network state of time t , isn't it?
In my examples this would mean that there are 1.900 time slices which represent the whole network state of their corresponding timeframe (X therefore is a 1002x1900 matrix). Another question that occurs to me here is
why is a 1 (I guess it is the bias) and the input value u appended to this vector: X(:,t-initLen) = [1;u;x];
So:
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Run the reservoir with the data and collect X.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Allocated memory for the design (collected states) matrix
X = zeros((1+inSize) + resSize, trainLen - initLen);
% Vector of reservoir neuron activations (used for calculation)
x = zeros(resSize, 1);
% Update of the reservoir neuron activations
xUpd = zeros(resSize, 1);
for t = 1:trainLen
u = data(t);
xUpd = tanh( Win * [1;u] + W * x );
x = (1-a) * x + a * xUpd;
if ( t > initLen )
X(:,t-initLen) = [1;u;x];
end
end
Training part:
The training part is also a little magic to me yet. I am familiar how linear regression works, so this is not the problem here.
What I see is that this part just uses the hole state matrix X and performs a single linear regression step on the input data to generate the output weight vector Wout and that's it.
So all that's been done so far - if I'm not mistaken - is initializing the output weights according to the state matri X which itself was generated using input data and randomly gernerated (input and reservoir) weights.
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Train the output
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Set the corresponding target matrix directly
Yt = data(initLen+2:trainLen+1)';
% Regularization coefficient
reg = 1e-8;
% Get X transposed - needed twice therefore it is a little faster
X_T = X';
% Yt * pseudo_inverse(X); (linear regression task)
Wout = Yt * X_T * (X * X_T + reg * eye(1+inSize+resSize))^(-1);
Running the ESN in a generative mode:
I can run this in two modes: generative or predictive. But well, this is the part where I just can say: "Well, .. it works." not having the exact idea why it is.
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Run the trained ESN in a generative mode. no need to initialize here,
% because x is initialized with training data and we continue from there.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Y = zeros(outSize,testLen);
u = data(trainLen+1);
for t = 1:testLen
xUpd = tanh( Win*[1;u] + W*x );
x = (1-a)*x + a*xUpd;
% Generative mode:
u = Wout*[1;u;x];
% This would be a predictive mode:
%u = data(trainLen+t+1);
Y(:,t) = u;
end
It works pretty well as you can see (generative mode):
I know this is a quiet huge "question" if this can even be considered as one. I feel like I am understanding the single parts but what I'm missing is the big picture of this magic black box called Echo State Network.
The echo state network (ESN) is basically a clever way to train a Recurrent Neural Network.
The ESN has a "reservoir" of hidden units which are coupled.
The inputs are connected to the reservoir with input (plus a bias) to hidden connections. These connections are not trained. They are randomly initialized, and this is the code snippet that does this initialization (I am using python).
Win = (random.rand(resSize,1+inSize)-0.5) * 1
The units in the reservoir are coupled, meaning basically that there exist hidden to hidden connections. Again the weights in the reservoir are not trained but initialized. However, initialization of the reservoir weights is tricky. Those weights (depicted by W in the code) are first randomly initialized and then they are multiplied by a factor which takes into account the spectral radius of the random matrix. Careful initialization of these connections is very important because it affects the dynamics of the ESN (do not forget it is a recurrent network). I guess if you want to know more details about this you have to be able to understand linear system theory.
Now, after initializing properly the two weight matrices you start presenting inputs to the reservoir. For each input presented to the reservoir the activations are calculated and these activations are the state of the ESN. Look at the figure below.
This figure shows a plot of 200 activations for 20 inputs.
So, after presenting all inputs to the ESN the states are collected into a matrix X. This is the code snippet that does this in python:
x = zeros((resSize,1))
for t in range(trainLen):
u = data[t]
x = (1-a)*x + a*tanh( dot( Win, vstack((1,u)) ) + dot( W, x ) )
if t >= initLen:
X[:,t-initLen] = vstack((1,u,x))[:,0]
The state of the ESN is therefore a function of the finite history of the inputs presented to the network.
Now, in order to predict the output from the states of the oscillators the only thing that has to be learned is how to couple the outputs to the oscillators, i.e. the hidden to output connections:
# train the output
reg = 1e-8 # regularization coefficient
X_T = X.T
Wout = dot( dot(Yt,X_T), linalg.inv( dot(X,X_T) + \
reg*eye(1+inSize+resSize) ) )
Then after the network has been trained the predictive capability is tested using the test sample of the data.
The generative mode means that you start with a particular value of the time series and then you use that value to predict the next value in the time series but then you use the predicted value to predict the next value and so on. In effect you are generating the time series, hence generative mode. It allows you to predict multiple steps into the future, as opposed to predictive mode where you get one value from the time series and predict the next one.
And this is why the ESN seems to be doing a pretty good job. The target signal is pretty complex and yet in generative mode it does very well.
Finally, as far as minimal implementation goes i guess it refers to the size of the reservoir (1000), which apparently is pretty small.