What is theory behind AWGN channel command matlab - matlab

Kindly tell me the theory behind the two commands (marked with ***) of an AWGN channel in the below shown code.
Code:
N_all = [10^3*ones(1,6) 10^3*ones(1,5)];
Eb_no = [0:2:20];
for ii=1:length(Eb_no)
N = N_all(ii);
b = (1/sqrt(2))*rand(1,N)>0.5;
ip = qpsk_new(b);
s = ip;
*** noise = 1/sqrt(2) * [randn(1,N/2)+j*randn(1,N/2)];
*** y = s+10^(-Eb_no(ii)/20)*noise;
end

The randn function in the first marked line generates complex, Gaussian-distributed (1), independent (2) samples with zero mean and unit variance. The second marked line scales those samples according to the specified signal-to-noise ratio (EB/N0) and adds (3) them to the signal.
These operations stem from the definition of AWGN:
The "G" in "AWGN" means "Gaussian".
The "W" means "white". The term "white" applied to a stochastic process means that the samples are statistically independent (or uncorrelated; but in the Gaussian case they are equivalent conditions).
The "A" is "additive", so you add the noise to the signal.

Related

Small bug in MATLAB R2017B LogLikelihood after fitnlm?

Background: I am working on a problem similar to the nonlinear logistic regression described in the link [1] (my problem is more complicated, but link [1] is enough for the next sections of this post). Comparing my results with those obtained in parallel with a R package, I got similar results for the coefficients, but (very approximately) an opposite logLikelihood.
Hypothesis: The logLikelihood given by fitnlm in Matlab is in fact the negative LogLikelihood. (Note that this impairs consequently the BIC and AIC computation by Matlab)
Reasonning: in [1], the same problem is solved through two different approaches. ML-approach/ By defining the negative LogLikelihood and making an optimization with fminsearch. GLS-approach/ By using fitnlm.
The negative LogLikelihood after the ML-approach is:380
The negative LogLikelihood after the GLS-approach is:-406
I imagine the second one should be at least multiplied by (-1)?
Questions: Did I miss something? Is the (-1) coefficient enough, or would this simple correction not be enough?
Self-contained code:
%copy-pasting code from [1]
myf = #(beta,x) beta(1)*x./(beta(2) + x);
mymodelfun = #(beta,x) 1./(1 + exp(-myf(beta,x)));
rng(300,'twister');
x = linspace(-1,1,200)';
beta = [10;2];
beta0=[3;3];
mu = mymodelfun(beta,x);
n = 50;
z = binornd(n,mu);
y = z./n;
%ML Approach
mynegloglik = #(beta) -sum(log(binopdf(z,n,mymodelfun(beta,x))));
opts = optimset('fminsearch');
opts.MaxFunEvals = Inf;
opts.MaxIter = 10000;
betaHatML = fminsearch(mynegloglik,beta0,opts)
neglogLH_MLApproach = mynegloglik(betaHatML);
%GLS Approach
wfun = #(xx) n./(xx.*(1-xx));
nlm = fitnlm(x,y,mymodelfun,beta0,'Weights',wfun)
neglogLH_GLSApproach = - nlm.LogLikelihood;
Source:
[1] https://uk.mathworks.com/help/stats/examples/nonlinear-logistic-regression.html
This answer (now) only details which code is used. Please see Tom Lane's answer below for a substantive answer.
Basically, fitnlm.m is a call to NonLinearModel.fit.
When opening NonLinearModel.m, one gets in line 1209:
model.LogLikelihood = getlogLikelihood(model);
getlogLikelihood is itself described between lines 1234-1251.
For instance:
function L = getlogLikelihood(model)
(...)
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
(...)
Please also not that this notably impacts ModelCriterion.AIC and ModelCriterion.BIC, as they are computed using model.LogLikelihood ("thinking" it is the logLikelihood).
To get the corresponding formula for BIC/AIC/..., type:
edit classreg.regr.modelutils.modelcriterion
this is Tom from MathWorks. Take another look at the formula quoted:
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
Remember the normal distribution has a factor (1/sqrt(2*pi)), so taking logs of that gives us -log(2*pi)/2. So the minus sign comes from that and it is part of the log likelihood. The property value is not the negative log likelihood.
One reason for the difference in the two log likelihood values is that the "ML approach" value is computing something based on the discrete probabilities from the binomial distribution. Those are all between 0 and 1, and they add up to 1. The "GLS approach" is computing something based on the probability density of the continuous normal distribution. In this example, the standard deviation of the residuals is about 0.0462. That leads to density values that are much higher than 1 at the peak. So the two things are not really comparable. You would need to convert the normal values to probabilities on the same discrete intervals that correspond to individual outcomes from the binomial distribution.

Simulation of Markov chains

I have the following Markov chain:
This chain shows the states of the Spaceship, which is in the asteroid belt: S1 - is serviceable, S2 - is broken. 0.12 - the probability of destroying the Spaceship by a collision with an asteroid. 0.88 - the probability of that a collision will not be critical. Need to find the probability of a serviceable condition of the ship after the third collision.
Analytical solution showed the response - 0.681. But it is necessary to solve this problem by simulation method using any modeling tool (MATLAB Simulink, AnyLogic, Scilab, etc.).
Do you know what components should be used to simulate this process in Simulink or any other simulation environment? Any examples or links.
First, we know the three step probability transition matrix contains the answer (0.6815).
% MATLAB R2019a
P = [0.88 0.12;
0 1];
P3 = P*P*P
P(1,1) % 0.6815
Approach 1: Requires Econometrics Toolbox
This approach uses the dtmc() and simulate() functions.
First, create the Discrete Time Markov Chain (DTMC) with the probability transition matrix, P, and using dtmc().
mc = dtmc(P); % Create the DTMC
numSteps = 3; % Number of collisions
You can get one sample path easily using simulate(). Pay attention to how you specify the initial conditions.
% One Sample Path
rng(8675309) % for reproducibility
X = simulate(mc,numSteps,'X0',[1 0])
% Multiple Sample Paths
numSamplePaths = 3;
X = simulate(mc,numSteps,'X0',[numSamplePaths 0]) % returns a 4 x 3 matrix
The first row is the X0 row for the starting state (initial condition) of the DTMC. The second row is the state after 1 transition (X1). Thus, the fourth row is the state after 3 transitions (collisions).
% 50000 Sample Paths
rng(8675309) % for reproducibility
k = 50000;
X = simulate(mc,numSteps,'X0',[k 0]); % returns a 4 x 50000 matrix
prob_survive_3collisions = sum(X(end,:)==1)/k % 0.6800
We can bootstrap a 95% Confidence Interval on the mean probability to survive 3 collisions to get 0.6814 ± 0.00069221, or rather, [0.6807 0.6821], which contains the result.
numTrials = 40;
ProbSurvive_3collisions = zeros(numTrials,1);
for trial = 1:numTrials
Xtrial = simulate(mc,numSteps,'X0',[k 0]);
ProbSurvive_3collisions(trial) = sum(Xtrial(end,:)==1)/k;
end
% Mean +/- Halfwidth
alpha = 0.05;
mean_prob_survive_3collisions = mean(ProbSurvive_3collisions)
hw = tinv(1-(0.5*alpha), numTrials-1)*(std(ProbSurvive_3collisions)/sqrt(numTrials))
ci95 = [mean_prob_survive_3collisions-hw mean_prob_survive_3collisions+hw]
maxNumCollisions = 10;
numSamplePaths = 50000;
ProbSurvive = zeros(maxNumCollisions,1);
for numCollisions = 1:maxNumCollisions
Xc = simulate(mc,numCollisions,'X0',[numSamplePaths 0]);
ProbSurvive(numCollisions) = sum(Xc(end,:)==1)/numSamplePaths;
end
For a more complex system you'll want to use Stateflow or SimEvents, but for this simple example all you need is a single Unit Delay block (output = 0 => S1, output = 1 => S2), with a Switch block, a Random block, and some comparison blocks to construct the logic determining the next value of the state.
Presumably you must execute the simulation a (very) large number of times and average the results to get a statistically significant output.
You'll need to change the "seed" of the random generator each time you run the simulation.
This can be done by setting the seed to be "now" (or something similar to that).
Alternatively you could quite easily vectorize the model so that you only need to execute it once.
If you want to simulate this, it is fairly easy in matlab:
servicable = 1;
t = 0;
while servicable =1
t = t+1;
servicable = rand()<=0.88
end
Now t represents the amount of steps before the ship is broken.
Wrap this in a for loop and you can do as many simulations as you like.
Note that this can actually give you the distribution, if you want to know it after 3 times, simply add && t<3 to the while condition.

Scale Factor in Matlabs `conv()`

I have the following code which is used to deconvolve a signal. It works very well, within my error limit...as long as I divide my final result by a very large factor (11000).
width = 83.66;
x = linspace(-400,400,1000);
a2 = 1.205e+004 ;
al = 1.778e+005 ;
b1 = 94.88 ;
c1 = 224.3 ;
d = 4.077 ;
measured = al*exp(-((abs((x-b1)./c1).^d)))+a2;
rect = #(x) 0.5*(sign(x+0.5) - sign(x-0.5));
rt = rect(x/83.66);
signal = conv(rt,measured,'same');
check = (1/11000)*conv(signal,rt,'same');
Here is what I have. measured represents the signal I was given. Signal is what I am trying to find. And check is to verify that if I convolve my slit with the signal I found, I get the same result. If you use what I have exactly, you will see that the check and measured are off by that factor of 11000~ish that I threw up there.
Does anyone have any suggestions. My thoughts are that the slit height is not exactly 1 or that convolve will not actually effectively deconvolve, as I request it to. (The use of deconv only gives me 1 point, so I used convolve instead).
I think you misunderstand what conv (and probably also therefore deconv) is doing.
A discrete convolution is simply a sum. In fact, you can expand it as a sum, using a couple of explicit loops, sums of products of the measured and rt vectors.
Note that sum(rt) is not 1. Were rt scaled to sum to 1, then conv would preserve the scaling of your original vector. So, note how the scalings pass through here.
sum(rt)
ans =
104
sum(measured)
ans =
1.0231e+08
signal = conv(rt,measured);
sum(signal)
ans =
1.0640e+10
sum(signal)/sum(rt)
ans =
1.0231e+08
See that this next version does preserve the scaling of your vector:
signal = conv(rt/sum(rt),measured);
sum(signal)
ans =
1.0231e+08
Now, as it turns out, you are using the same option for conv. This introduces an edge effect, since it truncates some of the signal so it ends up losing just a bit.
signal = conv(rt/sum(rt),measured,'same');
sum(signal)
ans =
1.0187e+08
The idea is that conv will preserve the scaling of your signal as long as the kernel is scaled to sum to 1, AND there are no losses due to truncation of the edges. Of course convolution as an integral also has a similar property.
By the way, where did that quoted factor of roughly 11000 come from?
sum(rt)^2
ans =
10816
Might be coincidence. Or not. Think about it.

Frequency array feeds FFT

The final goal I am trying to achieve is the generation of a ten minutes time series: to achieve this I have to perform an FFT operation, and it's the point I have been stumbling upon.
Generally the aimed time series will be assigned as the sum of two terms: a steady component U(t) and a fluctuating component u'(t). That is
u(t) = U(t) + u'(t);
So generally, my code follows this procedure:
1) Given data
time = 600 [s];
Nfft = 4096;
L = 340.2 [m];
U = 10 [m/s];
df = 1/600 = 0.00167 Hz;
fn = Nfft/(2*time) = 3.4133 Hz;
This means that my frequency array should be laid out as follows:
f = (-fn+df):df:fn;
But, instead of using the whole f array, I am only making use of the positive half:
fpos = df:fn = 0.00167:3.4133 Hz;
2) Spectrum Definition
I define a certain spectrum shape, applying the following relationship
Su = (6*L*U)./((1 + 6.*fpos.*(L/U)).^(5/3));
3) Random phase generation
I, then, have to generate a set of complex samples with a determined distribution: in my case, the random phase will approach a standard Gaussian distribution (mu = 0, sigma = 1).
In MATLAB I call
nn = complex(normrnd(0,1,Nfft/2),normrnd(0,1,Nfft/2));
4) Apply random phase
To apply the random phase, I just do this
Hu = Su*nn;
At this point start my pains!
So far, I only generated Nfft/2 = 2048 complex samples accounting for the fpos content. Therefore, the content accounting for the negative half of f is still missing. To overcome this issue, I was thinking to merge the real and imaginary part of Hu, in order to get a signal Huu with Nfft = 4096 samples and with all real values.
But, by using this merging process, the 0-th frequency order would not be represented, since the imaginary part of Hu is defined for fpos.
Thus, how to account for the 0-th order by keeping a procedure as the one I have been proposing so far?

Finding the difference between two signals

I have two signals, let's call them 'a' and 'b'. They are both nearly identical signals (recorded from the same input and contain the same information) however, because I recorded them at two different 'b' is time shifted by an unknown amount. Obviously, there is random noise in each.
Currently, I am using cross correlation to compute the time shift, however, I am still getting improper results.
Here is the code I am using to calculate the time shift:
function [ diff ] = FindDiff( signal1, signal2 )
%FINDDIFF Finds the difference between two signals of equal frequency
%after an appropritate time shift is applied
% Calculates the time shift between two signals of equal frequency
% using cross correlation, shifts the second signal and subtracts the
% shifted signal from the first signal. This difference is returned.
length = size(signal1);
if (length ~= size(signal2))
error('Vectors must be equal size');
end
t = 1:length;
tx = (-length+1):length;
x = xcorr(signal1,signal2);
[mx,ix] = max(x);
lag = abs(tx(ix));
shifted_signal2 = timeshift(signal2,lag);
diff = signal1 - shifted_signal2;
end
function [ shifted ] = timeshift( input_signal, shift_amount )
input_size = size(input_signal);
shifted = (1:input_size)';
for i = 1:input_size
if i <= shift_amount
shifted(i) = 0;
else
shifted(i) = input_signal(i-shift_amount);
end
end
end
plot(FindDiff(a,b));
However the result from the function is a period wave, rather than random noise, so the lag must still be off. I would post an image of the plot, but imgur is currently not cooperating.
Is there a more accurate way to calculate lag other than cross correlation, or is there a way to improve the results from cross correlation?
Cross-correlation is usually the simplest way to determine the time lag between two signals. The position of peak value indicates the time offset at which the two signals are the most similar.
%// Normalize signals to zero mean and unit variance
s1 = (signal1 - mean(signal1)) / std(signal1);
s2 = (signal2 - mean(signal2)) / std(signal2);
%// Compute time lag between signals
c = xcorr(s1, s2); %// Cross correlation
lag = mod(find(c == max(c)), length(s2)) %// Find the position of the peak
Note that the two signals have to be normalized first to the same energy level, so that the results are not biased.
By the way, don't use diff as a name for a variable. There's already a built-in function in MATLAB with the same name.
Now there are two functions in Matlab:
one called finddelay
and another called alignsignals that can do what you want, I believe.
corr finds a dot product between vectors (v1, v2). If it works bad with your signal, I'd try to minimize a sum of squares of differences (i.e. abs(v1 - v2)).
signal = sin(1:100);
signal1 = [zeros(1, 10) signal];
signal2 = [signal zeros(1, 10)];
for i = 1:length(signal1)
signal1shifted = [signal1 zeros(1, i)];
signal2shifted = [zeros(1, i) signal2];
d2(i) = sum((signal1shifted - signal2shifted).^2);
end
[fval lag2] = min(d2);
lag2
It is computationally worse than cross-calculation which can be speeded up by using FFT. As far as I know you can't do this with euclidean distance.
UPD. Deleted wrong idea about cross-correlation with periodic signals
You can try matched filtering in frequency domain
function [corr_output] = pc_corr_processor (target_signal, ref_signal)
L = length(ref_signal);
N = length(target_signal);
matched_filter = flipud(ref_signal')';
matched_filter_Res = fft(matched_filter,N);
corr_fft = matched_filter_Res.*fft(target_signal);
corr_out = abs(ifft(corr_fft));
The peak of the matched filter maximum-index of corr_out above should give you the lag amount.