How to generate data with a specific trend in MATLAB - matlab

I want to test the Akaike criterion (it is a criterion that gives where do you get a significant change in a time series), but to do that I need to generate data that for example follow a sinusoidal trend, a linear trend with positive or negative slope, a constant trend, etc. So far I have done this but with random numbers, this is:
%Implementation of the Akaike method for Earth sciences.
N=100;
data=zeros(N,1);
for i=1:N
data(i,1)=unifrnd(1,N);
end
%AIC=zeros(N-1,1);
data=rand(1,N);
for k=1:N
%y=datasample(data,k);
AIC(k,1)=k*log(var(data(1:k),1))+(N-k-1)*log(var(data(k+1:N),1));
end
AIC(1)=NaN;
%AIC(N-1)=[];AIC(N)=[];
%disp(AIC)
%plot(AIC)
subplot(2,1,1)
plot(data,'Marker','.')
subplot(2,1,2)
plot(AIC,'Marker','.')
So, How can I generate different data with different trend in MATLAB?
Thanks a lot in advance.

What you can do is first start off with a known curve, then add some noise or random values so that the signal does follow a trend but it is noisy. Given a set of independent values, use these to generate values for your sinusoidal curves, a line with a positive or negative slope and a constant value.
Something like this comes to mind:
X = 1 : N; % N is defined in your code
Y1 = sin(X) + rand(1, N); % Sinusoidal
slope1 = 2; intercept = 3;
Y2 = slope1*X + intercept + rand(1, N); % Line with a positive slope
slope2 = -1; intercept2 = 0.5;
Y3 = slope2*X + intercept2 + rand(1, N); % Line with a negative slope
B = 2;
Y4 = B*ones(1, N) + rand(1, N); % Constant line
rand is a function in MATLAB that uniformly generates floating-point values between [0,1]. Y1, Y2, Y3 and Y4 are the trends you desire where they follow the curve defined but they add a bit of random values so that you don't completely get the trend you want and the noise is designed to decrease how similarity those curves are to the curve you defined. Increase the magnitude of the random values to decrease the similarity.

Related

MATLAB: How to apply ifft correctly to bring a "filtered" signal back to the time doamin?

I am trying to get the output of a Gaussian pulse going through a coax cable. I made a vector that represents a coax cable; I got attenuation and phase delay information online and used Euler's equation to create a complex array.
I FFTed my Gaussian vector and convoluted it with my cable. The issue is, I can't figure out how to properly iFFT the convolution. I read about iFFt in MathWorks and looked at other people's questions. Someone had a similar problem and in the answers, someone suggested to remove n = 2^nextpow2(L) and FFT over length(t) instead. I was able to get more reasonable plot from that and it made sense to why that is the case. I am confused about whether or not I should be using the symmetry option in iFFt. It is making a big difference in my plots. The main reason I added the symmetry it is because I was getting complex numbers in the iFFTed convolution (timeHF). I would truly appreciate some help, thanks!
clc, clear
Fs = 14E12; %1 sample per pico seconds
tlim = 4000E-12;
t = -tlim:1/Fs:tlim; %in pico seconds
ag = 0.5; %peak of guassian
bg = 0; %peak location
wg = 50E-12; %FWHM
x = ag.*exp(-4 .* log(2) .* (t-bg).^2 / (wg).^2); %Gauss. in terms of FWHM
Ly = x;
L = length(t);
%n = 2^nextpow2(L); %test output in time domain with and without as suggested online
fNum = fft(Ly,L);
frange = Fs/L*(0:(L/2)); %half of the spectrum
fNumMag = abs(fNum/L); %divide by n to normalize
% COAX modulation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%phase data
mu = 4*pi*1E-7;
sigma_a = 2.9*1E7;
sigma_b = 5.8*1E6;
a = 0.42E-3;
b = 1.75E-3;
er = 1.508;
vf = 0.66;
c = 3E8;
l = 1;
Lso = sqrt(mu) /(4*pi^3/2) * (1/(sqrt(sigma_a)*a) + 1/(b*sqrt(sigma_b)));
Lo = mu/(2*pi) * log(b/a);
%to = l/(vf*c);
to = 12E-9; %measured
phase = -pi*to*(frange + 1/2 * Lso/Lo * sqrt(frange));
%attenuation Data
k1 = 0.34190;
k2 = 0.00377;
len = 1;
mldb = (k1 .* sqrt(frange) + k2 .* frange) ./ 100 .* len ./1E6;
mldb1 = mldb ./ 0.3048; %original eqaution is in inch
tfMag = 10.^(mldb1./-10);
% combine to make in complex form
tfC = [];
for ii = 1: L/2 + 1
tfC(ii) = tfMag(ii) * (cosd(phase(ii)) + 1j*sind(phase(ii)));
end
%END ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%convolute both h and signal
fNum = fNum(1:L/2+1);
convHF = tfC.*fNum;
convHFMag = abs(convHF/L);
timeHF = ifft(convHF, length(t), 'symmetric'); %this is the part im confused about
% Ignore,
% tfC(numel(fNum)) = 0;
% convHF = tfC.*fNum;
% convHFMag = abs(convHF/n);
% timeHF = ifft(convHF);
%% plotting
% subplot(2, 2, 1);
% plot(t, Ly)
% title('Gaussian input');
% xlabel('time in seconds')
% ylabel('V')
% grid
subplot(2, 2, 1)
plot(frange, abs(tfC(1: L/2 + 1)));
set(gca, 'Xscale', 'log')
title('coax cable model')
xlabel('Hz')
ylabel('|H(s)|V/V')
grid
ylim([0 1.1])
subplot(2, 2, 2);
plot(frange, convHFMag(1:L/2+1), '.-', frange, fNumMag(1:L/2+1)) %make both range and function the same lenght
title('The input signal Vs its convolution with coax');
xlabel('Hz')
ylabel('V')
legend('Convolution','Lorentzian in frequecuency domain');
xlim([0, 5E10])
grid
subplot(2, 2, [3, 4]);
plot(t, Ly, t, timeHF)
% plot(t, real(timeHF(1:length(t)))) %make both range and function the same lenght
legend('Input', 'Output')
title('Signal at the output')
xlabel('time in seconds')
ylabel('V')
grid
It's important to understand deeply the principles of the FFT to use it correctly.
When you apply Fourier transform to a real signal, the coefficients at negative frequencies are the conjugate of the ones at positive frequencies. When you apply FFT to a real numerical signal, you can show mathematically that the conjugates of the coefficients that should be at negative frequencies (-f) will now appear at (Fsampling-f) where Fsampling=1/dt is the sampling frequency and dt the sampling period. This behavior is called aliasing and is present when you apply fft to a discrete time signal and the sampling period should be chosen small enaough for those two spectra not to overlap Shannon criteria.
When you want to apply a frequency filter to a signal, we say that we keep the first half of the spectrum because the high frequencies (>Fsampling/2) are due to aliasing and are not characteristics of the original signal. To do so, we put zeros on the second half of the spectra before multiplying by the filter. However, by doing so you also lose half of the amplitude of the original signal that you will not recover with ifft. The option 'symmetric' enable to recover it by adding in high frequencis (>Fsampling/2) the conjugate of the coefficients at lower ones (<Fsampling/2).
I simplified the code to explain briefly what's happening and implemented for you at line 20 a hand-made symmetrisation. Note that I reduced the sampling period from one to 100 picoseconds for the spectrum to display correctly:
close all
clc, clear
Fs = 14E10; %1 sample per pico seconds % CHANGED to 100ps
tlim = 4000E-12;
t = -tlim:1/Fs:tlim; %in pico seconds
ag = 0.5; %peak of guassian
bg = 0; %peak location
wg = 50E-12; %FWHM
NT = length(t);
x_i = ag.*exp(-4 .* log(2) .* (t-bg).^2 / (wg).^2); %Gauss. in terms of FWHM
fftx_i = fft(x_i);
f = 1/(2*tlim)*(0:NT-1);
fftx_r = fftx_i;
fftx_r(floor(NT/2):end) = 0; % The removal of high frequencies due to aliasing leads to losing half the amplitude
% HER YOU APPLY FILTER
x_r1 = ifft(fftx_r); % without symmetrisation (half the amplitude lost)
x_r2 = ifft(fftx_r, 'symmetric'); % with symmetrisation
x_r3 = ifft(fftx_r+[0, conj(fftx_r(end:-1:2))]); % hand-made symmetrisation
figure();
subplot(211)
hold on
plot(t, x_i, 'r')
plot(t, x_r2, 'r-+')
plot(t, x_r3, 'r-o')
plot(t, x_r1, 'k--')
hold off
legend('Initial', 'Matlab sym', 'Hand made sym', 'No sym')
title('Time signals')
xlabel('time in seconds')
ylabel('V')
grid
subplot(212)
hold on
plot(f, abs(fft(x_i)), 'r')
plot(f, abs(fft(x_r2)), 'r-+')
plot(f, abs(fft(x_r3)), 'r-o')
plot(f, abs(fft(x_r1)), 'k--')
hold off
legend('Initial', 'Matlab sym', 'Hand made sym', 'No sym')
title('Power spectra')
xlabel('frequency in hertz')
ylabel('V')
grid
Plots the result:
Do not hesitate if you have further questions. Good luck!
---------- EDIT ----------
The amplitude of discrete Fourier transform is not the same as the continuous one. If you are interested in showing signal in frequency domain, you will need to apply a normalization based on the convention you have chosen. In general, you use the convention that the amplitude of the Fourier transform of a Dirac delta function has amplitude one everywhere.
A numerical Dirac delta function has an amplitude of one at an index and zeros elsewhere and leads to a power spectrum equal to one everywhere. However in your case, the time axis has sample period dt, the integral over time of a numerical Dirac in that case is not 1 but dt. You must normalize your frequency domain signal by multiplying it by a factor dt (=1picoseceond in your case) to respect the convention. You can also note that this makes the frequency domain signal homogeneous to [unit of the original multiplied by a time] which is the correct unit of a Fourier transform.

MATLAB fft vs Mathematical fourier

I am using the following code to generate a fft and mathematical Fourier transform of a signal. I want to then mathematically recreate the original signal of the fft. This works on the mathematical signal but not on the fft since it is a Discrete Transform. Does anyone know what change I can make to my inverse transform equation that will make it work for fft?
clear all; clc;
N = 1024;
N2 = 1023;
SNR = -10;
fs = 1024;
Ts = 1/fs;
t = (0:(N-1))*Ts;
x = 0.5*sawtooth(2*2*pi*t);
x1 = fft(x);
Magnitude1 = abs(x1);
Phase1 = angle(x1)*360/(2*pi);
for m = 1:1024
f(m) = m; % Sinusoidal frequencies
a = (2/N)*sum(x.*cos(2*pi*f(m)*t)); % Cosine coeff.
b = (2/N)*sum(x.*sin(2*pi*f(m)*t)); % Sine coeff
Magnitude(m) = sqrt(a^2 + b^2); % Magnitude spectrum
Phase(m) = -atan2(b,a); % Phase spectrum
end
subplot(2,1,1);
plot(f,Magnitude1./512); % Plot magnitude spectrum
......Labels and title.......
subplot(2,1,2);
plot(f,Magnitude,'k'); % Plot phase spectrum
ylabel('Phase (deg)','FontSize',14);
pause();
x2 = zeros(1,1024); % Waveform vector
for m = 1:24
f(m) = m; % Sinusoidal frequencies
x2 = (1/m)*(x2 + Magnitude1(m)*cos(2*pi*f(m)*t + Phase1(m)));
end
x3 = zeros(1,1024); % Waveform vector
for m = 1:24
f(m) = m; % Sinusoidal frequencies
x3 = (x3 + Magnitude(m)*cos(2*pi*f(m)*t + Phase(m)));
end
plot(t,x,'--k'); hold on;
plot(t,x2,'k');
plot(t,x3,'b');```
There are a few comments about the Fourier Transform, and I hope I can explain everything for you. Also, I don't know what you mean by "Mathematical Fourier transform", as none of the expressions in your code is resembles the Fourier series of the sawtooth wave.
To understand exactly what the fft function does, we can do things step by step.
First, following your code, we create and plot one period of the sawtooth wave.
n = 1024;
fs = 1024;
dt = 1/fs;
t = (0:(n-1))*dt;
x = 0.5*sawtooth(2*pi*t);
figure; plot(t,x); xlabel('t [s]'); ylabel('x');
We can now calculate a few things.
First, the Nyquist frequency, the maximum detectable frequency from the samples.
f_max = 0.5*fs
f_max =
512
Also, the minimum detectable frequency,
f_min = 1/t(end)
f_min =
1.000977517106549
Calculate now the discrete Fourier transform with MATLAB function:
X = fft(x)/n;
This function obtains the complex coefficients of each term of the discrete Fourier transform. Notice it calculates the coefficients using the exp notation, not in terms of sines and cosines. The division by n is to guarantee that the first coefficient is equal to the arithmetic mean of the samples
If you want to plot the magnitude/phase of the transformed signal, you can type:
f = linspace(f_min,f_max,n/2); % frequency vector
a0 = X(1); % constant amplitude
X(1)=[]; % we don't have to plot the first component, as it is the constant amplitude term
XP = X(1:n/2); % we get only the first half of the array, as the second half is the reflection along the y-axis
figure
subplot(2,1,1)
plot(f,abs(XP)); ylabel('Amplitude');
subplot(2,1,2)
plot(f,angle(XP)); ylabel('Phase');
xlabel('Frequency [Hz]')
What does this plot means? It shows in a figure the amplitude and phase of the complex coefficients of the terms in the Fourier series that represent the original signal (the sawtooth wave). You can use this coefficients to obtain the signal approximation in terms of a (truncated) Fourier series. Of course, to do that, we need the whole transform (not only the first half, as it is usual to plot it).
X = fft(x)/n;
amplitude = abs(X);
phase = angle(X);
f = fs*[(0:(n/2)-1)/n (-n/2:-1)/n]; % frequency vector with all components
% we calculate the value of x for each time step
for j=1:n
x_approx(j) = 0;
for k=1:n % summation done using a for
x_approx(j) = x_approx(j)+X(k)*exp(2*pi*1i/n*(j-1)*(k-1));
end
x_approx(j) = x_approx(j);
end
Notice: The code above is for clarification and does not intend to be well coded. The summation can be done in MATLAB in a much better way than using a for loop, and some warnings will pop up in the code, warning the user to preallocate each variable for speed.
The above code calculates the x(ti) for each time ti, using the terms of the truncated Fourier series. If we plot both the original signal and the approximated one, we get:
figure
plot(t,x,t,x_approx)
legend('original signal','signal from fft','location','best')
The original signal and the approximated one are nearly equal. As a matter of fact,
norm(x-x_approx)
ans =
1.997566360514140e-12
Is almost zero, but not exactly zero.
Also, the plot above will issue a warning, due to the use of complex coefficients when calculating the approximated signal:
Warning: Imaginary parts of complex X and/or Y arguments ignored
But you can check that the imaginary term is very close to zero. It is not exactly zero due to roundoff errors in the computations.
norm(imag(x_approx))
ans =
1.402648396024229e-12
Notice in the codes above how to interpret and use the results from the fft function and how they are represented in the exp form, not on terms of sines and cosines, as you coded.

Interpolation using chebyshev points

Interpolate the Runge function of Example 10.6 at Chebyshev points for n from 10 to 170
in increments of 10. Calculate the maximum interpolation error on the uniform evaluation
mesh x = -1:.001:1 and plot the error vs. polynomial degree as in Figure 10.8 using
semilogy. Observe spectral accuracy.
The runge function is given by: f(x) = 1 / (1 + 25x^2)
My code so far:
x = -1:0.001:1;
n = 170;
i = 10:10:170;
cx = cos(((2*i + 1)/(2*(n+1)))*pi); %chebyshev pts
y = 1 ./ (1 + 25*x.^2); %true fct
%chebyshev polynomial, don't know how to construct using matlab
yc = polyval(c, x); %graph of approx polynomial fct
plot(x, yc);
mErr = (1 / ((2.^n).*(n+1)!))*%n+1 derivative of f evaluated at max x in [-1,1], not sure how to do this
%plotting stuff
I know very little matlab, so I am struggling on creating the interpolating polynomial. I did some google work, but I was confused with the current functions as I didn't find one that just simply took in points and the polynomial to be interpolated. I am also a bit confused in this case of whether I should be doing i = 0:1:n and n=10:10:170 or if n is fixed here. Any help is appreciated, thank you
Since you know very little about MATLAB, I will try explain everything step by step:
First, to visualize the Runge function, you can type:
f = #(x) 1./(1+25*x.^2); % Runge function
% plot Runge function over [-1,1];
x = -1:1e-3:1;
y = f(x);
figure;
plot(x,y); title('Runge function)'); xlabel('x');ylabel('y');
The #(x) part of the code is a function handle, a very useful feature of MATLAB. Notice the function is properly vecotrized, so it can receive as an argument a variable or an array. The plot function is straightforward.
To understand the Runge phenomenon, consider a linearly spaced vector of [-1,1] of 10 elements and use these points to obtain the interpolating (Lagrange) polynomial. You get the following:
% 10 linearly spaced points
xc = linspace(-1,1,10);
yc = f(xc);
p = polyfit(xc,yc,9); % gives the coefficients of the polynomial of degree 10
hold on; plot(xc,yc,'o',x,polyval(p,x));
The polyfit function does a polynomial curve fitting - it obtains the coefficients of the interpolating polynomial, given the poins x,y and the degree of the polynomial n. You can easily evaluate the polynomial at other points with the polyval function.
Obseve that, close to the end domains, you get an oscilatting polynomial and the interpolation is not a good approximation of the function. As a matter of fact, you can plot the absolute error, comparing the value of the function f(x) and the interpolating polynomial p(x):
plot(x,abs(y-polyval(p,x))); xlabel('x');ylabel('|f(x)-p(x)|');title('Error');
This error can be reduced if, instead of using a linearly space vector, you use other points to do the interpolation. A good choice is to use the Chebyshev nodes, which should reduce the error. As a matter of fact, notice that:
% find 10 Chebyshev nodes and mark them on the plot
n = 10;
k = 1:10; % iterator
xc = cos((2*k-1)/2/n*pi); % Chebyshev nodes
yc = f(xc); % function evaluated at Chebyshev nodes
hold on;
plot(xc,yc,'o')
% find polynomial to interpolate data using the Chebyshev nodes
p = polyfit(xc,yc,n-1); % gives the coefficients of the polynomial of degree 10
plot(x,polyval(p,x),'--'); % plot polynomial
legend('Runge function','Chebyshev nodes','interpolating polynomial','location','best')
Notice how the error is reduced close to the end domains. You don't get now that high oscillatory behaviour of the interpolating polynomial. If you plot the error, you will observe:
plot(x,abs(y-polyval(p,x))); xlabel('x');ylabel('|f(x)-p(x)|');title('Error');
If, now, you change the number of Chebyshev nodes, you will get an even better approximation. A little modification on the code lets you run it again for different numbers of nodes. You can store the maximum error and plot it as a function of the number of nodes:
n=1:20; % number of nodes
% pre-allocation for speed
e_ln = zeros(1,length(n)); % error for the linearly spaced interpolation
e_cn = zeros(1,length(n)); % error for the chebyshev nodes interpolation
for ii=1:length(n)
% linearly spaced vector
x_ln = linspace(-1,1,n(ii)); y_ln = f(x_ln);
p_ln = polyfit(x_ln,y_ln,n(ii)-1);
e_ln(ii) = max( abs( y-polyval(p_ln,x) ) );
% Chebyshev nodes
k = 1:n(ii); x_cn = cos((2*k-1)/2/n(ii)*pi); y_cn = f(x_cn);
p_cn = polyfit(x_cn,y_cn,n(ii)-1);
e_cn(ii) = max( abs( y-polyval(p_cn,x) ) );
end
figure
plot(n,e_ln,n,e_cn);
xlabel('no of points'); ylabel('maximum absolute error');
legend('linearly space','chebyshev nodes','location','best')

finding and plotting second derivative of a signal

clc
clear all
x= -2:0.1:4;
y= (3*x.^3)-(26*x)+10;
y1= diff(y)./diff(x);
y2= diff(y1)./diff(x);
plot(x,y,'R')
hold on
plot(x,y1,'G')
hold on
plot(x,y2,'B')
Is this right way to find second derivative?
It's not finding the second derivative and giving an error that matrix dimensions must agree in line 6.
You have values y that correspond to the values x. While calculating the approximation to the first derivatives y1 of y, you use finite differences. These kind of finite differences approximate the derivative best at the points inbetween x. This means your calculation of the approximation of the second derivative should involve (x(1:end-1)+x(2:end))/2:
x1 = (x(1:end-1) + x(2:end))/2;
y2 = diff(y1) / diff(x1);
Note that these approximations y2 best approximate the second derivative at the positions x2 = (x1(1:end-1) + x1(2:end))/2.
All in all your code might look like this:
%// Clear command window, delete all variables, close all figures
clc;
clear all;
close all;
%// Define function
f = #(x) (3*x.^3)-(26*x)+10;
%// Define positions
x = -2:0.1:4;
%// Calculate f(x)
y= f(x);
%// Calculate approximation to first derivative and their positions
x1 = (x(1:end-1) + x(2:end))/2;
y1 = diff(y) ./ diff(x);
%// Calculate approximation to second derivative and their positions
x2 = (x1(1:end-1) + x1(2:end))/2;
y2 = diff(y1) ./ diff(x1);
%// Plot
plot(x, y, 'R',...
x1,y1,'G',...
x2,y2,'B');
legend('f','f''', 'f''''');

Integration of normal probability distribution function with random numbers

function Y=normpdf(X)
syms X
Y = normpdf(X);
int(Y,X,1,inf)
end
I need to integrate normal pdf function from 1 to infinity for the case of N=100 where N is the total numbers generated.I know i need to use randn() for generating random numbers but i dont know how to use it in this situation.
You could have N = 100 random numbers from t = randn(N, 1);. First, we sort with t = sort(t), then the integrated PDF, i.e. cumulative density function is approximated by your samples with p = (1 : N) / N for t as you can see with plot(t, p). It will overlap well with hold on, plot(t, normcdf(t), 'r').
A perhaps more intuitive approach is to partition the x axis into bins in order to estimate the CDF:
N = 100; % number of samples
t = randn(N, 1); % random data
x = linspace(-10,10,200); % define bins
estim_cdf = mean(bsxfun(#le, t, x)); % estimate CDF
plot(x, estim_cdf);
hold on
plot(x, normcdf(x), 'r')
Note that #s.bandara's solution can be interpreted as the limiting case of this as the number of bins tends to infinity, and therefore it probably gives more accurate results.