Spectrum Derivative in MATLAB, the end point problem - matlab

I am trying to take the derivative of the a spectrum with 125 bands using the following lines:
dW=diff(wavelength);
dR=diff(data);
df=dR./dW;
problem is in the next step i want to compare it with original spectrum numerically and also visually by plotting, but the size of df is 124 however my original wavelength is 125. Question is do i have to remove the first or the last band? however the output of some spectral analysis software is not changing the size. taking the average of bands also does not work, it make the graph to show crazy behavior.

diff is basically:
Y = [X(2)-X(1) X(3)-X(2) ... X(m)-X(m-1)]
which means it has to be one shorter than your input (you can't subtract something from nothing, right?).
What you have to do of course depends on what you want to do, but the least "meaning-altering" approach (kind of keeping causality with respect to sampling times) would be to prepend your dW and dR with a single arbitrary value.
By the way, your ratio df=dR./dW might have a lot of NaNs if dW has zeros (which happens as soon as two consecutive data values are the same).

Related

finding power spectrum of signal using two approaches

I am trying to find power spectrum of the signal. The length of the signal is 100000, sample frequency is 1000Hz,and the number of points is 100000. I found the power spectrum using two approaches. The first one is by taking all the length as one part and found power spectrum for it while the second approach is by dividing the signal into 100*1000and find spectrum for each row then get the mean for all rows. My problem is that I must get the same answer in both approaches but I got different answers. I do not know what is the error in my code.
N=100000;
SF=1000;
a=0.1;
b=0.3;
amplitude1=1;
amplitude2=0.5;
t=0:1/SF:100;
f1=SF*a;
f2=SF*b;
A=amplitude1*sin(2*pi*f1*t)+amplitude2*sin(2*pi*f2*t);
Y=2*randn(1,length(A))+A;
bin=[0 :N/2];
fax_Hz=(bin*SF)/N;
FFT=fft(Y);
spectra=2/(SF*length(Y))*(FFT.*conj(FFT));
plot(fax_Hz,spectra(1,1:50001));
D=reshape(Y(1,1:100000),[100,1000]);
M=length(D(1,:));
for i=1:100
FFT_1(i,:)=fft(D(i,:));
S(i,:)=(2/(SF*M))*(FFT_1(i,:).*conj(FFT_1(i,:)));
end
S_f=mean(S);
figure
plot (S_f);
I just update the code. I do not know but when I added noise to signal the two plots looks shifted.
The main problem is with reshape you are working with each row being a separate sequence. Reshape however fills the first column before moving to the second one.
You can use the following instead.
D=reshape(A(1,1:100000),[1000,100]).';
Normalization is another problem. You can either use ifft instead of fft as it is normalized by default (not sure why). Or alternatively keep your normalization and instead of using mean you should can use sum, maybe that is due to a mistake you might have made. There still seems to be a small discrepancy in the amplitudes, not sure where that is coming from.
At the end to plot use the following:
bin=[0 :N];
fax_Hz=(bin*SF)/N;
FFT=ifft(A);
spectra=FFT.*conj(FFT);
plot(fax_Hz,spectra); hold on
D=reshape(A(1,1:100000),[1000,100]).';
M=length(D(1,:));
for i=1:100
FFT_1(i,:)=ifft(D(i,:));
S(i,:)=FFT_1(i,:).*conj(FFT_1(i,:));
end
S_f=mean(S);
plot(fax_Hz(1:100:end-1), S_f);
Note: the fax_Hz(1:100:end-1) is a hacky way of getting the length of the vectors to be the same.

Time Series from spectrum

I am having a samll problem while converting a spectrum to a time series. I have read many article sand I htink I am applying the right procedure but I do not get the right results. Could you help to find the error?
I have a time series like:
When I compute the spectrum I do:
%number of points
nPoints=length(timeSeries);
%time interval
dt=time(2)-time(1);
%Fast Fourier transform
p=abs(fft(timeSeries))./(nPoints/2);
%power of positive frequencies
spectrum=p(1:(nPoints/2)).^2;
%frequency
dfFFT=1/tDur;
frequency=(1:nPoints)*dfFFT;
frequency=frequency(1:(nPoints)/2);
%plot spectrum
semilogy(frequency,spectrum); grid on;
xlabel('Frequency [Hz]');
ylabel('Power Spectrum [N*m]^2/[Hz]');
title('SPD load signal');
And I obtain:
I think the spectrum is well computed. However now I need to go back and obtain a time series from this spectrum and I do:
df=frequency(2)-frequency(1);
ap = sqrt(2.*spectrum*df)';
%random number form -pi to pi
epsilon=-pi + 2*pi*rand(1,length(ap));
%transform to time series
randomSeries=length(time).*real(ifft(pad(ap.*exp(epsilon.*i.*2.*pi),length(time))));
%Add the mean value
randomSeries=randomSeries+mean(timeSeries);
However, the plot looks like:
Where it is one order of magnitude lower than the original serie.
Any recommendation?
There are (at least) two things going on here. The first is that you are throwing away information, and then substituting random numbers for that information.
The FFT of a real sequence is a sequence of complex numbers consisting of a real and imaginary part. Converting those numbers to polar form gives you magnitude and phase angle. You are capturing the magnitude part with p=aps(fft(...)), but you are not capturing the phase angle (which would involve atan2(...)). You are then making up random numbers (epsilon=...) and using those to replace the original numbers when you reconstruct your time-series. Also, as the FFT of a real sequence has a particular symmetry, substituting random numbers for the phase angle destroys that symmetry, which means that the IFFT will in general no longer be a real sequence, but a sequence of complex numbers - and again, you're only looking at the real portion of the IFFT, so you're throwing away information again. If this is an audio signal, the results may sound somewhat like the original (or they may be completely different), but the waveform definitely won't match...
The second issue is that in many implementations, ifft(fft(...)) will scale the result by the number of points in the signal. There are several different ways to avoid that, with differing results, but sometimes more attractive in different scenarios, depending on what you are trying to do. You can either scale the fft() result before you do the ifft(), or scale the ifft() result at the end, or in some cases, I've even seen both being scaled by a factor of sqrt(N) - doing it twice has the end result of scaling the final result by N, but it is a bit less efficient since you do the scaling twice...

Determine the position and value of peak

I have a graph with five major peaks. I'd like to find the position and value of the first peak (the one furthest to the right). I have more than 100 different plots of this and the peak grows and shrinks in size in the various plots, and will need to use a for loop. I'm just stuck on determining the x and y values to a large number of significant figures using Matlab code.
Here's one of the many plots:
If you know for sure you're always gonna have 5 peaks I think the FileExchange function extrema will be very helpful, see here.
This will return you the maxima (and minima if needed) in descending order, so the first elements of output zmax and imax are respectively the maximal value and its index, their second elements are the second maximum value and its index and so on.
In the case if the peak you need is always the smallest of the five you'll just need zmax(5) and imax(5) to determine the 5th biggest maximum.
If you have access to Signal Processing Toolbox, findpeaks is the function you are looking for. It can be invoked using different options including number of peaks, which can be helpful when that information is available.

Detect incorrect points in a homogeneous surface

In my project i have hige surfaces of 20.000 points computed by a algorithm. This algorithm, sometimes, has an error, computing 1 or more points in an small area incorrectly.
This error can not be solved in the algorithm, but needs to be detected afterwards.
The error can be seen in the next figure:
As you can see, there is a point wrongly computed that not only breaks the full homogeneous surface, but also destroys the aestetics of the plot (wich is also important in the project.)
Sometimes it can be more than a point, in general no more than 5 or 6. The error is allways the Z axis, so no need to check X and Y
I have been squeezing my mind to find a bit "generic" algorithm to detect this poitns.
I thougth that maybe taking patches of surface and meaning the Z, then detecting the points out of the variance... but I dont think it will work allways.
Any ideas?
NOTE: I dont want someone to write code for me, just an idea.
PD: relevant code for the avobe image:
[x,y] = meshgrid([-2:.07:2]);
Z = x.*exp(-x.^2-y.^2);
subplot(1,2,1)
surf(x,y,Z,gradient(Z))
subplot(1,2,2)
Z(35,35)=Z(35,35)+0.3;
surf(x,y,Z,gradient(Z))
The standard trick is to use a Laplacian, looking for the largest outliers. (This is not unlike what Mohsen posed for an answer, but is actually a bit easier.) You could even probably do it with conv2, so it would be pretty efficient.
I could offer a few ways to implement the idea. A simple one is to use my gridfit tool, found on the File Exchange. (Gridfit essentially uses a Laplacian for its smoothing operation.) Fit the surface with all points included, then look for the single point that was perturbed the most by the fit. Exclude it, then rerun the fit, again looking for the largest outlier. (With gridfit, you can use weights to give points a zero weight, a simple way to exclude a point or list of points.) When the largest perturbation that was needed is small enough, you can decide to stop the process. A nice thing is gridfit will also impute new values for the outliers, filling in all of the holes.
A second approach is to use the Laplacian directly, in more of a filtering approach. Here, you simply compute a value at each point that is the average of each neighbor to the left, right, above, and below. The single value that is most largely in disagreement with its computed average is replaced with a new value. Or, you can use a weighted average of the new value with the old one there. Again, iterate until the process does not generate anything larger than some tolerance. (This is the basis of an old outlier detection and correction scheme that I recall from the Fortran IMSL libraries, but probably dates back to roughly 30 years ago.)
Since your functions seems to vary smoothly these abrupt changes can be detected by looking into the derivatives. You can
Take the derivative in one direction
Calculate mean and standard deviation of derivative
Find the points by looking for points that are further from mean by certain multiple of standard deviation.
Here is the code
U=diff(Z);
V=(U-mean(U(:)))/std(U(:));
surf(x(2:end,:),y(2:end,:),V)
V=[zeros(1,size(V,2)); V];
V(abs(V)<10)=0;
V=sign(V);
W=cumsum(V);
[I,J]=find(W);
outliers = [I, J];
For your example you get this plot for V with a peak at around 21.7 while second peak is at around 1.9528, so maybe a threshold of 10 is ok.
and running the code returns
outliers =
35 35
The need for cumsum is for the cases that you have a patch of points next to each other that are incorrect.

spike in my inverse fourier transform

I am trying to compare two data sets in MATLAB. To do this I need to filter the data sets by Fourier transforming the data, filtering it and then inverse Fourier transforming it.
When I inverse Fourier transform the data however I get a spike at either end of the red data set (picture shows the first spike), it should be close to zero at the start, like the blue line. I am comparing many data sets and this only happens occasionally.
I have three questions about this phenomenon. First, what may be causing it, secondly, how can I remedy it, and third, will it affect the data further along the time series or just at the beginning and end of the time series as it appears to from the picture.
Any help would be great thanks.
When using DFT you must remember the DFT assumes a Periodic Signal (As a Superposition of Harmonic Functions).
As you can see, the start point is exact continuation of the last point in harmonic function manner.
Did you perform any Zero Padding in the Spectrum Domain?
Anyhow, Windowing might reduce the Overshooting.
Knowing more about the filter and the Original data might be helpful.
If you say spike near zero frequencies, I answer check the DC component.
You seem interested by the shape, so doing
x = x - mean(x)
or
x -= mean(x)
or
x -= x.mean()
(I love numpy!)
will just constrain the dataset to begin with null amplitude at zero-frequency and to go ahead with comapring the spectra's amplitude.
(as a side-note: did you check that you approprately use fftshift and ifftshift? this has always been the source of trouble for me)
Could be the numerical equivalent of Gibbs' phenomenon. If that's correct, there's no way to remedy it except for filtering.