Fit sine wave with a distorted time-base - matlab

I want to know the best way to fit a sine-wave with a distorted time base, in Matlab.
The distortion in time is given by a n-th order polynomial (n~10), of the form t_distort = P(t).
For example, consider the distortion t_distort = 8 + 12t + 6t^2 + t^3 (which is just the power series expansion of (t-2)^3).
This will distort a sine-wave as follows:
I want to be able to find the distortion given this distorted sine-wave. (i.e. I want to find the function t = G(t_distort), but t_distort = P(t) is unknown.)

If your resolution is high enough, then this is basically an angle-demodulation problem. The standard way to demodulate an angle-modulated signal is to take the derivative, followed by an envelope detector, followed by an integrator.
Since I don't know the exact numbers you're using, I'll show an example with my own numbers. Suppose my original timebase has 10 million points from 0 to 100:
t = 0:0.00001:100;
I then get the distorted timebase and calculate the distorted sine wave:
td = 0.02*(t+2).^3;
yd = sin(td);
Now I can demodulate it. Take the "derivative" using approximate differences divided by the step size from before:
ydot = diff(yd)/0.00001;
The envelope can be easily detected:
envelope = abs(hilbert(ydot));
This gives an approximation for the derivative of P(t). The last step is an integrator, which I can approximate using a cumulative sum (we have to scale it again by the step size):
tdguess = cumsum(envelope)*0.00001;
This gives a curve that's almost identical to the original distorted timebase (so, it gives a good approximation of P(t)):
You won't be able to get the constant term of the polynomial since we made our approximation from its derivative, which of course eliminates the constant term. You wouldn't even be able to find a unique constant term mathematically from just yd, since infinitely many values will yield the same distorted sine wave. You can get the other three coefficients of P(t) using polyfit if you know the degree of P(t) (ignore the last number, it's the constant term):
>> polyfit(t(1:10000000), tdguess, 3)
ans =
0.0200 0.1201 0.2358 0.4915
This is pretty close to the original, aside from the constant term: 0.02*(t+2)^3 = 0.02t^3 + 0.12t^2 + 0.24t + 0.16.
You wanted the inverse mapping Q(t). Can you do that knowing a close approximation for P(t) as found so far?

Here's an analytical driven route that takes asin of the signal with proper unwrapping of the angle. Then you can fit a polynomial using polyfit on the angle or using other fit methods (search for fit and see). Last, take a sin of the fitted function and compare the signal to the fitted one... see this pedagogical example:
% generate data
t=linspace(0,10,1e2);
x=0.02*(t+2).^3;
y=sin(x);
% take asin^2 to obtain points of "discontinuity" where then asin hits +-1
da=(asin(y).^2);
[val locs]=findpeaks(da); % this can be done in many other ways too...
% construct the asin according to the proper phase unwrapping
an=NaN(size(y));
an(1:locs(1)-1)=asin(y(1:locs(1)-1));
for n=2:numel(locs)
an(locs(n-1)+1:locs(n)-1)=(n-1)*pi+(-1)^(n-1)*asin(y(locs(n-1)+1:locs(n)-1));
end
an(locs(n)+1:end)=n*pi+(-1)^(n)*asin(y(locs(n)+1:end));
r=~isnan(an);
p=polyfit(t(r),an(r),3);
figure;
subplot(2,1,1); plot(t,y,'.',t,sin(polyval(p,t)),'r-');
subplot(2,1,2); plot(t,x,'.',t,(polyval(p,t)),'r-');
title(['mean error ' num2str(mean(abs(x-polyval(p,t))))]);
p =
0.0200 0.1200 0.2400 0.1600
The reason I preallocate with NaNand avoid taking the asin at points of discontinuity (locs) is to reduce the error of the fit later. As you can see, for a 100 points between 0,10 the average error is of the order of floating point accuracy, and the polynomial coefficients are as exact as you can have them.
The fact that you are not taking a derivative (as in the very elegant Hilbert transform) allows to be numerically exact. For the same conditions the Hilbert transform solution will have a much bigger average error (order of unity vs 1e-15).
The only limitation of this method is that you need enough points in the regime where the asin flips direction and that function inside the sin is well behaved. If there's a sampling issue you can truncate the data and only maintain a smaller range closer to zero, such that it'll be enough to characterize the function inside the sin. After all, you don't need millions op points to fit to a 3 parameter function.

Related

Find zero crossing points in a signal and plot them matlab

I have a signal 's' of voice of which you can see an extract here:
I would like to plot the zero crossing points in the same graph. I have tried with the following code:
zci = #(v) find(v(:).*circshift(v(:), [-1 0]) <= 0); % Returns Zero-Crossing Indices Of Argument Vector
zx = zci(s);
figure
set(gcf,'color','w')
plot(t,s)
hold on
plot(t(zx),s(zx),'o')
But it does not interpole the points in which the sign change, so the result is:
However, I'd like that the highlighted points were as near as possible to zero.
I hope someone can help me. Thanks you for your responses in advanced.
Try this?
w = 1;
crossPts=[];
for k=1:(length(s)-1)
if (s(k)*s(k+1)<0)
crossPts(w) = (t(k)+t(k+1))/2;
w = w + 1;
end
end
figure
set(gcf,'color','w')
plot(t,s)
hold on
plot(t, s)
plot(crossPts, zeros(length(crossPts)), 'o')
Important questions: what is the highest frequency conponent of the signal you are measuring? Can you remeasure this signal? What is your sampling rate? What is this analysis for? (Schoolwork or scholarly research). You may have quite a bit of trouble measuring the zeros of this function with any significance or accurracy because it looks like your waveform has a frequency greater than half of your sampling rate (greater than your Nyquist frequency). Upsampling/interpolating your entire waveform will allow you to find the zeros much more precisely (but with no greater degree of accurracy) but this is a huge no-no in the scientific community. While my method may not look super pretty, it's the most accurate method that doesn't make unsafe assumptions. If you just want it to look pretty, I would recommend interp1 and using the 'Spline' method. You can interpolate the whole waveform and then use the above answer to find more accurate zeros.
Also, you could calculate the zeros on the interpolated waveform and then display it on the raw data.
A remotely possible solution to improve your data;
If you're measuring a human voice, why not try filtering at the range of human speech? This should be fine mathematically and could possibly improve your waveform.

Transforming draws in Matlab from Gaussian mixture to uniform

Consider the following draws for a 2x1 vector in Matlab with a probability distribution that is a mixture of two Gaussian components.
P=10^3; %number draws
v=1;
%First component
mu_a = [0,0.5];
sigma_a = [v,0;0,v];
%Second component
mu_b = [0,8.2];
sigma_b = [v,0;0,v];
%Combine
MU = [mu_a;mu_b];
SIGMA = cat(3,sigma_a,sigma_b);
w = ones(1,2)/2; %equal weight 0.5
obj = gmdistribution(MU,SIGMA,w);
%Draws
RV_temp = random(obj,P);%Px2
% Transform each component of RV_temp into a uniform in [0,1] by estimating the cdf.
RV1=ksdensity(RV_temp(:,1), RV_temp(:,1),'function', 'cdf');
RV2=ksdensity(RV_temp(:,2), RV_temp(:,2),'function', 'cdf');
Now, if we check whether RV1 and RV2 are uniformly distributed on [0,1] by doing
ecdf(RV1)
ecdf(RV2)
we can see that RV1 is uniformly distributed on [0,1] (the empirical cdf is close to the 45 degree line) while RV2 is not.
I don't understand why. It seems that the more distant are mu_a(2)and mu_b(2), the worse the job done by ksdensity with a reasonable number of draws. Why?
When you have a mixture of N(0.5,v) and N(8.2,v) then the range of the generated data is larger than if you had expectation which were closer, like N(0,v) and N(0,v), as you have in the other dimension. Then you ask ksdensity to approximate a function using P points inside this range.
Like in standard linear interpolation, the denser the points the better approximation of the function (inside the range), this is the same case here. Thus in the N(0.5,v) and N(8.2,v) where the points are "sparse" (or sparser, is that a word?) the approximation is worse than in the N(0,v) and N(0,v) where the points are denser.
As a small side note, are there any reason that you do not apply ksdensity directly on the bivariate data? Also I cannot reproduce your comment where you say that 5e2points are also good. Final comment, 1e3 is typically prefered over 10^3.
I think this is simply about the number of samples you're using. For the first example, the means of the two Gaussians are relatively close, hence a thousand samples are enough to obtain a cdf really close the the U[0,1] cdf. On the second vector though, you have a higher difference, and need more samples. With 100000 samples, I obtained the following result:
With 1000 I obtained this:
Which is clearly farther from the Uniform cdf function. Try to increase the number of samples to a million and check if the result is again getting closer.

How to produce a log scale FFT with MatLab

It's my first time performing an FFT within MatLab by experimenting with some example code from the MathWorks website. I was wondering if it was possible to take the code I have and transform the x axis to a log-scale representation rather than linear. I understand most of the code, but it is the x axis line of code that I'm still not 100% sure exactly what it is doing apart from the +1 at the end of the line, which is that fact that MatLab's indexing structure doesn't start on 0.
My code so far is:
[y,fs] = wavread('Wav/800Hz_2sec.wav');
NFFT = 4096;
Y = fft(y,NFFT)/length(y);
f = fs/2*linspace(0,1,NFFT/2+1);
plot(f,2*abs(Y(1:NFFT/2+1))
frequency usually comes out in linear scale from Discrete Fourier Transform. if you want, you can make a new frequency vector in log scale and interpolate the results you already have
fnew=fs/2.*logspace(log10(fs/length(y)),0,npts);
Ynew= interp1(f,Y(1:NFFT/2+1),fnew);
where npts is the length of your new frequency vector. for just plotting
loglog(f,2*abs(Y(1:NFFT/2+1));
honestly IMO, the interpolation thing doesn't work very well because FFT of real signals produces strong peaks and troughs in spectra, so unless you smooth your spectrum first, the interpolated spectrum won't look as nice

Detect steps in a Piecewise constant signal

I have a piecewise constant signal shown below. I want to detect the location of step transition (Marked in red).
My current approach:
Smooth signal using moving average filter (http://www.mathworks.com/help/signal/examples/signal-smoothing.html)
Perform Discrete Wavelet transform to get discontinuities
Locate the discontinuities to get the location of step transition
I am currently implementing the last step of detecting the discontinuities. However, I cannot get the precise location and end with many false detection.
My question:
Is this the correct approach?
If yes, can someone shed some info/ algorithm to use for the last step?
Please suggest an alternate/ better approach.
Thanks
Convolve your signal with a 1st derivative of a Gaussian to find the step positions, similar to a Canny edge detection in 1-D. You can do that in a multi-scale approach, starting from a "large" sigma (say ~10 pixels) detect local maxima, then to a smaller sigma (~2 pixels) to converge on the right pixels where the steps are.
You can see an implementation of this approach here.
If your function is really piecewise constant, why not use just abs of diff compared to a threshold?
th = 0.1;
x_steps = x(abs(diff(y)) > th)
where x a vector with your x-axis values, y is your y-axis data, and th is a threshold.
Example:
>> x = [2 3 4 5 6 7 8 9];
>> y = [1 1 1 2 2 2 3 3];
>> th = 0.1;
>> x_steps = x(abs(diff(y)) > th)
x_steps =
4 7
Regarding your point 3: (Please suggest an alternate/ better approach)
I suggest to use a Potts "filter". This is a variational approach to get an accurate estimation of your piecewise constant signal (similar to the total variation minimization). It can be interpreted as adaptive median filtering. Given the Potts estimate u, the jump points are the points of non-zero gradient of u, that is, diff(u) ~= 0. (There are free Matlab implementations of the Potts filters on the web)
See also http://en.wikipedia.org/wiki/Step_detection
Total Variation Denoising can produce a piecewise constant signal. Then, as pointed out above, "abs of diff compared to a threshold" returns the position of the transitions.
There exist very efficient algorithms for TVDN that process millions of data points within milliseconds:
http://www.gipsa-lab.grenoble-inp.fr/~laurent.condat/download/condat_fast_tv.c
Here's an implementation of a variational approach with python and matlab interface that also uses TVDN:
https://github.com/qubit-ulm/ebs
I think, smoothing with a sharper lowpass filter should work better.
Try to use medfilt1() (a median filter) instead, since you have very concrete levels. If you know how long your plateau is, you can take half/quarter of the plateau length for example. Then you would get very sharp edges. The sharp edges should be detectable using a Haar wavelet or even just using simple differentiation.

How to use inverse FFT on amplitude-frequency response?

I am trying to create an application for calculating coefficients for a graphic equalizer FIR filter. I am doing some prototyping in Matlab but I have some problems.
I have started with the following Matlab code:
% binamps vector holds 2^13 = 8192 bins of desired amplitude values for frequencies in range 0.001 .. 22050 Hz (half of samplerate 44100 Hz)
% it looks just fine, when I use Matlab plot() function
% now I get ifft
n = size(binamps,1);
iff = ifft(binamps, n);
coeffs = real(iff); % throw away the imaginary part, because FIR module will not use it anyway
But when I do the fft() of the coefficients, I see that the frequencies are stretched 2 times and the ending of my AFR data is lost:
p = fft(coeffs, n); % take the fourier transform of coefficients for a test
nUniquePts = ceil((n+1)/2);
p = p(1:nUniquePts); % select just the first half since the second half
% is a mirror image of the first
p = abs(p); % take the absolute value, or the magnitude
p = p/n; % scale by the number of points so that
% the magnitude does not depend on the length
% of the signal or on its sampling frequency
p = p.^2; % square it to get the power
sampFreq = 44100;
freqArray = (0:nUniquePts-1) * (sampFreq / n); % create the frequency array
semilogx(freqArray, 10*log10(p))
axis([10, 30000 -Inf Inf])
xlabel('Frequency (Hz)')
ylabel('Power (dB)')
So I guess, I am using ifft wrong. Do I need to make my binamps vector twice as long and create a mirror in the second part of it? If it is the case, then is it just a Matlab's implementation of ifft or also other C/C++ FFT libraries (especially Ooura FFT) need mirrored data for inverse FFT?
Is there anything else I should know to get the FIR coefficients out of ifft?
Your frequency domain vector needs to be complex rather than just real, and it needs to be symmetric about the mid point in order to get a purely real time domain signal. Set the real parts to your desired magnitude values and set the imaginary parts to zero. The real parts need to have even symmetry such that A[N - i] = A[i] (A[0] and A[N / 2] are "special", being the DC and Nyquist components - just set these to zero.)
The above applies to any general purpose complex-to-complex FFT/IFFT, not just MATLAB's implementation.
Note that if you're trying to design a time domain filter with an arbitrary frequency response then you'll need to do some windowing in the frequency domain first. You might find this article helpful - it talks about arbitrary FIR filter design usign MATLAB, in particular fir2.
To get a real result, the input to any typical generic IFFT (not just Matlab's implementation) needs to be complex-conjugate-symmetric. So doing an IFFT with a given number of independent specification points will require an FFT at least twice as long (preferably even longer to allow for some transition to zero from the highest frequency cut-off).
Trying to get a real result by throwing away the "imaginary" portion of a complex result won't work, as you will be throwing away actual required information content the time-domain filter needs for the given frequency response input to the IFFT. However, if the original data is conjugate-symmetric, then the imaginary portion of the IFFT/FFT result will be (usually insignificant) rounding-error noise that can be thrown away.
Also, the DTFT of a finite frequency response will produce an infinitely long FIR. To get a finite length FIR, you will need to compromise the specification for your frequency response specification so that there is little energy left in the latter portion of the time-domain representation that has to be truncated from the FIR to make it realizable or finite. One common (but not necessary the best) way to do this is to window the FIR result produced by the IFFT, and, by trial-and-error, try different windows until you find a FIR filter for which an FFT produces a result "close enough" to your original frequency spec.