Issue Regarding MATLAB code for regression technique - matlab

I am working on steering wheel angle sensor that measures absolute angle of steering wheel. As steering angle sensors uses gears and several joints which is totally hardware related so in spite of calibration in start with the passage of time due to usage of mechanical parts and also due to some environmental and road conditions some errors occurs in the values of sensors (e.g. offset, phase change, flattening of signal, delay).
In short due to these errors in the measurements our aim gets distracted means If I am viewing velocity vs time curve so if in the original or calibrated sensor in short close to ideal condition sensor my velocity shows a peak in amplitude but due to error (hysteresis) in measured signal I am not getting peak in velocity curve or I am getting flattening of curve so it will affect my final task.
I have a tolerance let say 1.20 degree for hysteresis so that’s why I am having detailed idea about my signal and want to observe my signal if some changes means offset, delay, lowering has occurred in my signal or not. This will not only provide me an idea that whether to lessen the amount of sensors used for my task or made some changes in hardware of sensor to lessen the amount of hysteresis or do some other actions to reduce it.
What I have done uptill now in which uptill now I am not sure that whether I am right or wrong. I am getting some values for hysteresis but I have few questions regarding one of the technique I am using. If someone provides me an idea about it how to improve this technique or provide me a better approach then it will be nice and great guidance.
I have an ideal sensor signal (under ideal conditions which we want) and values from 1 sensor I have data of 6 different drives from car. I am explaining just 1 example of my first drive and its relation with my reference sensor data.
Given the data reference signal and sensor signal data of size 1x1626100 and 1 x 1626100 double for one reading from sensor but in all readings values from Ideal and measured signal w.r.t to time are same.
In short I want to find out the Hysteresis difference of sensor signal from measured signal.
In this method I have applied Regression lines Technique (On upper and lower values of difference signal).
I took difference of my signals (Ref – measured value of signal after applying my limitation on signal).
Applied regression technique by putting a threshold by myself above and below the difference signal means on upper values and on lower values separately and difference between upper and lower values regression lines is called as Hysteresis (Loss). Please have a look at figure 3 and 4 for clear view.
The Problem here with this technique is that I define the values for upper and lower regression line by myself after looking into data like up= 0.4, low= -0.4.
Question:
IS it possible that I could be able to write a program which decides the best line of fit by itself rather than giving a threshold?
Means is there any way that my code draw the best regression line for above values and best regression line for lower values and calculate Hysteresis.
I shall be really grateful to you all as I remained unable to find solution for this problem.
Thanks in anticipation.
clear all
clc
drv(6)=load('20170420__142024.mat');
t=drv(6).q_T0;
ref=drv(6).Pos;
lws_7=drv(6).SteeringWheelAngle;
swav=drv(6).SteeringWhellRotSpd;
vel=drv(6).KBI_angez_Geschw;
sig_diff=ref-lws_7;
mean_sig_diff = mean(sig_diff);
offset_removed_sig_diff = detrend(sig_diff ,'constant' );
offset_removed_mean_sig_diff = detrend(mean_sig_diff ,'constant');
figure(1)
ax11=subplot(321);
histfit(sig_diff)
dum=['Drive ' num2str(i) ': Differnce Signal With offset Removed for drive '];
title(dum)
hold on
plot([mean_sig_diff mean_sig_diff],[0 10000],'r')
hold off
ax12=subplot(322);
histfit(offset_removed_sig_diff)
dum=['Drive ' num2str(i) ': Differnce Signal With offset Removed'];
title(dum)
hold on
plot([offset_removed_mean_sig_diff offset_removed_mean_sig_diff],[0 10000],'r')
hold off
swvel_thres=20;
vehvel=60;
SAmax=90;
t_lim=t(((lws_7<SAmax)&(lws_7>-SAmax)&(swav<swvel_thres)&(vel>vehvel)));
sig_diff_lim = sig_diff((lws_7<SAmax)&(lws_7>-SAmax)& (swav<swvel_thres)&(vel>vehvel));
offset_rem_sig_diff_lim = detrend(sig_diff_lim,'constant');
mean_sig_diff_lim = mean(sig_diff_lim);
offsetmean_sig_diff_lim = detrend(mean_sig_diff_lim,'constant');
figure(2)
ax21=subplot(321);
histfit(sig_diff_lim)
dum=['Drive ' num2str(i) ': Limited Differnce Signal With offset Removed for drive '];
title(dum)
hold on
plot([ mean_sig_diff_lim mean_sig_diff_lim],[0 10000],'r')
hold off
ax22=subplot(322);
histfit(offset_rem_sig_diff_lim )
dum=['Drive ' num2str(i) ': Limited Differnce Signal With offset Removed'];
title(dum)
hold on
plot([offsetmean_sig_diff_lim offsetmean_sig_diff_lim],[0 10000],'r')
hold off
up=0.4;
low=-up;
stats_up = regstats(offset_rem_sig_diff_lim((offset_rem_sig_diff_lim>up)),t_lim((offset_rem_sig_diff_lim>up)), 'linear', {'beta'}); %calculate linear regression for upper values
intercept_up=stats_up.beta(1);
slope_up=stats_up.beta(2);
stats_low = regstats(offset_rem_sig_diff_lim((offset_rem_sig_diff_lim<low)),t_lim((offset_rem_sig_diff_lim<low)), 'linear', {'beta'}); %calculate linear regression for upper values
intercept_low=stats_low.beta(1);
slope_low=stats_low.beta(2);
Hysteresis_LinReg = abs(intercept_low)+abs(intercept_up);
figure(4)
% ax31=subplot(321);
plot(t_lim, offset_rem_sig_diff_lim ,t_lim, t_lim*slope_up+intercept_up, t_lim ,t_lim*slope_low+intercept_low);grid
legend('diff','reg up','reg low')
title(' Limited Differnce Signal With offset Removed with regression lines for drive ')
figure(5)
histfit(offset_rem_sig_diff_lim)
dum=['Drive ' num2str(i) ':Offset Removed Limited Difference Signal with Regression Lines for drive '];
title(dum)
hold on
plot([ intercept_up intercept_up],[0 12000],'r')
hold off
hold on
plot([intercept_low intercept_low],[0 12000],'r')
hold off

You can try a 1D version of the k-means algorithm. k-means divides the data set into k sets (called clusters) - in your case k = 3 (middle points, upper points, lower points) - in terms of how close they are to each other.
You can use the kmeans() function provided by Matlab. It is a 2D version AFAIK, but you can reduce the dimensionality of your problem by setting the first (i.e. time) coordinate of each point to 0, leaving only the "y" (i.e. signal) values.
After k-means is done just select the points, whose mean value is the lowest and the highest, which would give you upper and lower points. You can get the means of each cluster by using this version of the function (see the linked docs):
[idx,C] = kmeans(___)
Matrix C will contain the means, and idx shows which point belongs to which set (cluster).
Then just fit lines to your chosen sets of points.

Related

Any good ways to obtain zero local means in audio signals?

I have asked this question on DSP.SE before, but my question has got no attention. Maybe it was not so related to signal processing.
I needed to divide a discrete audio signal into segments to have some statistical processing and analysis on them. Therefore, segments with fixed local mean would be very helpful for my case. Length of segments are predefined, e.g. 512 samples.
I have tried several things. I do use reshape() function to divide audio signal into segments, and then calculate means of every segment as:
L = 512; % Length of segment
N = floor(length(audio(:,1))/L); % Number of segments
seg = reshape(audio(1:N*L,1), L, N); % Reshape into LxN sized matrix
x = mean(seg); % Calculate mean of each column
Subtracting x(k) from each seg(:,k) would make each local mean zero, yet it would distort audio signal a lot when segments are joined back.
So, since mean of hanning window is almost 0.5, substracting 2*x(k)*hann(L) from each seg(:,k) was the first thing I tried. But this time multiplying by 2 (to make the mean of hanning window be almost equal to 1) distorted the neighborhood of midpoints in each segments itself.
Then, I have used convolution by a smaller hanning window instead of multiplying directly, and subtracting these (as shown in figure below) from each seg(:,k).
This last step gives better results, yet it is still not very useful when segments are smaller. I have seen many amazing approaches here on this site for different problems. So I just wonder if there is any clever ways or existing methods to obtain zero local means which distorts an audio signal less. I read that, this property is useful in some decompositions such as EMD. So maybe I need such decompositions?
You can try to use a moving average filter:
x = cumsum(rand(15*512, 1)-0.5); % generate a random input signal
mean_filter = 1/512 * ones(1, 512); % generate a mean filter
mean = filtfilt(mean_filter, 1, x); % filtfilt is used instead of filter to obtain a symmetric moving average.
% plot the result
figure
subplot(2,1,1)
plot(x);
hold on
plot(mean);
subplot(2,1,2)
plot(x - mean);
You can tune the filter by changing the interval of the mean filter. Using a smaller interval, results in lower means inside each interval, but filters also more low frequencies out of your signal.

Finite Difference Time Domain (FTDT) method for 1D EM Wave

I have attempted to write a code in order to solve the following coupled partial differential EM wave equations:
The code employs finite difference time domain using the Yee algorithm which can be read about in the following two online documents:
http://www.eecs.wsu.edu/~schneidj/ufdtd/ufdtd.pdf
http://www.eecs.wsu.edu/~schneidj/ufdtd/chap3.pdf
I want my source at the left hand side boundary to be a sinusoidal wave with parameters as such:
Ex(0,t) = E0 sin(2πft) for 0 ≤ t ≤ 1/f
The code updates the electric and magnetic field properties of the wave with each loop.
My initial code is as follows:
%FDTD Yee algorithm to solve coupled EM wave equations
clear
clc
G=50; %Specify size of the grid
f=10^3; %choose initial frequency of wave in hertz
e=1; %specify permitivity and permeability (normalised condition)
u=1;
Nt=150; %time steps
E0=1; %Electric Field initial amplitude
%Specify step sizes using corruant condition
c=3*10^8;
dx=0.01;
dt=dx/2*c;
%make constant terms
c1=-dt/(dx*e);
c2=-dt/(dx*u);
%create vgector place holders
Ex=zeros(1,G);
Hy=zeros(1,G);
%create updating loop
M=moviein(Nt);
for t=1:Nt
% Spatial Ex
for k=2:G-1
Ex(k)=Ex(k)+c1*(Hy(k)-Hy(k-1));
end
Ex(G)=0; %PEC boundary condition
%E Source at LHS boundary
Ex(1)=E0*sin(2*pi*f*t);
%Spatial Hy
for n=1:G-1
Hy(n)=Hy(n)+c2*(Ex(n)-Ex(n+1));
end
Hy(G)=0; %PMC boundary condition
plot(Ex);
M(:,t) = getframe;
end
movie(M,1);
Basically I want to create an updating movie which shows the sinusoidal wave propagating to the right hand side boundary coded as a perfect electrical conductor; therefore reflecting the wave, and then propagating back to the left hand side boundary coded as a perfect insulator; absorbing the wave.
The problems I have are as follows:
1) I'm not sure how to properly implement my desired source. It don't appear to be purely sinusoidal.
2) The wave I've coded begins to propagate but it quickly disappears for the majority of the simulation. I do not know why this is occurring
3) I do not know much about running a movie simulation and the plot oscillates as the solution is being run. How can I stop this?
Your wave attenuates because the diference equations are not correctly implemented; instead:
Ex(k)=Ex(k)+c1*(Hy(k)-Hy(k-1));
you should use
Ex1(k)=Ex(k)+c1*(Hy(k)-Hy(k-1));
and instead of:
Hy(n)=Hy(n)+c2*(Ex(n)-Ex(n+1));
you should use:
Hy1(n)=Hy(n)+c2*(Ex(n)-Ex(n+1));
and, in the end of the loop update the current "dataframe":
Hy = Hy1;
Ey = Ey1;
(you should take care also the boundary conditions).
If you want a fixed plot frame that doesn't change when your data changes, create first a axis where you can plot into, with a defined xmin/max and ymin/max, see http://www.mathworks.com/help/matlab/ref/axis.html
You should set the Courant number closer to 1 say 0.995. Thus delta_t = 0.995*delta_x/c.
Also assuming delta_x is in METRIC units then e and u should be in metric units.
I do not know about the specific coding language used but in c or c++ there is no need for intermediate variable Ey1 etc.
Also there should be at least 10 samples per wavelength for accuracy ( preferably 60). Thus wavelength = 60*delta_x and thus the frequency equals roughly of the order 10 to power of 9. Also, I think the sinesoidal source should be E0 * sin(2* pi * f * t * delta_t). You need to adjust your constants, and try it again

Time delay estimation using crosscorrelation

I have two sensors seperated by some distance which receive a signal from a source. The signal in its pure form is a sine wave at a frequency of 17kHz. I want to estimate the TDOA between the two sensors. I am using crosscorrelation and below is my code
x1; % signal as recieved by sensor1
x2; % signal as recieved by sensor2
len = length(x1);
nfft = 2^nextpow2(2*len-1);
X1 = fft(x1);
X2 = fft(x2);
X = X1.*conj(X2);
m = ifft(X);
r = [m(end-len+1) m(1:len)];
[a,i] = max(r);
td = i - length(r)/2;
I am filtering my signals x1 and x2 by removing all frequencies below 17kHz.
I am having two problems with the above code:
1. With the sensors and source at the same place, I am getting different values of 'td' at each time. I am not sure what is wrong. Is it because of the noise? If so can anyone please provide a solution? I have read many papers and went through other questions on stackoverflow so please answer with code along with theory instead of just stating the theory.
2. The value of 'td' is sometimes not matching with the delay as calculated using xcorr. What am i doing wrong? Below is my code for td using xcorr
[xc,lags] = xcorr(x1,x2);
[m,i] = max(xc);
td = lags(i);
One problem you might have is the fact that you only use a single frequency. At f = 17 kHz, and an estimated speed-of-sound v = 340 m/s (I assume you use ultra-sound), the wavelength is lambda = v / f = 2 cm. This means that your length measurement has an unambiguity range of 2 cm (sorry, cannot find a good link, google yourself). This means that you already need to know your distance to better than 2 cm, before you can use the result of your measurement to refine the distance.
Think of it in another way: when taking the cross-correlation between two perfect sines, the result should be a 'comb' of peaks with spacing equal to the wavelength. If they overlap perfectly, and you displace one signal by one wavelength, they still overlap perfectly. This means that you first have to know which of these peaks is the right one, otherwise a different peak can be the highest every time purely by random noise. Did you make a plot of the calculated cross-correlation before trying to blindly find the maximum?
This problem is the same as in interferometry, where it is easy to measure small distance variations with a resolution smaller than a wavelength by measuring phase differences, but you have no idea about the absolute distance, since you do not know the absolute phase.
The solution to this is actually easy: let your source generate more frequencies. Even using (band-limited) white-noise should work without problems when calculating cross-correlations, and it removes the ambiguity problem. You should see the white noise as a collection of sines. The cross-correlation of each of them will generate a comb, but with different spacing. When adding all those combs together, they will add up significantly only in a single point, at the delay you are looking for!
White Noise, Maximum Length Sequency or other non-periodic signals should be used as the test signal for time delay measurement using cross correleation. This is because non-periodic signals have only one cross correlation peak and there will be no ambiguity to determine the time delay. It is possible to use the burst type of periodic signals to do the job, but with degraded SNR. If you have to use a continuous periodic signal as the test signal, then you can only measure a time delay within one period of the periodic test signal. This should explain why, in your case, using lower frequency sine wave as the test signal works while using higher frequency sine wave does not. This is demonstrated in these videos: https://youtu.be/L6YJqhbsuFY, https://youtu.be/7u1nSD0RlwY .

Time Series from spectrum

I am having a samll problem while converting a spectrum to a time series. I have read many article sand I htink I am applying the right procedure but I do not get the right results. Could you help to find the error?
I have a time series like:
When I compute the spectrum I do:
%number of points
nPoints=length(timeSeries);
%time interval
dt=time(2)-time(1);
%Fast Fourier transform
p=abs(fft(timeSeries))./(nPoints/2);
%power of positive frequencies
spectrum=p(1:(nPoints/2)).^2;
%frequency
dfFFT=1/tDur;
frequency=(1:nPoints)*dfFFT;
frequency=frequency(1:(nPoints)/2);
%plot spectrum
semilogy(frequency,spectrum); grid on;
xlabel('Frequency [Hz]');
ylabel('Power Spectrum [N*m]^2/[Hz]');
title('SPD load signal');
And I obtain:
I think the spectrum is well computed. However now I need to go back and obtain a time series from this spectrum and I do:
df=frequency(2)-frequency(1);
ap = sqrt(2.*spectrum*df)';
%random number form -pi to pi
epsilon=-pi + 2*pi*rand(1,length(ap));
%transform to time series
randomSeries=length(time).*real(ifft(pad(ap.*exp(epsilon.*i.*2.*pi),length(time))));
%Add the mean value
randomSeries=randomSeries+mean(timeSeries);
However, the plot looks like:
Where it is one order of magnitude lower than the original serie.
Any recommendation?
There are (at least) two things going on here. The first is that you are throwing away information, and then substituting random numbers for that information.
The FFT of a real sequence is a sequence of complex numbers consisting of a real and imaginary part. Converting those numbers to polar form gives you magnitude and phase angle. You are capturing the magnitude part with p=aps(fft(...)), but you are not capturing the phase angle (which would involve atan2(...)). You are then making up random numbers (epsilon=...) and using those to replace the original numbers when you reconstruct your time-series. Also, as the FFT of a real sequence has a particular symmetry, substituting random numbers for the phase angle destroys that symmetry, which means that the IFFT will in general no longer be a real sequence, but a sequence of complex numbers - and again, you're only looking at the real portion of the IFFT, so you're throwing away information again. If this is an audio signal, the results may sound somewhat like the original (or they may be completely different), but the waveform definitely won't match...
The second issue is that in many implementations, ifft(fft(...)) will scale the result by the number of points in the signal. There are several different ways to avoid that, with differing results, but sometimes more attractive in different scenarios, depending on what you are trying to do. You can either scale the fft() result before you do the ifft(), or scale the ifft() result at the end, or in some cases, I've even seen both being scaled by a factor of sqrt(N) - doing it twice has the end result of scaling the final result by N, but it is a bit less efficient since you do the scaling twice...

Matlab: Peak detection for clusters of peaks

I am working with biological signal data, and am trying to count the number of regions with a high density of high amplitude peaks. As seen in the figure below, the regions of interest (as observed qualitatively) are contained in red boxes and 8 such regions were observed for this particular trial. The goal is to mathematically achieve this same result in near real time without the intervention or observation of the researcher.
The data seen plotted below is the result of raw data from a 24-bit ADC being processed by an FIR filter, with no other processing yet being done.
What I am looking for is a method, or ideally code, to help me detect such regions as identified while subsequently ignoring some of the higher amplitude peaks in between the regions of interest (i.e. between regions 3 and 4, 5 and 6, or 7 and 8 there is a narrow region of high amplitude which is not of concern). It is worth noting that the maximum is not known prior to computation.
Thanks for your help.
Data
https://www.dropbox.com/s/oejyy6tpf5iti3j/FIRData.mat
can you work with thresholds?
define:
(1) "amplitude threshold": if the signal is greater than the threshold it is considered a peak
(2) "window size" : of a fixed time duration
algorithm:
if n number of peaks was detected in a duration defined in "window size" than consider the signal within "window size" as cluster of peaks.(I worked with eye blink eeg data this way before, not sure if it is suitable for your application)
P.S. if you have data that are already labelled by human, you can train a classifier to find out your thresholds and window size.
Does it make sense in your problem to have some sort of "window size"? In other words, given a region of "high" amplitude, if you shrink the duration of the region, at what point will it become meaningless to your analysis?
If you can come up with a window, just apply this window to your data as it comes in and compute the energy within the window. Then, you can define some energy threshold and perform simple peak detection on the energy signal.
By inspection of your data, the regions with high amplitude peaks are repeated at what appears to be fairly uniform intervals. This suggests that you might fit a sine or cosine wave (or a combination of the two) to your data.
Excuse my crude sketch but what I mean is something like this:
Once you make this identification, you can use the FFT to get the dominant spatial frequencies. Keep in mind that the spatial frequency spectrum of your signal may be fairly complex, due to spurious data, but what you are after is one or two dominant frequencies of your data.
For example, I made up a sinusoid and you can do the calculation like this:
N = 255; % # of samples
x = linspace(-1/2, 1/2, N);
dx = x(2)-x(1);
nu = 8; % frequency in cycles/interval
vx = (1/(dx))*[-(N-1)/2:(N-1)/2]/N; % spatial frequency
y = sin(2*pi*nu*x); % this would be your data
F = fftshift(abs(fft(y))/N);
figure; set(gcf,'Color',[1 1 1]);
subplot(2,1,1);plot(x,y,'-b.'); grid on; xlabel('x'); grid on;
subplot(2,1,2);plot(vx,F,'-k.'); axis([-1.3*nu 1.3*nu 0 0.6]); xlabel('frequency'); grid on;
Which gives:
Note the peaks at ± nu, the dominant spatial frequency. Now once you have the dominant spatial frequencies you can reconstruct the sine wave using the frequencies that you have obtained from the FFT.
Finally, once you have your sine wave you can identify the boxes with centers at the peaks of the sine waves.
This is also a nice approach because it effectively filters out the spurious or less relevant spikes, helping you to properly place the boxes at your intended locations.
Since I don't have your data, I wasn't able to complete all of the code for you, but the idea is sound and you should be able to proceed from this point.