I have a large pulse oxymetery signal. Some part of that is noisy and will corrupt my data if I use it. Do you have any strategy to automatically remove the noisy part? (Since the data is very long and there are many channels, I can't really do it manually).
Please find the picture attached to have a feeling of the signal.
You can filter it but you need to know the spectral characteristics of the signal so you can extract it or the spectral characteristics of the noise so you can suppress it. Do you have a signal that doesn't have noise or do you know where in the spectrum your signal of interest lies?
This might be the problem identical to removing outliers from time series.
This problem can be solved by fitting the time series with a given model as shown in the this link. For example, try the following simulation codes.
xdata = (0:0.1:2*pi)';
y0 = sin(xdata); % pure data
gnoise = y0.*randn(size(y0)); % noise component
ydata = y0 + gnoise;
f = fittype('a*sin(b*x)');
fit1 = fit(xdata,ydata,f,'StartPoint',[1 1]);
plot(fit1,'r-',xdata,ydata,'k.',outliers,'m*')
xlim([0 2*pi])
Related
I have calculated power spectrum of signal. the steps are:
FFT of time signal
square of absolute value of FFT/length of signal i.e power spectrum
Now I want to convert it into time domain. What steps should I follow.
The reconstruction of the original signal from the frequency-domain requires both the magnitude and the phase information. So, as you compute the power spectrum and keep only the magnitude, you no longer have all the required information to uniquely reconstruct the original signal.
In other words, we can find examples where different signals have the exact same power spectrum. In that case retrieving which one of those different signals was the original one would thus not be possible.
As a simple illustration, let's say the original signal x is:
x = [0.862209 0.43418 0.216947544 0.14497645];
For sake of argument, let's consider some other signal y, which I've specially crafted for the purpose of this example as:
y = [-0.252234 -0.0835824 -0.826926341 -0.495571572];
As show in the following plots, those two signals might appear completely unrelated:
They do however share the same power spectrum:
f = [0:N-1]/N;
Xf = fft(x,N);
Yf = fft(y,N);
hold off; plot(f, Xf.*conj(Xf)/N, 'b');
hold on; plot(f, Yf.*conj(Yf)/N, 'r:');
xlabel('Normalized frequency');
legend('Px', 'Py')
title('Power spectrum');
As a result, someone who only sees the power spectrum and doesn't know that you started with x, could very well guess that you instead started with y.
That said, the fact that those signals have the same power spectrum could tell you that those signals aren't as unrelated as you might think. In fact those signals also share the same autocorrelation function in the time-domain:
Rx = xcorr(x);
Ry = xcorr(y);
t = [0:length(Rx)-1] - length(x) + 1;
hold off; stem(t, Rx, 'bo');
hold on; stem(t, Ry, 'rx');
legend('Rxx', 'Ryy');
xlabel('lag');
title('Autocorrelation');
This is to be expected since the autocorrelation can be obtained by computing the inverse transform (with ifft) of the power spectrum. This, however, is about as much as you can recover in the time domain. Any signal with this autocorrelation function would be as good a guess as any for the original signal. If you are very motivated you could attempt to solve the set of non-linear equations that are obtained from the definition of the autocorrelation and obtain a list of possibles signals. That would still not be sufficient to tell which one was the original, and as you noticed when comparing my example x and y, there wouldn't be a whole lot to make of it.
The easiest way to see the non-uniqueness of the power (or amplitude) spectrum for describing the time domain signal is that both white noise and the delta function in the time domain have the same power (or amplitude) spectrum - a constant - in the frequency domain.
I'm working in the space of biosignal acquisition. I made a experiment as detailed below, and am now trying to obtain some results from the data.
I have a text file of a signal in Matlab. I loaded the signal onto a waveform generator, then I recorded the generator output on an oscilloscope.
I imported the recorded signal from the oscilloscope back into Matlab.
The Pearson's correlation coefficient between the original signal and the oscilloscope signal is 0.9958 (obtained using corrcoeff function).
I want to compute the SNR of the oscilloscope signal (what I'm calling my signal plus whatever noise is introduced through the digital-to-analog conversion and visa-versa). I have attached a snippet of the 2 signals for reference.
So my original signal is X and oscilloscope signal is X + N.
I used the snr function to compute SNR as follows.
snr(original, (oscilloscope - original))
The result I got was 20.44 dB.
This seems off to me as I would have thought with such a high correlation, that the SNR should be much higher?
Or is it not appropriate to try and compute SNR in this sort of situation?
All help is appreciated.
Thanks
Edit: Graph of a couple of results vs Sleutheye's simulated relationship
You might be surprised at just how even such moderate SNR can still result in fairly high correlations.
I ran an experiment to illustrate the approximate relation between correlation and signal-to-noise-ratio estimate. Since I did not have your specific EEG signal, I just used a reference constant signal and some white Gaussian noise. Keep in mind that the relationship could be affected by the nature of the signal and noise, but it should give you an idea of what to expect. This simulation can be executed with the following code:
SNR = [10:1:40];
M = 10000;
C = zeros(size(SNR));
for i=1:length(SNR)
x = ones(1,M);
K = sqrt(sum(x.*x)/M)*power(10, -SNR(i)/20);
z = x + K*randn(size(x));
C(i) = xcorr(x,z,0)./sqrt(sum(x.*x)*sum(z.*z));
end
figure(1);
hold off; plot(SNR, C);
corr0 = 0.9958;
hold on; plot([SNR(1) SNR(end)], [corr0 corr0], 'k:');
snr0 = 20.44;
hold on; plot([snr0 snr0], [min(C) max(C)], 'r:');
xlabel('SNR (dB)');
ylabel('Correlation');
The dotted black horizontal line highlights your 0.9958 correlation measurement, and the dotted red vertical line highlights your 20.44 dB SNR result.
I'd say that's a pretty good match!
In fact, for this specific case in my simulation (x = 1; z = x + N(0,σ)) if we denote C(x,z) to be the correlation between x and z, and σ as the noise standard deviation, we can actually show that:
Given a correlation value of 0.9958, this would yield an SNR of 20.79dB, which is consistent with your results.
Hi all I have a sound with noise.I want to remove that noise how can I remove it?
Original Sound: Zamfir-EinsamerHirte
Noisy Sound: Zamfir-EinsamerHirte_noisy
[y4,Fs]=audioread('Zamfir-EinsamerHirte_noisy.ogg');
ffty4=fft(y4);
First I analysed the signal
shiftedffty4=fftshift(ffty4);
spectrumy41=abs(shiftedffty4);
phaseffty41 = angle(shiftedffty4);
N4=length(spectrumy41);
t4=-Fs/2:Fs/N4:Fs/2-Fs/N4;
spectrumy42=abs(fftshift(ffty4))/N4;
phaseffty42=angle(fftshift(ffty4));
Secondly I made a all pass filter with the same length of spectrum and product with fft of noisy sound and made inverse fft and removed imaginary parts and played the sound. The sound still with noise
allpassfilter=ones([N4,2]);
allpassfilter(spectrumy42>1e+06)=0;
filteredy4=allpassfilter.*ffty4;
filteredyeni4=ifft(filteredy4);
filteredyy4=real(filteredyeni4);
sound(filteredyy4,Fs);
But I couldn't remove noise.The problem is that I don't know how to make zero, the value of noise(location where has noise) in allpassfilter like below:
allpassfilter(spectrumy42>1e+06)=0;
How can I make it? !!!Any help will be appreciated!!!! Thanks in advance.
I downloaded the clean and noisy audio files.
First let's analyze a small portion of the audio.
n=1024*8; % a small portion of data
w1=1e5;
w2=w1+n-1;
sig_noisy=data_n(w1:w2,1); % noisy audio
sig_clean=data_c(w1:w2,1); % clean audio
figure; hold all
plot(sig_noisy,'b')
plot(sig_clean,'r','LineWidth',2)
ylim([-1.5 1.5])
legend('Noisy','Clean')
As it is seen here, the noisy audio is somehow saturated and
truncated version of clean signal. Truncating a signal cause harmonics
at larger frequencies. Let's look at the power spectrum
densities of the signals.
n=1024*1; % a smaller portion of data
w1=1e5;
w2=w1+n-1;
sig_noisy=data_n(w1:w2,1); % noisy
sig_clean=data_c(w1:w2,1); % clean
[psd_noisy, f] = pwelch(sig_noisy);
[psd_clean, ~] = pwelch(sig_clean);
figure; hold all
plot(f/pi,db(psd_noisy),'b')
plot(f/pi,db(psd_clean),'r')
xlabel('Normalized Freq.')
legend('Noisy','Clean')
You see that noisy audio has harmonics plus noise at high frequencies. Well, now if you assume that the characteristics of the noise is not changing through the end of the audio, then you can design a filter with looking at this small portion of the audio. Since you already have noisy and clean signal together, why not use deconvolution method.
For example, if you deconvolve the clean signal with the noisy one, then
you obtain the inverse response of your system (h_inv), which is also the filter coefficients which you can use to filter the noisy signal
(sig_noisy = sig_clean * h).
Here I use Wiener deconvolution method. Also note that this function is not meant to be used for images only, you can also use the deconvolution methods in Matlab with 1D signals.
h_inv=deconvwnr(sig_clean,sig_noisy);
figure,plot(h_inv)
legend('h^-^1')
As I said, this is the filter coefficients you need. For example if I filter the noisy signal with h_inv:
sig_filtered=conv(sig_noisy,h_inv,'same');
[psd_filtered, ~] = pwelch(sig_filtered);
figure; hold all
plot(f/pi,db(psd_noisy),'b')
plot(f/pi,db(psd_clean),'r')
plot(f/pi,db(psd_filtered),'k')
xlabel('Normalized Freq.')
legend('Noisy','Clean','Filtered')
The filtered signal spectrum is pretty close to the clean signal spectrum. Now that you have the filter coefficients, just filter the whole noisy audio with h_inv and listen to the result.
filtered_all=conv(data_n(:,1),h_inv,'same');
sound(filtered_all,48000)
You may try other deconvolution methods and see the results. You can also zero the unwanted spectrum in fourier domain and take inverse fourier for clean signal. However, since the signal is too long, you will have to do it in a sliding window. Alternatively, you can design cascaded notch filters to filter each harmonic separately.
I see that there are four stron harmonics. So design four notch filters for each and a low pass filter to filter high-frequency noise.
% First notch
fc1=0.0001; bw1=0.05; N=4;
f = fdesign.notch('N,F0,BW',N,fc1,bw1); h = design(f);
% Second notch
fc2=0.21; bw2=0.2;
f = fdesign.notch('N,F0,BW',N,fc2,bw2); h2 = design(f);
% Third notch
fc3=0.41; bw3=0.2;
f = fdesign.notch('N,F0,BW',N,fc3,bw3); h3 = design(f);
% Fourth notch
fc4=0.58; bw4=0.2;
f = fdesign.notch('N,F0,BW',N,fc4,bw4); h4 = design(f);
% A Final lowpass filter
f = fdesign.lowpass('Fp,Fst,Ap,Ast',0.6,0.65,1,30); h5 = design(f);
% Cascade the filters
hd = dfilt.cascade(h, h2, h3, h4, h5);
% See the filter characterisctic
ff=fvtool(hd,'Color','white');
% Now we can filter our
sig_filtered2 = filter(hd,sig_noisy);
[psd_filtered2,f] = pwelch(sig_filtered2);
figure; hold all
plot(f/pi,db(psd_noisy),'b');
plot(f/pi,db(psd_clean),'r');
plot(f/pi,db(psd_filtered2),'k');
xlabel('Normalized Freq.')
legend('Noisy','Clean','Filtered')
Now you can filter the whole audio
filtered_all2 = filter(hd,data_n(:,1));
sound(filtered_all2,48000)
Hope I helped.
I am pretty new to Matlab and I am trying to write a simple frequency based speech detection algorithm. The end goal is to run the script on a wav file, and have it output start/end times for each speech segment. If use the code:
fr = 128;
[ audio, fs, nbits ] = wavread(audioPath);
spectrogram(audio,fr,120,fr,fs,'yaxis')
I get a useful frequency intensity vs. time graph like this:
By looking at it, it is very easy to see when speech occurs. I could write an algorithm to automate the detection process by looking at each x-axis frame, figuring out which frequencies are dominant (have the highest intensity), testing the dominant frequencies to see if enough of them are above a certain intensity threshold (the difference between yellow and red on the graph), and then labeling that frame as either speech or non-speech. Once the frames are labeled, it would be simple to get start/end times for each speech segment.
My problem is that I don't know how to access that data. I can use the code:
[S,F,T,P] = spectrogram(audio,fr,120,fr,fs);
to get all the features of the spectrogram, but the results of that code don't make any sense to me. The bounds of the S,F,T,P arrays and matrices don't correlate to anything I see on the graph. I've looked through the help files and the API, but I get confused when they start throwing around algorithm names and acronyms - my DSP background is pretty limited.
How could I get an array of the frequency intensity values for each frame of this spectrogram analysis? I can figure the rest out from there, I just need to know how to get the appropriate data.
What you are trying to do is called speech activity detection. There are many approaches to this, the simplest might be a simple band pass filter, that passes frequencies where speech is strongest, this is between 1kHz and 8kHz. You could then compare total signal energy with bandpass limited and if majority of energy is in the speech band, classify frame as speech. That's one option, but there are others too.
To get frequencies at peaks you could use FFT to get spectrum and then use peakdetect.m. But this is a very naïve approach, as you will get a lot of peaks, belonging to harmonic frequencies of a base sine.
Theoretically you should use some sort of cepstrum (also known as spectrum of spectrum), which reduces harmonics' periodicity in spectrum to base frequency and then use that with peakdetect. Or, you could use existing tools, that do that, such as praat.
Be aware, that speech analysis is usually done on a frames of around 30ms, stepping in 10ms. You could further filter out false detection by ensuring formant is detected in N sequential frames.
Why don't you use fft with `fftshift:
%% Time specifications:
Fs = 100; % samples per second
dt = 1/Fs; % seconds per sample
StopTime = 1; % seconds
t = (0:dt:StopTime-dt)';
N = size(t,1);
%% Sine wave:
Fc = 12; % hertz
x = cos(2*pi*Fc*t);
%% Fourier Transform:
X = fftshift(fft(x));
%% Frequency specifications:
dF = Fs/N; % hertz
f = -Fs/2:dF:Fs/2-dF; % hertz
%% Plot the spectrum:
figure;
plot(f,abs(X)/N);
xlabel('Frequency (in hertz)');
title('Magnitude Response');
Why do you want to use complex stuff?
a nice and full solution may found in https://dsp.stackexchange.com/questions/1522/simplest-way-of-detecting-where-audio-envelopes-start-and-stop
Have a look at the STFT (short-time fourier transform) or (even better) the DWT (discrete wavelet transform) which both will estimate the frequency content in blocks (windows) of data, which is what you need if you want to detect sudden changes in amplitude of certain ("speech") frequencies.
Don't use a FFT since it calculates the relative frequency content over the entire duration of the signal, making it impossible to determine when a certain frequency occured in the signal.
If you still use inbuilt STFT function, then to plot the maximum you can use following command
plot(T,(floor(abs(max(S,[],1)))))
I want to ask some questions related to the last question of mine so I don't want to post in another thread. My question contains a code, I therefore can't post it as a comment. So I have to edit my old question into a new one. Please take a look and help. Thank you.
I'm new to FFT and DSP and I want to ask you some questions about calculating FFT in Matlab. The following code is from Matlab help, I just removed the noise.
Can I choose the length of signal L different from NFFT?
I'm not sure if I used window correctly. But when I use window (hanning in the following code), I can't get the exact values of amplitudes?
When L and NFFT get different values, then the values of amplitudes were different too. How can I get the exact value of amplitude of input signal? (in the following code, I used a already known signal to check if the code work correctly. But in case, I got the signal from a sensor and I dont know ahead its amplitude, how can I check?)
I thank you very much and look forward to hearing from you :)
Fs = 1000; % Sampling frequency
T = 1/Fs; % Sample time
L = 512; % Length of signal
NFFT=1024; % number of fft points
t = (0:L-1)*T; % Time vector
x = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t); input signal
X = fft(hann(L).*x', NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2+1);
plot(f,2*abs(X(1:NFFT/2+1))) % Plot single-sided amplitude spectrum.
L is the number of samples in your input signal. If L < NFFT then the difference is zero-padded.
I would recommend you do some reading on the effect of zero-padding on FFTs. Typically it is best to use L = NFFT as this will give you the best representation of your data.
An excepted answer on the use of zero-padding and FFTs is given here:
https://dsp.stackexchange.com/questions/741/why-should-i-zero-pad-a-signal-before-taking-the-fourier-transform
In your experiment you are seeing different amplitudes because you will have different amount of spectral leakage with each different L.
You need to apply a window function prior to the FFT to get consistent results with frequency components that have non-integral number of periods within your sampling window.
You might also want to consider using periodogram instead of using the FFT directly - it takes care of window functions and a lot of the other housekeeping for you.