Add noise to speech with different sampling rates in matlab - matlab

I am trying to add a noise file (.wav) file to a speech signal(.wav).
[b fs]=audioread('AzBio_01-01_S60.wav');
[babble fs1]=audioread('cafeteria_babble.wav');
The issue is that both the files have different sampling rates (fs=22050, fs1=44100).
When i add them it distorts the other signal since the sampling rate is different. How do i solve this. I am playing the sound via
p=audioplayer (total,fs)
play (p)

The distortion will be due to the sample values summing to more than 1 not the sample rate differences. To deal with that risk divide both vectors by two before summing them.
That said, you do still need to deal with the sample rate difference. There are two ways of doing this: either reduce the sampling rate of the signal with the higher sampling rate or increase the sampling rate of the other signal.
Reducing the sampling rate of the signal with the higher rate is easier but comes with the disadvantage that you are actually losing data. You could do this by using resample. resample(babble, fs, fs1)
Alternatively you can upsample the signal with the lower sampling frequency. This won't give you any more data but it will mean the sample rates match. You can do this with interp1. b = interp1(1:length(b),b,0.5:0.5:length(b))
So in summary
[b fs]=audioread('AzBio_01-01_S60.wav');
[babble fs1]=audioread('cafeteria_babble.wav');
b = interp1(1:length(b),b,0.5:0.5:length(b)); % interpolate b to fs1
total = b/2 + babble/2; % sum at half the sample values
p=audioplayer (total,fs1);
play (p)

Related

Does the duration of a signal affect its frequency component's amplitude? Also, does the sampling frequency affect the power of a signal?

I have two questions that are bugging me:
Does the duration of an audio signal affect the amplitude of frequency components of that same signal? For example, I am recording the sound of a fan using a microphone. At first, I record only for 10 sec and convert the audio signal into frequency spectrum. Then, I record the same sound for 20 sec and then convert the audio signal into frequency spectrum. In both the cases, the sound of the fan is same, but does the duration of the signal affect the amplitude of frequency components in the spectrum plot?
For example, I have 2 audio signals. For the first one, I have that same fan sound recording for 10 sec and the sampling frequency is 5KHz, and for the second recording, I have that same audio signal but now the sampling frequency is changed to 15KHz. I used MATLAB to check the power for both the signals and the power for both the signals was same, however I want to know why. Formula that I used was Power=rms(signal)^2. According to me the second signal should have more power because now there are more number of samples compared to the first recording and since those extra samples would also have a random amplitude, the average shouldn't be the same as for the first one. Am I thinking it right?
Can anyone provide their thoughts? Thank You!
This answer is from: https://dsp.stackexchange.com/questions/75025/does-the-duration-of-a-signal-affect-its-frequency-components-amplitude-also
Power is energy per unit time. If you increase the duration, you increase the energy, but due to the normalization with time the power would be the same.
The DFT as given by
X[k]=∑n=0N−1x[n]e−j2πnk/N
will scale the frequency component by N as given by the summation over N samples. This can be normalized by multiplying the result by 1/N.
The frequency components of the signal levels will be the same in a normalized DFT (normalized by dividing by the total number of samples) for signal components that occupy one bin (pure tones), but the noise floor as observed may be lower by the change in sampling rate: if the noise floor is limited by quantization noise, the total quantization noise (well approximated as white noise, meaning constant across all frequencies) will be spread over a wider frequency range, so increasing the sampling rate will cause the contribution of quantization noise on a per Hz basis (noise density) to be lower. Further the duration will effect the frequency resolution in each bin in the frequency domain; for an unwindowed DFT, the equivalent noise bandwidth per bin is fs/N where fs is the sampling rate and N is the number of bins. At a given sampling rate, increasing the number of bins will increase the total duration and thus reduce the noise bandwidth per bin; causing the overall DFT noise floor as observed to decrease. (Same noise power just less measured in each bin). Windowing if done in the time domain will reduce the signal level by the coherent gain of the window, but increase the equivalent noise bandwidth such that pure tones will be reduced more than noise that is spread over multiple bins.

Playing a varying frequency in Hertz in matlab?

I have some matlab code which produces frequency (in hertz) of a sound over 5 seconds. The code as it stands outputs 100 samples per second, and I want to play the 5 second block to see what this sounds like, but I'm having issues with sampling rate and sound / soundsc commands.
My frequency oscillates (data here ) and I'd be very grateful if someone could help me convert this data into some kind of real-time approximation of what it should sound like.
Something like this may be helpful
Fs=2000; %sample rate, Hz
t=0:1/Fs:5; %time vector
F=298+sin(2*pi*t); %put your own F here
S=sin(2*pi*F.*t); %here is the sound vector
%visual check
figure(1);
plot(t,S)
figure(2);
plot(t,F)
%listen
wavplay(S,Fs)
This is like FM modulation, but different. If you have an Fold vector with a different sample rate, you can convert it with the command
F=interp1(told,Fold,t); %told and Fold are F at a different sample rate
%check it
plot(told,Fold,t,F)
First, your sampling rate should be at least twice the maximum frequency according the Nyquist–Shannon sampling theorem.
Next, you need to generate a sinusoid:
Signal = sin(2*pi*Phi);
where Phi is the phase corresponding to the desired frequency pattern, which is simply an integral of the frequency (which you can do numerically or analytically).

How can filter data to remove noise in matlab?

I have 2 arrays of 800000 input and output data samples of a system. The system in a kind of oven that works among 0 and 10 volts. The sample time is 0.001s.
I have to identify the model of this system, but first of all, given that the data are clearly dirty, I would like to filter the noise.
How can I do it with the System Identification Toolbox of Matlab?
Moreover, how can I estimate the cutoff frequency to remove the noise?
Thank you in advance.
PS: given that this is a bit out of topic, please, post your answer here thank you.
The cutoff frequency is directly given by you sampling time or sampling frequency.
you sampling frequency is 1/(sampling time) and must at least 2 the factor of the highest frequency of interest:
http://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
f_s = 1/T_s >= 2*f_cutOff
You can then simply to same frequency domain processing in the case you sampling frequency is realy high enough. The easiest way would to have a look at the frequency domain (with function fft() ). And check first where you have high noise components. Then filter out these components (zeroing) and then transform it back into time domain ( with function ifft() ).
Noise is modeled as a white Gaussian distribution in the simplest case. If you estimate the noise energy, you can make a dummy noise by calling
noise = A*randn(1,N);
Here, A is the amplitude and N is the sample count. then just take the fft of this signal and subtract it from the fft of input signal and take the inverse fft (ifft)

Matlab: Finding dominant frequencies in a frame of audio data

I am pretty new to Matlab and I am trying to write a simple frequency based speech detection algorithm. The end goal is to run the script on a wav file, and have it output start/end times for each speech segment. If use the code:
fr = 128;
[ audio, fs, nbits ] = wavread(audioPath);
spectrogram(audio,fr,120,fr,fs,'yaxis')
I get a useful frequency intensity vs. time graph like this:
By looking at it, it is very easy to see when speech occurs. I could write an algorithm to automate the detection process by looking at each x-axis frame, figuring out which frequencies are dominant (have the highest intensity), testing the dominant frequencies to see if enough of them are above a certain intensity threshold (the difference between yellow and red on the graph), and then labeling that frame as either speech or non-speech. Once the frames are labeled, it would be simple to get start/end times for each speech segment.
My problem is that I don't know how to access that data. I can use the code:
[S,F,T,P] = spectrogram(audio,fr,120,fr,fs);
to get all the features of the spectrogram, but the results of that code don't make any sense to me. The bounds of the S,F,T,P arrays and matrices don't correlate to anything I see on the graph. I've looked through the help files and the API, but I get confused when they start throwing around algorithm names and acronyms - my DSP background is pretty limited.
How could I get an array of the frequency intensity values for each frame of this spectrogram analysis? I can figure the rest out from there, I just need to know how to get the appropriate data.
What you are trying to do is called speech activity detection. There are many approaches to this, the simplest might be a simple band pass filter, that passes frequencies where speech is strongest, this is between 1kHz and 8kHz. You could then compare total signal energy with bandpass limited and if majority of energy is in the speech band, classify frame as speech. That's one option, but there are others too.
To get frequencies at peaks you could use FFT to get spectrum and then use peakdetect.m. But this is a very naïve approach, as you will get a lot of peaks, belonging to harmonic frequencies of a base sine.
Theoretically you should use some sort of cepstrum (also known as spectrum of spectrum), which reduces harmonics' periodicity in spectrum to base frequency and then use that with peakdetect. Or, you could use existing tools, that do that, such as praat.
Be aware, that speech analysis is usually done on a frames of around 30ms, stepping in 10ms. You could further filter out false detection by ensuring formant is detected in N sequential frames.
Why don't you use fft with `fftshift:
%% Time specifications:
Fs = 100; % samples per second
dt = 1/Fs; % seconds per sample
StopTime = 1; % seconds
t = (0:dt:StopTime-dt)';
N = size(t,1);
%% Sine wave:
Fc = 12; % hertz
x = cos(2*pi*Fc*t);
%% Fourier Transform:
X = fftshift(fft(x));
%% Frequency specifications:
dF = Fs/N; % hertz
f = -Fs/2:dF:Fs/2-dF; % hertz
%% Plot the spectrum:
figure;
plot(f,abs(X)/N);
xlabel('Frequency (in hertz)');
title('Magnitude Response');
Why do you want to use complex stuff?
a nice and full solution may found in https://dsp.stackexchange.com/questions/1522/simplest-way-of-detecting-where-audio-envelopes-start-and-stop
Have a look at the STFT (short-time fourier transform) or (even better) the DWT (discrete wavelet transform) which both will estimate the frequency content in blocks (windows) of data, which is what you need if you want to detect sudden changes in amplitude of certain ("speech") frequencies.
Don't use a FFT since it calculates the relative frequency content over the entire duration of the signal, making it impossible to determine when a certain frequency occured in the signal.
If you still use inbuilt STFT function, then to plot the maximum you can use following command
plot(T,(floor(abs(max(S,[],1)))))

how can the noise be removed from a recorded sound,using fft in MATLAB?

I want to remove noises from a recorded sound and make the fft of it finding fundamental frequencies of that sound, but I don't know how to remove those noises. I'm recording the sound of falling objects from different heights. I want to find the relation between the height and the maximum frequency of the recorded sound.
[y,fs]=wavread('100cmfreefall.wav');
ch1=y(:,1);
time=(1/44100)*length(ch1);
t=linspace(0,time,length(ch1));
L=length(ch1);
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(y,NFFT)/L;
Y1=log10(Y);
figure(1)
f = fs/2*linspace(0,1,NFFT/2+1);
plot(f,2*abs(Y1(1:NFFT/2+1))) ;
[b,a]=butter(10,3000/(44100/2),'high');
Y1=filtfilt(b,a,Y1);
% freqz(b,a)
figure(2)
plot(f,2*abs(Y1(1:NFFT/2+1))) ;
title('Single-Sided Amplitude Spectrum of y(t)');
xlabel('Frequency (Hz)');
ylabel('|Y(f)|')
xlim([0 50000])
% soundsc(ch1(1:100000),44100)
Saying that there is noise in your signal is very vague and doesn't convey much information at all. Some of the questions are:
Is the noise high frequency or low frequency?
Is it well separated from your signal's frequency band or is it mixed in?
Does the noise follow a statistical model? Can it be described as a stationary process?
Is the noise another deterministic interfering signal?
The approach you take will certainly depend on the answers to the above questions.
However, from the experiment setup that you described, my guess is that your noise is just a background noise, that in most cases, can be approximated to be white in nature. White noise refers to a statistical noise model that has a constant power at all frequencies.
The simplest approach will be to use a low pass filter or a band pass filter to retain only those frequencies that you are interested in (a quick look at the frequency spectrum should reveal this, if you do not know it already). In a previous answer of mine, to a related question on filtering using MATLAB, I provide examples of creating low-pass filters and common pitfalls. You can probably read through that and see if it helps you.
A simple example:
Consider a sinusoid with a frequency of 50 Hz, sampled at 1000 Hz. To that, I add Gaussian white noise such that the SNR is ~ -6dB. The original signal and the noisy signal can be seen in the top row of the figure below (only 50 samples are shown). As you can see, it almost looks as if there is no hope with the noisy signal as all structure seems to have been destroyed. However, taking an FFT, reveals the buried sinusoid (shown in the bottom row)
Filtering the noisy signal with a narrow band filter from 48 to 52 Hz, gives us a "cleaned" signal. There will of course be some loss in amplitude due to the noise. However, the signal has been retrieved from what looked like a lost cause at first.
How you proceed depends on your exact application. But I hope this helped you understand some of the basics of noise filtering.
EDIT
#Shabnam: It's been nearly 50 comments, and I really do not see you making any effort to understand or at the very least, try things on your own. You really should learn to read the documentation and learn the concepts and try it instead of running back for every single error. Anyway, please try the following (modified from your code) and show the output in the comments.
[y,fs]=wavread('100cmfreefall.wav');
ch1=y(:,1);
time=(1/fs)*length(ch1);
t=linspace(0,time,length(ch1));
L=length(ch1);
NFFT = 2^nextpow2(L);
f = fs/2*linspace(0,1,NFFT/2+1);
[b,a]=butter(10,3e3/(fs/2),'high');
y1=filtfilt(b,a,ch1);
figure(1)
subplot(2,1,1)
Y=fft(ch1,NFFT)/L;
plot(f,log10(abs(Y(1:NFFT/2+1))))
title('unfiltered')
subplot(2,1,2)
Y1=fft(y1,NFFT)/L;
plot(f,log10(abs(Y1(1:NFFT/2+1))))
title('filtered')
Answer to your question is highly dependent on the characteristics of what you call "noise" - its spectral distribution, the noise being stationary or not, the source of the noise (does it originate in the environment or the recording chain?).
If the noise is stationary, i.e its statistical characteristics do not change over time, you can try recording a few seconds (10-15 is a good initial guess) of noise only, preform FFT, and then subtract the value of the noise in FFT bin n from your measurement FFT bin n.
You can read some background here: http://en.wikipedia.org/wiki/Noise_reduction