How to convert a speech spectrum to time domain - matlab

I am doing speech analysis. I recorded the sound for 5 seconds. Applied Hamming window, DC offsetting and normalising and using fft took the spectrum. I want to hear how much the sound has changed. So is there a way to convert the fft back to time domain?
clc,clear;
% Record your voice for 5 seconds.
%recObj = audiorecorder;
recObj = audiorecorder(96000, 16, 1);
disp('Start speaking.')
recordblocking(recObj,5);
disp('End of Recording.');
% Play back the recording.
play(recObj);
get(recObj);
myspeech = getaudiodata(recObj);
wavwrite(double(myspeech),96000,'C://Users//naveen//Desktop//unprocessed')
% Store data in double-precision array.
myRecording = getaudiodata(recObj);
% Store data in double-precision array.
myRecording = getaudiodata(recObj);
% Plot the samples.
figure,plot(myRecording),title('Original Sound');
%Offset Elimination
a = myRecording;
a=double(a);
D = a-mean(a);
figure,plot(myRecording),title('Sound after Offset Elimination');
%normalizing
w = D/max(abs(D));
figure,plot(w),title('Normalized Sound');
% hamming window
a1=double(w);
%a1=a1';
N=length(w);
hmw = hamming(N);
temp = a1.*hmw;
a1 = temp;
%Fast Fourier Transform
a2=double(a1);
N=length(a1);
n=ceil(log2(N));
nz=2^n;
fs = 96000;
x_z=0*[1:nz];
x_z(1:N)=a2;
X=fft(x_z);
x1=abs(X);
wq=double(0:nz-1)*(fs/nz);
figure,stem(wq,x1),title('Spectrum');
xlabel('Frequency (Hz)');
ylabel('Magnitude of FFT Coefficients');
nz1=round(nz/2)
x2=x1(1:nz1);
w1=wq(1:nz1);
figure,plot(w1,x2);
title('Half Length Spectrum of Sound');
nz2=nz1*10;

Like you do fft you can also apply ifft which is the inverse of the fourier transform (http://www.mathworks.es/es/help/matlab/ref/ifft.html)

Using the abs() function on complex data is a lossy operation which throws away any phase information. The phase information encodes the waveform shapes and well as the timing of any transients in the FFT window. Since that information has been discarded, a magnitude spectrum or spectrogram alone can't be turned back into audio that sounds like the original speech.
But if you keep the full complex results of the FFT, then a complex IFFT might be used in some sort of resynthesis process.

Related

Getting frequency response of my microphone

I'm trying to get a frequency response curve from a microphone that I have connected to my pc, using matlab.
I think I'm pretty close of getting the final code, but i think I'm missing something.
This is what I have right now:
close all, clear all, clc
x = 5; % seconds recording
Fs = 44100; % Sampling frequency
T = 1/Fs; % Sample time
L= x*1000; % Length of signal
t = (0:L-1)*T; % Time vector
% Record your voice for 'x' seconds.
recObj = audiorecorder(Fs, 24, 1);
disp('Start of Recording.');
recordblocking(recObj, x);
disp('End of Recording.');
% Store data in double-precision array.
myRecording = getaudiodata(recObj);
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
fourier = fft(myRecording);
Y = fft(myRecording,NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2+1);
X = 2*abs(Y(1:NFFT/2+1));
samples = get(recObj,'TotalSamples');
plot(f,X)
title('Single-Sided Amplitude Spectrum)
xlabel('Frequency (Hz)')
ylabel('|Y(f)|')
This part of the code is correct I think.
For example, when I play a tone of 5kHz I get this plot:
Now I play pink noise, and add this small part of code to convert it to dB, so I can get the frequency response curve:
dbX = db(X);
plot(f,dbX)
I expect (or my goal is..) a frequency response curve (as you can find on google images for example, I don't have enough reputation for more than 2 links, so sorry I didn't use a picture link here) , but I got this instead:
Clearly I'm doing something wrong, but I don't know what ..
You are closer than you think. Here are three tips to get a little closer.
Firstly, you need a logarithmic plot of your frequency domain data. Use semilogx() instead of plot.
Secondly, you're going to need to smooth the data. The simplest function for this in Matlab is smooth() but there are more advanced options that may better suit your needs.
Finally, in order to get a relative response, subtract the mean from your data. dbx_relative = dbx-mean(dbx)
Putting it all together:
dbX = db(X);
relative_dbx = dbx-mean(dbx);
smoothed_dbx = smooth(relative_dbx);
semilogx(f,smoothed_dbx);
Use pwelch to compute the transfer function between the stimulus signal (i.e. the reference audio waveform ) and the response (what you measure with your microphone).

How to find the frequency of a periodic sound signal?

I'm working on sound signals of a walking pattern, which has obvious regular patterns:
Then I thought I can get the frequency of walking (approximately 1.7Hz from the image) using FFT function:
x = walk_5; % Walking sound with a size of 711680x2 double
Fs = 48000; % sound frquency
L=length(x);
t=(1:L)/Fs; %time base
plot(t,x);
figure;
NFFT=2^nextpow2(L);
X=fft(x,NFFT);
Px=X.*conj(X)/(NFFT*L); %Power of each freq components
fVals=Fs*(0:NFFT/2-1)/NFFT;
plot(fVals,Px(1:NFFT/2),'b','LineSmoothing','on','LineWidth',1);
title('One Sided Power Spectral Density');
xlabel('Frequency (Hz)')
ylabel('PSD');
But then it doesn't give me what I expected:
FFT result:
zoom image has lots of noises:
and there is no information near 1.7Hz
Here is the graph from log domain using
semilogy(fVals,Px(1:NFFT));
It's pretty symmetric though:
I couldn't find anything wrong with my code. Do you have any solutions to easily extract the 1.7Hz from the walking pattern?
here is the link for the audio file in mat
https://www.dropbox.com/s/craof8qkz9n5dr1/walk_sound.mat?dl=0
Thank you very much!
Kai
I suggest you to forget about DFT approach since your signal is not appropriate for this type of analysis due to many reasons. Even by looking on the spectrum in range of frequencies that you are interested in, there is no easy way to estimate the peak:
Of course you could try with PSD/STFT and other funky methods, but this is an overkill. I can think of two, rather simple methods, for this task.
First one is based simply on the Auto Correlation Function.
Calculate the ACF
Define the minimum distance between them. Since you know that expected frequency is around 1.7Hz, then it corresponds to 0.58s. Let's make it 0.5s as the minimum distance.
Calculate the average distance between peaks found.
This gave me an approximate frequency of 1.72 Hz .
Second approach is based on the observation to your signal already has some peaks which are periodic. Therefore we can simply search for them using findpeaks function.
Define the minimum peak distance in a same way as before.
Define the minimum peak height. For example 10% of maximum peak.
Get the average difference.
This gave me an average frequency of 1.7 Hz.
Easy and fast method. There are obviously some things that can be improved, such as:
Refining thresholds
Finding both positive and negative peaks
Taking care of some missing peaks, i.e. due to low amplitude
Anyway that should get you started, instead of being stuck with crappy FFT and lazy semilogx.
Code snippet:
load walk_sound
fs = 48000;
dt = 1/fs;
x = walk_5(:,1);
x = x - mean(x);
N = length(x);
t = 0:dt:(N-1)*dt;
% FFT based
win = hamming(N);
X = abs(fft(x.*win));
X = 2*X(1:N/2+1)/sum(win);
X = 20*log10(X/max(abs(X)));
f = 0:fs/N:fs/2;
subplot(2,1,1)
plot(t, x)
grid on
xlabel('t [s]')
ylabel('A')
title('Time domain signal')
subplot(2,1,2)
plot(f, X)
grid on
xlabel('f [Hz]')
ylabel('A [dB]')
title('Signal Spectrum')
% Autocorrelation
[ac, lag] = xcorr(x);
min_dist = ceil(0.5*fs);
[pks, loc] = findpeaks(ac, 'MinPeakDistance', min_dist);
% Average distance/frequency
avg_dt = mean(gradient(loc))*dt;
avg_f = 1/avg_dt;
figure
plot(lag*dt, ac);
hold on
grid on
plot(lag(loc)*dt, pks, 'xr')
title(sprintf('ACF - Average frequency: %.2f Hz', avg_f))
% Simple peak finding in time domain
[pkst, loct] = findpeaks(x, 'MinPeakDistance', min_dist, ...
'MinPeakHeight', 0.1*max(x));
avg_dt2 = mean(gradient(loct))*dt;
avg_f2 = 1/avg_dt2;
figure
plot(t, x)
grid on
hold on
plot(loct*dt, pkst, 'xr')
xlabel('t [s]')
ylabel('A')
title(sprintf('Peak search in time domain - Average frequency: %.2f Hz', avg_f2))
Here's a nifty solution:
Take the absolute value of your raw data before taking the FFT. The data has a ton of high frequency noise that is drowning out whatever low frequency periodicity is present in the signal. The amplitude of the high frequency noise gets bigger every 1.7 seconds, and the increase in amplitude is visible to the eye, and periodic, but when you multiply the signal by a low frequency sine wave and sum everything you still end up with something close to zero. Taking the absolute value changes this, making those amplitude modulations periodic at low frequencies.
Try the following code comparing the FFT of the regular data with the FFT of abs(data). Note that I took a few liberties with your code, such as combining what I assume were the two stereo channels into a single mono channel.
x = (walk_5(:,1)+walk_5(:,2))/2; % Convert from sterio to mono
Fs = 48000; % sampling frquency
L=length(x); % length of sample
fVals=(0:L-1)*(Fs/L); % frequency range for FFT
walk5abs=abs(x); % Take the absolute value of the raw data
Xold=abs(fft(x)); % FFT of the data (abs in Matlab takes complex magnitude)
Xnew=abs(fft(walk5abs-mean(walk5abs))); % FFT of the absolute value of the data, with average value subtracted
figure;
plot(fVals,Xold/max(Xold),'r',fVals,Xnew/max(Xnew),'b')
axis([0 10 0 1])
legend('old method','new method')
[~,maxInd]=max(Xnew); % Index of maximum value of FFT
walkingFrequency=fVals(maxInd) % print max value
And plotting the FFT for both the old method and the new, from 0 to 10 Hz gives:
As you can see it detects a peak at about 1.686 Hz, and for this data, that's the highest peak in the FFT spectrum.

frequency domain interpolation changes the signal spectrum

I am working on some experimental data related to a sine-sweep excitation.
I first reconstructed the signal using the amplitude and frequency information I get from the data file:
% finz: frequency
% ginz: amplitude
R = 4; % sweep rate
tz = 60/R*log2(finz/finz(1)); % time
u_swt = sin(2*pi*((60*finz(1)/(R*log(2.))*(2.^(R/60*tz)-1))));
time_sign = ginz.*u_swt;
freq_sign = fft(time_sign);
This is what I obtain:
I then tried to interpolate the frequency data before computing the time signal to obtain a 'nicer' signal (having more samples it should be easier to reconstruct it):
ginz = interp(ginz,200);
finz = interp(finz,200);
But now the spectrum is changed:
Why the frequency spectrum is so different? Am I doing something wrong in the interpolation? Should I not interpolate the data?
The details of the signal you are working with are not clear to me. For instance, can you please provide typical examples of finz and ginz? Also, it is not clear what you are hoping to achieve through interpolation, so it is hard to advise on its use.
However, if you interpolate a time series you should expect its spectrum to change as it increases the sampling frequency. The frequency of an interpolated signal will become smaller relative to the new sampling frequency. Therefore, the signal spectrum will be (not being very technical here) pushed towards zero. I have provided a script below which creates white Gaussian noise, and plots the spectrum for different levels of interpolation. In the first subfigure with no interpolation, the spectrum is uniformly occupied (by design - white noise). In subsequent subfigures, with increasing interpolation the occupied spectrum becomes smaller and smaller. Hope this helps.
% white Gaussian noise (WGN)
WGN = randn(1,1000);
% DFT of WGN
DFT_WGN = abs(fft(WGN));
% one-sided spectrum
DFT_WGN = DFT_WGN(1:length(WGN)/2);
% interpolated WGN by factor of 2 (q = 2)
WGN_interp_2 = interp(WGN,2);
% DFT of interpolated WGN
DFT_WGN_interp_2 = abs(fft(WGN_interp_2));
% one-sided spectrum
DFT_WGN_interp_2 = DFT_WGN_interp_2(1:length(DFT_WGN_interp_2 )/2);
% interpolated WGN by factor of 10 (q = 10)
WGN_interp_10 = interp(WGN,10);
% DFT of interpolated WGN
DFT_WGN_interp_10 = abs(fft(WGN_interp_10));
% one-sided spectrum
DFT_WGN_interp_10 = DFT_WGN_interp_10(1:length(DFT_WGN_interp_10 )/2);
figure
subplot(3,1,1)
plot(DFT_WGN)
ylabel('DFT')
subplot(3,1,2)
plot(DFT_WGN_interp_2)
ylabel('DFT (q:2)')
subplot(3,1,3)
plot(DFT_WGN_interp_10)
ylabel('DFT (q:10)')

Ways to Compute Spectrum Matlab

I have a question while computing the spectrum of a time series in Matlab. I have read the documentations concerning 'fft' function. However I have seen two ways of implementation and both wgive me different results. I would appreciate to have some answer about this difference:
1st Method:
nPoints=length(timeSeries);
Time specifications:
Fs = 1; % samples per second
Fs = 50;
freq = 0:nPoints-1; %Numerators of frequency series
freq = freq.*Fs./nPoints;
% Fourier Transform:
X = fft(timeSeries)/nPoints; % normalize the data
% find find nuquist frequency
cutOff = ceil(nPoints./2);
% take only the first half of the spectrum
X = abs(X(1:cutOff));
% Frequency specifications:
freq = freq(1:cutOff);
%Plot spectrum
semilogy(handles.plotLoadSeries,freq,X);
2nd Method:
NFFT = 2^nextpow2(nPoints); % Next power of 2 from length of y
Y = fft(timeSeries,NFFT)/nPoints;
f = 1/2*linspace(0,1,NFFT/2+1);
% % Plot single-sided amplitude spectrum.
% plot(handles.plotLoadSeries, f,2*abs(Y(1:NFFT/2+1)))
semilogy(handles.plotLoadSeries,f,2*abs(Y(1:NFFT/2+1)));
I thought that it is not necessary to use 'nextpow' function in 'fft' function in Matlab. Finally, which is the good one?
THanks
The short answer: you need windowing for spectrum analysis.
Now for the long answer... In the second approach, you are using an optimised FFT algorithm useful when the length of the input vector is a power of two. Let's assume that your original signal has 401 samples (as in my example below) from an infinitely long signal; nextpow2() will give you NFFT=512 samples. When you feed the shorter, 401-sample signal into the fft() function, it is implicitly zero-padded to match the requested length of 512 (NFFT). But (here comes the tricky part): zero-padding your signal is equivalent to multiplying an infinitely long signal by a rectangular function, an operation that in the frequency domain translates to a convolution with a sinc function. This would be the reason behind the increased noise floor at the bottom of your semilogarithmic plot.
A way to avoid this noise increase is to create manually the 512-sample signal you want to feed into fft(), using a smoother window function instead of the default rectangular one. Windowing means just multiplying your signal by a tapered, symmetric one. There are tons of literature on choosing a good windowing function, but a typically accurate one with low sidelobes (low noise increase) is the Hamming function, implemented in MATLAB as hamming().
Here is a figure illustrating the issue (in the frequency domain and time domain):
...and the code to generate this figure:
clear
% Create signal
fs = 40; % sampling freq.
Ts = 1/fs; % sampling period
t = 0:Ts:10; % time vector
s = sin(2*pi*3*t); % original signal
N = length(s);
% FFT (length not power of 2)
S = abs(fft(s)/N);
freq = fs*(0:N-1)/N;
% FFT (length power of 2)
N2 = 2^nextpow2(N);
S2 = abs(fft(s, N2)/N2);
freq2 = fs*(0:N2-1)/N2;
t2 = (0:N2-1)*Ts; % longer time vector
s2 = [s,zeros(1,N2-N)]; % signal that was implicitly created for this FFT
% FFT (windowing before FFT)
s3 = [s.*hamming(N).',zeros(1,N2-N)];
S3 = abs(fft(s3, N2)/N2);
% Frequency-domain plot
figure(1)
subplot(211)
cla
semilogy(freq,S);
hold on
semilogy(freq2,S2,'r');
semilogy(freq2,S3,'g');
xlabel('Frequency [Hz]')
ylabel('FFT')
grid on
legend( 'FFT[401]', 'FFT[512]', 'FFT[512] with windowing' )
% Time-domain plot
subplot(212)
cla
plot(s)
hold on
plot(s3,'g')
xlabel('Index')
ylabel('Amplitude')
grid on
legend( 'Original samples', 'Windowed samples' )

How would i down-sample a .wav file then reconstruct it using nyquist? - in MATLAB

This is all done in MATLAB 2010
My objective is to show the results of: undersampling, nyquist rate/ oversampling
First i need to downsample the .wav file to get an incomplete/ or impartial data stream that i can then reconstuct.
Heres the flow chart of what im going to be doing So the flow is analog signal -> sampling analog filter -> ADC -> resample down -> resample up -> DAC -> reconstruction analog filter
what needs to be achieved:
F= Frequency
F(Hz=1/s) E.x. 100Hz = 1000 (Cyc/sec)
F(s)= 1/(2f)
Example problem: 1000 hz = Highest
frequency 1/2(1000hz) = 1/2000 =
5x10(-3) sec/cyc or a sampling rate of
5ms
This is my first signal processing project using matlab.
what i have so far.
% Fs = frequency sampled (44100hz or the sampling frequency of a cd)
[test,fs]=wavread('test.wav'); % loads the .wav file
left=test(:,1);
% Plot of the .wav signal time vs. strength
time=(1/44100)*length(left);
t=linspace(0,time,length(left));
plot(t,left)
xlabel('time (sec)');
ylabel('relative signal strength')
**%this is were i would need to sample it at the different frequecys (both above and below and at) nyquist frequency.*I think.***
soundsc(left,fs) % shows the resaultant audio file , which is the same as original ( only at or above nyquist frequency however)
Can anyone tell me how to make it better, and how to do the sampling at verious frequencies?
heres the .wav file http://www.4shared.com/audio/11xvNmkd/piano.html
EDIT:
%Play decimated file ( soundsc(y,fs) )
%Play Original file ( soundsc(play,fs ) )
%Play reconstucted File ( soundsc(final,fs) )
[piano,fs]=wavread('piano.wav'); % loads piano
play=piano(:,1); % Renames the file as "play"
t = linspace(0,time,length(play)); % Time vector
x = play;
y = decimate(x,25);
stem(x(1:30)), axis([0 30 -2 2]) % Original signal
title('Original Signal')
figure
stem(y(1:30)) % Decimated signal
title('Decimated Signal')
%changes the sampling rate
fs1 = fs/2;
fs2 = fs/3;
fs3 = fs/4;
fs4 = fs*2;
fs5 = fs*3;
fs6 = fs*4;
wavwrite(y,fs/25,'PianoDecimation');
%------------------------------------------------------------------
%Downsampled version of piano is now upsampled to the original
[PianoDecimation,fs]=wavread('PianoDecimation.wav'); % loads piano
play2=PianoDecimation(:,1); % Renames the file as "play
%upsampling
UpSampleRatio = 2; % 2*fs = nyquist rate sampling
play2Up=zeros(length(PianoDecimation)*UpSampleRatio, 1);
play2Up(1:UpSampleRatio:end) = play2; % fill in every N'th sample
%low pass filter
ResampFilt = firpm(44, [0 0.39625 0.60938 1], [1 1 0 0]);
fsUp = (fs*UpSampleRatio)*1;
wavwrite(play2Up,fsUp,'PianoUpsampled');
%Plot2
%data vs time plot
time=(1/44100)*length(play2);
t=linspace(0,time,length(play2));
stem(t,play2)
title('Upsampled graph of piano')
xlabel('time(sec)');
ylabel('relative signal strength')
[PianoUpsampled,fs]=wavread('PianoUpsampled.wav'); % loads piano
final=PianoUpsampled(:,1); % Renames the file as "play"
%-------------------------------------------------------------
%resampleing
[piano,fs]=wavread('piano.wav'); % loads piano
x=piano(:,1); % Renames the file as "play"
m = resample(x,3,2);
Original:
http://www.4shared.com/audio/11xvNmkd/piano.html
New:
http://www.4shared.com/audio/nTRBNSld/PianoUs.html
The easiest thing to do is change sample rates by an integer factor. Downsampling consists of running the data through a low-pass filter followed by discarding samples, while upsampling consists of inserting samples then running the data through a low pass filter (also known as a reconstruction filter or interpolating filter). Aliasing occurs when the filtering steps are skipped or poorly done. So, to show the effect of aliasing, I suggest you simply discard or insert samples as required, then create a new WAV file at the new sample rate. To discard samples, you can do:
DownSampleRatio = 2;
%# Normally apply a low pass filter here
leftDown = left(1:DownSampleRatio:end); %# extract every N'th sample
fsDown = fs/DownSampleRatio;
wavwrite(leftDown, fsDown, filename);
To create samples you can do:
UpSampleRatio = 2;
leftUp = zeros(length(left)*UpSampleRatio, 1);
leftUp(1:UpSampleRatio:end) = left; %# fill in every N'th sample
%# Normally apply a low pass filter here
fsUp = fs*UpSampleRatio;
wavwrite(leftUp, fsUp, filename);
You can just play back the written WAV files to hear the effects.
As an aside, you asked for improvements to your code - I prefer to initialize the t vector as t = (0:(length(left)-1))/fs;.
The DSP technique you need is called decimation.