I am trying to use concepts learned in signals analysis to isolate a specific frequency from a sound file. I have a short WAV file that consists of a person talking but also has other noises with unknown frequencies both above and below the desired signal. I have an upper and lower bound for the frequency range that should contain the part of the sound I want.
I think I should be able to do this without using the signals analysis toolbox or the butter filter.
so far I have this code which plots the power spectrum for the signal:
[y, Fs] = audioread('filename.wav','double');
t = 1:1:length(y);
y = transpose(y);
a = ifft(y);
a_k = abs([a((length(y)/2)+1:-1:2),a(1:1:(length(y)/2)+1)]);
bar((-length(y)/2)+1:1:(length(y)/2),a_k);
The power spectrum looks like this:
I think I should be able to use what I have to filter our anything above or below my known range, but I am not sure how to start doing that.
To apply ideal band pass filter you can use the code below. However, please note that the ideal filter may not produce the most optimal results if your signal is not periodic (see the wiki article here).
%% Initialisation
Fs = 44100;
t0 = 0;
t1 = 1;
t = t0 : 1/Fs : t1;
f1 = 10;
f2 = 40;
y = 10*cos(2*pi*f1*t) + 20*sin(2*pi*f2*t);
%% Cosine series
true_fft = fft(y);
nfft = length(y);
% The number of unique points
if mod(nfft, 2) == 0
num_pts = round(nfft/2) + 1;
else
num_pts = ceil(nfft/2);
end
% The vector that contains only unique points
fftT = true_fft(1 : num_pts);
% The vector that contains the unique frequencies
u_f = (0 : num_pts-1)*Fs/nfft;
%% Filtered signal
% Definition of the frequency band
f_low = 5;
f_high = 15;
[~, idx_low] = min(abs(u_f - f_low));
[~, idx_high] = min(abs(u_f - f_high));
filtFFTT = fftT;
filtFFTT([1: idx_low idx_high : end]) = 0;
if mod(nfft, 2) == 0
filtFFTT = [filtFFTT conj(filtFFTT((end - 1) : -1 : 2))];
else
filtFFTT = [filtFFTT conj(filtFFTT(end : -1 : 2))];
end
%% Data visualisation
figure;
plot(t, ifft(filtFFTT));
Related
I'm trying to get volume-time graph of .wav file. First, I recorded sound (patient exhalations) via android as .wav file, but when I read this .wav file in MATLAB it has negative values. What is the meaning of negative values? Second, MATLAB experts could you please check if the code below does the same as written in my comments? Also another question. Y = fft(WindowArray);
p = abs(Y).^2;
I took the power of values returned from fft...is that correct and what is the goal of this step??
[data, fs] = wavread('newF2');
% read exhalation audio wav file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows
while(((start+WINDOW_SIZE)-1)<=array_size)
j=1;
i =start;
while(j<=WINDOW_SIZE)
WindowArray(j) = transformed(i);
j = j+1;
i = i +1;
end
Y = fft(WindowArray);
p = abs(Y).^2; %power
[a, b] = max(abs(Y)); % find max a and its indices b
[m, i] = max(p); %the maximum of the power m and its indices i
maximum(g) = m;
index(t) = i;
power(o) = a;
indexP(g) = b;
start = start + step;
k = k+1;
t = t+1;
g = g+1;
o=o+1;
end
% low pass filter
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
M = max(maximum); %highest value in the array
Accept = 0.05* M;
if(maximum(u) > Accept)
maximum = maximum(u:length(maximum));
break;
end
end
% preparing the time of the graph,
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks));
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume
ylabel('Volume');
xlabel('Time (in seconds)');
(I only answer the part of the question which I understood)
Per default Matlab normalizes the audio wave to - 1...1 range. Use the native option if you want the integer data.
First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:
def plot_signal(file_name):
sampFreq, snd = wavfile.read(file_name)
snd = snd / (2.**15) #Convert sound array to floating point values
#Floating point values range from -1 to 1
s1 = snd[:,0] #left channel
s2 = snd[:,1] #right channel
timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000 #scale to milliseconds
timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000 #scale to milliseconds
n = len(s1)
p = fft(s1) # take the fourier transform
m = len(s2)
p2 = fft(s2)
nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)
mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)
'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p = p**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
p[1:len(p)] = p[1:len(p)] * 2
else:
p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()
'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p2 = p2**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()
I hope this helps.
In the attached code I create an arbitrarily complex signal generator (3 sine waves in this case) and generate 256 + 50 values using it. I then create an FFT using the first 256 values, plot them and then do an inverse FFT and verify that it produces a very close representation of the original signal. All good so far.
What I'd like to do now is, using the FFT results, attempt to generate the additional 50 values that were not part of the FFT data set. Is there a straight-forward in MatLab?
Without being exactly sure how I assume I can create a signal generator using the center frequency of each bin and the FFT result and generate the signal that way, but that does seem like a lot of work so before I went down that path I figured I'd see if there was an easy way that I just haven't found.
I have MatLab & the Signal Processing and DSP packages to use at this time.
Thanks!
FFTsize = 256;
futureSize = 50;
t = 1:FFTsize+futureSize;
F1_bars = 10;
RadPerBar1 = (2*pi)/F1_bars;
L1 = 1;
Offset1 = 0;
F2_bars = 8;
RadPerBar2 = (2*pi)/F2_bars;
L2 = 1;
Offset2 = 0;
F3_bars = 50;
RadPerBar3 = (2*pi)/F3_bars;
L3 = 1;
Offset3 = 0;
Sig = (L1*sin(RadPerBar1*t + Offset1) +...
L2*sin(RadPerBar2*t + Offset2) +...
L3*sin(RadPerBar3*t + Offset3));
DataSet = Sig(1:FFTsize);
FFT = fft(DataSet)/FFTsize;
%Suggested by Mad Physicist
paddedFFT = [FFT(1:ceil(FFTsize/2)) zeros(1, futureSize) FFT(ceil(FFTsize/2)+1 : end)];
IFFT = ifft(FFT)*FFTsize;
%Suggested by Mad Physicist
IFFT2 = ifft(paddedFFT)*(FFTsize + futureSize);
figure(111);
hold off;
plot(abs(FFT));
hold on;
plot(abs(paddedFFT),'--r');
legend('FFT','Padded FFT','Position','best');
hold off;
figure(110)
hold off;
plot(Sig);
hold on;
plot(IFFT,'--r');
plot(t, IFFT2,'g');
legend('Input Signal','IFFT','Padded IFFT','Position','best');
hold off;
Pad the FFT with 50 zeros before you invert it:
futureSize = 50;
paddedFFT = [FFT(1:ceil(FFTsize/2)) zeros(1, futureSize) FFT(ceil(FFTsize/2)+1 : end)];
IFFT = ifft(paddedFFT)*(FFTsize + futureSize);
I'm trying to find the number of peaks in Matlab. When I plot the given wav file according to my threshold values, I can easily see that there are two distinct peaks. But how can I write "There are two peaks" on the screen? Here's my first attempt:
hfile = 'two.wav';
[stereo1, Fs, nbits, readinfo] = wavread(hfile);
mono1 = mean(stereo1,2);
M = round(0.01*Fs);
N = 2^nextpow2(4*M);
w = gausswin(M);
[S,F,T,P] = spectrogram(mono1,w,120,N,Fs);
thresh_l=1000;
thresh_h=10000000;
% take the segment of P relating to your frequencies of interest
P2 = P(F>thresh_l&F<thresh_h,:);
%show the mean power in that band over time
m = mean(P2);
[pks,loc]=findpeaks(T,'npeaks',m);
message = sprintf('The number of peaks found = %d',length(pks));
msgbox(message);
how about looking for where the first derivative switches sign? say x is your time series
dx = diff(x);
% need to add 1 b/c of the offset generated by diff
peak_loc = find((dx(1:end - 1) > 0) & (dx(2:end) <= 0)) + 1;
peak_num = len(peak_loc);
Im tying to find the fundamental frequency of a note using harmonic product spectrum. This is the part that implements the HPS algorithm
seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); % FFT data
seg_fft = abs(seg_fft);
seg_fft2 = ones(size(seg_fft));
seg_fft3 = ones(size(seg_fft));
seg_fft4 = ones(size(seg_fft));
seg_fft5 = ones(size(seg_fft));
for i = 1:floor((length(seg_fft)-1)/2)
seg_fft2(i,1) = (seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end
for i = 1:floor((length(seg_fft)-2)/3)
seg_fft3(i,1) = (seg_fft(3*i,1) + seg_fft((3*i)+1,1) + seg_fft((3*i)+2,1))/3;
end
for i = 1:floor((length(seg_fft)-3)/4)
seg_fft4(i,1) = (seg_fft(4*i,1) + seg_fft((4*i)+1,1) + seg_fft((4*i)+2,1) + seg_fft((4*i)+3,1))/4;
end
for i = 1:floor((length(seg_fft)-4)/5)
seg_fft5(i,1) = (seg_fft(5*i,1) + seg_fft((5*i)+1,1) + seg_fft((5*i)+2,1) + seg_fft((5*i)+3,1) + seg_fft((5*i)+4,1))/5;
end
f_ym = (seg_fft) .* (seg_fft2) .* (seg_fft3) .* (seg_fft4) .*(seg_fft5);
Now when i play F4, the 2nd harmonic(698 -F5) has a higher amplitude. So HPS is supposed to help me detect the fundamental which is F4 and NOT F5.
When i do the HPS these are the graphs I get:
The figures above show the plots of seg_fft2, seg_fft3, seg_fft4 and seg_fft5 respectively.
But what I dont understand is how come the frequency points obtained in these graphs are not factors of the original spectrum?? Isn't that how HPS is supposed to work??
This is the plot I obtained when I took the product of all 5.
the peak is at 698Hz.. But shouldn't it be at 349Hz instead??
But after the whole code is run, I do get the fundamental as F4.. Its all very confusing.... Can someone tell me why my graphs are different from what is expected yet I get the correct fundamental please????
This is the rest of the code
%HPS, PartIII: find max
f_y1 = max(f_ym)
for c = 1 : length(f_ym)
if(f_ym(c,1) == f_y1)
index = c;
end
end
% Convert that to a frequency
f_y(h) = (index / NFFT) * FS;
h=h+1;
%end
V = abs(f_y);
Please do help...... Thanx in advance....
Intro: I'm using MATLAB's Neural Network Toolbox in an attempt to forecast time series one step into the future. Currently I'm just trying to forecast a simple sinusoidal function, but hopefully I will be able to move on to something a bit more complex after I obtain satisfactory results.
Problem: Everything seems to work fine, however the predicted forecast tends to be lagged by one period. Neural network forecasting isn't much use if it just outputs the series delayed by one unit of time, right?
Code:
t = -50:0.2:100;
noise = rand(1,length(t));
y = sin(t)+1/2*sin(t+pi/3);
split = floor(0.9*length(t));
forperiod = length(t)-split;
numinputs = 5;
forecasted = [];
msg = '';
for j = 1:forperiod
fprintf(repmat('\b',1,numel(msg)));
msg = sprintf('forecasting iteration %g/%g...\n',j,forperiod);
fprintf('%s',msg);
estdata = y(1:split+j-1);
estdatalen = size(estdata,2);
signal = estdata;
last = signal(end);
[signal,low,high] = preprocess(signal'); % pre-process
signal = signal';
inputs = signal(rowshiftmat(length(signal),numinputs));
targets = signal(numinputs+1:end);
%% NARNET METHOD
feedbackDelays = 1:4;
hiddenLayerSize = 10;
net = narnet(feedbackDelays,[hiddenLayerSize hiddenLayerSize]);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
signalcells = mat2cell(signal,[1],ones(1,length(signal)));
[inputs,inputStates,layerStates,targets] = preparets(net,{},{},signalcells);
net.trainParam.showWindow = false;
net.trainparam.showCommandLine = false;
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
net.performFcn = 'mse'; % Mean squared error
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
next = net(inputs(end),inputStates,layerStates);
next = postprocess(next{1}, low, high); % post-process
next = (next+1)*last;
forecasted = [forecasted next];
end
figure(1);
plot(1:forperiod, forecasted, 'b', 1:forperiod, y(end-forperiod+1:end), 'r');
grid on;
Note:
The function 'preprocess' simply converts the data into logged % differences and 'postprocess' converts the logged % differences back for plotting. (Check EDIT for preprocess and postprocess code)
Results:
BLUE: Forecasted Values
RED: Actual Values
Can anyone tell me what I'm doing wrong here? Or perhaps recommend another method to achieve the desired results (lagless prediction of sinusoidal function, and eventually more chaotic timeseries)? Your help is very much appreciated.
EDIT:
It's been a few days now and I hope everyone has enjoyed their weekend. Since no solutions have emerged I've decided to post the code for the helper functions 'postprocess.m', 'preprocess.m', and their helper function 'normalize.m'. Maybe this will help get the ball rollin.
postprocess.m:
function data = postprocess(x, low, high)
% denormalize
logdata = (x+1)/2*(high-low)+low;
% inverse log data
sign = logdata./abs(logdata);
data = sign.*(exp(abs(logdata))-1);
end
preprocess.m:
function [y, low, high] = preprocess(x)
% differencing
diffs = diff(x);
% calc % changes
chngs = diffs./x(1:end-1,:);
% log data
sign = chngs./abs(chngs);
logdata = sign.*log(abs(chngs)+1);
% normalize logrets
high = max(max(logdata));
low = min(min(logdata));
y=[];
for i = 1:size(logdata,2)
y = [y normalize(logdata(:,i), -1, 1)];
end
end
normalize.m:
function Y = normalize(X,low,high)
%NORMALIZE Linear normalization of X between low and high values.
if length(X) <= 1
error('Length of X input vector must be greater than 1.');
end
mi = min(X);
ma = max(X);
Y = (X-mi)/(ma-mi)*(high-low)+low;
end
I didn't check you code, but made a similar test to predict sin() with NN. The result seems reasonable, without a lag. I think, your bug is somewhere in synchronization of predicted values with actual values.
Here is the code:
%% init & params
t = (-50 : 0.2 : 100)';
y = sin(t) + 0.5 * sin(t + pi / 3);
sigma = 0.2;
n_lags = 12;
hidden_layer_size = 15;
%% create net
net = fitnet(hidden_layer_size);
%% train
noise = sigma * randn(size(t));
y_train = y + noise;
out = circshift(y_train, -1);
out(end) = nan;
in = lagged_input(y_train, n_lags);
net = train(net, in', out');
%% test
noise = sigma * randn(size(t)); % new noise
y_test = y + noise;
in_test = lagged_input(y_test, n_lags);
out_test = net(in_test')';
y_test_predicted = circshift(out_test, 1); % sync with actual value
y_test_predicted(1) = nan;
%% plot
figure,
plot(t, [y, y_test, y_test_predicted], 'linewidth', 2);
grid minor; legend('orig', 'noised', 'predicted')
and the lagged_input() function:
function in = lagged_input(in, n_lags)
for k = 2 : n_lags
in = cat(2, in, circshift(in(:, end), 1));
in(1, k) = nan;
end
end