DSP - Filtering in the frequency domain via FFT - filtering

I've been playing around a little with the Exocortex implementation of the FFT, but I'm having some problems.
Whenever I modify the amplitudes of the frequency bins before calling the iFFT the resulting signal contains some clicks and pops, especially when low frequencies are present in the signal (like drums or basses). However, this does not happen if I attenuate all the bins by the same factor.
Let me put an example of the output buffer of a 4-sample FFT:
// Bin 0 (DC)
FFTOut[0] = 0.0000610351563
FFTOut[1] = 0.0
// Bin 1
FFTOut[2] = 0.000331878662
FFTOut[3] = 0.000629425049
// Bin 2
FFTOut[4] = -0.0000381469727
FFTOut[5] = 0.0
// Bin 3, this is the first and only negative frequency bin.
FFTOut[6] = 0.000331878662
FFTOut[7] = -0.000629425049
The output is composed of pairs of floats, each representing the real and imaginay parts of a single bin. So, bin 0 (array indexes 0, 1) would represent the real and imaginary parts of the DC frequency. As you can see, bins 1 and 3 both have the same values, (except for the sign of the Im part), so I guess bin 3 is the first negative frequency, and finally indexes (4, 5) would be the last positive frequency bin.
Then to attenuate the frequency bin 1 this is what I do:
// Attenuate the 'positive' bin
FFTOut[2] *= 0.5;
FFTOut[3] *= 0.5;
// Attenuate its corresponding negative bin.
FFTOut[6] *= 0.5;
FFTOut[7] *= 0.5;
For the actual tests I'm using a 1024-length FFT and I always provide all the samples so no 0-padding is needed.
// Attenuate
var halfSize = fftWindowLength / 2;
float leftFreq = 0f;
float rightFreq = 22050f;
for( var c = 1; c < halfSize; c++ )
{
var freq = c * (44100d / halfSize);
// Calc. positive and negative frequency indexes.
var k = c * 2;
var nk = (fftWindowLength - c) * 2;
// This kind of attenuation corresponds to a high-pass filter.
// The attenuation at the transition band is linearly applied, could
// this be the cause of the distortion of low frequencies?
var attn = (freq < leftFreq) ?
0 :
(freq < rightFreq) ?
((freq - leftFreq) / (rightFreq - leftFreq)) :
1;
// Attenuate positive and negative bins.
mFFTOut[ k ] *= (float)attn;
mFFTOut[ k + 1 ] *= (float)attn;
mFFTOut[ nk ] *= (float)attn;
mFFTOut[ nk + 1 ] *= (float)attn;
}
Obviously I'm doing something wrong but can't figure out what.
I don't want to use the FFT output as a means to generate a set of FIR coefficients since I'm trying to implement a very basic dynamic equalizer.
What's the correct way to filter in the frequency domain? what I'm missing?
Also, is it really needed to attenuate negative frequencies as well? I've seen an FFT implementation where neg. frequency values are zeroed before synthesis.
Thanks in advance.

There are two issues: the way you use the FFT, and the particular filter.
Filtering is traditionally implemented as convolution in the time domain. You're right that multiplying the spectra of the input and filter signals is equivalent. However, when you use the Discrete Fourier Transform (DFT) (implemented with a Fast Fourier Transform algorithm for speed), you actually calculate a sampled version of the true spectrum. This has lots of implications, but the one most relevant to filtering is the implication that the time domain signal is periodic.
Here's an example. Consider a sinusoidal input signal x with 1.5 cycles in the period, and a simple low pass filter h. In Matlab/Octave syntax:
N = 1024;
n = (1:N)'-1; %'# define the time index
x = sin(2*pi*1.5*n/N); %# input with 1.5 cycles per 1024 points
h = hanning(129) .* sinc(0.25*(-64:1:64)'); %'# windowed sinc LPF, Fc = pi/4
h = [h./sum(h)]; %# normalize DC gain
y = ifft(fft(x) .* fft(h,N)); %# inverse FT of product of sampled spectra
y = real(y); %# due to numerical error, y has a tiny imaginary part
%# Depending on your FT/IFT implementation, might have to scale by N or 1/N here
plot(y);
And here's the graph:
The glitch at the beginning of the block is not what we expect at all. But if you consider fft(x), it makes sense. The Discrete Fourier Transform assumes the signal is periodic within the transform block. As far as the DFT knows, we asked for the transform of one period of this:
This leads to the first important consideration when filtering with DFTs: you are actually implementing circular convolution, not linear convolution. So the "glitch" in the first graph is not really a glitch when you consider the math. So then the question becomes: is there a way to work around the periodicity? The answer is yes: use overlap-save processing. Essentially, you calculate N-long products as above, but only keep N/2 points.
Nproc = 512;
xproc = zeros(2*Nproc,1); %# initialize temp buffer
idx = 1:Nproc; %# initialize half-buffer index
ycorrect = zeros(2*Nproc,1); %# initialize destination
for ctr = 1:(length(x)/Nproc) %# iterate over x 512 points at a time
xproc(1:Nproc) = xproc((Nproc+1):end); %# shift 2nd half of last iteration to 1st half of this iteration
xproc((Nproc+1):end) = x(idx); %# fill 2nd half of this iteration with new data
yproc = ifft(fft(xproc) .* fft(h,2*Nproc)); %# calculate new buffer
ycorrect(idx) = real(yproc((Nproc+1):end)); %# keep 2nd half of new buffer
idx = idx + Nproc; %# step half-buffer index
end
And here's the graph of ycorrect:
This picture makes sense - we expect a startup transient from the filter, then the result settles into the steady state sinusoidal response. Note that now x can be arbitrarily long. The limitation is Nproc > 2*min(length(x),length(h)).
Now onto the second issue: the particular filter. In your loop, you create a filter who's spectrum is essentially H = [0 (1:511)/512 1 (511:-1:1)/512]'; If you do hraw = real(ifft(H)); plot(hraw), you get:
It's hard to see, but there are a bunch of non-zero points at the far left edge of the graph, and then a bunch more at the far right edge. Using Octave's built-in freqz function to look at the frequency response we see (by doing freqz(hraw)):
The magnitude response has a lot of ripples from the high-pass envelope down to zero. Again, the periodicity inherent in the DFT is at work. As far as the DFT is concerned, hraw repeats over and over again. But if you take one period of hraw, as freqz does, its spectrum is quite different from the periodic version's.
So let's define a new signal: hrot = [hraw(513:end) ; hraw(1:512)]; We simply rotate the raw DFT output to make it continuous within the block. Now let's look at the frequency response using freqz(hrot):
Much better. The desired envelope is there, without all the ripples. Of course, the implementation isn't so simple now, you have to do a full complex multiply by fft(hrot) rather than just scaling each complex bin, but at least you'll get the right answer.
Note that for speed, you'd usually pre-calculate the DFT of the padded h, I left it alone in the loop to more easily compare with the original.

Your primary issue is that frequencies aren't well defined over short time intervals. This is particularly true for low frequencies, which is why you notice the problem most there.
Therefore, when you take really short segments out of the sound train, and then you filter these, the filtered segments wont filter in a way that produces a continuous waveform, and you hear the jumps between segments and this is what generates the clicks you here.
For example, taking some reasonable numbers: I start with a waveform at 27.5 Hz (A0 on a piano), digitized at 44100 Hz, it will look like this (where the red part is 1024 samples long):
So first we'll start with a low pass of 40Hz. So since the original frequency is less than 40Hz, a low-pass filter with a 40Hz cut-off shouldn't really have any effect, and we will get an output that almost exactly matches the input. Right? Wrong, wrong, wrong – and this is basically the core of your problem. The problem is that for the short sections the idea of 27.5 Hz isn't clearly defined, and can't be represented well in the DFT.
That 27.5 Hz isn't particularly meaningful in the short segment can be seen by looking at the DFT in the figure below. Note that although the longer segment's DFT (black dots) shows a peak at 27.5 Hz, the short one (red dots) doesn't.
Clearly, then filtering below 40Hz, will just capture the DC offset, and the result of the 40Hz low-pass filter is shown in green below.
The blue curve (taken with a 200 Hz cut-off) is starting to match up much better. But note that it's not the low frequencies that are making it match up well, but the inclusion of high frequencies. It's not until we include every frequency possible in the short segment, up to 22KHz that we finally get a good representation of the original sine wave.
The reason for all of this is that a small segment of a 27.5 Hz sine wave is not a 27.5 Hz sine wave, and it's DFT doesn't have much to do with 27.5 Hz.

Are you attenuating the value of the DC frequency sample to zero? It appears that you are not attenuating it at all in your example. Since you are implementing a high pass filter, you need to set the DC value to zero as well.
This would explain low frequency distortion. You would have a lot of ripple in the frequency response at low frequencies if that DC value is non-zero because of the large transition.
Here is an example in MATLAB/Octave to demonstrate what might be happening:
N = 32;
os = 4;
Fs = 1000;
X = [ones(1,4) linspace(1,0,8) zeros(1,3) 1 zeros(1,4) linspace(0,1,8) ones(1,4)];
x = ifftshift(ifft(X));
Xos = fft(x, N*os);
f1 = linspace(-Fs/2, Fs/2-Fs/N, N);
f2 = linspace(-Fs/2, Fs/2-Fs/(N*os), N*os);
hold off;
plot(f2, abs(Xos), '-o');
hold on;
grid on;
plot(f1, abs(X), '-ro');
hold off;
xlabel('Frequency (Hz)');
ylabel('Magnitude');
Notice that in my code, I am creating an example of the DC value being non-zero, followed by an abrupt change to zero, and then a ramp up. I then take the IFFT to transform into the time domain. Then I perform a zero-padded fft (which is done automatically by MATLAB when you pass in an fft size bigger than the input signal) on that time-domain signal. The zero-padding in the time-domain results in interpolation in the frequency domain. Using this, we can see how the filter will respond between filter samples.
One of the most important things to remember is that even though you are setting filter response values at given frequencies by attenuating the outputs of the DFT, this guarantees nothing for frequencies occurring between sample points. This means the more abrupt your changes, the more overshoot and oscillation between samples will occur.
Now to answer your question on how this filtering should be done. There are a number of ways, but one of the easiest to implement and understand is the window design method. The problem with your current design is that the transition width is huge. Most of the time, you will want as quick of transitions as possible, with as little ripple as possible.
In the next code, I will create an ideal filter and display the response:
N = 32;
os = 4;
Fs = 1000;
X = [ones(1,8) zeros(1,16) ones(1,8)];
x = ifftshift(ifft(X));
Xos = fft(x, N*os);
f1 = linspace(-Fs/2, Fs/2-Fs/N, N);
f2 = linspace(-Fs/2, Fs/2-Fs/(N*os), N*os);
hold off;
plot(f2, abs(Xos), '-o');
hold on;
grid on;
plot(f1, abs(X), '-ro');
hold off;
xlabel('Frequency (Hz)');
ylabel('Magnitude');
Notice that there is a lot of oscillation caused by the abrupt changes.
The FFT or Discrete Fourier Transform is a sampled version of the Fourier Transform. The Fourier Transform is applied to a signal over the continuous range -infinity to infinity while the DFT is applied over a finite number of samples. This in effect results in a square windowing (truncation) in the time domain when using the DFT since we are only dealing with a finite number of samples. Unfortunately, the DFT of a square wave is a sinc type function (sin(x)/x).
The problem with having sharp transitions in your filter (quick jump from 0 to 1 in one sample) is that this has a very long response in the time domain, which is being truncated by a square window. So to help minimize this problem, we can multiply the time-domain signal by a more gradual window. If we multiply a hanning window by adding the line:
x = x .* hanning(1,N).';
after taking the IFFT, we get this response:
So I would recommend trying to implement the window design method since it is fairly simple (there are better ways, but they get more complicated). Since you are implementing an equalizer, I assume you want to be able to change the attenuations on the fly, so I would suggest calculating and storing the filter in the frequency domain whenever there is a change in parameters, and then you can just apply it to each input audio buffer by taking the fft of the input buffer, multiplying by your frequency domain filter samples, and then performing the ifft to get back to the time domain. This will be a lot more efficient than all of the branching you are doing for each sample.

First, about the normalization: that is a known (non) issue. The DFT/IDFT would require a factor 1/sqrt(N) (apart from the standard cosine/sine factors) in each one (direct an inverse) to make them simmetric and truly invertible. Another possibility is to divide one of them (the direct or the inverse) by N, this is a matter of convenience and taste. Often the FFT routines do not perform this normalization, the user is expected to be aware of it and normalize as he prefers. See
Second: in a (say) 16 point DFT, what you call the bin 0 would correspond to the zero frequency (DC), bin 1 low freq... bin 4 medium freq, bin 8 to the highest frequency and bins 9...15 to the "negative frequencies". In you example, then, bin 1 is actually both the low frequency and medium frequency. Apart from this consideration, there is nothing conceptually wrong in your "equalization". I don't understand what you mean by "the signal gets distorted at low frequencies". How do you observe that ?

Related

Frequency domain phase shift, amplitude, hope size and non-linearity

I am trying to implement a frequency domain phase shift but there are few points on which I am not sure.
1- I am able to get a perfect reconstruction from a sine or sweep signal using a hanning window with a hop size of 50%. Nevertheless, how should I normalise my result when using a hop size > 50%?
2- When shifting the phase of low frequency signals (f<100, window size<1024, fs=44100) I can clearly see some non-linearity in my result. Is this because of the window size being to short for low frequencies?
Thank you very much for your help.
clear
freq=500;
fs=44100;
endTime=0.02;
t = 1/fs:1/fs:(endTime);
f1=linspace(freq,freq,fs*endTime);
x = sin(2*pi*f1.*t);
targetLength=numel(x);
L=1024;
w=hanning(L);
H=L*.50;% Hopsize of 50%
N=1024;
%match input length with window length
x=[zeros(L,1);x';zeros(L+mod(length(x),H),1)];
pend=length(x)- L ;
pin=0;
count=1;
X=zeros(N,1);
buffer0pad= zeros(N,1);
outBuffer0pad= zeros(L,1);
y=zeros(length(x),1);
delay=-.00001;
df = fs/N;
f= -fs/2:df:fs/2 - df;
while pin<pend
buffer = x(pin+1:pin+L).*w;
%append zero padding in the middle
buffer0pad(1:(L)/2)=buffer((L)/2+1: L);
buffer0pad(N-(L)/2+1:N)=buffer(1:(L)/2);
X = fft(buffer0pad,N);
% Phase modification
X = abs(X).*exp(1i*(angle(X))-(1i*2*pi*f'*delay));
outBuffer=real(ifft(X,N));
% undo zero padding----------------------
outBuffer0pad(1:L/2)=outBuffer(N-(L/2-1): N);
outBuffer0pad(L/2+1:L)=outBuffer(1:(L)/2);
%Overlap-add
y(pin+1:pin+L) = y(pin+1:pin+L) + outBuffer0pad;
pin=pin+H;
count=count+1;
end
%match output length with original input length
output=y(L+1:numel(y)-(L+mod(targetLength,H)));
figure(2)
plot(t,x(L+1:numel(x)-(L+mod(targetLength,H))))
hold on
plot(t,output)
hold off
Anything below 100 Hz has less than two cycles in your FFT window. Note that a DFT or FFT represents any waveform, including a single non-integer-periodic sinusoid, by possibly summing up of a whole bunch of sinusoids of very different frequencies. e.g. a lot more than just one. That's just how the math works.
For a von Hann window containing less than 2 cycles, these are often a bunch of mostly completely different frequencies (possibly very far away in terms of percentage from your low frequency). Shifting the phase of all those completely different frequencies may or may not shift your windowed low frequency sinusoid by the desired amount (depending on how different in frequency your signal is from being integer-periodic).
Also for low frequencies, the complex conjugate mirror needs to be shifted in the opposite direction in phase in order to still represent a completely real result. So you end up mixing 2 overlapped and opposite phase changes, which again is mostly a problem if the low frequency signal is far from being integer periodic in the DFT aperture.
Using a longer window in time and samples allows more cycles of a given frequency to fit inside it (thus possibly needing a lesser power of very different frequency sinusoids to be summed up in order to compose, make up or synthesize your low frequency sinusoid); and the complex conjugate is farther away in terms of FFT result bin index, thus reducing interference.
A sequence using any hop of a von Hann window that in 50% / (some-integer) in length is non-lossy (except for the very first or last window). All other hop sizes modulate or destroy information, and thus can't be normalized by a constant for reconstruction.

Any good ways to obtain zero local means in audio signals?

I have asked this question on DSP.SE before, but my question has got no attention. Maybe it was not so related to signal processing.
I needed to divide a discrete audio signal into segments to have some statistical processing and analysis on them. Therefore, segments with fixed local mean would be very helpful for my case. Length of segments are predefined, e.g. 512 samples.
I have tried several things. I do use reshape() function to divide audio signal into segments, and then calculate means of every segment as:
L = 512; % Length of segment
N = floor(length(audio(:,1))/L); % Number of segments
seg = reshape(audio(1:N*L,1), L, N); % Reshape into LxN sized matrix
x = mean(seg); % Calculate mean of each column
Subtracting x(k) from each seg(:,k) would make each local mean zero, yet it would distort audio signal a lot when segments are joined back.
So, since mean of hanning window is almost 0.5, substracting 2*x(k)*hann(L) from each seg(:,k) was the first thing I tried. But this time multiplying by 2 (to make the mean of hanning window be almost equal to 1) distorted the neighborhood of midpoints in each segments itself.
Then, I have used convolution by a smaller hanning window instead of multiplying directly, and subtracting these (as shown in figure below) from each seg(:,k).
This last step gives better results, yet it is still not very useful when segments are smaller. I have seen many amazing approaches here on this site for different problems. So I just wonder if there is any clever ways or existing methods to obtain zero local means which distorts an audio signal less. I read that, this property is useful in some decompositions such as EMD. So maybe I need such decompositions?
You can try to use a moving average filter:
x = cumsum(rand(15*512, 1)-0.5); % generate a random input signal
mean_filter = 1/512 * ones(1, 512); % generate a mean filter
mean = filtfilt(mean_filter, 1, x); % filtfilt is used instead of filter to obtain a symmetric moving average.
% plot the result
figure
subplot(2,1,1)
plot(x);
hold on
plot(mean);
subplot(2,1,2)
plot(x - mean);
You can tune the filter by changing the interval of the mean filter. Using a smaller interval, results in lower means inside each interval, but filters also more low frequencies out of your signal.

Sampling at exactly Nyquist rate in Matlab

Today I have stumbled upon a strange outcome in matlab. Lets say I have a sine wave such that
f = 1;
Fs = 2*f;
t = linspace(0,1,Fs);
x = sin(2*pi*f*t);
plot(x)
and the outcome is in the figure.
when I set,
f = 100
outcome is in the figure below,
What is the exact reason of this? It is the Nyquist sampling theorem, thus it should have generated the sine properly. Of course when I take Fs >> f I get better results and a very good sine shape. My explenation to myself is that Matlab was having hardtime with floating numbers but I am not so sure if this is true at all. Anyone have any suggestions?
In the first case you only generate 2 samples (the third input of linspace is number of samples), so it's hard to see anything.
In the second case you generate 200 samples from time 0 to 1 (including those two values). So the sampling period is 1/199, and the sampling frequency is 199, which is slightly below the Nyquist rate. So there is aliasing: you see the original signal of frequency 100 plus its alias at frequency 99.
In other words: the following code reproduces your second figure:
t = linspace(0,1,200);
x = .5*sin(2*pi*99*t) -.5*sin(2*pi*100*t);
plot(x)
The .5 and -.5 above stem from the fact that a sine wave can be decomposed as the sum of two spectral deltas at positive and negative frequencies, and the coefficients of those deltas have opposite signs.
The sum of those two sinusoids is equivalent to amplitude modulation, namely a sine of frequency 99.5 modulated by a sine of frequency 1/2. Since time spans from 0 to 1, the modulator signal (whose frequency is 1/2) only completes half a period. That's what you see in your second figure.
To avoid aliasing you need to increase sample rate above the Nyquist rate. Then, to recover the original signal from its samples you can use an ideal low pass filter with cutoff frequency Fs/2. In your case, however, since you are sampling below the Nyquist rate, you would not recover the signal at frequency 100, but rather its alias at frequency 99.
Had you sampled above the Nyquist rate, for example Fs = 201, the orignal signal could ideally be recovered from the samples.† But that would require an almost ideal low pass filter, with a very sharp transition between passband and stopband. Namely, the alias would now be at frequency 101 and should be rejected, whereas the desired signal would be at frequency 100 and should be passed.
To relax the filter requirements you need can sample well above the Nyquist rate. That way the aliases are further appart from the signal and the filter has an easier job separating signal from aliases.
† That doesn't mean the graph looks like your original signal (see SergV's answer); it only means that after ideal lowpass filtering it will.
Your problem is not related to the Nyquist theorem and aliasing. It is simple problem of graphic representation. You can change your code that frequency of sine will be lower Nyquist limit, but graph will be as strange as before:
t = linspace(0,1,Fs+2);
plot(sin(2*pi*f*t));
Result:
To explain problem I modify your code:
Fs=100;
f=12; %f << Fs
t=0:1/Fs:0.5; % step =1/Fs
t1=0:1/(10*Fs):0.5; % step=1/(10*Fs) for precise graphic representation
subplot (2, 1, 1);
plot(t,sin(2*pi*f*t),"-b",t,sin(2*pi*f*t),"*r");
subplot (2, 1, 2);
plot(t1,sin(2*pi*f*t1),"g",t,sin(2*pi*f*t),"r*");
See result:
Red star - values of sin(2*pi*f) with sampling rate of Fs.
Blue line - lines which connect red stars. It is usual data representation of function plot() - line interpolation between data points
Green curve - sin(2*pi*f)
Your eyes and brain can easily understand that these graphs represent the sine
Change frequency to more high:
f=48; % 2*f < Fs !!!
See on blue lines and red stars. Your eyes and brain do not understand now that these graphs represent the same sine. But your "red stars" are actually valid value of sine. See on bottom graph.
Finally, there is the same graphics for sine with frequency f=50 (2*f = Fs):
P.S.
Nyquist-Shannon sampling theorem states for your case that if:
f < 2*Fs
You have infinite number of samples (red stars on our plots)
then you can reproduce values of function in any time (green curve on our plots). You must use sinc interpolation to do it.
copied from Matlab Help:
linspace
Generate linearly spaced vectors
Syntax
y = linspace(a,b)
y = linspace(a,b,n)
Description
The linspace function generates linearly spaced vectors. It is similar to the colon operator ":", but gives direct control over the number of points.
y = linspace(a,b) generates a row vector y of 100 points linearly spaced between and including a and b.
y = linspace(a,b,n) generates a row vector y of n points linearly spaced between and including a and b. For n < 2, linspace returns b.
Examples
Create a vector of 100 linearly spaced numbers from 1 to 500:
A = linspace(1,500);
Create a vector of 12 linearly spaced numbers from 1 to 36:
A = linspace(1,36,12);
linspace is not apparent for Nyquist interval, so you can use the common form:
t = 0:Ts:1;
or
t = 0:1/Fs:1;
and change the Fs values.
The first Figure is due to the approximation of '0': sin(0) and sin(2*pi). We can notice the range is in 10^(-16) level.
I wrote the function reconstruct_FFT that can recover critically sampled data even for short observation intervals if the input sequence of samples is periodic. It performs lowpass filtering in the frequency domain.

How to use inverse FFT on amplitude-frequency response?

I am trying to create an application for calculating coefficients for a graphic equalizer FIR filter. I am doing some prototyping in Matlab but I have some problems.
I have started with the following Matlab code:
% binamps vector holds 2^13 = 8192 bins of desired amplitude values for frequencies in range 0.001 .. 22050 Hz (half of samplerate 44100 Hz)
% it looks just fine, when I use Matlab plot() function
% now I get ifft
n = size(binamps,1);
iff = ifft(binamps, n);
coeffs = real(iff); % throw away the imaginary part, because FIR module will not use it anyway
But when I do the fft() of the coefficients, I see that the frequencies are stretched 2 times and the ending of my AFR data is lost:
p = fft(coeffs, n); % take the fourier transform of coefficients for a test
nUniquePts = ceil((n+1)/2);
p = p(1:nUniquePts); % select just the first half since the second half
% is a mirror image of the first
p = abs(p); % take the absolute value, or the magnitude
p = p/n; % scale by the number of points so that
% the magnitude does not depend on the length
% of the signal or on its sampling frequency
p = p.^2; % square it to get the power
sampFreq = 44100;
freqArray = (0:nUniquePts-1) * (sampFreq / n); % create the frequency array
semilogx(freqArray, 10*log10(p))
axis([10, 30000 -Inf Inf])
xlabel('Frequency (Hz)')
ylabel('Power (dB)')
So I guess, I am using ifft wrong. Do I need to make my binamps vector twice as long and create a mirror in the second part of it? If it is the case, then is it just a Matlab's implementation of ifft or also other C/C++ FFT libraries (especially Ooura FFT) need mirrored data for inverse FFT?
Is there anything else I should know to get the FIR coefficients out of ifft?
Your frequency domain vector needs to be complex rather than just real, and it needs to be symmetric about the mid point in order to get a purely real time domain signal. Set the real parts to your desired magnitude values and set the imaginary parts to zero. The real parts need to have even symmetry such that A[N - i] = A[i] (A[0] and A[N / 2] are "special", being the DC and Nyquist components - just set these to zero.)
The above applies to any general purpose complex-to-complex FFT/IFFT, not just MATLAB's implementation.
Note that if you're trying to design a time domain filter with an arbitrary frequency response then you'll need to do some windowing in the frequency domain first. You might find this article helpful - it talks about arbitrary FIR filter design usign MATLAB, in particular fir2.
To get a real result, the input to any typical generic IFFT (not just Matlab's implementation) needs to be complex-conjugate-symmetric. So doing an IFFT with a given number of independent specification points will require an FFT at least twice as long (preferably even longer to allow for some transition to zero from the highest frequency cut-off).
Trying to get a real result by throwing away the "imaginary" portion of a complex result won't work, as you will be throwing away actual required information content the time-domain filter needs for the given frequency response input to the IFFT. However, if the original data is conjugate-symmetric, then the imaginary portion of the IFFT/FFT result will be (usually insignificant) rounding-error noise that can be thrown away.
Also, the DTFT of a finite frequency response will produce an infinitely long FIR. To get a finite length FIR, you will need to compromise the specification for your frequency response specification so that there is little energy left in the latter portion of the time-domain representation that has to be truncated from the FIR to make it realizable or finite. One common (but not necessary the best) way to do this is to window the FIR result produced by the IFFT, and, by trial-and-error, try different windows until you find a FIR filter for which an FFT produces a result "close enough" to your original frequency spec.

Using Cepstrum for PDA

Hey, I am currently deleveloping a algorithm to decide wheather or not a frame is voiced or unvoiced. I am trying to use the Cepstrum to discriminate between these two situations. I use MATLAB for my implementation.
I have some problems, saying something generally about the frame, but my currently implementation looks like (I'm award of the MATLAB has the function rceps, but this haven't worked for either):
ceps = abs(ifft(log10(abs(fft(frame.*window')).^2+eps)));
Can anybody give me a small demo, that will convert the frame to the power cepstrum, so a single lollipop at the pitch frequency. For instance use this code to generate the frequency.
fs = 8000;
timelength = 25e-3;
freq = 500;
k = 0:1/fs:timelength-(1/fs);
s = 0.8*sin(2*pi*freq*k);
Thanks.
According to Wikipedia, the power cepstrum is (deep breath) the magnitude squared of the Fourier transform of the log of the magnitude squared of the Fourier transform of the signal. So I think you're looking for
function c = ceps(frame, win)
c = abs(fft(log10(abs(fft(frame.*win)).^2+eps))).^2;
Note that I changed one of your variable names because WINDOW is a predefined function in the Signal Processing Toolbox.
But, ifft and fft only differ by a scale factor, and the outer abs won't change the overall shape, so where's the lollipop right? See further down on the Wikipedia page.
A sinusoidal time input isn't going to give you an impulse in the cepstrum. The sine should yield an impulse in the spectrum, which will still be an impulse after the logmag operation, which will transform into a level shift in the cepstrum. To get something impulsive in the cepstrum, you need something periodic in the spectrum, which means you need something with multiple harmonic frequencies in the time domain. Consider, for instance, a square wave:
N = 1024;
h = hann(N, 'periodic');
f = 10;
x = sin(2*pi*f*((1:N)'-1)/N); %#'# to deal with SO formatting
s = 2*(x > 0) - 1; %# square wave
cx = ceps(x, h);
cs = ceps(s, h);
cs will have your longed-for lollipop, not cx.
There seems to always be a large component in the 0th cepstral bin. I guess this is because the logarithm operation always makes the input to the second FFT have a big level shift? Also, I don't get the idea of quefrency, I would have expected the lollipop to be at N/f. So maybe there's still something wrong with this code, or (more likely) my understanding.