Finding Energy in frequency bands of an audio file vector - matlab

I have an audio file which I imported into my Matlab workspace and have it as a vector now.
I have broken the vector into windows of 100 ms long.
window_length = fs*0.1;
How can I find the energy in certain frequency bands. 0-1000 Hz, 1000-2000 Hz etc??
I've tried to use the filter below:
% Create Filter
[N,Wc]=ellipord([450 1050]/(fs/2),[500 1000]/(fs/2),1,40);
[a,b]=ellip(N,1,40,Wc);
window_filtered=filter(a,b,window);
% Find Filterend Energy
Energy_band_X_X(position) =diag(window_filtered*window_filtered');
However my results are too large and don't make any sense.
Thanks!

I recommend using a simple FFT to find the entire frequency spectrum and then finding the energy in the band of your interest. You should also normalize your input data. For example, you can divide your data by the maximum value to make them between 0 and 1. If you are dealing with 16 bits or 8 bits integer valued audio samples then your energy value is going to be high.

Related

How to generate accurate FFT plot of guitar harmonics with only 256 data points # 44.1khz Fs ?[Matlab]

I'm trying to make a realtime(ish) monophonic guitar to midi program. I want a latency of <=6 milli secs. To find what note was played i aim to sample 256 points (should take approx 6 millis) , run an fft and analyze mag plot to determine pitch of note played.
When i do this in matlab, it gives me back very unstable/inaccurate results with peaks appearing in random places etc.
The note being inputted is 110Hz sampled # 44.1khz. I've applied a high pass filter at 500hz with a roll off of 48db/octave.. so only the higher harmonics of signal should remain. The audio last for 1 second ( filled with zeros after 256 samples)
Code:
%fft work
guitar = wavread('C:\Users\Donnacha\Desktop\Astring110hz.wav');
guitar(1:44100);
X = fft(guitar);
Xmag = abs(X);
plot(Xmag);
Zoomed in FFT plot
I was hoping to see all the harmonics of 110Hz (A note on guitar) starting at >500hz..
How would i achieve accurate results from an FFT with such little data?
You can't. (at least reliably for all notes in a guitar's range).
256 samples at 44.1kHz is less than one period of most low string guitar notes. One period of vibration from a guitar's open low E string takes around 535 samples, depending on the guitar's tuning and intonation.
Harmonics often require multiple periods (repetitions) of a guitar note waveform within the FFT window in order to reliably show up in the FFT's spectrum. The more periods within the FFT window, the more reliably and sharper the harmonics show up in the FFT spectrum. Even more periods are required if the data is Von Hann (et.al.) windowed to avoid "leakage" windowing artifacts. So you have to pick the minimum number of periods needed based on the lowest note needed, your window type, and your statistical reliability and frequency resolution requirements.
An alternative is to concatenate several sets of your 256 samples into a longer window, at least as long as several periods of the lowest pitch you want to reliably plot.

window size and overlapping in pwelch function of matlab for PSD evaluations

Could anyone please suggest ideal Window size and overlapping samples for pwelch function in Matlab. I have several 200 ms EEG signals with sampling rate 1000 (signal length or number of samples = 200) to evaluate spectral power. By default pwelch uses hamming window and divides the data into 8 segments with 50% overlap. Are these default values okay for a signal with only 200 samples? The default values are working fine and its giving me a PSD plot. I want to make sure that what I am doing is conceptually correct and if someone could suggest any better way of doing this? This is a study of ERP response to stimuli in a 200 ms time duration. I want to compare the spectral powers in different frequency bands.
Thanks for the help!
Considering your time window is only 200 ms, I would suggest using pwelch with a single 200 ms window and no overlap. The frequency precision of the underlying Fourier transform is related to the length of the data segment used. Specifically, the spacing between Fourier Transform bins is related to the length of the data segment as (Fs / N) where Fs is the sampling rate of the data, and N is the length of the segment. So for example with a sampling rate of 1000 Hz and a 200 ms data segment, the effective resolution of your frequency transform will be 4 Hz. (1000 / 200).
see also: https://electronics.stackexchange.com/questions/12407/what-is-the-relation-between-fft-length-and-frequency-resolution
If you were analyzing a longer section of data (say, a few seconds), then using pwelch with overlapping windows would be more appropriate.
A good introduction to frequency based analyses is the text 'Analyzing Neural Time Series Data' by M.X. Cohen.

How to specify a certain number of fft points in a frequency

I apologize in advance, if my question doesn't make any sense. I am confused myself, because I have trouble understanding it. Its a general question, which i need to be answered for my code i am currently working on.
I have a sample rate of 44.1 kHz, for a audio file (wav), on which i want use a FFT. If i am right, it means,there are 44100 points in 1 second. I have fulfiled the rules of the nyquist-shannon-theorem, which says that my fs/2 > fmax. I have a frequency resolution of 4096. I have defined a variable, that is 300 Hz. I want know, how many points are in there.
If you mean that your FFT size is 4096 and your sample rate is 44.1 kHz, then each bin has a resolution of 44100/4096 = 10.7666015625 Hz, and a 300 Hz sine wave will have a peak at the bin with index 4096*300/44100 = 27.863945578231293, so in practice it will have a maximum at bin index 28, with some energy in adjacent bins. (Note this is using the common convention of indices starting at 0 - if you are using MATLAB then the indices will most likely be 1-based and you will need to compensate for this.)
See this useful answer for a more detailed explanation of how bin indices relate to frequency.

Summing Frequency Spectrums

I've a set of data from an EEG device from which I want to find the strength of different brain waves in Matlab. I tried to use EEGLAB but I wasn't really sure how, so at this point I'm simply using the dsp toolbox in Matlab.
For background: I've 15 epochs, 4 seconds in length. The device sampled at 256 Hz, and there are 264 sensors, so there are 1024 data points for each sensor for each epoch, i.e. my raw data is 264 x 1024 x 15. The baseline is removed. The data in each epoch is going to be used to train a classifier eventually, so I'm dealing with each epoch individually. I'll come up with more data samples later.
Anyways, what I've done so far is apply a Hann filter to the data and then run fft on the filtered data. So now I have the information in frequency domain. However, I'm not quite sure how to go from the power of the fft buckets to the power of certain frequency bands (e.g. alpha 8-13), to get the values I seek.
I know the answer should be straightforward but I can't seem to get find the answer I want online, and then there's further confusion by certain sources recommending using a wavelet transform? Here's the little bit of code I have so far, the input "data" is one epoch, i.e. 264 x 1024.
% apply a hann window
siz = size(data);
hann_window = hann(siz(2));
hann_window = repmat(hann_window.', siz(1), 1);
hann_data = data.' * hann_window;
% run fft
X = fft(hann_data, [], 2);
X_mag = abs(X);
X_mag = X_mag.';
Thanks for the assistance!
If I'm understanding your question correctly, you are wanting to scale the FFT output to get the correct power. To do this you need to divide by the number of samples used for the FFT.
X_mag = abs(X)/length(hann_data); % This gives the correct power.
See this question for more info.
Once the content is scaled correctly, you can find the power in a band (e.g. 8 - 13 Hz) by integrating the content from the start to the stop of the band. Since you are dealing with discrete values it is a discrete integration. For perspective, this is equivalent to changing the resolution bandwidth of a spectrum analyzer.

Channel vocoder using FFT - what to do about DC Component and Nyquist frequency?

I am trying to implement a channel vocoder using the iOS Accelerate vDSP FFT algorithms. I am having trouble figuring out how to treat the DC component and Nyquist frequency.
The modulator and carrier signals are both float arrays of length n. On each, I perform a forward FFT and am returned a frequency plot (call it bin[]) of length n/2.
As per the vDSP specifications, bin[1] contains the first frequency above 0Hz, bin[2] the second, etc... bin[0] contains the DC Component in the real part and the Nyquist frequency (which would normally be in bin[n/2]) in the imaginary part. vDSP essentially packs the frequency plot into as little space as possible (the imaginary part for bin[0] and bin[n/2] should always be zero before the packing).
I split the frequency plot for both carrier and modulator into k bands. My goal is to multiply each frequency in carrier.band[x] by the total magnitude of the frequencies in modulator.band[x]. Essentially, increasing the intensity of those frequencies in the carrier that are also present in the modulator.
So if n=8 and k=2, the second band for the modulator would contain contain bin[2] and bin[3]. Simple enough to find the total magnitude, simply sum the magnitudes of each bin (for example mag[2] = sqrt( bin[2].real*bin[2]*real + bin[2].imag*bin[2]*imag )).
That works great for all bands except the first one, because the first band contains the weird bin[0] with the DC component and Nyquist frequency.
How do I handle that first bin when calculating the total magnitude of a band? Do I just assume the magnitude for the first bin is JUST the DC component by itself? Do I discard the Nyquist frequency?
Thank you to anyone who can provide some guidance! I appreciate it.
I suggest you ignore 0 Hz and Nyquist since they contain no useful information in the case of an audio signal.