MFCC spectrogram vs Scipi Spectrogram - scipy

I am currently working on a Convolution Neural Network (CNN) and started to look at different spectrogram plots:
With regards to the Librosa Plot (MFCC), the spectrogram is way different that the other spectrogram plots. I took a look at the comment posted here talking about the "undetailed" MFCC spectrogram. How to accomplish the task (Python Code wise) posted by the solution given there?
Also, would this poor resolution MFCC plot miss any nuisances as the images go through the CNN?
Any help in carrying out the Python Code mentioned here will be sincerely appreciated!
Here is my Python code for the comparison of the Spectrograms and here is the location of the wav file being analyzed.
Python Code
# Load various imports
import os
import librosa
import librosa.display
import matplotlib.pyplot as plt
#24bit accessible version
import wavfile
plt.figure(figsize=(17, 30))
filename = 'AWCK AR AK 47 Attached.wav'
librosa_audio, librosa_sample_rate = librosa.load(filename, sr=None)
xmin = 0
plt.title('Original Audio - 24BIT')
fig_1 = plt.plot(librosa_audio)
sr = librosa_sample_rate
mfccs = librosa.feature.mfcc(y=librosa_audio, sr=librosa_sample_rate, n_mfcc=40)
librosa.display.specshow(mfccs, sr=librosa_sample_rate, x_axis='time', y_axis='hz')
plt.title('Librosa Plot')
X = librosa.stft(librosa_audio)
Xdb = librosa.amplitude_to_db(abs(X))
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')
# plt.colorbar()
# maximum frequency
Fs = 96000.
samplerate, data =
plt.specgram(data, Fs=samplerate)
plt.title('Scipy Plot (Fs=96000)')

MFCCs are not spectrograms (time-frequency), but "cepstrograms" (time-cepstrum). Comparing MFCC with spectrogram visually is not easy, and I am not sure it is very useful either. If you wish to do so, then invert the MFCC to get back a (mel) spectrogram, by doing an inverse DCT. You can probably use mfcc_to_mel for that.
This will allow to estimate how much data has been lost in the MFCC forward transformation. But it may not say much about how much relevant information for your task has been lost, or how much reduction there has been in irrelevant noise.
This needs to be evaluated for your task and dataset. The best way is to try different settings, and evaluate performance using the evaluation metrics that you care about.
Note that MFCCs may not be such a great representation for the typical 2D CNNs that are applied to spectrograms. That is because the locality has been reduced: In the MFCC domain, frequencies that are close to eachother are no longer next to eachother in vertical axis. And because 2D CNNs have kernels with limited locality (typ 3x3 or 5x5 early on), this can reduce performance of the model.


How to generate smooth filtered envelope on EMG data in Matlab

I'm new to analysing EMG data and would appreciate some carefully explained help.
I would like to generate a smooth, linear enevelope signal of my EMG data (50kHz sampling rate) like the one published in this paper:
My end goal is to be able to analyze the relationship between EMG activity (output) and action potentials fired from upstream neurons (putative input) recorded at the same time.
Even though this paper lists the filtering methods out quite clearly, I do not understand what they mean or how to perform them in matlab, which is the analysis tool I have available to me.
In the code I have written so far, I can dc offset as well as rectify my data:
x = EMGtime_data
y = EMGvoltage_data
%dc offset
% Rectification of the EMG signal
plot(x, rec_y)
But then I am not sure how to proceed.
I have tried the envelope function, but it is not as smooth as I would like:
For instance, if I used the following:
I get this (which also doesn't seem to care that the data is rectified):
Even if I were to accept the envelope function, I'm not sure how to access just the processed envelope data to adjust the plot (i.e. change the y-range), or analyse the data further for on-set and off-set of the signal since the results of this function seem to be coupled with the original trace.
I have also come across fastrms.m, which seems promising. Unfortunately, I do not understand how to implement this function since the general explanation is over my head and the example code is lacking any defined variable (so I don't know where to integrate my own data!)
The example code from fastrms.m file exchange is here
Fs = 200; T = 5; N = T*Fs; t = linspace(0,T,N);
noise = randn(N,1);
[a,b] = butter(5, [9 12]/(Fs/2));
x = filtfilt(a,b,noise);
window = gausswin(0.25*Fs);
rms = fastrms(x,window,[],1);
plot(t,x,t,rms*[1 -1],'LineWidth',2);
xlabel('Time (sec)'); ylabel('Signal')
title('Instantaneous amplitude via RMS')
I will be eternally grateful for help in understanding how to filter and smooth EMG data!
In order to analysis EMG signals in time domain, researcher use The combination of rectification and low pass filtering which is also called finding the “linear envelope” of the signal.
And as mentioned in both the above sentence and your attached article image's explanation, in order to plot overlaid signal, you could simply low pass filter your signal at specific frequency.
In your attached article the said signal was filtered at 8 HZ.
For better understanding the art of EMG signal analysis , i think this document could help you a lot (link)

PPG signal diastolic peak detection using matlab

I'm working on PPG signals. and I want to detect some points for feature extraction. but I can't detect the point illustrated in the following figure on my own dataset:
I have tried to use fft as the following code:
close all
%% Data Importation and Extraction
increment = 1;
x = [1:increment:length(PPG)];
d = deriv2(PPG);
subplot 211
plot(x,PPG);xlim([0 100]);grid on
subplot 212
plot(diff(diff(PPG)));xlim([0 100]);grid on
Here is my own dataset:
I recently did a coursework on trying to estimate BPM of heart, by analysing every 5 sec. worth of samples. (The input was taken from phone camera, with flash ON)
But I did my implementation in Python, by using peak detection function available in SciPy. (I got decent results with it). Although I'm not sure whether if there is any similar kind of function available in MATLAB.
You can use the available parameters to detect relevant +ve peaks accordingly.
A helpful tip: Prominence is the most important parameter out of all the available parameters.

Matlab - creating psd with right scaling

I want to analyze an audiodata (.wav with pcm, 32k as sampling rate) and create the psd of it with the axes Sxx (watts/hertz not db) and f (hertz).
So I would start by reading out the audiodata with:
After this I'm having some problems because I dont really know how to proceed and also Matlab always tells me that psd functions won't be supported in the future and that I should use pwelch.. (also tried to build the autocorr and afterwards use fourier to get to the Sxx but it didn't work out really well)
So could anybody tell me how I can get from my vector x to a vector with the psdvalues in watts/hertz and plot it afterwards?
very grateful for every kind of help! :)
Update1: Yes I did read the documentation of pwelch but I'm afraid my english is too bad to understand it completly.
So if I use the psd documentation:
nfft = 2^nextpow2(length(x));
Pxx = abs(fft(x,nfft)).^2/length(x)/fs;
Hpsd = dspdata.psd(Pxx(1:length(Pxx)/2),'fs',fs);
I'm able to get the plot in db with the peak at the right frequency. (I dont know how dspdata.psd work though)
I tried out:
this gives me a non db-scale but the peak is at the wrong frequency
Update 2:
First of all, thanks a lot for your detailed answer! At the moment I'm working on my matlabskills as well as my english language but all the specific technical terms give me a hard time..
When using your example of pwelch on a wav-data with a clear frequency of 1khz, the plot shows me the peak at round about 0.14, could it maybe still be a special-scaled x-axis?
If I try it this way:
fax_Hz= bin_vals*fs/N;
the result seems right (is this way correct?), but I still need some time to search for a proper way to display the y-axis in W/Hz, since I dont know how the audiosignal was created.
Update 3:
This wav file should have a dominant frequency at 1khz with a duration of 3 seconds and a sampling frequency of 44100Hz. (If I plot the data received from audioread the oscillation seems reasonable)
I get a peak at 0.14 on the x-axis.
if I use
instead, the peak is at the 1000. Is this way right? And how could I interpret the difference scaling on the y-axis? (pwelch vs. square of abs)
I also wanted to ask if it is possible to get a flat psd of awgn in matlab? (since you just have finite elements I don't know to get there)
Thanks again for your detailed support!
Update 4
So I have a new Problem for which I think it is probably necessary to go a bit more into detail. So my plan is basically to do the following:
Read and Audiodata ([y,fs]) and generate white Noise with a certain SNR ([n,fs])
Generate a Filter H which shapes the PSD(y) similiar to the PSD(n)
Generate an inverse Filter G=H^(-1) which reverts the effect of H.
My problem is that with using pwelch, the resulting vectorlength of pyy is way smaller than the vectorlength of y. Since my Filter is determined by P=sqrt(pnn/pyy), I can't multiply fft(y)*H and therefore get no results.
Do you know any help for this Problem?
Or is there a way to go back from a PSD (Welch estimated) to a normal signal (like an inverse function for pwelch)?
In the example you have from the psd documentation, you compute a psd estimate yourself, then put it into a dspdata.psd container and plot it. What dspdata.psd data does here for you is basically compute the frequency axis and provide it to the plot command, nothing more. You get a plot of the spectral density estimate, but that's the one you compute yourself using fft, which is the simplest and worst psd estimate you can get, a so-called periodogram.
Your use of pwelch is almost correct, you just forgot to use the frequency axis information in your plot.
should give you the peak at the correct frequency.
Your use of pwelch is almost correct, but you have to give the sampling frequency as the 5th argument, and then use the frequency axis information in your plot.
should give you the peak at the correct frequency.
What pwelch gives you is the spectral density of the signal over Hz. Correct axis labels would therefore be
xlabel('frequency (Hz)')
ylabel('psd (1/Hz)')
The signal you give pwelch is a pure sequence of numbers without physical dimensions. By specifying the sampling rate, the time axis gets a physical unit, s, therefore the resulting frequency is in Hz and the density is in 1/Hz. But still your time series values have no physical dimension, and therefore the density cannot be related to something like W. Has your audiosignal been obtained by a calibrated A/D converter? If yes, you should be able to relate your data to a physical dimension and units, but that's a nontrivial step.
On a personal note, I'd really advise you to brush up on your English, because using software, especially programming interfaces, without properly understanding the documentation is a recipe for disaster.

Using Matlab FFT to extract frequencies from EEG signal

I am new to BCI. I have a Mindset EEG device from Neurosky and I record the Raw data values coming from the device in a csv file. I can read and extract the data from the csv into Matlab and I apply FFT. I now need to extract certain frequencies (Alpha, Beta, Theta, Gamma) from the FFT.
Where Delta = 1-3 Hz
Theta= 4-7 Hz
Alpha = 8-12 Hz
Beta = 13-30 Hz
Gamma = 31-40 Hz
This is what I did so far:
f = (0:N-1)*(Fs/N);
title ('Raw Signal');
p = abs(fft(rawDouble));
figure,plot (f,p);
title('Magnitude of FFT of Raw Signal');
Can anyone tell me how to extract those particular frequency ranges from the signal?? Thank you very much!
For convenient analysis of EEG data with MatLab you might consider to use the EEGLAB toolbox ( or the fieldtrip toolbox (
Both toolboxes come with good tutorials:
You may find it easier to start off with MATLAB's periodogram function, rather than trying to use the FFT directly. This takes care of windowing the data for you and various other implementation details.
I think the easiest way is to filter your signal in those ranges after you load your data.
band=[30 100] eeglocal.lowpass(band(2)).highpass(band(1));
then you can use select the time you want to process.
That should be all you need.

matlab FFT. Stuck understanding relationship between frequency and result

We're trying to analyse flow around circular cylinder and we have a set of Cp values that we got from wind tunnel experiment. Initially, we started off with a sample frequency of 20 Hz and tried to find the frequency of vortex shedding using FFT in matlab. We got a frequency of around 7 Hz. Next, we did the same experiment, but the only thing we changed was the sampling frequency- from 20 Hz to 200 Hz. We got the frequency of the vortex shedding to be around 70 Hz (this is where the peak is located in the graph). The graph doesn't change regardless of the Cp data that we enter. The only time the peak differs is when we change the sample frequency. It seems like the increase in the frequency of vortex shedding is proportional to the sample frequency and this doesn't seem to make sense at all. Any help regarding establishing a relation between sample frequency and vortex shedding frequency would be greatly appreaciated.
The problem you are seeing is related to "data aliasing" due to limitations of the FFT being able to detect frequencies higher than the Nyquist Frequency (half-the sampling frequency).
With data aliasing, a peak in real frequency will be centered around (real frequency modulo Nyquist frequency). In your 20 Hz sampling (assuming 70 Hz is the true frequency, that results in zero frequency which means you're not seeing the real information. One thing that can help you with this is to use FFT "windowing".
Another problem that you may be experiencing is related to noisy data generation via single-FFT measurement. It's better to take lots of data, use windowing with overlap, and make sure you have at least 5 FFTs which you average to find your result. As Steven Lowe mentioned, you should also sample at faster rates if possible. I would recommend sampling at the fastest rate your instruments can sample.
Lastly, I would recommend that you read some excerpts from Numerical Recipes in C (<-- link):
Section 12.0 -- Introduction to FFT
Section 12.1 (Discusses data aliasing)
Section 13.4 (Discusses FFT windowing)
You don't need to read the C source code -- just the explanations. Numerical Recipes for C has excellent condensed information on the subject.
If you have any more questions, leave them in the comments. I'll try to do my best in answering them.
Good luck!
this is probably not a programming problem, it sounds like an experiment-measurement problem
i think the sampling frequency has to be at least twice the rate of the oscillation frequency, otherwise you get artifacts; this might explain the difference. Note that the ratio of the FFT frequency to the sampling frequency is 0.35 in both cases. Can you repeat the experiment with higher sampling rates? I'm thinking that if this is a narrow cylinder in a strong wind, it may be vibrating/oscillating faster than the sampling rate can detect..
i hope this helps - there's a 97.6% probability that i don't know what i'm talking about ;-)
If it's not an aliasing problem, it sounds like you could be plotting the frequency response on a normalised frequency scale, which will change with sample frequency. Here's an example of a reasonably good way to plot a frequency response of a signal in Matlab:
Fs = 100;
Tmax = 10;
time = 0:1/Fs:Tmax;
omega = 2*pi*10; % 10 Hz
signal = 10*sin(omega*time) + rand(1,Tmax*Fs+1);
Nfft = 2^8;
[Pxx,freq] = pwelch(signal,Nfft,[],[],Fs)
Note that the sample frequency must be explicitly passed to the pwelch command in order to output the “real” frequency data. Otherwise, when you change the sample frequency the bin where the resonance occurs will seem to shift, which is similar to the problem you describe.
Methinks you need to do some serious reading on digital signal processing before you can even begin to understand all the nuances of the DFT (FFT). If I was you, I'd get grounded in it first with this great book:
Discrete-Time Signal Processing
If you want more of a mathematical treatment that will really expand your abilities,
Fourier Analysis by Körner
Take a look at this related question. While it was originally asked about asked about VB the responses are generically about FFTs
I tried using the frequency response code as above but it seems that I dont have the appropriate toolbox in Matlab. Is there any way to do the same thing without using fft command? So far, this is what I have:
% FFT Algorithm
Fs = 200; % Sampling frequency
T = 1/Fs; % Sample time
L = 65536; % Length of signal
t = (0:L-1)*T; % Time vector
y = data1; % Your CP values go in this vector
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(y,NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2);
% Plot single-sided amplitude spectrum.
title(' y(t)')
xlabel('Frequency (Hz)')
I think there might be something wrong with the code I am using. I'm not sure what though.
A colleague of mine has written some nice GPL-licenced functions for spectral analysis:
(Update: this code is now part of one of the Octave modules:
But it might be tricky to extract just the pieces you need from there.)
They're written for both Matlab and Octave and serve mostly as a drop-in replacement for the analogous functions in the Signal Processing Toolbox. (So the code above should still work fine.)
It may help with your data analysis; better than rolling your own with fft and the like.