Scipy butterworth Bandpass filter - scipy

I am new to butterworth filter and want to know my result specifically.
The data has columns of date and relative velocity variation. (The number of rows is 764)
data=pd.read_csv('dtt_median_ZZ-f1-m10_f07_1.csv')
date=data.Date
data=data.M*(-100)
def butter_bandpass(lowcut, highcut, fs, order=5):
nyq=0.5*fs
low=lowcut/nyq
high=highcut/nyq
b,a=butter(order, [low,high],btype='band')
return b, a
def butter_bandpass_filter(data,lowcut,highcut,fs,order=5):
b,a = butter_bandpass(lowcut, highcut, fs, order=order)
y=lfilter(b,a,data)
return y
F=butter_bandpass_filter(data, 1, 20, 200, 3)
#print(F)
plt.plot(date, F)
plt.show()
enter image description here
I want to specify whether the x axis is frequency domain or time domain.
If it is frequency domain, is there any way to transform it to time domain (fft?)
Because I have to control the period to see the variation during the period
The sampling rate that I used for getting data was 200Hz but I think the sampling rate in this code is different with it. So what would be the variation of the fq(sampling rate)?
What would be the normal variation of lowcut, highcut? I used 0.7-1.0Hz to export data from raw data. But I think I have to use different variation of frequency.

Related

matlab, frequency calculation and code review

i've got a logical/statistical problem:
i have to find out if the fire rate according to four different stimuli given to a neuron are significantly different.
I calculated the frequencies via psth/binning method in matlab and i am not sure if it was the right way. Following i did an anova and a tukey-test via jmp. At first sight it looks good but as mentioned before i don't think the calculation was correct.
Maybe its not the right forum for my problem but perhaps some g can find my mistake or has a better solution. Thanks:D
bins is the number of bins calculated by total duration (800ms) divided by binwidth(10ms).
At the end this function should give me a histogram plotted with freq over time (ms) and the frequencies (here a 1x80 vector with the average freq per bin).
Done for four different stimuli i got 4 vectors, put in jmd and done the tukey.
function [freq] = BinFireRate(data, dur, times_snippet, binwidth)
%function that plots the firing rate of a given dataset via binning method in [hz]
%in: dataset (n x m-matrix), dur as duarion observed from trial
%time_snippet (1,n-vector) for convert data into time values [ms]
%binwidth
%out: histogram of firing rate (freq) over time and frequency [hz]
%[1x80-vector] itself
bins = dur / binwidth;
spiketimes_stim = data .* times_snippet;
spiketimes_stim = spiketimes_stim(spiketimes_stim ~= 0);
[spikes_per_bin, bincenters] = hist(spiketimes_stim, bins);
freq = ((spikes_per_bin / binwidth) / length(data(:, 1))) * 1000;
bar(bincenters, freq);

Plotting fft data in matlab

I am analyzing ECG data using MATLAB. The data is made up of two columns, one the time in milliseconds and the other contains the volts (mV) and is imported into MATLAB from a CSV file.
I use the built-in fft function in MATLAB (i.e fft(mV)). Now that I have the transformed data, I don't know how to plot it.
I know that I need the frequency data but I'm having trouble understanding where that comes from and what the other axis even is.
When you say "the time in milliseconds", I hope you have sampled at an even interval when performing an FFT. If you have not then you have two choices.
You can interpolate the data between the points so as to "guess" where the graph would eb at the time in the time domain.
OR
You can re-sample at a regular interval. Returning the time in milliseconds is not really necessary for this as the interval must be equal, but it could functions as a validator to prove that the data is correct.
Once you have you data with a regular sampling period then you can use this to obtain the FFT.
function [ X, f ] = ctft( x, T )
% x = sample array
% T = sampling period
% X = fft amplitude
% f = frequency
N = length(x);
X = fftshift(fft( x, N ))*( (2*pi) / N );
f = linspace( -1, 1-1/N, N)/(2*T);

Filter data using frequency model

I would like to filter my time vector v using a given frequency response.
The Filer is given as a set of two vectors: f and h, where f is the frequency and h is the magnitude of the response.
I cannot fft my data v and then multiply in the frequency domain, since my data v is extremely large and ffting it is not feasible.
I've tried Yule-walker approach, but it doesn't fit to my data.
How can I apply this given frequency response on the data?
Perhaps you could try a linear phase FIR filter. You can design such a filter to achieve your desired frequency response, then filter your data using it.
Let me first generate some hypothetical f, h, and v so that my solution contains a working example:
n = 20; % (desired filter order)/2
Fs = 50; % Data sampling rate
f = (0:n)'/n * (Fs/2); % Freq for filter response
h = (f < 10); % Hypothetical low-pass filter
t = (0:1000)'/Fs;
v = sin(t) + sin(2*pi*20*t); % Hypothetical data
Design an FIR filter using ifft
You can inverse Fourier transform the filter's frequency response to obtain FIR filter coefficients.
Here, I am assuming that f is evenly spaced, starts at 0, and ends at the Nyquist frequency. If this is not the case, then maybe you can interpolate f and h so that this is true. Also keep in mind that the resulting filter will have a length of 2*n, where n = length(f)-1
Since we are assuming a symmetric real filter, the filter should have the same response for negative frequencies. ifft expects the frequencies to start at zero, though, so these negative frequency responses get aliased into higher frequencies. Assuming that h is a column vector:
h_shifted = [h; flipud(h(2:end-1))];
Then we inverse transform this and shift it back:
b_shifted = ifft(h_shifted);
b = [b_shifted(n+1:end); b_shifted(1:n)];
If you have the Signal Processing Toolbox, you can also try designfilt which gives you a few more options with regards to filter design.
Filter your data
Now that you have your filter coefficients, you can use filter or fftfilt to apply this filter to your data. Remember to account for the filter delay. For example:
v_padded = [v; zeros(n,1)];
y_padded = fftfilt(b, v_padded);
y = y_padded(n+1:end);
Of course, if you are filtering the data in blocks, you should pad with the subsequent data instead of with zeros :)

Filtering signal noise using Fourier Transforms and MATLAB

So, I've been given three different MATLAB (I'm using MATLAB R2014b) files with signals containing noise. I simply just plotted the values I was given for the first part. For example, the plot of the first signal looks like the one below.
Then, I did the Fourier Transform of the signal and plotted those values as well to determine where the noise and signal occur in the frequency spectrum. To show this, I added the plot image of the first signal below.
Finally, I am supposed to create a filter using the basic MATLAB commands and filter the noise out of the plot of the signal and then do the Fourier Transform of the signal again and plot the results. The filter portion will look something like this...
b = fir1(n,w,'type');
freqz(b,1,512);
in = filter(b,1,in);
Where n is the order of the filter, w is the cutoff frequency (cutoff frequency divided by half the sampling rate), and 'type' is something to the effect of low/high/stop/etc... So, my question is how do I figure out what the n, w, and type values of the filter I am creating are supposed to be?! Thanks in advance for any help!
I believe your high frequency components are noise, but it actually depends on your data.
See this example,
Fs = 2000;
L = 200;
t = (0 : L - 1)/Fs;
data = chirp(t,20,.05,50) + chirp(t,500,.1,700);
subplot(411)
plot(t,data,'LineWidth',2);
title('Original Data')
N = 2^nextpow2(L);
y = fft(data,N)/L;
f = Fs/2 * linspace(0,1,N/2+1);
subplot(412)
plot(f,abs(y(1:N/2+1)))
title('Spectrum of Original Data')
b = fir1(40,2*[1 200]/Fs);
newd = filter(b,1,data);
subplot(413)
plot(t,newd)
title('Filtered Data')
newy = fft(newd,N)/L;
subplot(414)
plot(f,abs(newy(1:N/2+1)))
title('Spectrum of Filtered Data')
You can use b = fir1(40,2*[200 800]/Fs); for high-pass filter.
If the second plot is correct, in the x-axis, I can assume:
A. The sampling frequency is 2000 Hz.
B. The "noise" is in the low frequencies. It seems also from the original signal, that you need to filter the low-frequency baseline.
If so, you need highpass filter, so 'type'='high'.
The order is depend in the sharpness that you want to the filter. from the plots it seems that you can use 'n'=12 or 20.
The cutoff frequency suppose to be about 0.1 if the peak in the low frequencies is indeed the noise that you want to filter, and if the 1000Hz x-axis is indeed the Nyquist frequency.

MATLAB FFT xaxis limits messing up and fftshift

This is the first time I'm using the fft function and I'm trying to plot the frequency spectrum of a simple cosine function:
f = cos(2*pi*300*t)
The sampling rate is 220500. I'm plotting one second of the function f.
Here is my attempt:
time = 1;
freq = 220500;
t = 0 : 1/freq : 1 - 1/freq;
N = length(t);
df = freq/(N*time);
F = fftshift(fft(cos(2*pi*300*t))/N);
faxis = -N/2 / time : df : (N/2-1) / time;
plot(faxis, real(F));
grid on;
xlim([-500, 500]);
Why do I get odd results when I increase the frequency to 900Hz? These odd results can be fixed by increasing the x-axis limits from, say, 500Hz to 1000Hz. Also, is this the correct approach? I noticed many other people didn't use fftshift(X) (but I think they only did a single sided spectrum analysis).
Thank you.
Here is my response as promised.
The first or your questions related to why you "get odd results when you increase the frequency to 900 Hz" is related to the Matlab's plot rescaling functionality as described by #Castilho. When you change the range of the x-axis, Matlab will try to be helpful and rescale the y-axis. If the peaks lie outside of your specified range, matlab will zoom in on the small numerical errors generated in the process. You can remedy this with the 'ylim' command if it bothers you.
However, your second, more open question "is this the correct approach?" requires a deeper discussion. Allow me to tell you how I would go about making a more flexible solution to achieve your goal of plotting a cosine wave.
You begin with the following:
time = 1;
freq = 220500;
This raises an alarm in my head immediately. Looking at the rest of the post, you appear to be interested in frequencies in the sub-kHz range. If that is the case, then this sampling rate is excessive as the Nyquist limit (sr/2) for this rate is above 100 kHz. I'm guessing you meant to use the common audio sampling rate of 22050 Hz (but I could be wrong here)?
Either way, your analysis works out numerically OK in the end. However, you are not helping yourself to understand how the FFT can be used most effectively for analysis in real-world situations.
Allow me to post how I would do this. The following script does almost exactly what your script does, but opens some potential on which we can build . .
%// These are the user parameters
durT = 1;
fs = 22050;
NFFT = durT*fs;
sigFreq = 300;
%//Calculate time axis
dt = 1/fs;
tAxis = 0:dt:(durT-dt);
%//Calculate frequency axis
df = fs/NFFT;
fAxis = 0:df:(fs-df);
%//Calculate time domain signal and convert to frequency domain
x = cos( 2*pi*sigFreq*tAxis );
F = abs( fft(x, NFFT) / NFFT );
subplot(2,1,1);
plot( fAxis, 2*F )
xlim([0 2*sigFreq])
title('single sided spectrum')
subplot(2,1,2);
plot( fAxis-fs/2, fftshift(F) )
xlim([-2*sigFreq 2*sigFreq])
title('whole fft-shifted spectrum')
You calculate a time axis and calculate your number of FFT points from the length of the time axis. This is very odd. The problem with this approach, is that the frequency resolution of the fft changes as you change the duration of your input signal, because N is dependent on your "time" variable. The matlab fft command will use an FFT size that matches the size of the input signal.
In my example, I calculate the frequency axis directly from the NFFT. This is somewhat irrelevant in the context of the above example, as I set the NFFT to equal the number of samples in the signal. However, using this format helps to demystify your thinking and it becomes very important in my next example.
** SIDE NOTE: You use real(F) in your example. Unless you have a very good reason to only be extracting the real part of the FFT result, then it is much more common to extract the magnitude of the FFT using abs(F). This is the equivalent of sqrt(real(F).^2 + imag(F).^2).**
Most of the time you will want to use a shorter NFFT. This might be because you are perhaps running the analysis in a real time system, or because you want to average the result of many FFTs together to get an idea of the average spectrum for a time varying signal, or because you want to compare spectra of signals that have different duration without wasting information. Just using the fft command with a value of NFFT < the number of elements in your signal will result in an fft calculated from the last NFFT points of the signal. This is a bit wasteful.
The following example is much more relevant to useful application. It shows how you would split a signal into blocks and then process each block and average the result:
%//These are the user parameters
durT = 1;
fs = 22050;
NFFT = 2048;
sigFreq = 300;
%//Calculate time axis
dt = 1/fs;
tAxis = dt:dt:(durT-dt);
%//Calculate frequency axis
df = fs/NFFT;
fAxis = 0:df:(fs-df);
%//Calculate time domain signal
x = cos( 2*pi*sigFreq*tAxis );
%//Buffer it and window
win = hamming(NFFT);%//chose window type based on your application
x = buffer(x, NFFT, NFFT/2); %// 50% overlap between frames in this instance
x = x(:, 2:end-1); %//optional step to remove zero padded frames
x = ( x' * diag(win) )'; %//efficiently window each frame using matrix algebra
%// Calculate mean FFT
F = abs( fft(x, NFFT) / sum(win) );
F = mean(F,2);
subplot(2,1,1);
plot( fAxis, 2*F )
xlim([0 2*sigFreq])
title('single sided spectrum')
subplot(2,1,2);
plot( fAxis-fs/2, fftshift(F) )
xlim([-2*sigFreq 2*sigFreq])
title('whole fft-shifted spectrum')
I use a hamming window in the above example. The window that you choose should suit the application http://en.wikipedia.org/wiki/Window_function
The overlap amount that you choose will depend somewhat on the type of window you use. In the above example, the Hamming window weights the samples in each buffer towards zero away from the centre of each frame. In order to use all of the information in the input signal, it is important to use some overlap. However, if you just use a plain rectangular window, the overlap becomes pointless as all samples are weighted equally. The more overlap you use, the more processing is required to calculate the mean spectrum.
Hope this helps your understanding.
Your result is perfectly right. Your frequency axis calculation is also right. The problem lies on the y axis scale. When you use the function xlims, matlab automatically recalculates the y scale so that you can see "meaningful" data. When the cosine peaks lie outside the limit you chose (when f>500Hz), there are no peaks to show, so the scale is calculated based on some veeeery small noise (here at my computer, with matlab 2011a, the y scale was 10-16).
Changing the limit is indeed the correct approach, because if you don't change it you can't see the peaks on the frequency spectrum.
One thing I noticed, however. Is there a reason for you to plot the real part of the transform? Usually, it is abs(F) that gets plotted, and not the real part.
edit: Actually, you're frequency axis is only right because df, in this case, is 1. The faxis line is right, but the df calculation isn't.
The FFT calculates N points from -Fs/2 to Fs/2. So N points over a range of Fs yields a df of Fs/N. As N/time = Fs => time = N/Fs. Substituting that on the expression of df you used: your_df = Fs/N*(N/Fs) = (Fs/N)^2. As Fs/N = 1, the final result was right :P