How to use rp, rs, and Wn parameters in scipy.signal.filter_design.ellip? - scipy

I'd like to try out the elliptic filter design function from SciPy in scipy.signal.filter_design.ellip. I'm familiar with the filter design functions in Octave, but I'm not sure how to use this:
From the documentation at http://www.scipy.org/doc/api_docs/SciPy.signal.filter_design.html
ellip(N, rp, rs, Wn, btype = 'low', analog = 0, output = 'ba')
Elliptic (Cauer) digital and analog filter design.
Description:
Design an Nth order lowpass digital or analog elliptic filter and return the filter coefficients in (B,A) or (Z,P,K) form.
See also ellipord.
I understand N (order), btype (low or high), analog (true/false), and output (ba vs. zpk).
What are rp, rs, and Wn and how are they supposed to work?
From my experience with Octave, I'm guessing that rp and rs have to do with the maximum allowed ripple in the pass and stop bands, and that Wn is a weight or controls the cutoff frequency, but how these work isn't documented and I can't find any examples.

I believe HYRY is correct. From my experience using the Python Matlab clone scripts they work well, with the exception of poor documentation. Yes, Rp and Rs are the maximum allowable ripple in the passband and stopband respectively. The Wn is the digital cutoff, or edge frequency.
So...here's some code on how to use it to replicate the filter that the mathworks uses as an example:
import pylab
import scipy
import scipy.signal
[b,a] = scipy.signal.ellip(6,3,50,300.0/500.0);
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
plt.title('Digital filter frequency response')
ax1 = fig.add_subplot(111)
h,w = scipy.signal.freqz(b, a)
plt.semilogy(h, np.abs(w), 'b')
plt.semilogy(h, abs(w), 'b')
plt.ylabel('Amplitude (dB)', color='b')
plt.xlabel('Frequency (rad/sample)')
plt.grid()
plt.legend()
ax2 = ax1.twinx()
angles = np.unwrap(np.angle(w))
plt.plot(h, angles, 'g')
plt.ylabel('Angle (radians)', color='g')
plt.show()
sorry the format is so lame, but it works! You'll notice that the frequency scale is different than matlab shows, it's just cosmetic. This is what you get:

I think this function is the same as Octave or MATLAB, so you can read the MATLAB document about it.
http://www.mathworks.com/help/toolbox/signal/ref/ellip.html

Related

Remove noise and smoothen the ecg signal

I am processing Long term afib dataset - https://physionet.org/content/ltafdb/1.0.0/
When I test the 30s strips of this data, my model is not correcting predicting the signals. So I am trying to deal with noise in this dataset. Here how it looks
Here is the code to plot -
def plot_filter_graphs(data,xmin,xmax,order):
from numpy import sin, cos, pi, linspace
from numpy.random import randn
from scipy import signal
from scipy.signal import lfilter, lfilter_zi, filtfilt, butter
from matplotlib.pyplot import plot, legend, show, grid, figure, savefig,xlim
lowcut=1
highcut=35
nyq = 0.5 * 300
low = lowcut / nyq
high = highcut / nyq
b, a = signal.butter(order, [low, high], btype='band')
# Apply the filter to xn. Use lfilter_zi to choose the initial condition
# of the filter.
z = lfilter(b, a,data)
# Use filtfilt to apply the filter.
y = filtfilt(b, a, data)
y = np.flipud(y)
y = signal.lfilter(b, a, y)
y = np.flipud(y)
# Make the plot.
figure(figsize=(16,5))
plot(data,'b',linewidth=1.75)
plot(z, 'r--', linewidth=1.75)
plot( y, 'k', linewidth=1.75)
xlim(xmin,xmax)
legend(('actual',
'lfilter',
'filtfilt'),
loc='best')
grid(True)
show()
I am using butter band pass filter to filter the noise. I also checked with filtfilt and lfilt but that is also not giving good result.
Any suggestion, how noise can be removed so that signal accuracy is good and hense it can be used for model prediction

Simple scipy curve_fit test not returning expected results

I am trying to estimate the amplitude, frequency, and phase of an incoming signal of about 50Hz based on measurement of only a few cycles. The frequency needs to be precise to .01 Hz. Since the signal itself is going to be a pretty clear sine wave, I am trying parameter fitting with SciPy's curve_fit. I've never used it before, so I wrote a quick test function.
I start by generating samples of a single cycle of a dummy cosine wave
from math import *
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
fs = 1000 # Sampling rate (Hz)
T = .1 # Length of collection (s)
windowlength = int(fs*T) # Number of samples
f0 = 10 # Fundamental frequency of our wave (Hz)
wave = [0]*windowlength
for x in range(windowlength):
wave[x] = cos(2*pi*f0*x/fs)
t = np.linspace(0,T,int(fs*T)) # This will be our x-axis for plotting
Then I try to fit those samples to a function, adapting the code from the official example provided by scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
# Define function to fit
def sinefit(x, A, ph, f):
return A * np.sin(2*pi*f * x + ph)
# Call curve_fit
popt,cov = curve_fit(sinefit, t, wave, p0=[1,np.pi/2,10])
# Plot the result
plt.plot(t, wave, 'b-', label='data')
plt.plot(t, sinefit(t, *popt), 'r-', label='fit')
print("[Amplitude,phase,frequency]")
print(popt)
This gives me popt = [1., 1.57079633, 9.9] and the plot
plot output
My question is: why is my frequency off? I initialized the curve_fit function with the exact parameters of the cosine wave, so shouldn't the first iteration of the LM algorithm realize that there is zero residual and that it has already arrived at the correct answer? That seems to be the case for amplitude and phase, but frequency is 0.1Hz too low.
I expect this is a dumb coding mistake, since the original wave and the fit are clearly lined up in the plot. I also confirmed that the difference between them was zero across the entire sample. If they really were .1 Hz out of phase, there would be a phase shift of 3.6 degrees over my 100ms window.
Any thoughts would be very much appreciated!
The problem is that your array t is not correct. The last value in your t is 0.1, but with a sampling period of 1/fs = 0.001, the last value in t should be 0.099. That is, the times of the 100 samples are [0, 0.001, 0.002, ..., 0.098, 0.099].
You can create t correctly with either
t = np.linspace(0, T, int(fs*T), endpoint=False)
or
t = np.arange(windowlength)/fs # Use float(fs) if you are using Python 2

MATLAB filtering a signal results in NaN [duplicate]

I'm trying to filter theta range (3-8 Hz) from a 10 min long EEG signal with sampling rate of 500Hz. This is my code. Please help me to understand what's wrong. Right now the filtered signal seems to be ruined. Thank you so much!
fs=500;
Wp = [3 8]/(fs/2); Ws = [2.5 8.5]/(fs/2);
Rp = 3; Rs = 40;
[n,Wn] = buttord(Wp,Ws,Rp,Rs);
[b,a] = butter(n,Wn,'bandpass');
fdata = filter(b,a,data);
x=0:ts:((length(data)/fs)-ts);
f=-fs/2:fs/(length(data)-1):fs/2;
subplot(2,2,1)
plot(x,data)
subplot(2,2,2)
z1=abs(fftshift(fft(data)));
plot(f,z1)
xlim([0 150]);
subplot(2,2,3)
plot(x,fdata)
subplot(2,2,4)
z=abs(fftshift(fft(fdata)));
plot(f,z);
xlim([0 150]);
Your code (line 4) gives a filter order, n, equal to 37. I've had issues of numerical precision with Butterworth filters of such large orders; even with orders as low as 8. The problem is that butter gives absurd b and a values for large orders. Check your b and a vectors, and you'll see they contain values of about 1e21 (!)
The solution is to use the zero-pole representation of the filter, instead of the coefficient (b, a) representation. You can read more about this here. In particular,
In general, you should use the [z,p,k] syntax to design IIR filters. To analyze or implement your filter, you can then use the [z,p,k] output with zp2sos. If you design the filter using the [b,a] syntax, you may encounter numerical problems. These problems are due to round-off errors. They may occur for filter orders as low as 4.
In your case, you could proceed along the following lines:
[z, p, k] = butter(n,Wn,'bandpass');
[sos,g] = zp2sos(z,p,k);
filt = dfilt.df2sos(sos,g);
fdata = filter(filt,data)

MATLAB How to implement a Ram-Lak filter (Ramp filter) in the frequency domain?

I have an assignment to implement a Ram-Lak filter, but nearly no information given on it (except look at fft, ifft, fftshift, ifftshift).
I have a sinogram that I have to filter via Ram-Lak. Also the number of projections is given.
I try to use the filter
1/4 if I == 0
(b^2)/(2*pi^2) * 0 if I even
-1/(pi^2 * I^2) if I odd
b seems to be the cut-off frequency, I has something to do with the sampling rate?
Also it is said that the convolution of two functions is a simple multiplication in Fourier space.
I do not understand how to implement the filter at all, especially with no b given, not told what I is and no idea how to apply this to the sinogram, I hope someone can help me here. I spent 2hrs googling and trying to understand what is needed to do here, but I could not understand how to implement it.
The formula you listed is an intermediate result if you wanted to do an inverse Radon transform without filtering in the Fourier domain. An alternative is to do the entire filtered back projection algorithm using convolution in the spatial domain, which might have been faster 40 years ago; you would eventually rederive the formula you posted. However, I wouldn't recommended it now, especially not for your first reconstruction; you should really understand the Hilbert transform first.
Anyway, here's some Matlab code which does the obligatory Shepp-Logan phantom filtered back projection reconstruction. I show how you can do your own filtering with the Ram-Lak filter. If I was really motivated, I would replace radon/iradon with some interp2 commands and summations.
phantomData=phantom();
N=size(phantomData,1);
theta = 0:179;
N_theta = length(theta);
[R,xp] = radon(phantomData,theta);
% make a Ram-Lak filter. it's just abs(f).
N1 = length(xp);
freqs=linspace(-1, 1, N1).';
myFilter = abs( freqs );
myFilter = repmat(myFilter, [1 N_theta]);
% do my own FT domain filtering
ft_R = fftshift(fft(R,[],1),1);
filteredProj = ft_R .* myFilter;
filteredProj = ifftshift(filteredProj,1);
ift_R = real(ifft(filteredProj,[],1));
% tell matlab to do inverse FBP without a filter
I1 = iradon(ift_R, theta, 'linear', 'none', 1.0, N);
subplot(1,3,1);imagesc( real(I1) ); title('Manual filtering')
colormap(gray(256)); axis image; axis off
% for comparison, ask matlab to use their Ram-Lak filter implementation
I2 = iradon(R, theta, 'linear', 'Ram-Lak', 1.0, N);
subplot(1,3,2);imagesc( real(I2) ); title('Matlab filtering')
colormap(gray(256)); axis image; axis off
% for fun, redo the filtering wrong on purpose
% exclude high frequencies to create a low-resolution reconstruction
myFilter( myFilter > 0.1 ) = 0;
ift_R = real(ifft(ifftshift(ft_R .* myFilter,1),[],1));
I3 = iradon(ift_R, theta, 'linear', 'none', 1.0, N);
subplot(1,3,3);imagesc( real(I3) ); title('Low resolution filtering')
colormap(gray(256)); axis image; axis off

Using scipy.stats.gaussian_kde with 2 dimensional data

I'm trying to use the scipy.stats.gaussian_kde class to smooth out some discrete data collected with latitude and longitude information, so it shows up as somewhat similar to a contour map in the end, where the high densities are the peak and low densities are the valley.
I'm having a hard time putting a two-dimensional dataset into the gaussian_kde class. I've played around to figure out how it works with 1 dimensional data, so I thought 2 dimensional would be something along the lines of:
from scipy import stats
from numpy import array
data = array([[1.1, 1.1],
[1.2, 1.2],
[1.3, 1.3]])
kde = stats.gaussian_kde(data)
kde.evaluate([1,2,3],[1,2,3])
which is saying that I have 3 points at [1.1, 1.1], [1.2, 1.2], [1.3, 1.3]. and I want to have the kernel density estimation using from 1 to 3 using width of 1 on x and y axis.
When creating the gaussian_kde, it keeps giving me this error:
raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix
Looking into the source code of gaussian_kde, I realize that the way I'm thinking about what dataset means is completely different from how the dimensionality is calculate, but I could not find any sample code showing how multi-dimension data works with the module. Could someone help me with some sample ways to use gaussian_kde with multi-dimensional data?
This example seems to be what you're looking for:
import numpy as np
import scipy.stats as stats
from matplotlib.pyplot import imshow
# Create some dummy data
rvs = np.append(stats.norm.rvs(loc=2,scale=1,size=(2000,1)),
stats.norm.rvs(loc=0,scale=3,size=(2000,1)),
axis=1)
kde = stats.kde.gaussian_kde(rvs.T)
# Regular grid to evaluate kde upon
x_flat = np.r_[rvs[:,0].min():rvs[:,0].max():128j]
y_flat = np.r_[rvs[:,1].min():rvs[:,1].max():128j]
x,y = np.meshgrid(x_flat,y_flat)
grid_coords = np.append(x.reshape(-1,1),y.reshape(-1,1),axis=1)
z = kde(grid_coords.T)
z = z.reshape(128,128)
imshow(z,aspect=x_flat.ptp()/y_flat.ptp())
Axes need fixing, obviously.
You can also do a scatter plot of the data with
scatter(rvs[:,0],rvs[:,1])
I think you are mixing up kernel density estimation with interpolation or maybe kernel regression. KDE estimates the distribution of points if you have a larger sample of points.
I'm not sure which interpolation you want, but either the splines or rbf in scipy.interpolate will be more appropriate.
If you want one-dimensional kernel regression, then you can find a version in scikits.statsmodels with several different kernels.
update: here is an example (if this is what you want)
>>> data = 2 + 2*np.random.randn(2, 100)
>>> kde = stats.gaussian_kde(data)
>>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
array([ 0.02573917, 0.02470436, 0.03084282])
gaussian_kde has variables in rows and observations in columns, so reversed orientation from the usual in stats. In your example, all three points are on a line, so it has perfect correlation. That is, I guess, the reason for the singular matrix.
Adjusting the array orientation and adding a small noise, the example works, but still looks very concentrated, for example you don't have any sample point near (3,3):
>>> data = np.array([[1.1, 1.1],
[1.2, 1.2],
[1.3, 1.3]]).T
>>> data = data + 0.01*np.random.randn(2,3)
>>> kde = stats.gaussian_kde(data)
>>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
array([ 7.70204299e+000, 1.96813149e-044, 1.45796523e-251])
I found it difficult to understand the SciPy manual's description of how gaussian_kde works with 2D data. Here is an explanation which is intended to complement #endolith 's example. I divided the code into several steps with comments to explain the less intuitive bits.
First, the imports:
import numpy as np
import scipy.stats as st
from matplotlib.pyplot import imshow, show
Create some dummy data: these are 1-D arrays of the "X" and "Y" point coordinates.
np.random.seed(142) # for reproducibility
x = st.norm.rvs(loc=2, scale=1, size=2000)
y = st.norm.rvs(loc=0, scale=3, size=2000)
For 2-D density estimation the gaussian_kde object has to be initialised with an array with two rows containing the "X" and "Y" datasets. In NumPy terminology, we "stack them vertically":
xy = np.vstack((x, y))
so the "X" data is in the first row xy[0,:] and the "Y" data are in the second row xy[1,:] and xy.shape is (2, 2000). Now create the gaussian_kde object:
dens = st.gaussian_kde(xy)
We will evaluate the estimated 2-D density PDF on a 2-D grid. There is more than one way of creating such a grid in NumPy. I show here an approach which is different from (but functionally equivalent to) #endolith 's method:
gx, gy = np.mgrid[x.min():x.max():128j, y.min():y.max():128j]
gxy = np.dstack((gx, gy)) # shape is (128, 128, 2)
gxy is a 3-D array, the [i,j]-th element of gxy contains a 2-element list of the corresponding "X" and "Y" values: gxy[i, j] 's value is [ gx[i], gy[j] ].
We have to invoke dens() (or dens.pdf() which is the same thing) on each of the 2-D grid points. NumPy has a very elegant function for this purpose:
z = np.apply_along_axis(dens, 2, gxy)
In words, the callable dens (could have been dens.pdf as well) is invoked along axis=2 (the third axis) in the 3-D array gxy and the values should be returned as a 2-D array. The only glitch is that the shape of z will be (128,128,1) and not (128,128) what I expected. Note that the documentation says that:
The shape of out [the return value, L.D.] is identical to the shape of arr, except along the
axis dimension. This axis is removed, and replaced with new dimensions
equal to the shape of the return value of func1d. So if func1d returns
a scalar out will have one fewer dimensions than arr.
Most likely dens() returned a 1-long tuple and not a scalar which I was hoping for. I didn't investigate the issue any further, because this is easy to fix:
z = z.reshape(128, 128)
after which we can generate the image:
imshow(z, aspect=gx.ptp() / gy.ptp())
show() # needed if you try this in PyCharm
Here is the image. (Note that I have implemented #endolith 's version as well and got an image indistinguishable from this one.)
The example posted in the top answer didn't work for me. I had to tweak it little bit and it works now:
import numpy as np
import scipy.stats as stats
from matplotlib import pyplot as plt
# Create some dummy data
rvs = np.append(stats.norm.rvs(loc=2,scale=1,size=(2000,1)),
stats.norm.rvs(loc=0,scale=3,size=(2000,1)),
axis=1)
kde = stats.kde.gaussian_kde(rvs.T)
# Regular grid to evaluate kde upon
x_flat = np.r_[rvs[:,0].min():rvs[:,0].max():128j]
y_flat = np.r_[rvs[:,1].min():rvs[:,1].max():128j]
x,y = np.meshgrid(x_flat,y_flat)
grid_coords = np.append(x.reshape(-1,1),y.reshape(-1,1),axis=1)
z = kde(grid_coords.T)
z = z.reshape(128,128)
plt.imshow(z,aspect=x_flat.ptp()/y_flat.ptp())
plt.show()