how to find out peak rise and decay - scipy

I have doing some siganl processing and I am new to it. I am using scipy.signal to do the calculations.
I am able to find the peak height, width, but I was wondering if I can also find the rise of peak time and decay time. That will be the distance from the left width point to the tallest peak point and then tallest peak point to right width point.
So, far I have this, which is from tutorial
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, height=0)
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.plot(np.zeros_like(x), "--", color="gray")
plt.show()
esults_full = peak_widths(x, peaks, rel_height=1)
I think I am looking for the first moment or derivative

This is a thing that depends on the type of the signal, for this signal in particular an approach that worked is to find all peaks then filter the peaks by a prominence threshold defined by the the midpoint in the prominence ranges.
Once I have the peaks of interest I used the positions of the previous and next peaks.
import numpy as np
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks, peak_prominences
x = electrocardiogram()[2000:3500]
#b, a = butter(4, 0.001, 'high')
#x = lfilter(b, a, x)
peaks, _ = find_peaks(x)
prominences, _, _ = peak_prominences(x, peaks)
selected = prominences > 0.5 * (np.min(prominences) + np.max(prominences))
left = peaks[:-1][selected[1:]]
right = peaks[1:][selected[:-1]]
top = peaks[selected]
plt.figure(figsize=(14, 4))
plt.plot(x)
plt.plot(top, x[top], "x")
plt.plot(left, x[left], ".", markersize=20)
plt.plot(right, x[right], ".", markersize=20)
plt.show()
If you want to use height threshold it is interesting to remove frequencies lower than the signal frequency.
from scipy.signal import butter, lfilter
x = electrocardiogram()
plt.figure(figsize=(14, 4))
b, a = butter(4, 0.01, 'high')
plt.plot(x[2000:10000])
x = lfilter(b, a, x)
plt.plot(x[2000:10000])
plt.legend(['original', 'highpass filtered'])
About coding style preference, if you coming from MATLAB you may be used to everything in the global scope, but I always say that modules are your friends :). I would simply import scipy.signal instead of importing their member functions as global variables, you can use some alias for the modules like import matplotlib.pyplot as plt, and you can find what alias is commong to be used for each module, but this is more for programmer interoperability, not mandatory, so it is that I wrote the code in your style.
The derivatives
You can use rise = (peaks[top] - peaks[left]) / (top - left), and fall = (peaks[top] - peaks[right]) / (top - right), this is not the actual value of the derivatives, but are related featueres features.
Also if you want to find the max de

Related

How to set different stride with uniform filter in scipy?

I am using the following code to run uniform filter on my data:
from scipy.ndimage.filters import uniform_filter
a = np.arange(1000)
b = uniform_filter(a, size=10)
The filter right now semms to work as if a stride was set to size // 2.
How to adjust the code so that the stride of the filter is not half of the size?
You seem to be misunderstanding what uniform_filter is doing.
In this case, it creates an array b that replaces every a[i] with the mean of a block of size 10 centered at a[i]. So, something like:
for i in range(0, len(a)): # for the 1D case
b[i] = mean(a[i-10//2:i+10//2]
Note that this tries to access values with indices outside the range 0..1000. In the default case, uniform_filter supposes that the data before position 0 is just a reflection of the data thereafter. And similarly at the end.
Also note that b uses the same type as a. In the example where a is of integer type, the mean will also be calculated at integer, which can cause some loss of precision.
Here is some code and plot to illustrate what's happening:
import matplotlib.pyplot as plt
import numpy as np
from scipy.ndimage.filters import uniform_filter
fig, axes = plt.subplots(ncols=2, figsize=(15,4))
for ax in axes:
if ax == axes[1]:
a = np.random.uniform(-1,1,50).cumsum()
ax.set_title('random curve')
else:
a = np.arange(50, dtype=float)
ax.set_title('values from 0 to 49')
b = uniform_filter(a, size=10)
ax.plot(a, 'b-')
ax.plot(-np.arange(0, 10)-1, a[:10], 'b:') # show the reflection at the start
ax.plot(50 + np.arange(0, 10), a[:-11:-1], 'b:') # show the reflection at the end
ax.plot(b, 'r-')
plt.show()

scipy.special yields fluctuating result for confluent hypergeometric function

The scipy implementation of the confluent hypergeometric function gives me wrong results. This is a minimal code:
import matplotlib.pyplot as plt
import numpy as np
from scipy import special
x=np.arange(0,1,.001)
f=special.hyp1f1(30,60,-1/x)
plt.scatter(x,f,s=.05)
When I run it, it produces the following plot:
output of scipy.special.hyp1f1
I wonder if there is a way to fix these fluctuations, which are definitely not correct. In fact, the function should be strictly positive in that range.
Starting from the explanation at scipy.special.hyp1f1, here is an attempt to approximate the function with a polynomial.
Apparently, hyp1f1(-1/x) works nice between x=0 and about x=0.2. Note that at x exactly 0, the function isn't properly defined. The approximation with a 5th degree polynomial is much too large for x<0.4. With a 80th degree polynomial, the approximation seems correct starting at x>0.025 but quickly gets out of bounds for smaller x. (With more than 90 terms the polynomial can't be calculated in this way anymore.)
Probably the best solution would be to use a high degree polynomial for x>=0.1 and the original hyp1f1 when x is smaller.
import matplotlib.pyplot as plt
import numpy as np
from scipy import special
x = np.linspace(0.001, 1, 1000)
f = special.hyp1f1(30, 60, -1 / x)
plt.scatter(x, f, s=1, color='r', label='hyp1f1')
for terms in range(80, 1, -10):
k10 = np.arange(terms)
c10 = special.poch(30, k10) / (special.poch(60, k10) * special.factorial(k10))
poly10 = np.poly1d(c10[::-1])
plt.scatter(x, poly10(-1 / x), s=1, label=f'{terms} terms', color=plt.cm.Set1(terms / 80))
plt.ylim(-3.5, 3.7)
plt.legend(scatterpoints=10, ncol=3)
plt.show()
Zoomed in:

Curve fitting of sine function in python using scipy is not yielding desired output

I'm trying to fit sine function on my data. No errors are shown but it doesn't seem to work.
python
def sin_fun(x,a,b):
return (a*np.sin(b*x))
p_opt,p_cov=cf(sin_fun,xdata,ydata)
print(p_opt)
plt.plot(xdata,sin_fun(xdata,*p_opt))
plt.scatter(xdata,ydata)
plt.show()
This is the output I am getting:
I have simulated your data. There are 2 problems with your code as to why it isn't doing what you want. First is that your sin_fun needs a y-offset parameter, otherwise the function will always be symmetrical about y = 0. Secondly, the fit works better if you can provide curve_fit with a reasonable guess. This is done using the p0 argument. Have a look here:
from scipy.optimize import curve_fit as cf
import numpy as np
from matplotlib import pyplot as plt
# simulate your data
xdata = np.linspace(0, 25000, 256)
ydata = 15000 * np.sin(xdata/2000) + 22000
# add some noise
ydata += np.random.rand(xdata.size) * 2000
# sin function needs a y-offset -> c
def sin_fun(x,a,b,c):
return a*np.sin(b*x)+c
# need a reasonable guess -> note that the guess is not quite right but curve_fit still works
p_opt,p_cov=cf(sin_fun,xdata,ydata, p0=(10000, 1/2500, 15000))
print(p_opt)
plt.plot(xdata,sin_fun(xdata,*p_opt))
plt.plot(xdata,ydata, 'r.', ms=1)
plt.show()
With these fixes you can get a good fit. You could also add a phase parameter to your function to help fit other sinusoids.

Why does the HMC sampler return negative values for hyperparameters that need to be positive? [older GPflow versions before 1.0]

I'd like to build a GP with marginalized hyperparameters.
I have seen that this is possible with the HMC sampler provided in gpflow from this notebook
However, when I tried to run the following code as a first step of this (NOTE this is on gpflow 0.5, an older version), the returned samples are negative, even though the lengthscale and variance need to be positive (negative values would be meaningless).
import numpy as np
from matplotlib import pyplot as plt
import gpflow
from gpflow import hmc
X = np.linspace(-3, 3, 20)
Y = np.random.exponential(np.sin(X) ** 2)
Y = (Y - np.mean(Y)) / np.std(Y)
k = gpflow.kernels.Matern32(1, lengthscales=.2, ARD=False)
m = gpflow.gpr.GPR(X[:, None], Y[:, None], k)
m.kern.lengthscales.prior = gpflow.priors.Gamma(1., 1.)
m.kern.variance.prior = gpflow.priors.Gamma(1., 1.)
# dont want likelihood be a hyperparam now so fixed
m.likelihood.variance = 1e-6
m.likelihood.variance.fixed = True
m.optimize(maxiter=1000)
samples = m.sample(500)
print(samples)
Output:
[[-0.43764571 -0.22753325]
[-0.50418501 -0.11070128]
[-0.5932655 0.00821438]
[-0.70217714 0.05077999]
[-0.77745654 0.09362291]
[-0.79404456 0.13649446]
[-0.83989415 0.27118385]
[-0.90355789 0.29589641]
...
I don't know too much in detail about HMC sampling but I would expect that the sampled posterior hyperparameters are positive, I've checked the code and it seems maybe related to the Log1pe transform, though I failed to figure it out myself.
Any hint on this?
It would be helpful if you specified which GPflow version you are using - especially given that from the output you posted it looks like you are using a really old version of GPflow (pre-1.0), and this is actually something that got improved since. What is happening here (in old GPflow) is that the sample() method returns a single array S x P, where S is the number of samples, and P is the number of free parameters [e.g. for a M x M matrix parameter with lower-triangular transform (such as the Cholesky of the covariance of the approximate posterior, q_sqrt), only M * (M - 1)/2 parameters are actually stored and optimised!]. These are the values in the unconstrained space, i.e. they can take any value whatsoever. Transforms (see gpflow.transforms module) provide the mapping between this value (between plus/minus infinity) and the constrained value (e.g. gpflow.transforms.positive for lengthscales and variances). In old GPflow, the model provides a get_samples_df() method that takes the S x P array returned by sample() and returns a pandas DataFrame with columns for all the trainable parameters which would be what you want. Or, ideally, you would just use a recent version of GPflow, in which the HMC sampler directly returns the DataFrame!

keep the scaling while drawing a weighed networkx

when I draw a weighed networkx, it does not really represented the real weight in terms of distance. I was curious if there is any parameters that I am missing or some other problem.
so, I started by making a simulated dataset as following
from pylab import plot,show
from numpy import vstack,array
from numpy.random import rand
from scipy.cluster.vq import kmeans,vq
from scipy.spatial.distance import euclidean
import networkx as nx
from scipy.spatial.distance import pdist, squareform, cdist
# data generation
data = vstack((rand(5,2) + array([12,12]),rand(5,2)))
a = pdist(data, 'euclidean')
def givexy(index1D, VectorLength):
return [index1D%VectorLength, index1D/VectorLength]
import matplotlib.pyplot as plt
plt.plot(data[:,0], data[:,1], 'o')
plt.show()
then, I calculate the euclidean distance among all pairs and use the distance as weight
G = nx.empty_graph(1)
for cnt, item in enumerate(a):
print cnt
G.add_edge(givexy(cnt, 10)[0], givexy(cnt, 10)[1], weight=item, length=0)
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos)
edge_labels=dict([((u,v,),"%.2f" % d['weight'])
for u,v,d in G.edges(data=True)])
nx.draw_networkx_edge_labels(G,pos,edge_labels=edge_labels)
#~ nx.draw(G,pos,edge_labels=edge_labels)
plt.show()
exit()
you might a get a different plot - because of unknown reason it is random. my main problem is the distance of nodes. for example the distance between node 4 to 8 is 0.82 but it looks longer than the distance of node 7 and 0.
any hint ?
thank you,
The spring layout doesn't explicitly use the weights as distances. Higher weight edges produce shorter edges in general.
Though if you want to specify the positions explicitly you can do that:
from numpy import vstack,array
from numpy.random import rand
from scipy.spatial.distance import euclidean, pdist
import networkx as nx
import matplotlib.pyplot as plt
# data generation
data = vstack((rand(5,2) + array([12,12]),rand(5,2)))
a = pdist(data, 'euclidean')
def givexy(index1D, VectorLength):
return [index1D%VectorLength, index1D/VectorLength]
plt.plot(data[:,0], data[:,1], 'o')
G = nx.Graph()
for cnt, item in enumerate(a):
print cnt
G.add_edge(givexy(cnt, 10)[0], givexy(cnt, 10)[1], weight=item, length=0)
pos={}
for node,row in enumerate(data):
pos[node]=row
nx.draw_networkx(G, pos)
plt.savefig('drawing.png')