How to compute the derivative of a cubic spline interpolation using scipy? - scipy

I have dataset which look like this :
position number_of_tag_at_this_position
3 4
8 6
13 25
23 12
I want to apply cubic spline interpolation to this dataset to interpolate tag density; to do so, i run :
import numpy as np
from scipy import interpolate`
x = [3,8,13,23]`
y = [4,6,25,12]`
tck = interpolate.splrep(x,y) # cubic`
And now, i would like to calculate the derivative of the function at each point of the interpolation, How can i do this ?
Thanks for your help !

See the manual:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.splev.html
Note parameter der.

Related

Using numerical methods to plot solution to first-order nonlinear differential equation in Matlab

I have a question about plotting x(t), the solution to the following differential equation knowing that dx/dt equals the expression below. The value of x is 0 at t = 0.
syms x
dxdt = -(1.0*(6.84e+45*x^2 + 5.24e+32*x - 2.49e+42))/(2.47e+39*x + 7.12e+37)
I want to plot the solution of this first-order nonlinear differential equation. The analytical solution involves complex numbers so that's not relevant because this equation models a real-life process, but Matlab can solve the equation using numerical methods and plot it. Can someone please suggest how to do this?
in matlab try this
tspan = [0 10];
x0 = 0;
[t,x] = ode45(#(t,x) -(1.0*(6.84e+45*x^2 + 5.24e+32*x - 2.49e+42))/(2.47e+39*x + 7.12e+37), tspan, x0);
plot(t,x,'b')
i try it and i got this
hope that help you.
I have written an example for how to use Python with SymPy and matplotlib. SymPy can be used to calculate both definite and indefinite integrals. By calculating the indefinite integral and adding a constant to set it to evaluate to 0 at t = 0. Now you have the integral, so just a matter of plotting. I would define an array from a starting point to an endpoint with 1000 points between (could likely be less). You can then calculate the value of the integral with the constant at each time point, which can then be plotted with matplotlib. There are plenty of other questions on how to customize plots with matplotlib.
This displays a basic plot of the indefinite integral of the function dxdt with assumption of x(t) = 0. Variation of the tuple when running Plotting() will set what range of x values to plot. This is set to plot 1000 data points between the minimum and maximum values set when calling the function.
For more information on customizing the plot, I recommend matplotlib documentation. Documentation on the integral can be found in SymPy documentation.
import pandas as pd
from sympy import *
from sympy.abc import x
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
def Plotting(xValues, dxdt):
# Calculate integral
xt = integrate(dxdt,x)
# Convert to function
f = lambdify(x, xt)
C = -f(0)
# Define x values, last number in linspace corresponding to number of points to plot
xValues = np.linspace(xValues[0],xValues[1],500)
yValues = [f(x)+C for x in xValues]
# Initialize figure
fig = plt.figure(figsize = (4,3))
ax = fig.add_axes([0, 0, 1, 1])
# Plot Data
ax.plot(xValues, yValues)
plt.show()
plt.close("all")
# Define Function
dxdt = -(1.0*(6.84e45*x**2 + 5.24e32*x - 2.49e42))/(2.47e39*x + 7.12e37)
# Run Plotting function, with left and right most points defined as tuple, and function as second argument
Plotting((-0.025, 0.05),dxdt)

Linear regression in MATLAB [duplicate]

This question already has an answer here:
How do I determine the coefficients for a linear regression line in MATLAB? [closed]
(1 answer)
Closed 7 years ago.
How could I make a linear regression with several value equals on x with MATLAB?
Now, an example with minimal data (not the data I use) :
y = [1,2,3,4,5,6,7,8,9,10];
x = [2,2,2,4,4,6,6,6,10,10];
If I use polyfit or \:
x = temp(:,1); y = temp(:,2);
b1 = x\y;
yCalc1 = b1*x;
plot(x,yCalc1,'-r');
Then the linear regression is wrong because (I suppose) he didn't notice that several values have got the same (x).
Here, a graph with my real data. Blue dots: my data. Red line : the linear regression (it's wrong). Don't focus to green dash line:
And here, the "same" graph (done with Excel):
Blue dots: my data. Red line : the linear regression (it's right)
Du you think that if I do a mean for each yvalues with the same x, it's mathematicaly right ?
If you intended to solve simple linear regression with matrix form Y= XB and the operator \, you need to add an additional column of ones in your X for calculating the intercepts.
y0 = [1,2,3,4,5,6,7,8,9,10];
x0 = [2,2,2,4,4,6,6,6,10,10];
X1 = [ones(length(x0),1) x0'];
b = X1\y0';
y = b(1) + x0*b(2)
plot(x0,y0,'o')
hold on
plot(x0,y,'--r')
You can find a good Matlab example here
So, Dan suggests me a function and it's working now.
If you want to do the same thing, just do like that :
use the fitlm function (http://fr.mathworks.com/help/stats/fitlm.html?refresh=true#bunfd6c-2)
Example datas :
y = [1,2,3,4,5,6,7,8,9,10];
x = [2,2,2,4,4,6,6,6,10,10];
tbl = table(x,y)
lm = fitlm(tbl,'linear')
and you will have different values.
A linear regression is an equation as y = ax + b. Here, on result, a correspond to x (bellow equal to 0.15663) and b correspond to (Intercept) (bellow equal to 1.4377).
With other values, Matlab will show you this result :
Linear regression model:
y ~ 1 + x
Estimated Coefficients:
Estimate SE tStat pValue
________ _________ ______ ___________
(Intercept) 1.4377 0.031151 46.151 5.8802e-290
x 0.15663 0.0054355 28.816 1.2346e-145
Number of observations: 1499, Error degrees of freedom: 1497
Root Mean Squared Error: 0.135
R-squared: 0.357, Adjusted R-Squared 0.356
F-statistic vs. constant model: 830, p-value = 1.23e-145
Thank's to Dan again !

How to make a polynomial approximation in Scilab?

I've a set of measures, which I want to approximate. I know I can do that with a 4th degree polynomial, but I don't know how to find it's five coefficients using Scilab.
For now, I must use the user-friendly functions of Open office calc... So, to keep using only Scilab, I'd like to know if a built-in function exists, or if we can use a simple script.
There is no built-in polyfit function like in Matlab, but you can make your own:
function cf = polyfit(x,y,n)
A = ones(length(x),n+1)
for i=1:n
A(:,i+1) = x(:).^i
end
cf = lsq(A,y(:))
endfunction
This function accepts two vectors of equal size (they can be either row or column vectors; colon operator makes sure they are column-oriented in the computation) and the degree of polynomial.
It returns the column of coefficients, ordered from 0th to the nth degree.
The computational method is straightforward: set up the (generally, overdetermined) linear system that requires the polynomial to pass through every point. Then solve it in the sense of least squares with lsq (in practice, it seems that cf = A\y(:) performs identically, although the algorithm is a bit different there).
Example of usage:
x = [-3 -1 0 1 3 5 7]
y = [50 74 62 40 19 35 52]
cf = polyfit(x,y,4)
t = linspace(min(x),max(x))' // now use these coefficients to plot the polynomial
A = ones(length(t),n+1)
for i=1:n
A(:,i+1) = t.^i
end
plot(x,y,'r*')
plot(t,A*cf)
Output:
The Atom's toolbox "stixbox" has Matlab-compatible "polyfit" and "polyval" functions included.
// Scilab 6.x.x need:
atomsInstall(["stixbox";"makematrix";"distfun";"helptbx";linalg"]) // install toolboxes
// POLYNOMINAL CURVE_FITTING
// Need toolboxes above
x = [-3 -1 0 1 3 5 7];
y = [50 74 62 40 19 35 52];
plot(x,y,"."); // plot sample points only
pcoeff = polyfit(x,y,4); // calculate polynominal coefficients (4th-degree)
xp = linspace(-3,7,100); // generate a little more x-values for a smoother curve fitting
yp = polyval(pcoeff,xp); // calculate the y-values for the curve fitting
plot(xp, yp,"k"); // plot the curve fitting in black

Arbitrary sampling over an interpolation

I have arbitrary points (8192,4678,1087.2,600,230.4,etc) that I want to interpolate and resample at other define points (100,500.3,802,2045,4399.5125,etc).
I tried cubic spline interpolation but it is using a steady step sampling and depending on the step sampling it may not generate the value I need.
How would you do it ?
If your points are x1=[...] and y1=[...] and you want to evaluate a spline a new base of x2=[...] then you
y2 = spline(x1,y1,x2)
** Example **
x1 = [0,2,4,6,8].'
y1 = [24,25,22,14,6].'
x2 = [2,2.5,3,3.5,4].'
y2 = spline(x1,y1,x2)
y2 =
25.0000
24.7227
24.1563
23.2617
22.0000
It all depends on the underlying physical phenomenon. There is a fine line between interpolating and just making up stuff.
I would probably first upsample & filter until I have a meaningful signal at a fixed sampling rate.
I would then use some interpolation method to estimate the signal at the goal points.
I would recommend you to consider doing this backwards.
Rather than generating a lot of points and hoping that the points that you need are there, calculate a formula for the interpolation (perhaps piecewise linear or something more complicated) and evaluate the function at the required points.
Assuming you have x = [1 2 3 4 10] and y = [11 22 13 24 11] your linear interpolation at point 6 would be:
24+(6-4) * (11-24) / (10-4)
It should not be too hard to generalize this.

Using scipy.stats.gaussian_kde with 2 dimensional data

I'm trying to use the scipy.stats.gaussian_kde class to smooth out some discrete data collected with latitude and longitude information, so it shows up as somewhat similar to a contour map in the end, where the high densities are the peak and low densities are the valley.
I'm having a hard time putting a two-dimensional dataset into the gaussian_kde class. I've played around to figure out how it works with 1 dimensional data, so I thought 2 dimensional would be something along the lines of:
from scipy import stats
from numpy import array
data = array([[1.1, 1.1],
[1.2, 1.2],
[1.3, 1.3]])
kde = stats.gaussian_kde(data)
kde.evaluate([1,2,3],[1,2,3])
which is saying that I have 3 points at [1.1, 1.1], [1.2, 1.2], [1.3, 1.3]. and I want to have the kernel density estimation using from 1 to 3 using width of 1 on x and y axis.
When creating the gaussian_kde, it keeps giving me this error:
raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix
Looking into the source code of gaussian_kde, I realize that the way I'm thinking about what dataset means is completely different from how the dimensionality is calculate, but I could not find any sample code showing how multi-dimension data works with the module. Could someone help me with some sample ways to use gaussian_kde with multi-dimensional data?
This example seems to be what you're looking for:
import numpy as np
import scipy.stats as stats
from matplotlib.pyplot import imshow
# Create some dummy data
rvs = np.append(stats.norm.rvs(loc=2,scale=1,size=(2000,1)),
stats.norm.rvs(loc=0,scale=3,size=(2000,1)),
axis=1)
kde = stats.kde.gaussian_kde(rvs.T)
# Regular grid to evaluate kde upon
x_flat = np.r_[rvs[:,0].min():rvs[:,0].max():128j]
y_flat = np.r_[rvs[:,1].min():rvs[:,1].max():128j]
x,y = np.meshgrid(x_flat,y_flat)
grid_coords = np.append(x.reshape(-1,1),y.reshape(-1,1),axis=1)
z = kde(grid_coords.T)
z = z.reshape(128,128)
imshow(z,aspect=x_flat.ptp()/y_flat.ptp())
Axes need fixing, obviously.
You can also do a scatter plot of the data with
scatter(rvs[:,0],rvs[:,1])
I think you are mixing up kernel density estimation with interpolation or maybe kernel regression. KDE estimates the distribution of points if you have a larger sample of points.
I'm not sure which interpolation you want, but either the splines or rbf in scipy.interpolate will be more appropriate.
If you want one-dimensional kernel regression, then you can find a version in scikits.statsmodels with several different kernels.
update: here is an example (if this is what you want)
>>> data = 2 + 2*np.random.randn(2, 100)
>>> kde = stats.gaussian_kde(data)
>>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
array([ 0.02573917, 0.02470436, 0.03084282])
gaussian_kde has variables in rows and observations in columns, so reversed orientation from the usual in stats. In your example, all three points are on a line, so it has perfect correlation. That is, I guess, the reason for the singular matrix.
Adjusting the array orientation and adding a small noise, the example works, but still looks very concentrated, for example you don't have any sample point near (3,3):
>>> data = np.array([[1.1, 1.1],
[1.2, 1.2],
[1.3, 1.3]]).T
>>> data = data + 0.01*np.random.randn(2,3)
>>> kde = stats.gaussian_kde(data)
>>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
array([ 7.70204299e+000, 1.96813149e-044, 1.45796523e-251])
I found it difficult to understand the SciPy manual's description of how gaussian_kde works with 2D data. Here is an explanation which is intended to complement #endolith 's example. I divided the code into several steps with comments to explain the less intuitive bits.
First, the imports:
import numpy as np
import scipy.stats as st
from matplotlib.pyplot import imshow, show
Create some dummy data: these are 1-D arrays of the "X" and "Y" point coordinates.
np.random.seed(142) # for reproducibility
x = st.norm.rvs(loc=2, scale=1, size=2000)
y = st.norm.rvs(loc=0, scale=3, size=2000)
For 2-D density estimation the gaussian_kde object has to be initialised with an array with two rows containing the "X" and "Y" datasets. In NumPy terminology, we "stack them vertically":
xy = np.vstack((x, y))
so the "X" data is in the first row xy[0,:] and the "Y" data are in the second row xy[1,:] and xy.shape is (2, 2000). Now create the gaussian_kde object:
dens = st.gaussian_kde(xy)
We will evaluate the estimated 2-D density PDF on a 2-D grid. There is more than one way of creating such a grid in NumPy. I show here an approach which is different from (but functionally equivalent to) #endolith 's method:
gx, gy = np.mgrid[x.min():x.max():128j, y.min():y.max():128j]
gxy = np.dstack((gx, gy)) # shape is (128, 128, 2)
gxy is a 3-D array, the [i,j]-th element of gxy contains a 2-element list of the corresponding "X" and "Y" values: gxy[i, j] 's value is [ gx[i], gy[j] ].
We have to invoke dens() (or dens.pdf() which is the same thing) on each of the 2-D grid points. NumPy has a very elegant function for this purpose:
z = np.apply_along_axis(dens, 2, gxy)
In words, the callable dens (could have been dens.pdf as well) is invoked along axis=2 (the third axis) in the 3-D array gxy and the values should be returned as a 2-D array. The only glitch is that the shape of z will be (128,128,1) and not (128,128) what I expected. Note that the documentation says that:
The shape of out [the return value, L.D.] is identical to the shape of arr, except along the
axis dimension. This axis is removed, and replaced with new dimensions
equal to the shape of the return value of func1d. So if func1d returns
a scalar out will have one fewer dimensions than arr.
Most likely dens() returned a 1-long tuple and not a scalar which I was hoping for. I didn't investigate the issue any further, because this is easy to fix:
z = z.reshape(128, 128)
after which we can generate the image:
imshow(z, aspect=gx.ptp() / gy.ptp())
show() # needed if you try this in PyCharm
Here is the image. (Note that I have implemented #endolith 's version as well and got an image indistinguishable from this one.)
The example posted in the top answer didn't work for me. I had to tweak it little bit and it works now:
import numpy as np
import scipy.stats as stats
from matplotlib import pyplot as plt
# Create some dummy data
rvs = np.append(stats.norm.rvs(loc=2,scale=1,size=(2000,1)),
stats.norm.rvs(loc=0,scale=3,size=(2000,1)),
axis=1)
kde = stats.kde.gaussian_kde(rvs.T)
# Regular grid to evaluate kde upon
x_flat = np.r_[rvs[:,0].min():rvs[:,0].max():128j]
y_flat = np.r_[rvs[:,1].min():rvs[:,1].max():128j]
x,y = np.meshgrid(x_flat,y_flat)
grid_coords = np.append(x.reshape(-1,1),y.reshape(-1,1),axis=1)
z = kde(grid_coords.T)
z = z.reshape(128,128)
plt.imshow(z,aspect=x_flat.ptp()/y_flat.ptp())
plt.show()