get strange result from scipy.signal.lsim when run twice - scipy

i run scipy.signal.lsim 10 times, it seems that the x0 only be used in the first time, why?
t=np.linspace(0.0,100,100*100)
transfun=[]
for i in range(10):
transfun.append(signal.lti([1],[1+i,1]))
y=[]
for i in range(10):
y.append(np.sin(2*np.pi*300*t)+np.random.normal(0,1,10000)+50)
sensor_output=[]
for i in range(10):
tout, yout, xout =signal.lsim(transfun[i],y[i],t,X0=[50.0])
sensor_output.append(yout)
fig=plt.figure()
for i in range(10):
plt.subplot(10,1,i+1)
plt.plot(t,y[i])
plt.plot(t,sensor_output[i])
plt.show()

lsim takes initial state vector as an argument, not initial output.
Transfer functions don't really have state vectors, but under the hood lsim is converting the transfer function to a state-space realization (which does have a state vector), and using that to simulate the system.
One problem is that, for a given transfer function, there's no unique realization. lsim doesn't say how it converts transfer functions to state-space realizations, but given your results I took a guess which happened to work (see below), but it's not robust.
To solve this for general transfer functions (i.e., not just first-order), you'd need to work with a specific state-space realization, and also specify more than just initial output, or the problem is under-constrained (I guess a typical approach would be to require d(y)/dt = 0, and similarly for all higher derivatives).
Below is a quick-and-dirty fix for your problem, and a sketch of how to do this for first-order state-space realizations.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
nex = 3
t = np.linspace(0, 40, 1001)
taus = [1, 5, 10]
transfun = [signal.lti([1],[tau,1])
for tau in taus]
u = np.tile(50, t.shape)
yinit = 49
sensor_output = [signal.lsim(tf,u,t,X0=[yinit*tau])[1]
for tf, tau in zip(transfun, taus)]
fig=plt.figure()
for i in range(nex):
plt.subplot(nex,1,i+1)
plt.plot(t,u)
plt.plot(t,sensor_output[i])
plt.savefig('img1.png')
# different SS realizations of the same TF need different x0
# to get the same initial y0
g = signal.tf2ss([1], [10, 1])
# create SS system with the same TF
k = 1.234
g2 = (g[0], k*g[1], g[2]/k, g[3])
# desired initial value
y0 = 321
#solve for initial state for the two SS systems
x0 = y0 / g[2]
x0_2 = y0 / g2[2]
output = [signal.lsim(g,u,t,X0=x0)[1],
signal.lsim(g2,u,t,X0=x0_2)[1]]
fig=plt.figure()
for i,out in enumerate(output):
plt.subplot(len(output),1,i+1)
plt.plot(t,u)
plt.plot(t,out)
plt.savefig('img2.png')
plt.show()

Related

How to correctly set the 'rtol' and 'atol' in scipy integration module 'solve_ivp' for solving a system of ODE with unknown analytic solution?

I was trying to reproduce some results of ode45 solver in Python using solve_ivp. Though all parameters, initial conditions, step size, and 'atol' and 'rtol' (which are 1e-6 and 1e-3) are same, I am getting different solutions. Both of the solutions are converging to a periodic solution but of different kind. As solve_ivp uses same rk4(5) method as ode45, this discrepancy in the final result is not quite understable. How can we know which one is the correct solution?
The code is included below
import sys
import numpy as np
from scipy.integrate import solve_ivp
#from scipy import integrate
import matplotlib.pyplot as plt
from matplotlib.patches import Circle
# Pendulum rod lengths (m), bob masses (kg).
L1, L2, mu, a1 = 1, 1, 1/5, 1
m1, m2, B = 1, 1, 0.1
# The gravitational acceleration (m.s-2).
g = 9.81
# The forcing frequency,forcing amplitude
w, a_m =10, 4.5
A=(a_m*w**2)/g
A1=a_m/g
def deriv(t, y, mu, a1, B, w, A): # beware of the order of the aruments
"""Return the first derivatives of y = theta1, z1, theta2, z2, z3."""
a, c, b, d, e = y
#c, s = np.cos(theta1-theta2), np.sin(theta1-theta2)
adot = c
cdot = (-(1-A*np.sin(e))*(((1+mu)*np.sin(a))-(mu*np.cos(a-b)*np.sin(b)))-((mu/a1)*((d**2)+(a1*np.cos(a-b)*c**2))*np.sin(a-b))-(2*B*(1+(np.sin(a-b))**2)*c)-((2*B*A/w)*(2*np.sin(a)-(np.cos(a-b)*np.sin(b)))*np.cos(e)))/(1+mu*(np.sin(a-b))**2)
bdot = d
ddot = ((-a1*(1+mu)*(1-A*np.sin(e))*(np.sin(b)-(np.cos(a-b)*np.sin(a))))+(((a1*(1+mu)*c**2)+(mu*np.cos(a-b)*d**2))*np.sin(a-b))-((2*B/mu)*(((1+mu*(np.sin(a-b))**2)*d)+(a1*(1-mu)*np.cos(a-b)*c)))-((2*B*a1*A/(w*mu))*(((1+mu)*np.sin(b))-(2*mu*np.cos(a-b)*np.sin(a)))*np.cos(e)))/(1+mu*(np.sin(a-b))**2)
edot = w
return adot, cdot, bdot, ddot, edot
# Initial conditions: theta1, dtheta1/dt, theta2, dtheta2/dt.
y0 = np.array([3.15, -0.1, 3.13, 0.1, 0])
# Do the numerical integration of the equations of motion
sol = integrate.solve_ivp(deriv,[0,40000], y0, args=(mu, a1, B, w, A), method='RK45',t_eval=np.arange(0, 40000, 0.005), dense_output=True, rtol=1e-3, atol=1e-6)
T = sol.t
Y = sol.y
I am expecting similar result from ode45 in MATLAB and solve_ivp in Python. How can I exactly reproduce the result from ode45 in python? What is the reason of discrepancy?
Even if ode45and RK45use the same underlying scheme, they do not necessarily use the same exact strategy regarding the evolution of the time step and its adaptation to match the error tolerance. Thus, it is difficult to know which one is better.
The only thing you could is simply trying lower tolerances, e.g. 1e-10. Then, both solutions should end up being virtually identical... Here, your current error tolerance might be insufficiently low, so that small discrepancies in the fine details of both algorithms create a visible difference in the solution.

GPFlow multiple independent realizations of same GP, irregular sampling times/lengths

In GPflow I have multiple time series and the sampling times are not aligned across time series, and the time series may have different length (longitudinal data). I assume that they are independent realizations from the same GP. What is the right way to handle this with svgp, and more generally with GPflow? Do i need to use coregionalization? The coregionalization notebook assumed correlated trajectories, while I want shared mean/kernel but independent.
Yes, the Coregion kernel implemented in GPflow is what you can use for your problem.
Let's set up some data from the generative model you describe, with different lengths for the timeseries:
import numpy as np
import gpflow
import matplotlib.pyplot as plt
Ns = [80, 90, 100] # number of observations for three different realizations
Xs = [np.random.uniform(0, 10, size=N) for N in Ns] # observation locations
# three different draws from the same GP:
k = gpflow.kernels.Matern52(variance=2.0, lengthscales=0.5) # kernel
Ks = [k(X[:, None]) for X in Xs]
Ls = [np.linalg.cholesky(K) for K in Ks]
vs = [np.random.randn(N, 1) for N in Ns]
fs = [(L # v).squeeze(axis=-1) for L, v in zip(Ls, vs)]
To actually set up the training data for the gpflow GP model:
# output indicator for the observations: which timeseries is this?
os = [o * np.ones(N) for o, N in enumerate(Ns)] # [0 ... 0, 1 ... 1, 2 ... 2]
# now assemble the three timeseries in single data set:
allX = np.concatenate(Xs)
allo = np.concatenate(os)
allf = np.concatenate(fs)
X = np.c_[allX, allo]
Y = allf[:, None]
assert X.shape == (sum(Ns), 2)
assert Y.shape == (sum(Ns), 1)
# now let's set up a copy of the original kernel:
k2 = gpflow.kernels.Matern52(active_dims=[0]) # the same as k above, but with different hyperparameters
# and a Coregionalization kernel that effectively says they are all independent:
kc = gpflow.kernels.Coregion(output_dim=len(Ns), rank=1, active_dims=[1])
kc.W.assign(np.zeros(kc.W.shape))
kc.kappa.assign(np.ones(kc.kappa.shape))
gpflow.set_trainable(kc, False) # we want W and kappa fixed
The Coregion kernel defines a covariance matrix B = W Wᵀ + diag(kappa), so by setting W=0 we prescribe zero correlations (independent realizations) and kappa=1 (actually the default) ensures that the variance hyperparameter of the copy of the original kernel remains interpretable.
Now construct the actual model and optimize hyperparameters:
k2c = k2 * kc
m = gpflow.models.GPR((X, Y), k2c, noise_variance=1e-5)
opt = gpflow.optimizers.Scipy()
opt.minimize(m.training_loss, m.trainable_variables, compile=False)
which recovers the initial variance and lengthscale hyperparameters pretty well.
If you want to predict, you have to provide the extra "output" column in the Xnew argument to m.predict_f(), e.g. as follows:
Xtest = np.linspace(0, 10, 100)
Xtest_augmented = np.c_[Xtest, np.zeros_like(Xtest)]
f_mean, f_var = m.predict_f(Xtest_augmented)
(whether you set the output column to 0, 1, or 2 does not matter, as we set them all to be the same with our choice of W and kappa).
If your input was more than one-dimensional, you could set
active_dims=list(range(X.shape[1] - 1)) for the first kernel(s) and active_dims=[X.shape[1]-1] for the Coregion kernel.

How to use scipy signal for MIMO systems

I am looking for a way to simulate the output of a signal for various input signals. To be more precise, I have a system defined by its transfer function H that takes one input and has one output. I generated several signals (stored in a numpy array). What I would like to do, is get the response of the system, to each input signal whithout using a for loop. Is there a way to proceed? Below is the code I wrote so far.
from __future__ import division
import numpy as np
from scipy import signal
nbr_inputs = 5
t_in = np.arange(0,10,0.2)
dim = (nbr_inputs, len(t_in))
x = np.cumsum(np.random.normal(0,2e-3, dim), axis=1)
H = signal.TransferFunction([1, 3, 3], [1, 2, 1])
t_out, y, _ = signal.lsim(H, x[0], t_in) # here, I would just like to simply write x
thanks for your help
This is not a MIMO system, it is a SISO system but you have multiple inputs.
You can create a MIMO system and apply your inputs all at once which will be computed channel by channel but simultaneously. Moreover, you can't use scipy.signal.lsim for MIMO systems yet. You can use other options such as python-control (if you have slycot extension otherwise again no MIMO) or harold if you have Python 3.6 or greater (disclaimer: I'm the author).
import numpy as np
from harold import *
import matplotlib.pyplot
nbr_inputs = 5
t_in = np.arange(0,10,0.2)
dim = (nbr_inputs, len(t_in))
x = np.cumsum(np.random.normal(0,2e-3, dim), axis=1)
# Forming a 1x5 system, common denominator will be completed automatically
H = Transfer([[[1, 3, 3]]*nbr_inputs], [1, 2, 1])
The keyword per_channel=True applies first input to first channel, second input to second and so on. Otherwise combined response is returned. You can check the shapes by playing around with it to see what I mean.
# Notice it is x.T below -> input shape = <num samples>, <num inputs>
y, t = simulate_linear_system(H, x.T, t_in, per_channel=True)
plt.plot(t, y)
This gives

Kalman Filter (pykalman): Value for obs_covariance and model without intercept

I am looking at the KalmanFilter from pykalman shown in examples:
pykalman documentation
Example 1
Example 2
and I am wondering
observation_covariance=100,
vs
observation_covariance=1,
the documentation states
observation_covariance R: e(t)^2 ~ Gaussian (0, R)
How should the value be set here correctly?
Additionally, is it possible to apply the Kalman filter without intercept in the above module?
The observation covariance shows how much error you assume to be in your input data. Kalman filter works fine on normally distributed data. Under this assumption you can use the 3-Sigma rule to calculate the covariance (in this case the variance) of your observation based on the maximum error in the observation.
The values in your question can be interpreted as follows:
Example 1
observation_covariance = 100
sigma = sqrt(observation_covariance) = 10
max_error = 3*sigma = 30
Example 2
observation_covariance = 1
sigma = sqrt(observation_covariance) = 1
max_error = 3*sigma = 3
So you need to choose the value based on your observation data. The more accurate the observation, the smaller the observation covariance.
Another point: you can tune your filter by manipulating the covariance, but I think it's not a good idea. The higher the observation covariance value the weaker impact a new observation has on the filter state.
Sorry, I did not understand the second part of your question (about the Kalman Filter without intercept). Could you please explain what you mean?
You are trying to use a regression model and both intercept and slope belong to it.
---------------------------
UPDATE
I prepared some code and plots to answer your questions in details. I used EWC and EWA historical data to stay close to the original article.
First of all here is the code (pretty the same one as in the examples above but with a different notation)
from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
# reading data (quick and dirty)
Datum=[]
EWA=[]
EWC=[]
for line in open('data/dataset.csv'):
f1, f2, f3 = line.split(';')
Datum.append(f1)
EWA.append(float(f2))
EWC.append(float(f3))
n = len(Datum)
# Filter Configuration
# both slope and intercept have to be estimated
# transition_matrix
F = np.eye(2) # identity matrix because x_(k+1) = x_(k) + noise
# observation_matrix
# H_k = [EWA_k 1]
H = np.vstack([np.matrix(EWA), np.ones((1, n))]).T[:, np.newaxis]
# transition_covariance
Q = [[1e-4, 0],
[ 0, 1e-4]]
# observation_covariance
R = 1 # max error = 3
# initial_state_mean
X0 = [0,
0]
# initial_state_covariance
P0 = [[ 1, 0],
[ 0, 1]]
# Kalman-Filter initialization
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
transition_matrices = F,
observation_matrices = H,
transition_covariance = Q,
observation_covariance = R,
initial_state_mean = X0,
initial_state_covariance = P0)
# Filtering
state_means, state_covs = kf.filter(EWC)
# Restore EWC based on EWA and estimated parameters
EWC_restored = np.multiply(EWA, state_means[:, 0]) + state_means[:, 1]
# Plots
plt.figure(1)
ax1 = plt.subplot(211)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.legend(loc="upper left")
ax2 = plt.subplot(212)
plt.plot(state_means[:, 1], label="Intercept")
plt.grid()
plt.legend(loc="upper left")
# check the result
plt.figure(2)
plt.plot(EWC, label="EWC original")
plt.plot(EWC_restored, label="EWC restored")
plt.grid()
plt.legend(loc="upper left")
plt.show()
I could not retrieve data using pandas, so I downloaded them and read from the file.
Here you can see the estimated slope and intercept:
To test the estimated data I restored the EWC value from the EWA using the estimated parameters:
About the observation covariance value
By varying the observation covariance value you tell the Filter how accurate the input data is (normally you just describe your confidence in the observation using some datasheets or your knowledge about the system).
Here are estimated parameters and the restored EWC values using different observation covariance values:
You can see the filter follows the original function better with a bigger confidence in observation (smaller R). If the confidence is low (bigger R) the filter leaves the initial estimate (slope = 0, intercept = 0) very slowly and the restored function is far away from the original one.
About the frozen intercept
If you want to freeze the intercept for some reason, you need to change the whole model and all filter parameters.
In the normal case we had:
x = [slope; intercept] #estimation state
H = [EWA 1] #observation matrix
z = [EWC] #observation
Now we have:
x = [slope] #estimation state
H = [EWA] #observation matrix
z = [EWC-const_intercept] #observation
Results:
Here is the code:
from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
# only slope has to be estimated (it will be manipulated by the constant intercept) - mathematically incorrect!
const_intercept = 10
# reading data (quick and dirty)
Datum=[]
EWA=[]
EWC=[]
for line in open('data/dataset.csv'):
f1, f2, f3 = line.split(';')
Datum.append(f1)
EWA.append(float(f2))
EWC.append(float(f3))
n = len(Datum)
# Filter Configuration
# transition_matrix
F = 1 # identity matrix because x_(k+1) = x_(k) + noise
# observation_matrix
# H_k = [EWA_k]
H = np.matrix(EWA).T[:, np.newaxis]
# transition_covariance
Q = 1e-4
# observation_covariance
R = 1 # max error = 3
# initial_state_mean
X0 = 0
# initial_state_covariance
P0 = 1
# Kalman-Filter initialization
kf = KalmanFilter(n_dim_obs=1, n_dim_state=1,
transition_matrices = F,
observation_matrices = H,
transition_covariance = Q,
observation_covariance = R,
initial_state_mean = X0,
initial_state_covariance = P0)
# Creating the observation based on EWC and the constant intercept
z = EWC[:] # copy the list (not just assign the reference!)
z[:] = [x - const_intercept for x in z]
# Filtering
state_means, state_covs = kf.filter(z) # the estimation for the EWC data minus constant intercept
# Restore EWC based on EWA and estimated parameters
EWC_restored = np.multiply(EWA, state_means[:, 0]) + const_intercept
# Plots
plt.figure(1)
ax1 = plt.subplot(211)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.legend(loc="upper left")
ax2 = plt.subplot(212)
plt.plot(const_intercept*np.ones((n, 1)), label="Intercept")
plt.grid()
plt.legend(loc="upper left")
# check the result
plt.figure(2)
plt.plot(EWC, label="EWC original")
plt.plot(EWC_restored, label="EWC restored")
plt.grid()
plt.legend(loc="upper left")
plt.show()

BFGS Fails to Converge

The model I'm working on is a multinomial logit choice model. It's a very specific dataset so other existing MNLogit libraries don't fit with my data.
So basically, it's a very complex function which takes 11 parameters and returns a loglikelihood value. Then I need to find the optimal parameter values that can minimize the loglikelihood using scipy.optimize.minimize.
Here are the problems that I encounter with different methods:
'Nelder-Mead’: it works well, and always give me the correct answer. However, it's EXTREMELY slow. For another function with a more complicated setup, it takes 15 hours to get to the optimal point. At the same time, the same function takes only 1 hour on Matlab using fminunc (which uses BFGS by default)
‘BFGS’: This is the method used by Matlab. It works well for any simply functions. However, for the function that I have, it always fails to converge and returns 'Desired error not necessarily achieved due to precision loss.’. I've spent lots of time playing around with the options but still failed to work.
'Powell': It quickly converges successfully but returns a wrong answer. The code is printed below (x0 is the correct answer, Nelder-Mead works for whatever initial value), and you can get the data here: https://www.dropbox.com/s/aap2dhor5jyxy94/data.csv
Thanks!
import pandas as pd
import numpy as np
from scipy.optimize import minimize
# https://www.dropbox.com/s/aap2dhor5jyxy94/data.csv
df = pd.read_csv('data.csv', index_col=0)
dfhh = df.hh
B = df.ix[:,'b0':'b4'].values # NT*5
P = df.ix[:,'p1':'p4'].values # NT*4
F = df.ix[:,'f1':'f4'].values # NT*4
SDV = df.ix[:,'lagb1':'lagb4'].values
def Li(x):
b1 = x[0] # coeff on prices
b2 = x[1] # coeff on features
a = x[2:7] # take first 4 values as alpha
E = np.exp(a + b1*P + b2*F) # (1*4) + (NT*4) + (NT*4) build matrix (NT*J) for each exp()
E = np.insert(E, 0, 1, axis=1) # (NT*5)
denom = E.sum(1)
return -np.log((B * E).sum(1) / denom).sum()
x0 = np.array([-32.31028223, 0.23965953, 0.84739154, 0.25418215,-3.38757007,-0.38036966])
np.random.seed(0)
x0 = x0 + np.random.rand(6)
minL = minimize(Li, x0, method='Nelder-Mead',options={'xtol': 1e-8, 'disp': True})
# minL = minimize(Li, x0, method='BFGS')
# minL = minimize(Li, x0, method='Powell', options={'xtol': 1e-12, 'ftol': 1e-12})
print minL
Update: 03/07/14 Simpler Version of the Code
Now Powell works well with very small tolerance, however the speed of Powell is slower than Nelder-Mead in this case. BFGS still fails to work.