Can I use ANOVA to compare coefficients significance from three different conditions (in linear regression)? - linear-regression

I'm trying to compare coefficients from linear regression between three different groups (A, B, C). And I want to check if one of the groups has a higher coefficient than the others, significantly. Can I use ANOVA for this?
The data with coefficients is like the below.
Condition
IV
DV
coef
p value
A
force
moving
0.1833
0.008
B
force
moving
-0.0758
0.001
C
force
moving
0.4973
0.000
Additional Info
I used 7 points Likert Scale for the survey.
I have done linear regression with multiple IVs and a Single DV.
And the table above is the data I got from linear regression.

Related

Application of tf function for large systems

I'm looking at the transfer functions (transfer matrix) of a Multiple-input single-output (MISO) system. The system has 32 dynamic states, four inputs, and one output. The system A, B, C, and matrices are calculated in a Matlab code, and the state space model is created as sys=ss(A,B,C,D).
My question is that why are the transfer functions obtained by applying the "tf" function on "sys" (a 1*4 structure) different from those obtained by applying the "tf" function on individual system models "sys(1)", "sys(2)", "sys(3)", and "sys(4)", whereas the system matrices obtained by individuals "sys(1)" to "sys(4) completely match with the corresponding matrices and matrices' columns of the "sys"?
I tried the same thing for a simple 4th-order system, and they completely match. I also tried it for a 32th-state system (of same dimension of my original system), where all system matrices are generated by randn function. Then, I tried to find the transfer function coefficients by using cell2mat(T.den) and cell2mat(T.num) for sys and for sys(1) to sys(4). All the denominator coefficients match. Also, except for one of the transfer functions, the numerator coefficients match as well.
It should be mentioned that in the original example, the matrix A is singular, but in the synthetic example 2 (of dimension 32), the condition number of system matrix is around 120.
You can find the code below.
Your help is highly appreciated.
clear all;
clc;
%% Building the system matrices
A=randn(32,32);
B=randn(32,4);
C=randn(1,32);
D=randn(1,4);
sys=ss(A,B,C,D); % creating the state space model
TFF=tf(sys); % calculating the tranfer matrix
%% extracting the numerator and denominator coefficients of 4 transfer
%functions
for i=1:4
Ti=TFF(i);
Tin(i,:)=cell2mat(Ti.num); % numerator coefficients
Tid(i,:)=cell2mat(Ti.den); % denominator coefficients
clear Ti
end
%% calculatingthe numerator and denominator coefficients based on individual
%transfer functions
TF1=tf(sys(1));
T1n=cell2mat(TF1.num);
T1d=cell2mat(TF1.den);
TF2=tf(sys(2));
T2n=cell2mat(TF2.num);
T2d=cell2mat(TF2.den);
TF3=tf(sys(3));
T3n=cell2mat(TF3.num);
T3d=cell2mat(TF3.den);
TF4=tf(sys(4));
T4n=cell2mat(TF4.num);
T4d=cell2mat(TF4.den);
num2str([T1n.'-Tin(1,:).']) % the error between the numerator coefficients
% of the TF1 by 2 aproaches
num2str([T2n.'-Tin(2,:).'])
num2str([T3n.'-Tin(3,:).'])
num2str([T4n.'-Tin(4,:).'])
num2str([T1d.'-Tid(1,:).'])
num2str([T2d.'-Tid(2,:).'])
num2str([T3d.'-Tid(3,:).'])
num2str([T4d.'-Tid(4,:).'])
It is a mixture of a few things; because the resulting model is not guaranteed to be minimal per se after model switches you get some discrepancies but also a subsystem of the MIMO system is not guaranteed to agree with a SISO part of the system which can remove some of the poles of the other modes if input is not acting on them.
However, beyond 5,6 transfer matrices are numerically terrible to perform operations with. Hence try to avoid them. Check other properties of the subsystems such as Bode plots for comparison

Stochastic spread method for pairs trading by Elliot et. al (2005) - Kalman filter + EM algorithm in MATLAB, am I doing something wrong?

I am implementing the Stochastic spread method for pairs trading by Elliott et. al (2005).
The procedure consists of modeling the spread between two stocks, log(P1)-log(P2), as a mean reverting process, calibrated from market observations.
The hidden state process for the spread can be written like this:
x_{t+1} = A + Bx_t + Ce_{t+1}
The observation process is:
y_t = x_t + D*w_t
Both e_t and w_t are i.i.d. Gauusian N(0,1).
Elliott gives the Kalman filter equations in his paper, which I have implemented in my code for the updating step:
function [xt_t,st_t,xt_tm,kt,st_tm]=EMupdate(DATA_t,xt_t_m1,st_t_m1,A,B,C2,D2)
st_tm=B^2*st_t_m1+C2;
kt=st_tm/(st_tm+D2);
xt_tm=A+B*xt_t_m1;
xt_t=xt_tm+kt*(DATA_t-xt_tm);
st_t=st_tm-kt*st_tm;
where
xt_t is x_{t|t}
xt_t_m1 is x_{t-1|t-1}
xt_tm is x_{t|t-1}
st_t is s_{t|t} (the MSE, denoted as P in e.g. Hamilton (1994))
st_t_m1 is s_{t-1|t-1}
st_tm is s_{t|t-1}
kt is the kalman gain for time t
DATA_t is the observed data for time t, y_t
A, B, C2, D2 are the estimated parameters (which I have estimated using the EM algorithm in another code).
This update step is done every time a new data point arrives. I am storing all the x's, s's and k's in vectors. I am supposed to compare y_t with x_{t|t-1}, and given a large deviation of the two, a trade should be initiated. However, the two follows each other very closely, and I am unsure whether I have done something wrong:
Can someone see if I am doing wrong?
Please tell me if I should link more of my code.
UPDATE: My procedure: (P is the same as s above)
To generate the spread between two stocks, I take the difference between the log-prices: y=log(p1)-log(p2).
I set a training period of 252 days, where I estimate the initial parameters (A, B, C2 and D2) using the EM algorithm. I implement the EM algorithm using all the data for the training period; that is y(1), y(2), ..., y(252) as well as initial guesses for A, B, C2 and D2:
2a. I set x_{1|1}=y(1). Furthermore I set the MSE, P_{1|1}=D2, my initial guess for D^2.
2b. I recursively calculate Kalman filters, x_{t|t}, x_{t+1|t}, P_{t|t}, P_{t+1|t} and k_{t} for all t=1...252 (the entire training period) using my initial guesses for A, B, C2 and D2.
2c. After I have calculated the kalman filters for the entire training period, I (backward) recursively calculate Kalman smoothers for the entire training period as well: t=1...252. These are x_{t|T}, P_{t|T}, P_{t,t-1|T} and j_{t}.
I then compute the log-likelihood value and the updated values for A, B, C2 and D2. Then I repeat the steps from 1 until the log-likelihood converges and I obtain optimal values for A, B, C2 and D2.
Is it correct to calculate Kalman filters for the entire training period before starting to calculate Kalman smoothers? Or should I, for example, calculate Kalman filters up till t=2, then Kalman smoothers for T=2, then Kalman filters up till t=3, then smoothers for T=3 etc.?
Now I have values for A, B, C2 and D2 and can begin my testperiod, also 252 days. I don't update my estimates for A, B, C2 and D2, but keep them constant. For each new observation I can compute Kalman filters (the same as in 2b). Finally I can compare y(t) to x_{t|t-1} for the training period.
My results look like this:
While a paper by Chen, Ren and Lu have the following results:
NB: Not the same security... but the difference is obvious nonetheless.
It seems that either you're underestimating the noise variance from the training data, or that your training data is not stationary in the period of your training window. try to increase the noise variance and you'll see that the filter actually smooths the time series. Your current under-estimation of noise variance leads the kalman filter to "forget" the past and give your last sample high weighting.
checking it is quite easy. Increase the measurement noise/error variance (the matrix R in Kalman filter) and see how it affects the output.
If the model is no longer linear-Gaussian Kalman filter will not be optimal. However, it still should smooth your data, so start "training" it until it provides acceptable prediction.

sequence prediction using HMM Matlab

I'm currently learning the murphyk's toolbox for Hidden Markov's Model, However I'v a problem of determining my model's coefficients and also the algorithm for the sequence prediction by log likelihood.
My Scenario:
I have the flying bird's trajectory in 3D-space i.e its X,Y and Z which lies in Continuous HMM's category. I'v the 200 observations of flying bird i.e 500 rows data of trajectory, and I want to predict the sequence. I want to sample that in 20 datapoints . i.e after 10 points, so my first question is, Is following parameters are valid for my case?
O = 3; %Number of coefficients in a vector
T = 20; %Number of vectors in a sequence
nex = 50; %Number of sequences
M = 2; %Number of mixtures
Q = 20; %Number of states
And the second question is, What algorithm is appropriate for sequence prediction and is training is compulsory for that?
From what I understand, I'm assuming you're training 200 different classes (HMMs) and each class has 500 training examples (observation sequences).
O is the dimensionality of vectors, seems to be correct.
There is no need to have a fixed T, it depends on the observation sequences you have.
M is the number of multivariate Gaussians (or mixtures) in the GMM of a state. More will fit to your data better and give you better accuracy, but at the cost of performance. Choose a suitable value.
N does not need to be equal to T. For the best number of states N, you'll have to benchmark and see yourself:
Determinig the number of hidden states in a Hidden Markov Model
Yes, you have to train your classes using the Baum-Welch algorithm, optionally preceded by something like the segmental k-means procedure. After that you can easily perform isolated unit recognition using Forward/Backward probability or Viterbi probability by simply selecting the class with the highest probability.

Variable levels of smoothing within the same Matlab matrix

I currently have a large matrix M (~100x100x50 elements) containing both positive and negative values. At the moment, if I want to smooth this matrix, I use the smooth3 function to apply a gaussian kernel over the entire 3-D matrix.
What I want to achieve is a variable level of smoothing within this matrix - i.e.. different parts of the matrix M are smoothed to different levels of sigma depending of the value in a similar 3-D matrix, d (with values ranging from 0 to 1). Where d is 0, no smoothing occurs, where d is 1 a maximum level of smoothing occurs.
The fact that the matrix is 3-D is trivial. Smoothing in 3 dimensions is nice, but not essential, and my current code (performing various other manipulations) handles each of the 50 slices of M separately anyway. I am happy to replace smooth3 with a convolution of M with a gaussian function, and perform this convolution over each slice individually. What I can't figure out is how to vary the sigma level of this gaussian function (based on d) given its location in M and output the result accordingly.
An alternative approach may be to use matrix d as a mask for a very smooth version of matrix Ms and somehow manipulate M and Ms to give an equivalent result, however I'm not convinced that this will work as I can't think of a function to combine M and Md that won't give artefacts of each of M or Ms when 0 < d < 1...any thoughts?
[I'm using 2009b, and only have access to the Signal Processing toolbox.]
You should have a look at the Guided Image Filter. It is a computationally efficient generalization of the bilateral filter.
http://research.microsoft.com/en-us/um/people/jiansun/papers/guidedfilter_eccv10.pdf
It will allow you to do proper smoothing based on your guidance matrix.

Calculating an inverse matrix in Matlab

I'm running an optimization algorithm that requires calculation of the inverse of a matrix. The goal of the algorithm is to eliminate negative values from the matrix A and obtain the new matrix B. Basically, I start with known square matrices B and C of the same size.
I start by calculating the matrix A which is equal to:
A = B^-1 * C
Or in Matlab:
A = B\C;
I use this because Matlab told me B\C is more accurate than inv(B)*C.
The negative values in A are then divided by two and A is then normalised so that it's rows have length of 1. Using this new A, I calculate a new B with:
(1/N) * A * C' = B^-1
where N is just a scaling factor (# of columns in A). This new B would then be used again in the first step and these iterations continue until the negatives in A are gone.
My problem is I have to calculate B from the second equation and then normalise it.
invB = (1/N)*A*C';
B = inv(invB);
I've been calculating B using inv(B^-1) but after a few iterations I start getting messages that B^-1 is "close to singular or badly scaled."
This algorithm actually works for smaller matrices (around 70x70) but when it gets up to about 500x500 I start getting these messages.
Are there any better ways to calculate inv(B^-1)?
You should definitely head warnings about singular matrices. Results in numerical linear algebra tend to break down as you move toward matrices with high condition numbers. The underlying idea is if
A*b_1 = c
and we're actually solving the problem (because we are using approximate numbers when we use computers)
(A + matrix error)*b_2 = (c + vector error)
how close are b_1 and b_2 as a function of the matrix and vector errors? When A has small condition number b_1 and b_2 are close. When A has large condition number b_1 and b_2 are not close.
There is an informative piece of analysis you could do on your algorithm. At each iteration, after you've found B, find use Matlab to find the condition number of it. This is
cond(B)
You will likely see the number climb rapidly. This indicates that every time you iterate your algorithm, you should trust your result for B less and less.
Problems like this crop up all the time in numerical mathematics. If you'll be working with numerical algorithms frequently you should take some time to familiarize with the role of condition numbers in the field and preconditioning techniques as mentioned above. My preferred text for this is "Numerical Linear Algebra" by Lloyd Trefethen, but any text on Numerical Algebra should address some of these issues.
Best of luck,
Andrew
The main issue is that your matrix has a high condition number (i.e. really small rcond(B) in your case). This is due to the iterative structure in your algorithm, I guess. As you do each iteration your small singular values get smaller and smaller so your condition number grows exponentially. You should check preconditioning to avoid this kind of behavior.