Understanding how pseudo random numbers in Matlab imply statistical independence - matlab

Consider the following Matlab code in which I generate some data using pseudo random number generator.
I would like your help to understand "how" random are these numbers from a statistical point of view, in the terms I explain below.
I first set some parameters
%%%%%%%%Parameters
clear
rng default
Xsup=-1:6;
Zsup=1:10;
n_m=200;
n_w=200;
R=n_m;
Then I generate the data
%%%%%%%%Creation of data [XZ,etapair,zetapair,etasingle,zetasingle]
%Vector X of dimension n_mx1
idX=randi(size(Xsup,2),n_m,1); %n_mx1
X=Xsup(idX).'; %n_mx1
%Vector Z of dimension n_wx1
idZ=randi(size(Zsup,2),n_w,1);
Z=Zsup(idZ).'; %n_wx1
%Combine X and Z in a matrix XZ of dimension (n_m*n_w)x2
which lists all possible combinations of values in X and Z
[cX, cZ] = ndgrid(X,Z);
XZ = [cX(:), cZ(:)]; %(n_m*n_w)x2
%Vector etapair of dimension (n_m*n_w)x1
etapair=randn(n_m*n_w,1); %(n_m*n_w)x1
%Vector zetapair of dimension (n_m*n_w)x1
zetapair=randn(n_m*n_w,1); %(n_m*n_w)x1
%Vector etasingle of dimension (n_m*n_w)x1
etasingle=max(randn(n_m,R),[],2); %n_mx1
etasingle=repmat(etasingle, n_w,1); %(n_m*n_w)x1
%Vector zetasingle of dimension (n_m*n_w)x1
zetasingle=max(randn(n_w,R),[],2); %n_wx1
zetasingle=kron(zetasingle, ones(n_m,1)); %(n_m*n_w)x1
Let me now translate these draws into statistical terms:
For t=1,...,n_w*n_m, X(t) can be thought as a realisation of a random variable X_t
For t=1,...,n_w*n_m, Z(t) can be thought as a realisation of a random variable Z_t
For t=1,...,n_w*n_m, etapair(t) can be thought as a realisation of a random variable E_t
For t=1,...,n_w*n_m, zetapair(t) can be thought as a realisation of a random variable Q_t
For t=1,...,n_w*n_m, etasingle(t) can be thought as a realisation of a random variable Y_t
For t=1,...,n_w*n_m, zetasingle(t) can be thought as a realisation of a random variable S_t
My belief was that the pseudo random number generator in Matlab allows to claim that
(X_1,X_2,..., Z_1,Z_2,...,E_1,E_2,..., Q_1,Q_2...,Y_1,Y_2,...,S_1,S_2,...) are mutually independent
as explained here
As a check of this hypothetical claim, I define W_t:=-E_t-Q_t+Y_t+S_t and empirically compute Pr(W_t<=1|X_t=5, Z_t=1)
If mutual independence holds, then Pr(W_t<=1|X_t=5, Z_t=1)=Pr(W_t<=1) and their empirical counterparts below, named option1 and option2, should be ALMOST the same.
%option 1
num1=zeros(n_m*n_w,1);
for h=1:n_m*n_w
if -etapair(h)-zetapair(h)+etasingle(h)+zetasingle(h)<=1 && XZ(h,1)==5 && XZ(h,2)==1
num1(h)=1;
end
end
den1=zeros(n_m*n_w,1);
for h=1:n_m*n_w
if XZ(h,1)==5 && XZ(h,2)==1
den1(h)=1;
end
end
option1=sum(num1)/sum(den1);
%option 2
num2=zeros(n_m*n_w,1);
for h=1:n_m*n_w
if -etapair(h)-zetapair(h)+etasingle(h)+zetasingle(h)<=1
num2(h)=1;
end
end
option2=sum(num2)/(n_m*n_w);
Question: the difference between option1 (=0.0021) and option2 (=0.0012) is referred to the "ALMOST" or I am doing something wrong?

By the very nature of observing random events, you cannot guarantee theoretically accurate results for a give empirical trial.
You have set rng default at the start of your script, which means you will always get the same result (option1 = 0.0021, option2 = 0.0012).
Running your script many times and averaging the results, we should approach theoretical accuracy.
kk = 10000;
option1 = zeros(kk, 1);
option2 = zeros(kk, 1);
for ii = 1:kk
% No need to use 'clear' here. If you were concerned
% for some reason, you could use 'clearvars -except kk option1 option2 ii'
% do not use 'rng default'. Use 'rng shuffle' if anything, but not necessary
Xsup = -1:6;
% ... all your other code
% replace 'option1=...' with 'option1(ii)=...'
% replace 'option2=...' with 'option2(ii)=...'
end
fprintf('Results:\nMean option1 = %f\nMean option2 = %f\n', mean(option1), mean(option2));
Results:
>> Mean option1 = 0.001461
>> Mean option2 = 0.001458
We can see these are the same to some degree of accuracy, which can be arbitrarily high if we run X trials (for large enough X). This is as expected for independent variables.
Note, if you have the parallel computing toolbox, this for loop can easily be swapped for a parfor, and you can run trials many times faster.

Related

Verify Law of Large Numbers in MATLAB

The problem:
If a large number of fair N-sided dice are rolled, the average of the simulated rolls is likely to be close to the mean of 1,2,...N i.e. the expected value of one die. For example, the expected value of a 6-sided die is 3.5.
Given N, simulate 1e8 N-sided dice rolls by creating a vector of 1e8 uniformly distributed random integers. Return the difference between the mean of this vector and the mean of integers from 1 to N.
My code:
function dice_diff = loln(N)
% the mean of integer from 1 to N
A = 1:N
meanN = sum(A)/N;
% I do not have any idea what I am doing here!
V = randi(1e8);
meanvector = V/1e8;
dice_diff = meanvector - meanN;
end
First of all, make sure everytime you ask a question that it is as clear as possible, to make it easier for other users to read.
If you check how randi works, you can see this:
R = randi(IMAX,N) returns an N-by-N matrix containing pseudorandom
integer values drawn from the discrete uniform distribution on 1:IMAX.
randi(IMAX,M,N) or randi(IMAX,[M,N]) returns an M-by-N matrix.
randi(IMAX,M,N,P,...) or randi(IMAX,[M,N,P,...]) returns an
M-by-N-by-P-by-... array. randi(IMAX) returns a scalar.
randi(IMAX,SIZE(A)) returns an array the same size as A.
So, if you want to use randi in your problem, you have to use it like this:
V=randi(N, 1e8,1);
and you need some more changes:
function dice_diff = loln(N)
%the mean of integer from 1 to N
A = 1:N;
meanN = mean(A);
V = randi(N, 1e8,1);
meanvector = mean(V);
dice_diff = meanvector - meanN;
end
For future problems, try using the command
help randi
And matlab will explain how the function randi (or other function) works.
Make sure to check if the code above gives the desired result
As pointed out, take a closer look at the use of randi(). From the general case
X = randi([LowerInt,UpperInt],NumRows,NumColumns); % UpperInt > LowerInt
you can adapt to dice rolling by
Rolls = randi([1 NumSides],NumRolls,NumSamplePaths);
as an example. Exchanging NumRolls and NumSamplePaths will yield Rolls.', or transpose(Rolls).
According to the Law of Large Numbers, the updated sample average after each roll should converge to the true mean, ExpVal (short for expected value), as the number of rolls (trials) increases. Notice that as NumRolls gets larger, the sample mean converges to the true mean. The image below shows this for two sample paths.
To get the sample mean for each number of dice rolls, I used arrayfun() with
CumulativeAvg1 = arrayfun(#(jj)mean(Rolls(1:jj,1)),[1:NumRolls]);
which is equivalent to using the cumulative sum, cumsum(), to get the same result.
CumulativeAvg1 = (cumsum(Rolls(:,1))./(1:NumRolls).'); % equivalent
% MATLAB R2019a
% Create Dice
NumSides = 6; % positive nonzero integer
NumRolls = 200;
NumSamplePaths = 2;
% Roll Dice
Rolls = randi([1 NumSides],NumRolls,NumSamplePaths);
% Output Statistics
ExpVal = mean(1:NumSides);
CumulativeAvg1 = arrayfun(#(jj)mean(Rolls(1:jj,1)),[1:NumRolls]);
CumulativeAvgError1 = CumulativeAvg1 - ExpVal;
CumulativeAvg2 = arrayfun(#(jj)mean(Rolls(1:jj,2)),[1:NumRolls]);
CumulativeAvgError2 = CumulativeAvg2 - ExpVal;
% Plot
figure
subplot(2,1,1), hold on, box on
plot(1:NumRolls,CumulativeAvg1,'b--','LineWidth',1.5,'DisplayName','Sample Path 1')
plot(1:NumRolls,CumulativeAvg2,'r--','LineWidth',1.5,'DisplayName','Sample Path 2')
yline(ExpVal,'k-')
title('Average')
xlabel('Number of Trials')
ylim([1 NumSides])
subplot(2,1,2), hold on, box on
plot(1:NumRolls,CumulativeAvgError1,'b--','LineWidth',1.5,'DisplayName','Sample Path 1')
plot(1:NumRolls,CumulativeAvgError2,'r--','LineWidth',1.5,'DisplayName','Sample Path 2')
yline(0,'k-')
title('Error')
xlabel('Number of Trials')

Solving system of equations on MATLAB, when a constant exists in variable matrix?

How do I solve the following system of equations on MATLAB when one of the elements of the variable vector is a constant? Please do give the code if possible.
More generally, if the solution is to use symbolic math, how will I go about generating large number of variables, say 12 (rather than just two) even before solving them?
For example, create a number of symbolic variables using syms, and then make the system of equations like below.
syms a1 a2
A = [matrix]
x = [1;a1;a2];
y = [1;0;0];
eqs = A*x == y
sol = solve(eqs,[a1, a2])
sol.a1
sol.a2
In case you have a system with many variables, you could define all the symbols using syms, and solve it like above.
You could also perform a parameter optimization with fminsearch. First you have to define a cost function, in a separate function file, in this example called cost_fcn.m.
function J = cost_fcn(p)
% make sure p is a vector
p = reshape(p, [length(p) 1]);
% system of equations, can be linear or nonlinear
A = magic(12); % your system, I took some arbitrary matrix
sol = A*p;
% the goal of the system of equations to reach, can be zero, or some other
% vector
goal = zeros(12,1);
% calculate the error
error = goal - sol;
% Use a cost criterion, e.g. sum of squares
J = sum(error.^2);
end
This cost function will contain your system of equations, and goal solution. This can be any kind of system. The vector p will contain the parameters that are being estimated, which will be optimized, starting from some initial guess. To do the optimization, you will have to create a script:
% initial guess, can be zeros, or some other starting point
p0 = zeros(12,1);
% do the parameter optimization
p = fminsearch(#cost_fcn, p0);
In this case p0 is the initial guess, which you provide to fminsearch. Then the values of this initial guess will be incremented, until a minimum to the cost function is found. When the parameter optimization is finished, p will contain the parameters that will result in the lowest error for your system of equations. It is however possible that this is a local minimum, if there is no exact solution to the problem.
Your system is over-constrained, meaning you have more equations than unknown, so you can't solve it. What you can do is find a least square solution, using mldivide. First re-arrange your equations so that you have all the constant terms on the right side of the equal sign, then use mldivide:
>> A = [0.0297 -1.7796; 2.2749 0.0297; 0.0297 2.2749]
A =
0.029700 -1.779600
2.274900 0.029700
0.029700 2.274900
>> b = [1-2.2749; -0.0297; 1.7796]
b =
-1.274900
-0.029700
1.779600
>> A\b
ans =
-0.022191
0.757299

Vectorizing the solution of a linear equation system in MATLAB

Summary: This question deals with the improvement of an algorithm for the computation of linear regression.
I have a 3D (dlMAT) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form:
DL = R*IT^P, where DL and IT are obtained experimentally and R and P must be estimated.
The above equation can be transformed into a simple linear model after applying a logarithm:
log(DL) = log(R) + P*log(IT) => y = a + b*x
Presented below is the most "naive" way to solve this system of equations, which essentially involves iterating over all "3rd dimension vectors" and fitting a polynomial of order 1 to (IT,DL(ind1,ind2,:):
%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedP = cellfun(#(x)x(1),sol); %// Estimate of pMAT
fittedR = cellfun(#(x)exp(x(2)),sol); %// Estimate of rMAT
The above approach seems like a good candidate for vectorization, since it does not utilize MATLAB's main strength that is MATrix operations. For this reason, it does not scale very well and takes much longer to execute than I think it should.
There exist alternative ways to perform this computation based on matrix division, as demonstrated here and here, which involve something like this:
sol = [ones(size(x)),log(x)]\log(y);
That is, appending a vector of 1s to the observations, followed by mldivide to solve the equation system.
The main challenge I'm facing is how to adapt my data to the algorithm (or vice versa).
Question #1: How can the matrix-division-based solution be extended to solve the problem presented above (and potentially replace the loops I am using)?
Question #2 (bonus): What is the principle behind this matrix-division-based solution?
The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error sense‡. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer.
Below is benchmarking code that compares the original solution with two alternatives suggested in chat1, 2 :
function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(#power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));
The reason why method 2 can be vectorized into method 3 is essentially that matrix multiplication can be separated by the columns of the second matrix. If A*B produces matrix X, then by definition A*B(:,n) gives X(:,n) for any n. Moving A to the right-hand side with mldivide, this means that the divisions A\X(:,n) can be done in one go for all n with A\X. The same holds for an overdetermined system (linear regression problem), in which there is no exact solution in general, and mldivide finds the matrix that minimizes the mean-square error. In this case too, the operations A\X(:,n) (method 2) can be done in one go for all n with A\X (method 3).
The implications of improving the algorithm when increasing the size of dlMAT can be seen below:
For the case of 500*500 (or 2.5E5) elements, the speedup from Method 1 to Method 3 is about x3500!
It is also interesting to observe the output of profile (here, for the case of 500*500):
Method 1
Method 2
Method 3
From the above it is seen that rearranging the elements via squeeze and flipud takes up about half (!) of the runtime of Method 2. It is also seen that some time is lost on the conversion of the solution from cells to matrices.
Since the 3rd solution avoids all of these pitfalls, as well as the loops altogether (which mostly means re-evaluation of the script on every iteration) - it unsurprisingly results in a considerable speedup.
Notes:
There was very little difference between the "compressed" and the "explicit" versions of Method 3 in favor of the "explicit" version. For this reason it was not included in the comparison.
A solution was attempted where the inputs to Method 3 were gpuArray-ed. This did not provide improved performance (and even somewhat degradaed them), possibly due to wrong implementation, or the overhead associated with copying matrices back and forth between RAM and VRAM.

Simulation of Markov chains

I have the following Markov chain:
This chain shows the states of the Spaceship, which is in the asteroid belt: S1 - is serviceable, S2 - is broken. 0.12 - the probability of destroying the Spaceship by a collision with an asteroid. 0.88 - the probability of that a collision will not be critical. Need to find the probability of a serviceable condition of the ship after the third collision.
Analytical solution showed the response - 0.681. But it is necessary to solve this problem by simulation method using any modeling tool (MATLAB Simulink, AnyLogic, Scilab, etc.).
Do you know what components should be used to simulate this process in Simulink or any other simulation environment? Any examples or links.
First, we know the three step probability transition matrix contains the answer (0.6815).
% MATLAB R2019a
P = [0.88 0.12;
0 1];
P3 = P*P*P
P(1,1) % 0.6815
Approach 1: Requires Econometrics Toolbox
This approach uses the dtmc() and simulate() functions.
First, create the Discrete Time Markov Chain (DTMC) with the probability transition matrix, P, and using dtmc().
mc = dtmc(P); % Create the DTMC
numSteps = 3; % Number of collisions
You can get one sample path easily using simulate(). Pay attention to how you specify the initial conditions.
% One Sample Path
rng(8675309) % for reproducibility
X = simulate(mc,numSteps,'X0',[1 0])
% Multiple Sample Paths
numSamplePaths = 3;
X = simulate(mc,numSteps,'X0',[numSamplePaths 0]) % returns a 4 x 3 matrix
The first row is the X0 row for the starting state (initial condition) of the DTMC. The second row is the state after 1 transition (X1). Thus, the fourth row is the state after 3 transitions (collisions).
% 50000 Sample Paths
rng(8675309) % for reproducibility
k = 50000;
X = simulate(mc,numSteps,'X0',[k 0]); % returns a 4 x 50000 matrix
prob_survive_3collisions = sum(X(end,:)==1)/k % 0.6800
We can bootstrap a 95% Confidence Interval on the mean probability to survive 3 collisions to get 0.6814 ± 0.00069221, or rather, [0.6807 0.6821], which contains the result.
numTrials = 40;
ProbSurvive_3collisions = zeros(numTrials,1);
for trial = 1:numTrials
Xtrial = simulate(mc,numSteps,'X0',[k 0]);
ProbSurvive_3collisions(trial) = sum(Xtrial(end,:)==1)/k;
end
% Mean +/- Halfwidth
alpha = 0.05;
mean_prob_survive_3collisions = mean(ProbSurvive_3collisions)
hw = tinv(1-(0.5*alpha), numTrials-1)*(std(ProbSurvive_3collisions)/sqrt(numTrials))
ci95 = [mean_prob_survive_3collisions-hw mean_prob_survive_3collisions+hw]
maxNumCollisions = 10;
numSamplePaths = 50000;
ProbSurvive = zeros(maxNumCollisions,1);
for numCollisions = 1:maxNumCollisions
Xc = simulate(mc,numCollisions,'X0',[numSamplePaths 0]);
ProbSurvive(numCollisions) = sum(Xc(end,:)==1)/numSamplePaths;
end
For a more complex system you'll want to use Stateflow or SimEvents, but for this simple example all you need is a single Unit Delay block (output = 0 => S1, output = 1 => S2), with a Switch block, a Random block, and some comparison blocks to construct the logic determining the next value of the state.
Presumably you must execute the simulation a (very) large number of times and average the results to get a statistically significant output.
You'll need to change the "seed" of the random generator each time you run the simulation.
This can be done by setting the seed to be "now" (or something similar to that).
Alternatively you could quite easily vectorize the model so that you only need to execute it once.
If you want to simulate this, it is fairly easy in matlab:
servicable = 1;
t = 0;
while servicable =1
t = t+1;
servicable = rand()<=0.88
end
Now t represents the amount of steps before the ship is broken.
Wrap this in a for loop and you can do as many simulations as you like.
Note that this can actually give you the distribution, if you want to know it after 3 times, simply add && t<3 to the while condition.

difference equations in MATLAB - why the need to switch signs?

Perhaps this is more of a math question than a MATLAB one, not really sure. I'm using MATLAB to compute an economic model - the New Hybrid ISLM model - and there's a confusing step where the author switches the sign of the solution.
First, the author declares symbolic variables and sets up a system of difference equations. Note that the suffixes "a" and "2t" both mean "time t+1", "2a" means "time t+2" and "t" means "time t":
%% --------------------------[2] MODEL proc-----------------------------%%
% Define endogenous vars ('a' denotes t+1 values)
syms y2a pi2a ya pia va y2t pi2t yt pit vt ;
% Monetary policy rule
ia = q1*ya+q2*pia;
% ia = q1*(ya-yt)+q2*pia; %%option speed limit policy
% Model equations
IS = rho*y2a+(1-rho)*yt-sigma*(ia-pi2a)-ya;
AS = beta*pi2a+(1-beta)*pit+alpha*ya-pia+va;
dum1 = ya-y2t;
dum2 = pia-pi2t;
MPs = phi*vt-va;
optcon = [IS ; AS ; dum1 ; dum2; MPs];
Edit: The equations that are going into the matrix, as they would appear in a textbook are as follows (curly braces indicate time period values, greek letters are parameters):
First equation:
y{t+1} = rho*y{t+2} + (1-rho)*y{t} - sigma*(i{t+1}-pi{t+2})
Second equation:
pi{t+1} = beta*pi{t+2} + (1-beta)*pi{t} + alpha*y{t+1} + v{t+1}
Third and fourth are dummies:
y{t+1} = y{t+1}
pi{t+1} = pi{t+1}
Fifth is simple:
v{t+1} = phi*v{t}
Moving on, the author computes the matrix A:
%% ------------------ [3] Linearization proc ------------------------%%
% Differentiation
xx = [y2a pi2a ya pia va y2t pi2t yt pit vt] ; % define vars
jopt = jacobian(optcon,xx);
% Define Linear Coefficients
coef = eval(jopt);
B = [ -coef(:,1:5) ] ;
C = [ coef(:,6:10) ] ;
% B[c(t+1) l(t+1) k(t+1) z(t+1)] = C[c(t) l(t) k(t) z(t)]
A = inv(C)*B ; %(Linearized reduced form )
As far as I understand, this A is the solution to the system. It's the matrix that turns time t+1 and t+2 variables into t and t+1 variables (it's a forward-looking model). My question is essentially why is it necessary to reverse the signs of all the partial derivatives in B in order to get this solution? I'm talking about this step:
B = [ -coef(:,1:5) ] ;
Reversing the sign here obviously reverses the sign of every component of A, but I don't have a clear understanding of why it's necessary. My apologies if the question is unclear or if this isn't the best place to ask.
I think the key is that the model is forward-looking, so the slopes (the partial derivatives) need to be reversed to go backward in time. One way to think of it is to say that the jacobian() function always calculates derivatives in the forward-time direction.
You've got an output vector of states called optcon = [IS;AS;dum1;dum2;MPs], and two vectors of input states [y2 pi2 y pi v]. The input vector at time t+1 is [y2a pi2a ya pia va], and the input vector at time t is [y2t pi2t yt pit vt]. These two are concatenated into a single vector for the call to jacobian(), then separated after. The same thing could have been done in two calls. The first 5 columns of the output of jacobian() are the partial derivatives of optcon with respect to the input vector at time t+1, and the second 5 columns are with respect to the input vector at time t.
In order to get the reduced form, you need to come up with two equations for optcon at time t+1. The second half of coef is just what is needed. But the first half of coef is the equation for optcon at time t+2. The trick is to reverse the signs of the partial derivatives to get linearized coefficients that take the input vector at t+1 to the output optcon at t+1.