How to find best fitting variable given two correlating variables and an outcome in Matlab? - matlab

I have 9 variables and a final rate, in this case I will call the variables (A, B, C, D, E, F, G, H, I) and the final rate Z. It is given that variables A, B and one other variable from (C...I) together are good predictors of Z. I have the data in a spreadsheet for 12 years for all of these variables as well as the rate.
I need to find a formula so I can test this. I've been trying for hours to use MatLab to use Multiple Variable Linear Regression w/ least squares method but I just can't seem to find information, get correlating data or be sure that it is proper. I'm also open to using other methods such as predictive distribution, kernels, but I do need to use the least squares method and have a general formula I can show my work with.
I'm thinking of doing a formula such as Z = (mean(z) - b1*A - b2*B - b3*(VAR)) + b1A + b2B + b3C (where VAR is tested for each other variable) but I'm not sure how to do this on matlab or if that is even the correct way. Please explain why you use that formula as well.
Thanks

Related

How to define custom functions in Maple?

I'm new to Maple and I'm looking for a simple way to automate some tasks. In particular, I'm looking for a way to define custom "action" that perform some steps automatically.
As as an example I would like to define a quick way to compute the determinant of the Hessian of a polynomial. Currently the way I do this is opening Maple, create a new worksheet than performing the following commands:
p := (x, y) -> x^2*y + 3*x^3 + y^3
with(VectorCalculus):
h := Hessian(p(x, y), [x, y])
Determinant(h)
What I would like to do is to compute the hessian determinant directly with something like
HessDet(p)
where HessDet would be a custom command that performs the operations above. How does one achieve something like this in Maple?
First things first: The value assigned to your p is a procedure which can return a polynomial expression, but not itself a polynomial. It's important not to muddle expressions and procedures. Doing so is a common cause of problems for new users.
Being able to throw around p(x,y) may be visually pleasing to your eye, but it serves little programmatic purpose here. The fact that the formal parameters of procedure p happen to be called x and y, along with the fact that you called procedure p with arguments x and y, is actually just another common source of confusion. Don't create procedures merely to call them in this way.
Also, your call p(x,y) makes it look magic that your code snippet "knows" how many arguments would be required by procedure p. So it's already a muddle to have your candidate HessDet accept p as a procedure.
So instead let's keep it straightforward, by writing HessDet to accept a polynomial rather than a procedure. We can programmatically ascertain the names in which this expression of of type polynom.
restart;
HessDet:=proc(p::algebraic)
local H,vars;
vars:=indets(p,
And(name,Non(constant),
satisfies(u->type(p,polynom(anything,u)))));
H:=VectorCalculus:-Hessian(p,[vars[]]);
LinearAlgebra:-Determinant(H);
end proc:
Now some examples of using it,
P := x^2*y + 3*x^3 + y^3;
HessDet(P);
p := (x, y) -> x^2*y + 3*x^3 + y^3;
HessDet(p(x,y));
HessDet(x^3-x^2+4*x);
HessDet(s^2*t + 3*s^3 + t^3);
HessDet(s[r]^2*t[r] + 3*s[r]^3 + t[r]^3);
You might also wonder how you could re-use this custom procedure across sessions, without having to type it in each time. Two reasonable ways are:
Put the (above) defining plaintext definition of HessDet inside a personal initialization file.
Create a (.mla) Maple Library Archive file, then Save your HessDet to that, and then augment the Library search path in your initialization file.
It might look like 2) is more effort, but only the Save step is needed for repeats, and you can store many custom procedures to the same archive. Your choice...
[edit] The OP has asked for clarification of the first part of the above procedure HessDet, which I suspect means the call to indets.
If P is assigned an expression then then the call indets(P,name) will return a set of all the names present in that expression. Basically, it returns the set of all indeterminate subexpressions of the expression which are of type name in Maple's technical sense.
For example,
P := x*y + sin(a*Pi)*x;
x y + sin(a Pi) x
indets( P,
name );
{Pi, a, x, y}
Perhaps the name of the constant Pi is not wanted here. Ie,
indets( P,
And( name,
Non(constant) ) );
{a, x, y}
Perhaps we want only the non-constant names in which the expression is a polynomial? Ie,
indets( P,
And( name,
Non(constant),
satisfies(u->type(p,polynom(anything,u))) ) );
{x, y}
That last result is an advanced way of using the following tests:
type(P, polynom(anything, x));
true
type(P, polynom(anything, y));
true
type(P, polynom(anything, a));
false
A central issue here is that the OP made no mention of what kind of polynomials are to be handled by the custom procedure. So I guessed with some defensive coding, in hope of less surprises later on. The original Question states that the input could be a "polynomial", but we weren't told what kind of coefficients there might be.
Perhaps the coefficients will always be real and exact or numeric. Perhaps the custon procedure should throw an error when not supplied such. These details weren't mentioned in the Question.

Q-Learning equation in Deep Q Network

I'm new to reinforcement learning at all, so I may be wrong.
My questions are:
Is the Q-Learning equation ( Q(s, a) = r + y * max(Q(s', a')) ) used in DQN only for computing a loss function?
Is the equation recurrent? Assume I use DQN for, say, playing Atari Breakout, the number of possible states is very large (assuming the state is single game's frame), so it's not efficient to create a matrix of all the Q-Values. The equation should update the Q-Value of given [state, action] pair, so what will it do in case of DQN? Will it call itself recursively? If it will, the quation can't be calculated, because the recurrention won't ever stop.
I've already tried to find what I want and I've seen many tutorials, but almost everyone doesn't show the background, just implements it using Python library like Keras.
Thanks in advance and I apologise if something sounds dumb, I just don't get that.
Is the Q-Learning equation ( Q(s, a) = r + y * max(Q(s', a')) ) used in DQN only for computing a loss function?
Yes, generally that equation is only used to define our losses. More specifically, it is rearranged a bit; that equation is what we expect to hold, but it generally does not yet precisely hold during training. We subtract the right-hand side from the left-hand side to compute a (temporal-difference) error, and that error is used in the loss function.
Is the equation recurrent? Assume I use DQN for, say, playing Atari Breakout, the number of possible states is very large (assuming the state is single game's frame), so it's not efficient to create a matrix of all the Q-Values. The equation should update the Q-Value of given [state, action] pair, so what will it do in case of DQN? Will it call itself recursively? If it will, the quation can't be calculated, because the recurrention won't ever stop.
Indeed the space of state-action pairs is much too large to enumerate them all in a matrix/table. In other words, we can't use Tabular RL. This is precisely why we use a Neural Network in DQN though. You can view Q(s, a) as a function. In the tabular case, Q(s, a) is simply a function that uses s and a to index into a table/matrix of values.
In the case of DQN and other Deep RL approaches, we use a Neural Network to approximate such a "function". We use s (and potentially a, though not really in the case of DQN) to create features based on that state (and action). In the case of DQN and Atari games, we simply take a stack of raw images/pixels as features. These are then used as inputs for the Neural Network. At the other end of the NN, DQN provides Q-values as outputs. In the case of DQN, multiple outputs are provided; one for every action a. So, in conclusion, when you read Q(s, a) you should think "the output corresponding to a when we plug the features/images/pixels of s as inputs into our network".
Further question from comments:
I think I still don't get the idea... Let's say we did one iteration through the network with state S and we got following output [A = 0.8, B = 0.1, C = 0.1] (where A, B and C are possible actions). We also got a reward R = 1 and set the y (a.k.a. gamma) to 0.95 . Now, how can we put these variables into the loss function formula https://imgur.com/a/2wTj7Yn? I don't understand what's the prediction if the DQN outputs which action to take? Also, what's the target Q? Could you post the formula with placed variables, please?
First a small correction: DQN does not output which action to take. Given inputs (a state s), it provides one output value per action a, which can be interpreted as an estimate of the Q(s, a) value for the input state s and the action a corresponding to that particular output. These values are typically used afterwards to determine which action to take (for example by selecting the action corresponding to the maximum Q value), so in some sense the action can be derived from the outputs of DQN, but DQN does not directly provide actions to take as outputs.
Anyway, let's consider the example situation. The loss function from the image is:
loss = (r + gamma max_a' Q-hat(s', a') - Q(s, a))^2
Note that there's a small mistake in the image, it has the old state s in the Q-hat instead of the new state s'. s' in there is correct.
In this formula:
r is the observed reward
gamma is (typically) a constant value
Q(s, a) is one of the output values from our Neural Network that we get when we provide it with s as input. Specifically, it is the output value corresponding to the action a that we have executed. So, in your example, if we chose to execute action A in state s, we have Q(s, A) = 0.8.
s' is the state we happen to end up in after having executed action a in state s.
Q-hat(s', a') (which we compute once for every possible subsequent action a') is, again, one of the output values from our Neural Network. This time, it's a value we get when we provide s' as input (instead of s), and again it will be the output value corresponding to action a'.
The Q-hat instead of Q there is because, in DQN, we typically actually use two different Neural Networks. Q-values are computed using the same Neural Network that we also modify by training. Q-hat-values are computed using a different "Target Network". This Target Network is typically a "slower-moving" version of the first network. It is constructed by occasionally (e.g. once every 10K steps) copying the other Network, and leaving its weights frozen in between those copy operations.
Firstly, the Q function is used both in the loss function and for the policy. Actual output of your Q function and the 'ideal' one is used to calculate a loss. Taking the highest value of the output of the Q function for all possible actions in a state is your policy.
Secondly, no, it's not recurrent. The equation is actually slightly different to what you have posted (perhaps a mathematician can correct me on this). It is actually Q(s, a) := r + y * max(Q(s', a')). Note the colon before the equals sign. This is called the assignment operator and means that we update the left side of the equation so that it is equal to the right side once (not recurrently). You can think of it as being the same as the assignment operator in most programming languages (x = x + 1 doesn't cause any problems).
The Q values will propagate through the network as you keep performing updates anyway, but it can take a while.

Solving with multiple assumptions

Suppose I have a CNF expression with variables (a,b,c,d,e,f,g). How would I go about using a SAT solver to find an assignment for (d,e,f) given that {a,b,c,g} = {1,0,0,1} and {a,b,c,g} = {1,1,1,1}? If it was one assumption, calling a sat solver to find assignments for {d,e,f} would be straight-forward (E.g., by adding unit clauses to the CNF). But what if I have multiple assumptions? Is this possible?
Here are the steps for what (I think) harold was trying to describe to you. You have some CNF formula F over the variables a, b, c, d, e, f and g.
Duplicate the formula, calling the duplicate G.
In G, replace the variable a with aa, b with bb, c with cc, and g with gg.
Add unit clauses to F so that (a,b,c,g) = (1,0,0,1).
Add unit clauses to G so that (aa,bb,cc,gg) = (1,1,1,1).
Concatenate the formulas F and G and feed the result into the SAT solver.
The solver will find a satisfying assignment consistent with both (a,b,c,g)'s and (aa,bb,cc,gg)'s preset values.
It is not quite clear if you want a practical answer or an interesting theoretical answer. I will go after practical.
For each set of assumptions, call a sat solver that supports solve with assumptions on that set of assumptions (example). Do this sequentially on the same solver instance.
Pros:
You do not mix satisfiability of mutually exclusive sets of assumptions. If set of assumptions A is sat for a formula F and the other set A' is unsat for F, each call to the solver tells you if those assumptions are sat/unsat.
Learned clauses from the first call may stick around for the second call. The intermediate learned clauses talk about the same variables. (Note: If you have a disjoint formula F & G where F is over variables X, G is over variables Y and X and Y share no variables, resolution -- the inference rule used in CDCL -- cannot derive clauses mixing F and G. There is no obvious gain of mixing the two together instead of splitting them apart unless one instance is much easier to prove unsat and stop early.)
Cons:
If instance A is hard to solve in practice but A' is trivial, you might get stuck on A.
It is not parallel so if you have way more instances than two that you want to solve ASAP you'll need additional mechanisms.
I know this is a bit of an obvious answer, but it is worth trying. If that fails, you can try doing fancier things like solving w.r.t. the assumptions A union A', and only if that is unsat solving falling back on this strategy of A then A'. This won't help for your example as (a,b,c,g) = (1,0,0,1) and (a,b,c,g) = (1,1,1,1) are mutually exclusive.

Using nlinfit in Matlab?

I'm having trouble understanding and applying the use of nlinfit function in Matlab. So, let's say I'm given vectors
x = [1, 2, 3, 4, 5]
y = [2.3, 2.1, 1.7, .95, .70]
and I'm asked to fit this data to an exponential form (I don't know if the numbers will work, I made them up) where y = A*e^(Bx) + C (A/B/C are constants).
My understanding is that nlinfit takes 4 arguments, the two vectors, a modelfunction which in this case should be the equation I have above, and then beta0, which I don't understand at all. My question is how do you implement the modelfunction in nlinft, and how do you find beta0 (when only working with 2 vectors you want to plot/fit) and how should it be implemented? Can someone show me an example so that I can apply this function for any fit? I suspect I'll be using this a lot in the future and really want to learn it.
Check out the second example in the docs: http://www.mathworks.com/help/stats/nlinfit.html
Basically you pass a function handle as your modelfunction parameter. Either make a function in a file and then just pass it the function name with an # in front or else make an anonymous function like this:
nlinfit(x, y, #(b,x)(b(1).*exp(b(2).*x) + b(3)), beta0)
You'll notice that in the above I have stuck all your parameters into a single vector. The first parameter of your function must be a vector of all the points you are trying to solve for (i.e. A, B and C in your case) and the second must be x.
As woodchips has said beta0 is your starting point so your best guess (doesn't have to be great) of your A, B and C parameters. so something like [1 1 1] or rand(3,1), it is very problem specific though. You should play around with a few. Just remember that this is a local search function and thus can get stuck on local optima so your starting points can actually be quite important.
beta0 is your initial guess at the parameters. The better your guess, the more likely you will see convergence to a viable solution. nlinfit is no more than an optimization. It has to start somewhere.

order of the outputs for the MATLAB solve function

I have been tinkering with the MATLAB solve function for a while, but cannot seem how it determines the order that it outputs the symbolic variables.
Specifically, I have a system of equations that I want to solve simultaneously.
a = f(a, b, c, d)
b = f(a, b, c, d)
c = f(a, b, c, d)
d = f(a, b, c, d)
and these equations are symbolic and have other symbolic variables (aside from a, b, c, and d). (so the solution outputs aren't numeric, but are symbolic).
For example, when I am solving the for the equations of motion for an inverted spring pendulum, I have two equations that are both dependent on phiDDot and lenDDot. I use the solve function to solve for phiDDot and lenDDot separately using this call:
[eom2, eom1] = solve(Lag(1)==0, Lag(2)==0, ddphi, ddlen);
The solution for ddphi corresponds to the second term of the matrix outputted, while ddlen corresponds to the first term of the matrix. I was wondering whether there was some way to tell MATLAB to output ddphi first and ddlen second, or at least determine what order they are outputted. Not knowing the order of the variables becomes a big problem when I am solving for more than 4 variables, and trying to solve the differential equations using ode45.
Any advice would be helpful!!
I believe that it's alphabetical based on the ASCII values of the variable names in your equations. As per the documentation for solve, sym/symvar is used to parse the equations in the case where you don't supply the names of output variables. The help for sym/symvar indicates that it returns variables in lexicographical order, i.e. alphabetical (symvar does the same, even though it doesn't say so, by making calls to setdiff). If you look at the actual code for solve.m (type edit solve in your command window) and examine the sub-function called assignOutputs (line 190 in R2012b) you'll see that it makes a call to sort and that there's a comment about lexicographical order.
In R2012b (and likely earlier) the documentation differs from that of R2013a in a way that seems relevant to your issue. In R2013a, this sentence is added:
If you explicitly specify independent variables vars, then the solver uses the same order
to return the solutions.
I'm still running R2012b, so I can't confirm this different behavior.