A set of results in Decision Tree - matlab

I have a data which is 1672x6. I have put some of them in the picture.
where x values are A1 A2 A3 A4 A5 A6 and y values are B1 B2 ....B1672.
I used the following code while generating decision trees:
vars = {'A1', ' A2 ','A3',' A4 ','A5',' A6'}
x = [A1 A2 A3 A4 A5 A6];
y = [B];
t = classregtree(x, y, 'method','classification', 'names',vars, ...
'categorical',[2 4], 'prune','off');
view(t)
and it generates super crazy trees like
I want to get the values which are greater than the values that I gave. When I say :
inst = [3 2.3 2 0 1 0];
prediction = eval(t, inst)
It only gives me the B value (like B271) which has that variable but I want to get all B variables which have greater values than inst variable such as A1>3 A2>2.3 A3>2 A4>0 A5>1 A6>0. How can I get them?

You seem to confuse two things: decision tree and finding desired rows
If you want to find all the rows that are greater than inst, the following is a simple code that prints all such rows.
for i = 1:size(B,1)
if all(a(i,:)>inst)==1
i,
end
end
However, decision tree is a totally different topic. In a decision tree, you have a set of conditions (A1 to A6 in your case) and many rows for training (B1 to B1672) and a consequence for each one of them. When a new test case is queried, the machine decides on best possible consequence out of all the consequences.
Some decision tree totorials: 1, 2 and wikipedia

Related

What is a Mat-lab function for « If-in »?

Problem statement: Provide a function that does the following: c1) two vectors D1 and D2 have 7 elements each, then form the division between the corresponding components of each vector and assign the result to a vector name D3, placing a statement that avoids division by zero (i.e., does not divide if element to denominator is null);
The idea of the problem, is to set an error message whenever one of the elements of vector D2 is equal to 0.
My attempt:
D1 = [d1 d2 d3 d4 d5 d6 d7]
D2= [d21 d22 d23 d24 d25 d26 d27]
for i= 1:length(D1)
if 0 in D2
fprintf(‘error: division by 0/n’)
else
D3=D1./D2
end
I don’t know if the “if-in” structure exists in Matlab. If it doesn’t, what could be an equivalent?
Thanks in advance!!!
One way to avoid any division by zero is to modify D2 by replacing any 0 with nan. Divisions by nan produce nan, so it's easy to tell which division would have caused a problem by simply inspecting the resulting vector D3. Moreover, almost all Matlab's functions are able to handle nans nicely (i.e. without crashing) or can be instructed to do so by setting some option.
What I've just described can be accomplished by using logical indexing, as follows:
% Definition of D1 and D2
D1 = [d1 d2 d3 d4 d5 d6 d7]
D2 = [d21 d22 d23 d24 d25 d26 d27]
% Replace 0s with NaNs
D2(D2==0) = nan;
% Perform the divisions at once
D3 = D1./D2 ;
For more details on logical indexing, look at the relevant section here.
As the OP requests a function that does the job, here's a possible implementation:
function D3 = vector_divide(D1, D2)
% Verify that vectors are numeric
% and have the same dimensions
if isnumeric(D1) & isnumeric(D2) &...
(size(D1,1) == size(D2,1)) &...
(size(D1,2) == size(D2,2))
% replace 0s with NaNs
D2(D2==0) = nan;
% Perform the divisions at once
D3 = D1./D2 ;
else
disp('D1 and D2 should both be numeric and have the same size!');
D3 = [];
end
Error handling in case of non-numeric arrays or size mismatch might vary depending on project requirements, if any. For instance, I could have used error (instead of disp) to display a message and terminate the program.

MATLAB Simple Calculation

I am working on MATLAB on my own, and was doing problem 9 on Project Euler
It states
" A Pythagorean triplet is a set of three natural numbers, a < b < c, for which,
a2 + b2 = c2
For example, 32 + 42 = 9 + 16 = 25 = 52.
There exists exactly one Pythagorean triplet for which a + b + c = 1000.
Find the product abc."
Below is the code I wrote; however, it compiles, but does not produce and output. I was hoping to get some feedback on what's wrong, so I can fix it.
Thanks,
syms a;
syms b;
syms c;
d= 1000;
d= a + b + c ;
ab= a.^2 + b.^2;
ab= c.^2;
c
I propose a vectorized way (that is, without using loops) to solve the problem. It may seem relatively complicated, especially if you come from other programming languages; but for Matlab you should get used to this way of approaching problems.
Ingredients:
Vectorization;
Indexing;
Transpose;
Implicit singleton expansion;
hypot;
find.
Read up on these concepts if you are not familiar with them, and then try to solve the problem yourself (which of course is the whole point of Project Euler). As a hint, the code below proceeds along these lines:
Generate a 1×1000 vector containing all possible values for a and b.
Compute a 1000×1000 matrix with the values of c corresponding to each pair a, b
From that compute a new matrix such that each entry contains a+b+c
Find the row and column indices where that matrix equals 1000. Those indices are the desired a and b (why?).
You'll get more than one solution (why?). Pick one.
Compute the product of the obtained a and b and the corresponding c.
Once you have tried yourself, you may want to check the code (move the mouse over it):
ab = 1:1000; % step 1
cc = hypot(ab,ab.'); % step 2
sum_abc = ab+ab.'+cc; % step 3
[a, b] = find(sum_abc==1000); % step 4
a = a(1); b = b(1); % step 5
prod_abc = a*b*cc(a,b); % step 6

Conceptual issues in AR model

I have some basic questions regarding multivariate model. In the ARFIT toolbox, the demo file ardem.m shows the working of a 2nd order bivariate (v1,v2) AR model. The coefficient matrices
A1 = [ 0.4 1.2; 0.3 0.7 ]
A2 = [ 0.35 -0.3; -0.4 -0.5 ]
are concatenated into
A = [ A1 A2 ]
Then a transpose of A is taken. So the result is a 2*4 matrix.
My question is that there should be only 4 coefficients viz. 2 for v1 variable and 2 for v2 variable but why are there 8 coefficients? If the equation format is
v(k,:) = a11*v1(k-1)+a12*v1(k-2) + a21*v2(k-1)+ a22*v2(k-2)
where a11 = 0.4, a12=1.2, a21=0.3 and a22=0.7.
I think I am missing somewhere in understanding. Can somebody please explain what is the correct representation?
The matrices A1 and A2 contain transfer coefficients that describe the contribution of states at times k-1 and k-2, respectively, to the state at time k. Since this is a bivariate process, we are following two variables which can influence each other, and both A1 and A2 are 2 x 2. Writing v1 = v(k,1) and v2 = v(k,2):
v1(k) = A1(1,1)*v1(k-1) + A1(1,2)*v2(k-1) + A2(1,1)*v1(k-2) + A2(1,2)*v2(k-2)
and similarly for v2(k). Then collectively A1 and A2 contain 8 elements. If the two processes were independent then A1 and A2 would be diagonal and would collectively contain only 4 nonzero elements.
By the way this is not really a Matlab question so I don't think this is the right forum for this question.

matrix assignment from a matrix A to a matrix B using conditional statements based on a third matrix C

I have two questions if you can kindly respond:
Q1) I have a matrix choice, where each person is making 4 of any possible choices, denoted as 1, 2, 3 and 4.
I have three matrixes A1, A2, A3 with income information for each person and each time period. Say I have n people and t time periods so A1, A2, A3 are n-by-t and choice is n-by-t.
Now I want to make another matrix B, where B will pick the element from A according to the value in the choice matrix, i.e. if choice(n,t)==1, then B(n,t) = A1(n,t). If choice(n,t)==2, then B(n,t) = A2(n,t), and so on.
I have tried the for loop and the if statement, I am unable to do it. Please help.
Q2) I have a matrix A of incomes. A is dimension n-by-t. Some people have low income, some have high income. Say anyone with income<1000 is low and above 1000 is high. At the end of my simulations, I need to know whether each person was high income or low income. How can I make a high income and low income matrix from the bigger matrix?
Q1:
C = choice %else the code gets too long
B = A1 .* (C==1) + A2 .* (C==2) + A3 .* (C==3)
I'm not sure how you want to handle the value '4' in B if you only have A1 A2 A3, but this should work.
[EDIT]:
If the choice is '4', that element of B will be 0 for the B i defined above.
Q2:
this one's a little vague. Maybe this is what you wanted:
HighIncome = A > 1000
LowIncome = A <= 1000
If this doesn't do it, please explain your objective more precisely.
[EDIT]:
Based on your slightly less vague explanation on Q2 it sounds like you wan't something like this:
A_high_income = A .* (A > 1000)
A_low_income = A .* (A <= 1000)
CHOICE_high_income = choice .* (A > 1000)
CHOICE_high_income = choice .* (A <= 1000)
The high income matrices have zeros at the low-income positions and vice versa.
This doesn't make very much sens IMHO, but it's the closest I could get to your description.
If this doesn't do it, follow the instructions in my comment below and post some examples.
Q1: You can use three simple statements and some logical indexing.
B = A1;
B(choice == 2) = A2(choice == 2);
B(choice == 3) = A3(choice == 3);
Q2: To separate A and choice into two parts based on income, you first find the indices of "low income" rows and use that to get rows from the matrices.
lowIncomeNdx = any(A < 1000, 2);
lowIncome = A(lowIncomeNdx, :);
lowIncomeChoice = choice(lowIncomeNdx, :);
highIncome = A(~lowIncomeNdx, :);
highIncomeChoice = choice(~lowIncomeNdx, :);

MATLAB: Transform a flat file list into a multi-dimensional array

I am completely stuck with this: I start out with a flat file type of list I get from an SQL statement like this and want to transform it into a 4D array.
SELECT a1, a2, a3, a4, v FROM table A;
a1 a2 a3 a4 v
--------------
2 2 3 3 100
2 1 2 2 200
3 3 3 3 300
...
a1 to a4 are some identifiers (integers) from a range of (1:5), which are also the coordinates for the new to be populated 4D array.
v is a value (double) e.g. a result from a measurement.
What I now want is to transform this list into a 4D array of dimension (5,5,5,5) where each v is put at the right coordinates.
This could easily be done using a for loop, however as I have lots of data this is not really feasible.
If I had just 1 dimension, I would do somesthing like this:
a1 = [2;5;7]; % Identifiers
v = [17;18;19]; % Values
b1 = (1:10)'; % Range of Identifiers
V = zeros(10,1); % Create result vector with correct dimensions
idx = ismember(b1, a1); % Do the look up
V(idx) = v; % Insert
My question: How can I do this for the above mentioned 4D array without using a for loop. Is there a "Matlab Way" of doing it?
Any help is greatly appreciated!
Thanks,
Janosch
You should be able to do what you want using linear indexing, and the sub2ind function. It would look something like this.
x=zeros(5,5,5,5); %initialize output vector
i = sub2ind(size(x),a1,a2,a3,a4);
x(i) = v;