I am trying to make a decision tree but the outcome is strange and I can't figure out where is wrong. There are seven variables, each of which I use 1 or 2 to represent their meaning, for example, for variable 1 the number 1 is warm and 2 is cold, for variable 2 the number 1 is yes and 2 is no.
vars = {'TEMP' 'SKIN' 'BIRTH' 'AQUATIC' 'AERIAL' 'LEGS' 'HIBER'};
x = [1 1 1 2 2 1 2
2 2 2 2 2 2 1
2 2 2 1 2 2 2
1 1 1 1 2 2 2
2 1 2 1 2 1 1
2 2 2 2 2 1 2
1 1 1 2 1 1 1
1 1 2 2 1 1 2
1 1 1 2 2 1 2
2 2 1 1 2 2 2
2 2 2 1 2 1 2
1 1 2 1 2 1 2
1 1 1 2 2 1 1
2 2 2 1 2 2 2
2 1 2 1 2 1 1];
s = {'M';'R';'F';'M';'A';'R';'M';'B';'M';'F';'R';'B';'M';'F';'A'};
y = cellstr(s);
t = classregtree(x, y, 'method','classification', 'names',vars,...
'categorical',[1 7], 'prune','off');
view(t)
The outcome is only one step tree without other information. What is wrong with this?
I'm not an expert of decision trees, anyway, playing a little bit with the parameters of classregtree (minparent, to be exact):
vars = {'TEMP' 'SKIN' 'BIRTH' 'AQUATIC' 'AERIAL' 'LEGS' 'HIBER'};
x = [1 1 1 2 2 1 2
2 2 2 2 2 2 1
2 2 2 1 2 2 2
1 1 1 1 2 2 2
2 1 2 1 2 1 1
2 2 2 2 2 1 2
1 1 1 2 1 1 1
1 1 2 2 1 1 2
1 1 1 2 2 1 2
2 2 1 1 2 2 2
2 2 2 1 2 1 2
1 1 2 1 2 1 2
1 1 1 2 2 1 1
2 2 2 1 2 2 2
2 1 2 1 2 1 1];
y = {'M';'R';'F';'M';'A';'R';'M';'B';'M';'F';'R';'B';'M';'F';'A'};
t = classregtree(x,y,'method','classification','Names',vars, ...
'categorical',[1 7],'prune','off','minparent',1);
view(t);
I've been able to reproduce something that looks fine. Anyway, since Matlab release 2011A, classregtree has become obsolete and has been superseded by fitrtree (RegressionTree) and fitctree (ClassificationTree) functions (classregtree is being kept for retrocompatibility reasons only). I recommend you to update your code and use those functions instead:
t = fitctree(x,y,'PredictorNames',vars, ...
'CategoricalPredictors',{'TEMP' 'HIBER'},'Prune','off','MinParentSize',1);
Related
I'm dealing with matrices of this format:
M =
1 1 3
1 1 1
1 2 2
1 2 1
1 2 2
2 1 5
2 1 1
2 2 3
2 2 4
2 2 2
...
What I want to do is extract sub matrices where the values in the first and second column can be grouped such that:
M1 =
1 1 3
1 1 1
M2 =
1 2 2
1 2 1
1 2 2
M3 =
2 1 5
2 1 1
...
I have been trying to think hard about how to index the matrix for this and I have a matrix available:
I =
1 1
1 2
2 1
2 2
...
that I could use for indexing. I was wondering if I could use it but I'm not 100% sure how. I don't want to use a for loop since the matrixes can be rather large and the order of complexity can become very large.
Thank you for reading!
This is easily done with unique and accumarray:
M = [ 1 1 3
1 1 1
1 2 2
1 2 1
1 2 2
2 1 5
2 1 1
2 2 3
2 2 4
2 2 2 ]; %// data
[~, ~, u] = unique(M(:,1:2), 'rows'); %// unique labels of rows based on columns 1 and 2
M_split = accumarray(u(:), (1:size(M,1)).', [], #(x){M(sort(x),:)}); %'// group rows
% // based on labels
This gives a cell array containing the partial matrices. In your example,
M_split{1} =
1 1 3
1 1 1
M_split{2} =
1 2 2
1 2 1
1 2 2
M_split{3} =
2 1 5
2 1 1
M_split{4} =
2 2 3
2 2 4
2 2 2
I have below code and I want to convert it to a faster way but I don't how I can convert For syntax to a faster way in Matlab.
If user count is 5 and item count is 2 and time count is 4, I want to create this matrix:
1 1 1
1 1 2
1 1 3
1 1 4
1 2 1
1 2 2
1 2 3
1 2 4
2 1 1
2 1 2
2 1 3
2 1 4
...
result=zeros(userCount*itemCount*timeCount,4);
j=0;
for i=1:userCount
result(j*itemCount*timeCount+1:j*itemCount*timeCount+itemCount*timeCount,1)=ones(itemCount*timeCount,1)*i;
j=j+1;
end
j=0;
h=1;
for i=1:userCount*itemCount
result(j*timeCount+1:j*timeCount+timeCount,2)=ones(timeCount,1)*(h);
j=j+1;
h=h+1;
if h>itemCount
h=1;
end
end
j=0;
for i=1:userCount*itemCount
result(j*timeCount+1:j*timeCount+timeCount,3)=1:timeCount;
j=j+1;
end
for i=1:size(subs,1)
f=(result(:,1)==subs(i,1)& result(:,2)==subs(i,2));
result(f,:)=[];
end
What you are describing is to enumerate permutations for three independent linear sets. One way to achieve this would be to use ndgrid and unroll each output into a single vector:
userCount = 5; itemCount = 2; timeCount = 4;
[X,Y,Z] = ndgrid(1:timeCount,1:itemCount,1:userCount);
result = [Z(:) Y(:) X(:)];
We get:
result =
1 1 1
1 1 2
1 1 3
1 1 4
1 2 1
1 2 2
1 2 3
1 2 4
2 1 1
2 1 2
2 1 3
2 1 4
2 2 1
2 2 2
2 2 3
2 2 4
3 1 1
3 1 2
3 1 3
3 1 4
3 2 1
3 2 2
3 2 3
3 2 4
4 1 1
4 1 2
4 1 3
4 1 4
4 2 1
4 2 2
4 2 3
4 2 4
5 1 1
5 1 2
5 1 3
5 1 4
5 2 1
5 2 2
5 2 3
5 2 4
Assume there is an m by n all-digit matrix:
X11 X12 ... X1n<br>
X21 X22 ... X2n<br>
...<br>
Xmn Xmn ... Xmn
I need help to write a method that somewhat like 'order by' in ansi sql. This method shall be able to sort the matrix by ANY number of column indices passed in at cmdline.
For example, if cmdline is:
% myOrderby -col 1,2,5
It'll sort the matrix by columns 1, 2 and 5.
However, if cmdline is
% myOrderby -col 1,4,8,11
then it would sort the matrix by columns 1, 4, 8 and 11.
I know how to implement the method if there was a fixed maximum number of columns to "order by". I am looking for a method that can sort a matrix based on any number of columns.
Is this possible?
You can sort by an arbitrary number of fields
# note, your columns start at 1, while arrays index from 0
my #cols = map($_-1, #param_cols);
my #sorted = sort {
my $r = 0;
$r ||= $a->[$_] <=> $b->[$_] for #cols;
return $r;
} #matrix;
You will loop through the indices, breaking out of the loop when the comparison shows a difference.
#M = map{$a=$_;map{$b=$_;map{$c=$_;map{$d=$_;
map[$a,$b,$c,$d,$_],2,1}1,2}2,1}1,2}2,1;
sub by_cols {
my ($row1,$row2,#indices) = #_;
foreach my $col (#indices) {
my $d = $row1->[$col] <=> $row2->[$col];
return $d if $d;
}
return 0;
}
print "#$_\n" for sort { by_cols($a,$b, 3,4,1) } #M;
Output
2 1 2 1 1
2 1 1 1 1
1 1 2 1 1
1 1 1 1 1
2 2 2 1 1
2 2 1 1 1
1 2 2 1 1
1 2 1 1 1
2 1 2 1 2
2 1 1 1 2
1 1 2 1 2
1 1 1 1 2
2 2 2 1 2
2 2 1 1 2
1 2 2 1 2
1 2 1 1 2
2 1 2 2 1
2 1 1 2 1
1 1 2 2 1
1 1 1 2 1
2 2 2 2 1
2 2 1 2 1
1 2 2 2 1
1 2 1 2 1
2 1 2 2 2
2 1 1 2 2
1 1 2 2 2
1 1 1 2 2
2 2 2 2 2
2 2 1 2 2
1 2 2 2 2
1 2 1 2 2
This is a similar request to my post at Iterate one vector through another in Matlab
I am using Luis' suggestion with the following code:
E=[1 2 3 4 5 6 7 8 9 10];
A = [1 2];
s = size(E,2);
t = numel(A);
C = cell(1,s);
[C{:}] = ndgrid(A);
C = cat(s+1, C{:});
C = fliplr(reshape(C, t^s, s));
This produces a good result for C as a 1024x10 matrix with all possible permutations of 1 and 2 to a length of 10 columns. What I want to do is remove any rows that are not in increasing order. For example now I get:
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 2
1 1 1 1 1 1 1 1 2 1
1 1 1 1 1 1 1 1 2 2
All are valid except for the third row since it goes from 2 to back to 1.
I have code to get the desired result, but it is very slow and inefficient.
E=[1 2 3 4 5 6 7 8 9 10];
A = [1 2];
s = size(E,2);
t = numel(A);
C = cell(1,s);
[C{:}] = ndgrid(A);
C = cat(s+1, C{:});
C = fliplr(reshape(C, t^s, s));
min=0;
for row=1:size(C,1)
for col=1:size(C,2)
if(C(row,col)>min)
min=C(row,col);
elseif(C(row,col)<min)
C(row,:)=0;
continue;
end
end
min=0;
end
C = C(any(C,2),:); %remove all zero rows
The desired output is now:
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 2
1 1 1 1 1 1 1 1 2 2
1 1 1 1 1 1 1 2 2 2
1 1 1 1 1 1 2 2 2 2
1 1 1 1 1 2 2 2 2 2
1 1 1 1 2 2 2 2 2 2
1 1 1 2 2 2 2 2 2 2
1 1 2 2 2 2 2 2 2 2
1 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2
Any ideas on how to optimize my code so I do not need to use nested loops?
The super-simple but not-quite-so-obvious solution via a couple of row-wise operations:
d = diff(C, [], 2);
m = min(d, [], 2);
C = C(m>=0, :);
Of course, in this particular example it would be far easier to just generate the resulting matrix directly:
C = flipud(triu(ones(s+1,s).*(max(A)-min(A))) + min(A));
but I assume you're also interested in less trivial values of A ;)
It was hard to phrase the question, but here's an example of what I'm looking for:
1 2 3 4
2 1 1 1
2 2 3 1
0 0 0 0
and in column one, I add all the value of all of the first three rows and save it to the third and so on, so that it becomes:
1 2 3 4
2 1 1 1
2 2 3 1
5 5 7 6
I think you can use sum:
octave:23> m = [1 2 3 4; 2 1 1 1; 2 2 3 1; 0 0 0 0]
m =
1 2 3 4
2 1 1 1
2 2 3 1
0 0 0 0
octave:24> m(length(m), :) = sum(m)
m =
1 2 3 4
2 1 1 1
2 2 3 1
5 5 7 6