Why i didn't get the complete decision tree(I mean all attributes)? - matlab

I used decision tree classifier in MATLAB. I used 7 attributes.But when i draw the decision tree it doesn't includes all attributes,it only includes 2 or 1 attributes.What is the problem of my code?
vars = {'Asymmetry' 'Border irregularity' 'colors' 'contrast' 'Co-relation'
'Homogeneity' 'Energy'};
x = [0.148 0.298 3 0.027 0.959 0.992 0.692
0.248 0.462 3 0.015 0.997 0.996 0.837
0.683 0.827 3 0.030 0.974 0.989 0.634
0.170 0.509 3 0.065 0.964 0.977 0.399
0.663 0.764 3 0.061 0.945 0.983 0.645
0.641 0.671 3 0.050 0.953 0.987 0.703
0.653 0.796 2 0.062 0.961 0.981 0.528
0.458 0.704 2 0.019 0.934 0.993 0.852
0.555 0.729 2 0.087 0.976 0.980 0.380
0.454 0.657 2 0.059 0.953 0.982 0.467
0.379 0.497 2 0.058 0.976 0.979 0.445
0.443 0.486 2 0.034 0.896 0.998 0.810
0.194 0.342 2 0.012 0.956 0.997 0.895
0.248 0.462 3 0.015 0.977 0.996 0.837
0.155 0.340 2 0.010 0.930 0.966 0.911
0.458 0.704 2 0.019 0.934 0.993 0.852];
y = {'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'Cancer';'non-Cancer';'non-Cancer';'non-Cancer';'non-Cancer'};
t = fitctree(x,y,'PredictorNames',vars, ...
'CategoricalPredictors',{},'Prune','off');
view(t);
X1=[0.148 0.186 2 0.139 0.984 0.992 0.558]
label = predict(t,X1);
view(t,'mode','graph');
The output image of the code:

There's nothing wrong with your code - the decision tree is not designed to use all of your variables, it's just designed to use the variables that give the best fit based on the given decision criteria. Using all of the variables would result in overfitting, especially considering that some of your variables are correlated with each other.

Related

Split a vector into smaller vectors in matlab

Are there any efficient way to do the following?
I have this vector:
0.923
0.757
0.552
0.298
0.079
0.925
0.769
0.565
0.297
0.075
0.927
0.777
0.572
0.294
0.072
0.931
0.778
0.57
0.292
0.07
0.933
0.78
0.566
0.293
0.075
I want to split this vector to smaller vectors each one consist of 5 values, and adding 1 in the top and 0 at the end of each vector
like this:
1
0.923
0.757
0.552
0.298
0.079
0
1
0.925
0.769
0.565
0.297
0.075
0
1
0.927
0.777
0.572
0.294
0.072
0
1
0.931
0.778
0.57
0.292
0.07
0
1
0.933
0.78
0.566
0.293
0.075
0
can I use cumsum to find the difference between value 1 and 2 in the same vector ?
for example, the first vector
0.923 - 1 = 0.077
and form another vector with the answers ?
v =[0.923
0.757
0.552
0.298
0.079
0.925
0.769
0.565
0.297
0.075
0.927
0.777
0.572
0.294
0.072
0.931
0.778
0.57
0.292
0.07
0.933
0.78
0.566
0.293
0.075];
A = reshape(v,5,[]);
A = [ones(1,size(A,2)) ; A ; zeros(1,size(A,2))]
A =
1.00000 1.00000 1.00000 1.00000 1.00000
0.92300 0.92500 0.92700 0.93100 0.93300
0.75700 0.76900 0.77700 0.77800 0.78000
0.55200 0.56500 0.57200 0.57000 0.56600
0.29800 0.29700 0.29400 0.29200 0.29300
0.07900 0.07500 0.07200 0.07000 0.07500
0.00000 0.00000 0.00000 0.00000 0.00000
B = diff(A)
B =
-0.077000 -0.075000 -0.073000 -0.069000 -0.067000
-0.166000 -0.156000 -0.150000 -0.153000 -0.153000
-0.205000 -0.204000 -0.205000 -0.208000 -0.214000
-0.254000 -0.268000 -0.278000 -0.278000 -0.273000
-0.219000 -0.222000 -0.222000 -0.222000 -0.218000
-0.079000 -0.075000 -0.072000 -0.070000 -0.075000

Bar plot with standard deviation

I am plotting bar plot with standard deviation in Matlab data are following
y = [0.776 0.707 1.269; 0.749 0.755 1.168; 0.813 0.734 1.270; 0.845 0.844 1.286];
std_dev = [0.01 0.055 0.052;0.067 0.119 0.106;0.036 0.077 0.060; 0.029 0.055 0.051];
I am writing following code
figure
hold on
bar(y)
errorbar(y,std_dev,'.')
But I am not getting standard deviation bar in the correct position.
If all the bars have the same color:
x=1:15;
y = [0.776 0.707 1.269 0 0.749 0.755 1.168 0 0.813 0.734 1.270 0 0.845 0.844 1.286];
std_dev = [0.01 0.055 0.052 0 0.067 0.119 0.106 0 0.036 0.077 0.060 0 0.029 0.055 0.051];
figure
hold on
bar(x,y)
errorbar(y,std_dev ,'.')
XTickLabel={'1' ; '2'; '3' ; '4'};
XTick=2:4:15
set(gca, 'XTick',XTick);
set(gca, 'XTickLabel', XTickLabel);

classifier.setOptions( weka.core.Utils.splitOptions()) is taking only default values even if other values provided in matlab

import weka.core.Instances.*
filename = 'C:\Users\Girish\Documents\MATLAB\DRESDEN_NSC.csv';
loader = weka.core.converters.CSVLoader();
loader.setFile(java.io.File(filename));
data = loader.getDataSet();
data.setClassIndex(data.numAttributes()-1);
%% classification
classifier = weka.classifiers.trees.J48();
classifier.setOptions( weka.core.Utils.splitOptions('-C 0.25 -M 2') );
classifier.buildClassifier(data);
classifier.toString()
ev = weka.classifiers.Evaluation(data);
v(1) = java.lang.String('-t');
v(2) = java.lang.String(filename);
v(3) = java.lang.String('-split-percentage');
v(4) = java.lang.String('66');
prm = cat(1,v(1:4));
ev.evaluateModel(classifier, prm)
Result:
Time taken to build model: 0.04 seconds
Time taken to test model on training split: 0.01 seconds
=== Error on training split ===
Correctly Classified Instances 767 99.2238 %
Incorrectly Classified Instances 6 0.7762 %
Kappa statistic 0.9882
Mean absolute error 0.0087
Root mean squared error 0.0658
Relative absolute error 1.9717 %
Root relative squared error 14.042 %
Total Number of Instances 773
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.994 0.009 0.987 0.994 0.990 0.984 0.999 0.999 Nikon
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 Sony
0.981 0.004 0.990 0.981 0.985 0.980 0.999 0.997 Canon
Weighted Avg. 0.992 0.004 0.992 0.992 0.992 0.988 1.000 0.999
=== Confusion Matrix ===
a b c <-- classified as
306 0 2 | a = Nikon
0 258 0 | b = Sony
4 0 203 | c = Canon
=== Error on test split ===
Correctly Classified Instances 358 89.9497 %
Incorrectly Classified Instances 40 10.0503 %
Kappa statistic 0.8482
Mean absolute error 0.0656
Root mean squared error 0.2464
Relative absolute error 14.8485 %
Root relative squared error 52.2626 %
Total Number of Instances 398
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.885 0.089 0.842 0.885 0.863 0.787 0.908 0.832 Nikon
0.993 0.000 1.000 0.993 0.997 0.995 0.997 0.996 Sony
0.796 0.060 0.841 0.796 0.818 0.749 0.897 0.744 Canon
Weighted Avg. 0.899 0.048 0.900 0.899 0.899 0.853 0.938 0.867
=== Confusion Matrix ===
a b c <-- classified as
123 0 16 | a = Nikon
0 145 1 | b = Sony
23 0 90 | c = Canon
import weka.core.Instances.*
filename = 'C:\Users\Girish\Documents\MATLAB\DRESDEN_NSC.csv';
loader = weka.core.converters.CSVLoader();
loader.setFile(java.io.File(filename));
data = loader.getDataSet();
data.setClassIndex(data.numAttributes()-1);
%% classification
classifier = weka.classifiers.trees.J48();
classifier.setOptions( weka.core.Utils.splitOptions('-C 0.1 -M 1') );
classifier.buildClassifier(data);
classifier.toString()
ev = weka.classifiers.Evaluation(data);
v(1) = java.lang.String('-t');
v(2) = java.lang.String(filename);
v(3) = java.lang.String('-split-percentage');
v(4) = java.lang.String('66');
prm = cat(1,v(1:4));
ev.evaluateModel(classifier, prm)
Result:
Time taken to build model: 0.04 seconds
Time taken to test model on training split: 0 seconds
=== Error on training split ===
Correctly Classified Instances 767 99.2238 %
Incorrectly Classified Instances 6 0.7762 %
Kappa statistic 0.9882
Mean absolute error 0.0087
Root mean squared error 0.0658
Relative absolute error 1.9717 %
Root relative squared error 14.042 %
Total Number of Instances 773
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.994 0.009 0.987 0.994 0.990 0.984 0.999 0.999 Nikon
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 Sony
0.981 0.004 0.990 0.981 0.985 0.980 0.999 0.997 Canon
Weighted Avg. 0.992 0.004 0.992 0.992 0.992 0.988 1.000 0.999
=== Confusion Matrix ===
a b c <-- classified as
306 0 2 | a = Nikon
0 258 0 | b = Sony
4 0 203 | c = Canon
=== Error on test split ===
Correctly Classified Instances 358 89.9497 %
Incorrectly Classified Instances 40 10.0503 %
Kappa statistic 0.8482
Mean absolute error 0.0656
Root mean squared error 0.2464
Relative absolute error 14.8485 %
Root relative squared error 52.2626 %
Total Number of Instances 398
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.885 0.089 0.842 0.885 0.863 0.787 0.908 0.832 Nikon
0.993 0.000 1.000 0.993 0.997 0.995 0.997 0.996 Sony
0.796 0.060 0.841 0.796 0.818 0.749 0.897 0.744 Canon
Weighted Avg. 0.899 0.048 0.900 0.899 0.899 0.853 0.938 0.867
=== Confusion Matrix ===
a b c <-- classified as
123 0 16 | a = Nikon
0 145 1 | b = Sony
23 0 90 | c = Canon
Same Result with both split options which is the result for default options i.e. -C 0.25 -M 2 for J48 classifier
please help!!! stuck here for a long time.Tried Different means but nothing worked for me

predict value of curve in matlab

Suppose I have the following vector of points:
X=[ 0.401 0.398 0.395 0.392 0.388 0.384 0.381 0.377 0.373 0.368 0.364 0.359 0.354 0.349 0.344 0.339 0.334 0.328 0.322 0.316 0.310 0.304 0.297 0.291 0.284 0.277 0.270 0.263 0.256 0.249 0.242 0.234 0.227 0.220 0.212 0.205 0.198 0.190 0.183 0.176 0.169 0.161 0.154 0.147 0.140 0.134 0.127 0.120 0.113 0.107 0.101 0.094 0.088 0.082 0.076 0.070 0.064 0.059 0.053 0.048 0.042 0.037 0.032 0.027 0.022 0.018 0.013 0.009 0.004 0.000 -0.004 -0.008 -0.012 -0.016 -0.019 -0.023 -0.026 -0.030 -0.033 -0.036 -0.039 -0.042 -0.045 -0.048 -0.050 -0.053 -0.055 -0.058 -0.060 -0.062 -0.064 -0.066 -0.068 -0.070 -0.072 -0.074 -0.076 -0.077 -0.079 -0.080];
Y=[0.347 0.362 0.377 0.393 0.409 0.426 0.442 0.459 0.477 0.494 0.512 0.530 0.548 0.567 0.585 0.604 0.622 0.641 0.659 0.678 0.696 0.715 0.733 0.750 0.768 0.785 0.801 0.817 0.833 0.848 0.863 0.876 0.890 0.902 0.914 0.925 0.935 0.945 0.953 0.961 0.969 0.975 0.981 0.986 0.990 0.993 0.996 0.998 0.999 1.000 1.000 0.999 0.998 0.996 0.994 0.991 0.988 0.984 0.979 0.974 0.969 0.963 0.957 0.951 0.944 0.937 0.930 0.922 0.914 0.906 0.898 0.889 0.881 0.872 0.863 0.855 0.846 0.837 0.827 0.818 0.809 0.800 0.791 0.782 0.773 0.764 0.755 0.747 0.738 0.729 0.721 0.712 0.704 0.696 0.688 0.680 0.672 0.664 0.656 0.649];
When I plot the points X and Y, this is what I get:
I want to calculate the value of 'Width' of the curve W. How can I do that?
It looks like the points are unordered, and so simply subtracting the last point by the first point won't work. What you can do is use max and min on the X array to determine the width:
Width = max(X) - min(X);
It's certainly as simple as that! FWIW, your title says one thing, but your question asks another. Suggest you either edit your question or title for clarity.

How to solve Ax=b when some values of x are known and b is vector of zeros

I have a matrix Q=A (64x64), a vector f=b is vector of zeros and I know some values of x=q. I know that I should transfer the corresponding column and row (of known x=q) on the right side of the equation (to b), but I don't know, how to make it in Matlab. I should to do that for 1st, 5th, 9th, 13th, 17th, 21st, 25th, 29th, 33th, 37th, 41st, 45th, 49th, 53th, 57th and 61st, column and row. Can you help me, pls?
This is the program:
clear all;
K=zeros(64,64);
f=zeros(64,1);
ne=32;
E= 8000; %Young
P= 0.51; %Poisson
Lambda=(E*P)/((1+P)*(1-2*P));
Eta=E/(2*(1+P));
%ILOK
ILOK=[
1 3 5 7 2 4 6 8;
5 7 9 11 6 8 10 12;
9 11 13 15 10 12 14 16;
13 15 17 19 14 16 18 20;
17 19 21 23 18 20 22 24;
21 23 25 27 22 24 26 28;
25 27 29 31 26 28 30 32;
29 31 33 35 30 32 34 36;
33 35 37 39 34 36 38 40;
37 39 41 43 38 40 42 44;
41 43 45 47 42 44 46 48;
45 47 49 51 46 48 50 52;
49 51 53 55 50 52 54 56;
53 55 57 59 54 56 58 60;
57 59 61 63 58 60 62 64;
61 63 1 3 62 64 2 4;
3 0 7 0 4 0 8 0;
7 0 11 0 8 0 12 0;
11 0 15 0 12 0 16 0;
15 0 19 0 16 0 20 0;
19 0 23 0 20 0 24 0;
23 0 27 0 24 0 28 0;
27 0 31 0 28 0 32 0;
31 0 35 0 32 0 36 0;
35 0 39 0 36 0 40 0;
39 0 43 0 40 0 44 0;
43 0 47 0 44 0 48 0;
47 0 51 0 48 0 52 0;
51 0 55 0 52 0 56 0;
55 0 59 0 56 0 60 0;
59 0 63 0 60 0 64 0;
63 0 3 0 64 0 4 0;
];
%x
xm=[
9.000 14.500 8.315 13.396;
8.315 13.396 6.364 10.253;
6.364 10.253 3.444 5.549;
3.444 5.549 0.000 0.000;
0.000 0.000 -3.444 -5.549;
-3.444 -5.549 -6.364 -10.253;
-6.364 -10.253 -8.315 -13.396;
-8.315 -13.396 -9.000 -14.500;
-9.000 -14.500 -8.315 -13.396;
-8.315 -13.396 -6.364 -10.253;
-6.364 -10.253 -3.444 -5.549;
-3.444 -5.549 0.000 0.000;
0.000 0.000 3.444 5.549;
3.444 5.549 6.364 10.253;
6.364 10.253 8.315 13.396;
8.315 13.396 9.000 14.500;
14.500 20.000 13.396 18.748;
13.396 18.748 10.253 14.142;
10.253 14.142 5.549 7.654;
5.549 7.654 0.000 0.000;
0.000 0.000 -5.549 -7.654;
-5.549 -7.654 -10.253 -14.142;
-10.253 -14.142 -13.396 -18.748;
-13.396 -18.748 -14.500 -20.000;
-14.500 -20.000 -13.396 -18.748;
-13.396 -18.748 -10.253 -14.142;
-10.253 -14.142 -5.549 -7.654;
-5.549 -7.654 0.000 0.000;
0.000 0.000 5.549 7.654;
5.549 7.654 10.253 14.142;
10.253 14.142 13.396 18.748;
13.396 18.748 14.500 20.000;
];
%y
ym=[
0.000 0.000 3.444 5.549
3.444 5.549 6.364 10.253
6.364 10.253 8.315 13.396
8.315 13.396 9.000 14.500
9.000 14.500 8.315 13.396
8.315 13.396 6.364 10.253
6.364 10.253 3.444 5.549
3.444 5.549 0.000 0.000
0.000 0.000 -3.444 -5.549
-3.444 -5.549 -6.364 -10.253
-6.364 -10.253 -8.315 -13.396
-8.315 -13.396 -9.000 -14.500
-9.000 -14.500 -8.315 -13.396
-8.315 -13.396 -6.364 -10.253
-6.364 -10.253 -3.444 -5.549
-3.444 -5.549 0.000 0.000
0.000 0.000 5.549 7.654
5.549 7.654 10.253 14.142
10.253 14.142 13.396 18.748
13.396 18.748 14.500 20.000
14.500 20.000 13.396 18.748
13.396 18.748 10.253 14.142
10.253 14.142 5.549 7.654
5.549 7.654 0.000 0.000
0.000 0.000 -5.549 -7.654
-5.549 -7.654 -10.253 -14.142
-10.253 -14.142 -13.396 -18.748
-13.396 -18.748 -14.500 -20.000
-14.500 -20.000 -13.396 -18.748
-13.396 -18.748 -10.253 -14.142
-10.253 -14.142 -5.549 -7.654
-5.549 -7.654 0.000 0.000
];
%Ke a fe of element
for k=1:ne
x=xm(k,:);%k-ty radek x-ove matice
y=ym(k,:);%k-ty radek y-ove matice
Au=zeros(4,4);
Av=zeros(4,4);
Auv=zeros(4,4);
Avu=zeros(4,4);
%Numerical integration
for i=1:9
a=0.774596669241483;
gaus=[1 0 0 68/81;
2 0 a 40/81;
3 a 0 40/81;
4 0 -a 40/81;
5 -a 0 40/81;
6 a a 25/81;
7 a -a 25/81;
8 -a -a 25/81;
9 -a a 25/81];
r=gaus(i,2);
s=gaus(i,3);
N=[(1/4)*(1-r)*(1-s);
(1/4)*(1+r)*(1-s);
(1/4)*(1+r)*(1+s);
(1/4)*(1-r)*(1+s)];
Nr=[(1/4)*(s-1);
(1/4)*(1-s);
(1/4)*(s+1);
(1/4)*(-s-1)];
Ns=[(1/4)*(r-1);
(1/4)*(-1-r);
(1/4)*(r+1);
(1/4)*(1-r)];
%Jacob matrix
j1=Nr'*x';
j2=Nr'*y';
j3=Ns'*x';
j4=Ns'*y';
J=[j1 j2;
j3 j4];
detJ=abs(det(J));
invJ=inv(J);
%Nx a Ny
Nx=invJ(1,1)*Nr+invJ(1,2)*Ns;
Ny=invJ(2,1)*Nr+invJ(2,2)*Ns;
ds=gaus(i,4)*detJ;
Au=Au+(Nx*(Lambda*Nx'+2*Eta*Nx')+Eta*Ny*Ny')*ds;
Av=Av+(Ny*(Lambda*Ny'+2*Eta*Ny')+Eta*Nx*Nx')*ds;
Auv=Auv+(Nx*Lambda*Ny'+Eta*Ny*Nx')*ds;
Avu=Avu+(Ny*Lambda*Nx'+Eta*Nx*Ny')*ds;
Ke=[Au Auv;
Avu Av];
fe=zeros(8,1);
end
%K a f
N=8;
je=1:N;
mg(je)=ILOK(k,je);
igl=mg;
inen=find(igl);
K(igl(inen),igl(inen))=K(igl(inen),igl(inen))+Ke(igl>0,igl>0);
f(igl(inen))=f(igl(inen))+fe(igl>0);
Ke=zeros(8,8);
fe=zeros(8,1);
end
K;
And then I need to solve q=K/f
I would like transfer the columns a rows from this matrix to f (in this case).
Thank you for your help :)
This is known as solving for the null space of a matrix.
Z = null(A)
Z = null(A,'r')
Z = null(A) is an orthonormal basis for the null space of A obtained
from the singular value decomposition. That is, A*Z has negligible
elements, size(Z,2) is the nullity of A, and Z'*Z = I.
Z = null(A,'r') is a "rational" basis for the null space obtained from
the reduced row echelon form. A*Z is zero, size(Z,2) is an estimate
for the nullity of A, and, if A is a small matrix with integer
elements, the elements of the reduced row echelon form (as computed
using rref) are ratios of small integers.
The orthonormal basis is preferable numerically, while the rational
basis may be preferable pedagogically.
Please refer to the fully worked out examples in the References section below, as they include MATLAB-specific examples, and worked out "by hand" solutions.
Good luck!
References
http://www.mathworks.com/help/matlab/ref/null.html
http://www.math.sunysb.edu/~badger/mat211f12/solver.pdf