I'm trying to use TreeBagger to build a classifier based on the UCI Diabetes 130-US database, http://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008.
I have imported the data as a table (letting Matlab decide on the data types), and have done some cleaning on the data. I'm calling the classifier as from an example, using my own data:
num_trees = 50;
B = TreeBagger(num_trees, train, train.readmitted,...
'OOBPrediction','On',...
'Method','classification');
oobErrorBaggedEnsemble = oobError(B);
plot(oobErrorBaggedEnsemble)
xlabel 'Number of grown trees';
ylabel 'Out-of-bag classification error';
I get the following error:
Error using classreg.learning.internal.table2PredictMatrix>makeXMatrix (line 100) Table variable is not a valid predictor.
Error in classreg.learning.internal.table2PredictMatrix (line 57)
Xout = makeXMatrix(X,CategoricalPredictors,vrange,pnames);
Error in classreg.learning.classif.CompactClassificationTree/predict (line 639)
X = classreg.learning.internal.table2PredictMatrix(X,[],[],...
Error in CompactTreeBagger/treeEval (line 1083)
[labels,~,nodes] = predict(tree,x);
Error in CompactTreeBagger/predictAccum (line 1414)
thisR = treeEval(bagger,it,thisX,doclassregtree);
Error in CompactTreeBagger/error (line 470)
predictAccum(bagger,X,'useifort',useIforT,...
Error in TreeBagger/oobError (line 1479)
err = error(bagger.Compact,bagger.X,bagger.Y,...
train is a table, and table.readmitted is a cell retrieved from the table. Most of the rows are cells, as most of the data in this dataset is categorical.
I'm wondering is there are certain datatypes that the classifier can't handle.
Thanks for any help!
The use of tables for the Machine Learning toolbox was introduced in R2016a. For previous versions, the data could only be passed to the fit* functions or TreeBagger as arrays.
The behavior changed from R2015(a,b) to R2016a.
Related
I am trying to learn relevant features in a 300*299 training matrix by taking a random row from it as my test data and applying sequentialfs on it. I have used the following code:
>> Md1=fitcdiscr(xtrain,ytrain);
>> func = #(xtrain, ytrain, xtest, ytest) sum(ytest ~= predict(Md1,xtest));
>> learnt = sequentialfs(func,xtrain,ytrain)
xtrain and ytrain are 299*299 and 299*1 respectively. Predict will give me the predicted label for xtest(which is some random row from original xtrain).
However, when I run my code, I get the following error:
Error using crossval>evalFun (line 480)
The function '#(xtrain,ytrain,xtest,ytest)sum(ytest~=predict(Md1,xtest))' generated the following error:
X must have 299 columns.
Error in crossval>getFuncVal (line 497)
funResult = evalFun(funorStr,arg(:));
Error in crossval (line 343)
funResult = getFuncVal(1, nData, cvp, data, funorStr, []);
Error in sequentialfs>callfun (line 485)
funResult = crossval(fun,x,other_data{:},...
Error in sequentialfs (line 353)
crit(k) = callfun(fun,x,other_data,cv,mcreps,ParOptions);
Error in new (line 13)
learnt = sequentialfs(func,xtrain,ytrain)
Where did I go wrong?
You should build your classifier inside func, not before.
sequentialfs calls the function each time on different sets, and a classifier must be built specifically for each set, using only the features sequentialfs selected for that iteration.
I'm not sure I managed to be clear, in practice you should move the first line of your code inside the body of func
Source: MathWorks
I have a signal in time domain that has more than 4500 sample. From this signal I have extracted the following signature:
Using the code that can be seen under (in matlab)I have managed to create a wavelet transform out of this signature.
Current_DIR = cd; % Save the current directory name.
cd(tempdir); % Work in a temporary directory.
familyName = 'MyWAVE T1';
familyShortName = 'mywa';
familyWaveType = 1;
familyNums = '';
fileWaveName = 'mywa.mat';
myna =F; %F is the signal
save myna mywa
wavemngr('add',familyName,familyShortName,familyWaveType, ...
familyNums,fileWaveName)
Once I have created the wavelet, I tried to plot it which was successful and looked like follow:
For now all is good up to here.
When I try to use the wavemenu tool in matlab to view the wavelet it gives me the following error:
>> wavemenu
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
Error in wdstem (line 38)
yy = [zeros(1,n);y;nan*ones(size(y))];
Error in wvdtool (line 390)
wdstem(axe_Lo_D,xVal_f,Lo_D,stemCOL,1);
Error while evaluating UIControl Callback
My question is, am I doing something wrong in the process? Is it even possible to take a signal and transform it into a wavelet in matlab?
Thank you in advance for your help. :)
I am trying to run the example at:
http://nl.mathworks.com/help/stats/group-comparisons-using-categorical-arrays.html
using Matlab R2013b.
clear
load('carsmall')
cars = table(MPG,Weight,Model_Year);
cars.Model_Year = nominal(cars.Model_Year);
fit = fitlm(cars,'MPG~Weight*Model_Year')
Unfortunatelly I get the error:
Error using classreg.regr.FitObject/assignData (line 257)
Predictor and response variables must have the same length.
Error in classreg.regr.TermsRegression/assignData (line 349)
model = assignData#classreg.regr.ParametricRegression(model,X,y,w,asCat,varNames,excl);
Error in LinearModel.fit (line 852)
model = assignData(model,X,y,weights,asCatVar,dummyCoding,model.Formula.VariableNames,exclude);
Error in fitlm (line 111)
model = LinearModel.fit(X,varargin{:});
Any clue why?
I want to use ARX. X is a 1000X13 matrix (1000 sample with 13 features). I want to see the relationship of for example 1st and 2nd column of X. I don't know how to make input parameters right. What should be the size of [na nb nk] for my regression problem. Matlab documentation doesn't have much detail.
Here is my code:
data = iddata(X(:,1),[],1); %I have to make iddata object first.
Y = arx(data,[ [ones(size(X(:,1),2),size(X(:,1),2))] [ones(size(X(:,2),1),size(X(:,1),1))] [ones(size(X(:,1),2),size(X(:,1),1))] ])
Error is:
Error using horzcat
Dimensions of matrices being concatenated are not consistent.
I tried to change the dimensions of [na nb nk], but every time, I got an error like:
Y = arx(data,[ [ones(size(X(:,1),2),size(X(:,1),2))] 1 [ones(size(X(:,1),2),size(X(:,1),1))] ])
Invalid ARX orders. Note that continuous-time ARX models are not supported.
Y = arx(data,[ 1 1 1])
Error using arx (line 77)
The model orders must be compatible with the input and output dimensions of the estimation data.
So I'm using an arima(3,0,0) model to forecast values in Matlab. I have a vector of initial values for it to use to forecast off of but I keep getting an error.
Here is my code:
model1=arima(3,0,0);
[EstMdl1,~,logL1]=estimate(model1,qtrdatachangelog2000(:,1));
Ymdl1pred=zeros(length(qtrdatachangelogafter2000),1);
for i=1:length(qtrdatachangelogafter2000)-2;
[Ymdl1pred(i)]=forecast(EstMdl1,1,'Y0',qtrdatachangelogafter2000(i+2,1));
end;
The error I get:
Error using internal.econ.LagIndexableTimeSeries.checkPresampleData (line 653)
Number of rows in presample array 'Y0' must be at least 3.
Error in arima/forecast (line 498)
Y0 = internal.econ.LagIndexableTimeSeries.checkPresampleData(zeros(maxPQ,numPaths),
'Y0', Y0, OBJ.P);
I'm assuming this is because the AR(3) has 3 parameters and thus needs at least 3 rows of data before it can start which is why in my for loop I used i+2 but it continues to give the error. Please help.
Using i+2 you are not adding 3 rows, you are using only 1 row. You need to specify the start and end index: i:i+2. You can use this code:
[Ymdl1pred(i)]=forecast(EstMdl1,1,'Y0',qtrdatachangelogafter2000(i:i+2,1));