Create a Tree Stump Matlab - matlab

Any idea on how to create a decision tree stump for use with boosting in Matlab? I mean is there some parameter I can send to classregtree to make sure i end up with only 1 level? I tried pruning and it doesn't always give a stump (single cut). Sometimes I was only able to get 2 cuts (unbalanced tree).
I'm aware of the ClassificationTree.template and the fitensemble functions but I want to write my own boosting algorithm to use it with LDA or other classifiers which are not provided by fitensemble.
Thanks

I believe you can just set the minparent parameter equal to your number of observations. Using the iris example data:
>> load fisheriris;
>> t = classregtree(meas,species,...
'names',{'SL' 'SW' 'PL' 'PW'}, 'minparent', 150)
t =
Decision tree for classification
1 if PL<2.45 then node 2 elseif PL>=2.45 then node 3 else setosa
2 class = setosa
3 class = versicolor
Not sure, but it may be quicker to eventually code it manually - especially if you're incorporating other custom code anyway. Good luck!

If t1 is your tree, as returned by classregtree, I think you can create a decision stump t2 with the command
t2 = prune(t1, 'level', max(prunelist(t1)-1));
Does that do what you need?

Related

Pytorch - how to undersample using weightedrandomsampler

I have an unbalanced dataset and would like to undersample the class that is overrepresented.How do I go about it. I would like to use to weightedrandomsampler but I am also open to other suggestions.
So far I am assuming that my code will have to be structured kind of like the following. But I dont know how to exaclty do it.
trainset = datasets.ImageFolder(path_train,transform=transform)
...
sampler = data.WeightedRandomSampler(weights=..., num_samples=..., replacement=...)
...
trainloader = data.DataLoader(trainset, batchsize = batchsize, sampler=sampler)
I hope someone can help. Thanks a lot
From my understanding, pytorch WeightedRandomSampler 'weights' argument is somewhat similar to numpy.random.choice 'p' argument which is the probability that a sample will get randomly selected. Pytorch uses weights instead to random sample training examples and they state in the doc that the weights don't have to sum to 1 so that's what I mean that it's not exactly like numpy's random choice. The stronger the weight, the more likely that sample will get sampled.
When you have replacement=True, it means that training examples can be drawn more than once which means you can have copies of training examples in your train set that get used to train your model; oversampling. Alongside, if the weights are low COMPARED TO THE OTHER TRAINING SAMPLE WEIGHTS the opposite occurs which means that those samples have a lower chance of being selected for random sampling; undersampling.
I have no clue how the num_samples argument works when using it with the train loader but I can warn you to NOT put your batch size there. Today, I tried putting the batch size and it gave horrible results. My co-worker put the number of classes*100 and his results were much better. All I know is that you should not put the batch size there. I also tried putting the size of all my training data for num_samples and it had better results but took forever to train. Either way, play around with it and see what works best for you. I would guess that the safe bet is to use the number of training examples for the num_samples argument.
Here's the example I saw somebody else use and I use it as well for binary classification. It seems to work just fine. You take the inverse of the number of training examples for each class and you set all training examples with that class its respective weight.
A quick example using your trainset object
labels = np.array(trainset.samples)[:,1] # turn to array and take all of column index 1 which are the labels
labels = labels.astype(int) # change to int
majority_weight = 1/num_of_majority_class_training_examples
minority_weight = 1/num_of_minority_class_training_examples
sample_weights = np.array([majority_weight, minority_weight]) # This is assuming that your minority class is the integer 1 in the labels object. If not, switch places so it's minority_weight, majority_weight.
weights = samples_weights[labels] # this goes through each training example and uses the labels 0 and 1 as the index in sample_weights object which is the weight you want for that class.
sampler = WeightedRandomSampler(weights=weights, num_samples=, replacement=True)
trainloader = data.DataLoader(trainset, batchsize = batchsize, sampler=sampler)
Since the pytorch doc says that the weights don't have to sum to 1, I think you can also just use the ratio which between the imbalanced classes. For example, if you had 100 training examples of the majority class and 50 training examples of the minority class, it would be a 2:1 ratio. To counterbalance this, I think you can just use a weight of 1.0 for each majority class training example and a weight 2.0 for all minority class training examples because technically you want the minority class to be 2 times more likely to be selected which would balance your classes during random selection.
I hope this helped a little bit. Sorry for the sloppy writing, I was in a huge rush and saw that nobody answered. I struggled through this myself without being able to find any help for it either. If it doesn't make sense just say so and I'll re-edit it and make it more clear when I get free time.
Based on torchdata (disclaimer: I'm the author) one can create a custom undersampler.
First, _Equalizer base class which:
creates multiple RandomSubsetSamplers (one for each class)
based on function (torch.max or torch.min) will behave as oversampler or undersampler
Code:
class _Equalizer(Sampler):
def __init__(self, labels: torch.tensor, function):
if len(labels.shape) > 1:
raise ValueError(
"labels can only have a single dimension (N, ), got shape: {}".format(
labels.shape
)
)
tensors = [
torch.nonzero(labels == i, as_tuple=False).flatten()
for i in torch.unique(labels)
]
self.samples_per_label = getattr(builtins, function)(map(len, tensors))
self.samplers = [
iter(
RandomSubsetSampler(
tensor,
replacement=len(tensor) < self.samples_per_label,
num_samples=self.samples_per_label
if len(tensor) < self.samples_per_label
else None,
)
)
for tensor in tensors
]
#property
def num_samples(self):
return self.samples_per_label * len(self.samplers)
def __iter__(self):
for _ in range(self.samples_per_label):
for index in torch.randperm(len(self.samplers)).tolist():
yield next(self.samplers[index])
def __len__(self):
return self.num_samples
Now, we can create undersampler (added oversampler as it is really short right now):
class RandomUnderSampler(_Equalizer):
def __init__(self, labels: torch.tensor):
super().__init__(labels, "min")
class RandomOverSampler(_Equalizer):
def __init__(self, labels):
super().__init__(labels, "max")
Just pass in your labels to the __init__ (has to be 1D but can have multiple or binary classes) and you can up/under sample your data.

Interclass and Intraclass classification structure of CNN

I am working on a inter-class and intra-class classification problem with one CNN such as first there is two classes Cat and Dog than in Cat there is a classification three different breeds of cats and in Dog there are 5 different breeds dogs.
I haven't tried the coding yet just working on feasibility if that works.
My question is what will be the feasible design for this kind of problem.
I am thinking to design for the training, first CNN-1 network that will differentiate cat and dog and gather the image data of all the training images. After the separation of cat and dog, CNN-2 and CNN-3 will train these images further for each breed of dog and cat. I am just not sure how the testing will work in this situation.
I have approached a similar problem previously in Python. Hopefully this is helpful and you can come up with an alternative implementation in Matlab if that is what you are using.
After all was said and done, I landed on a single model for all predictions. For your purpose you could have one binary output for dog vs. cat, another multi-class output for the dog breeds, and another multi-class output for the cat breeds.
Using Tensorflow, I created a mask for the irrelevant classes. For example, if the image was of a cat, then all of the dog breeds are irrelevant and they should not impact model training for that example. This required a customized TF Dataset (that converted 0's to -1 for the mask) and a customized loss function that returned 0 error when the mask was present for that example.
Finally for the training process. Specific to your question, you will have to create custom accuracy functions that can handle the mask values how you want them to, but otherwise this part of the process should be standard. It was best practice to evenly spread out the classes among the training data but they can all be trained together.
If you google "Multi-Task Training" you can find additional resources for this problem.
Here are some code snips if you are interested:
For the customize TF dataset that masked irrelevant labels...
# Replace 0's with -1 for mask when there aren't any labels
def produce_mask(features):
for filt, tensor in features.items():
if "target" in filt:
condition = tf.equal(tf.math.reduce_sum(tensor), 0)
features[filt] = tf.where(condition, tf.ones_like(tensor) * -1, tensor)
return features
def create_dataset(filepath, batch_size=10):
...
# **** This is where the mask was applied to the dataset
dataset = dataset.map(produce_mask, num_parallel_calls=cpu_count())
...
return parsed_features
Custom loss function. I was using binary-crossentropy because my problem was multi-label. You will likely want to adapt this to categorical-crossentropy.
# Custom loss function
def masked_binary_crossentropy(y_true, y_pred):
mask = backend.cast(backend.not_equal(y_true, -1), backend.floatx())
return backend.binary_crossentropy(y_true * mask, y_pred * mask)
Then for the custom accuracy metrics. I was using top-k accuracy, you may need to modify for your purposes, but this will give you the general idea. When comparing this to the loss function, instead of converting all to 0, which would over-inflate the accuracy, this function filters those values out entirely. That works because the outputs are measured individually, so each output (binary, cat breed, dog breed) would have a different accuracy measure filtered only to the relevant examples.
backend is keras backend.
def top_5_acc(y_true, y_pred, k=5):
mask = backend.cast(backend.not_equal(y_true, -1), tf.bool)
mask = tf.math.reduce_any(mask, axis=1)
masked_true = tf.boolean_mask(y_true, mask)
masked_pred = tf.boolean_mask(y_pred, mask)
return top_k_categorical_accuracy(masked_true, masked_pred, k)
Edit
No, in the scenario I described above there is only one model and it is trained with all of the data together. There are 3 outputs to the single model. The mask is a major part of this as it allows the network to only adjust weights that are relevant to the example. If the image was a cat, then the dog breed prediction does not result in loss.

How to use `crossval` in matlab for a Leave one Out Validation method

I have been reading the documentation: here and here but it's really unclear for me and I don't see how to use pratically crossval to do a leave one out cross-validation.
vals = crossval(fun,X)
vals = crossval(fun,X,Y,...)
mse = crossval('mse',X,y,'Predfun',predfun)
mcr = crossval('mcr',X,y,'Predfun',predfun)
val = crossval(criterion,X1,X2,...,y,'Predfun',predfun)
vals = crossval(...,'name',value)
I really don't understand the funpart.
I have estimatimate chlorophyll rate with different index. Then I have done a linear regression between those index and the field taken chlorophyll rate. Now I want to validate them, one of my estimation is a column with 22 entries, so I want to use 21 of them as trainee and 1 as a test, and do 22 loops so that all the data have been used as test.
But I don't where should I put the regression model? If my regression is Y=aX+b,
do I re-use the a and b calculated before for the train part, or do I do a new linear regression with the train part then see what's the test will be with that?
I am not sure I totally understood how to make a leave one out model.
Then I want to know the result of the test by calculating the RMSE (and maybe the R²).
How do I code that using crossval?
I saw the answer to the question here but I don't have access to the crossvalind fonction with my license.
Well I finaly figure it out: so this is my script:
First I charged my data and the linear regression fonction
X=indicesCha_without_Cloud(:,3);
y=Cha_g_m2t_without_Cloud(:,3);
testval=#(XTRAIN,ytrain,XTEST)Linear_regression_indices( XTRAIN,ytrain,XTEST);
where in my case fun(in the Mathwork help) is testvaland Linear_regression_indices is a very simple fonction:
function [ Linear_regression_indices ] = Linear_regression_indices(XTRAIN,ytrain,XTEST )
Linear_regression_indices=(polyval(polyfit(XTRAIN,ytrain,1),XTEST));
end
There is 2 ways to do it and they both give the same result:
one by using simply the crossval fonction
cvMse = crossval('mse',X,y,'predfun',testval,'leaveout',1);
this will do as many fold as the data size, using each time one of the data as Xtest
the second one is using cvpartition
c = cvpartition(n,'LeaveOut') creates a random partition for leave-one-out cross validation on n observations. Leave-one-out is a special case of 'KFold', in which the number of folds equals the number of observations. link
c = cvpartition(y,'LeaveOut');
cvMse2=crossval('mse',X,y,'predfun',testval,'partition',c);
then the RMSE can be easily calculated
RMSE=sqrt(cvMse);
RMSE2=sqrt(cvMse2);
then I simply get my answer, in my case RMSE=0,3548

Control Toolbox: How does the function minreal work in Matlab?

I have a question about the function “minreal” in Matlab. From the help of Matlab I would assume that the output is a minimal realization of a system. To my understanding it means that the output function is observable and controllable.
Example:
num = [ 6.40756397363316, -4511.90326777420, 7084807.91317081, -3549645853.18273, 2307781024837.00, -761727788683491, 2.26760542619190e+17, -1.54992537527829e+19, 5.58719150155001e+21 ];
den = [ 1, 824.614362937241, 1036273.19811846, 592905955.793358, 319582996989.696, 106244022544031, 2.87990542333047e+16, 2.36284104437760e+18, 3.50241006466156e+20, 0];
G = tf(num,den);
G_min = minreal(ss(G));
But it is not a minimal realization:
>> size(G_min)
State-space model with 1 outputs, 1 inputs, and 9 states.
>> rank(obsv(G_min))
ans = 6
>> rank(ctrb(G_min))
ans = 5
So obviously: rank(obsv(G_min)) != rank(ctrb(G_min)) != 9 (number of states).
Where is my mistake?
Thank you very much.
Conceptually you are correct, in that a minimal realization is controllable and observable. However, minreal makes no guarantees of that. As per the doc:
Pole-zero cancellation is a straightforward search through the poles
and zeros looking for matches that are within tolerance. Transfer functions
are first converted to zero-pole-gain form.
That is, minreal just does a somewhat mindless search for whether poles and
zeros are close to each other, and makes no guarantees that the result
satisfied any other conditions. Note that in your case you could specify a
larger tolerance and more states would be eliminated,
>> G_red = minreal(G,10)
G_red =
6.408 s + 74.87
------------------------
s^2 + 625.7 s + 1.703e05
Continuous-time transfer function.
and you'd get something closer to what you might expect.
Alternatively, you'd most likely be better off transforming to a balanced realization and deciding which states to eliminate yourself. See the doc for balreal for an example of how to use it with modred to achieve this.
You might also take note of the doc for obsv, which clearly states that you shouldn't trust its results for anything other than toy problems:
obsv is here for educational purposes and is not recommended for serious control design.
Computing the rank of the observability matrix is not recommended for observability testing.
Ob will be numerically singular for most systems with more than a handful of states.
This fact is well documented in the control literature. For example, see section III in
http://lawww.epfl.ch/webdav/site/la/users/105941/public/NumCompCtrl.pdf

LIBSVM in MATLAB/Octave - what's the output of libsvmread?

The second output of the libsvmread command is a set of features for each given training example.
For example, in the following MATLAB command:
[heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
This second variable (heart_scale_inst) holds content in a form that I don't understand, for example:
<1, 1> -> 0.70833
What is the meaning of it? How is it to be used (I can't plot it, the way it is)?
PS. If anyone could please recommend a good LIBSVM tutorial, I'd appreciate it. I haven't found anything useful and the README file isn't very clear... Thanks.
The definitive tutorial for LIBSVM for beginners is called: A Practical Guide to SVM Classification it is available from the site of the authors of LIBSVM.
The second parameter returned is called the instance matrix. It is a matrix, let call it M, M(1,:) are the features of data point 1 and so on. The matrix is sparse that is why it prints out weirdly. If you want to see it fully print full(M).
[heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
with heart_scale_label and heart_scale_inst you should be able to train an SVM by issuing:
mod = svmtrain(heart_scale_label,heart_scale_inst,'-c 1 -t 0');
I strong suggest you read the above linked guide to learn how to set the c parameter (and possibly, in case of RBF kernel the gamma parameter), but the above line is how you would train with that data.
I think it is the probability with which test case has been predicted to heart_scale label category