I'm trying to use MATLAB's TreeBagger method, which implements a random forest.
I get some results, and can do a classification in MATLAB after training the classifier.
However I'd like to "see" the trees, or want to know how the classification works.
For example, let's run this minimal example, I found here: Matlab treebagger example
So, I end up with a classificator stored in "B".
How can I inspect the trees? Like having a look at each node, to see on which criteria (e.g. feature) the decision is made?
Entering B returns:
B =
TreeBagger
Ensemble with 20 bagged decision trees:
Training X: [6x2]
Training Y: [6x1]
Method: classification
Nvars: 2
NVarToSample: 2
MinLeaf: 1
FBoot: 1
SampleWithReplacement: 1
ComputeOOBPrediction: 0
ComputeOOBVarImp: 0
Proximity: []
ClassNames: '0' '1'
I can't see something like B.trees or so.
And a follow-up question would be:
How to port your random-forest code you prototyped in MATLAB to any other language.
Then you need to know how each tree works, so you can implement it in the target language.
I hope you get the point, or understand my query ;)
Thanks for answers!
Best,
Patrick
Found out how to inspect the trees, by running the view() command. E.g. for inspecting the first tree of the example:
>> view(B.Trees{1})
Decision tree for classification
1 if x2<650 then node 2 elseif x2>=650 then node 3 else 0
2 if x1<4.5 then node 4 elseif x1>=4.5 then node 5 else 1
3 class = 0
4 class = 0
5 class = 1
By passing some more arguments to the view() command, the tree can also be visualized:
view(B.Trees{1},'mode','graph')
to view multiple trees just use loop :
for n=1:30 %number of tree
view(t.Trees{n});
end
you can find the source here
Related
I am totally new in asp, I am learning clingo and I have a problem with variables. I am working on graphs and paths in the graphs so I used a tuple such as g((1,2,3)). what I want is to add new node to the path in which the tuple sequence holds. for instance the code below will give me (0, (1,2,3)) but what I want is (0,1,2,3).
Thanks in advance.
g((1,2,3)).
g((0,X)):-g(X).
Naive fix:
g((0,X,Y,Z)) :- g((X,Y,Z)).
However I sense that you want to store the path in the tuple as is it is a list. Bad news: unlike prolog clingo isn't meant to handle lists as terms of atoms (like your example does). Lists are handled by indexing the elements, for example the list [a,b,c] would be stored in predicates like p(1,a). p(2,b). p(3,c).. Why? Because of grounding: you aim to get a small ground program to reduce the complexity of the solving process. To put it in numbers: assuming you are searching for a path which includes all n nodes. This sums up to n!. For n=10 this are 3628800 potential paths, introducing 3628800 predicates for a comparively small graph. Numbering the nodes as mentioned will lead to only n*n potential predicates to represent the path. For n=10 these are just 100, in comparison to 3628800 a huge gain.
To get an impression what you are searching for, run the following example derived from the potassco website:
% generating path: for every time exactly one node
{ path(T,X) : node(X) } = 1 :- T=1..6.
% one node isn't allowed on two different positions
:- path(T1,X), path(T2,X), T1!=T2.
% there has to be an edge between 2 adjascent positions
:- path(T,X), path(T+1,Y), not edge(X,Y).
#show path/2.
% Nodes
node(1..6).
% (Directed) Edges
edge(1,(2;3;4)). edge(2,(4;5;6)). edge(3,(1;4;5)).
edge(4,(1;2)). edge(5,(3;4;6)). edge(6,(2;3;5)).
Output:
Answer: 1
path(1,1) path(2,3) path(3,4) path(4,2) path(5,5) path(6,6)
Answer: 2
path(1,1) path(2,3) path(3,5) path(4,4) path(5,2) path(6,6)
Answer: 3
path(1,6) path(2,2) path(3,5) path(4,3) path(5,4) path(6,1)
Answer: 4
path(1,1) path(2,4) path(3,2) path(4,5) path(5,6) path(6,3)
Answer: 5
...
I'm trying to write a neural Network for binary classification in PyTorch and I'm confused about the loss function.
I see that BCELoss is a common function specifically geared for binary classification. I also see that an output layer of N outputs for N possible classes is standard for general classification. However, for binary classification it seems like it could be either 1 or 2 outputs.
So, should I have 2 outputs (1 for each label) and then convert my 0/1 training labels into [1,0] and [0,1] arrays, or use something like a sigmoid for a single-variable output?
Here are the relevant snippets of code so you can see:
self.outputs = nn.Linear(NETWORK_WIDTH, 2) # 1 or 2 dimensions?
def forward(self, x):
# other layers omitted
x = self.outputs(x)
return F.log_softmax(x) # <<< softmax over multiple vars, sigmoid over one, or other?
criterion = nn.BCELoss() # <<< Is this the right function?
net_out = net(data)
loss = criterion(net_out, target) # <<< Should target be an integer label or 1-hot vector?
Thanks in advance.
For binary outputs you can use 1 output unit, so then:
self.outputs = nn.Linear(NETWORK_WIDTH, 1)
Then you use sigmoid activation to map the values of your output unit to a range between 0 and 1 (of course you need to arrange your training data this way too):
def forward(self, x):
# other layers omitted
x = self.outputs(x)
return torch.sigmoid(x)
Finally you can use the torch.nn.BCELoss:
criterion = nn.BCELoss()
net_out = net(data)
loss = criterion(net_out, target)
This should work fine for you.
You can also use torch.nn.BCEWithLogitsLoss, this loss function already includes the sigmoid function so you could leave it out in your forward.
If you, want to use 2 output units, this is also possible. But then you need to use torch.nn.CrossEntropyLoss instead of BCELoss. The Softmax activation is already included in this loss function.
Edit: I just want to emphasize that there is a real difference in doing so. Using 2 output units gives you twice as many weights compared to using 1 output unit.. So these two alternatives are not equivalent.
Some theoretical add up:
For binary classification (say class 0 & class 1), the network should have only 1 output unit. Its output will be 1 (for class 1 present or class 0 absent) and 0 (for class 1 absent or class 0 present).
For loss calculation, you should first pass it through sigmoid and then through BinaryCrossEntropy (BCE). Sigmoid transforms the output of the network to probability (between 0 and 1) and BCE then maximizes the likelihood of the desired output.
I’m new to neural networks but I have a question regarding NN and ANN.
I have a list of objects. Each objects contains longitude, latitude and a list of words.
What I want to do is to predict the location based on the text contained in the object (similar texts should have similar location). Right now I’m using cosine similarity to calculate the similarity between the objects text but I’m stuck how I can use that information to train my neural network. I have a matrix containing each object and how many time each word appeared in that object. F.x. if I had these two objects
Obj C: 54.123, 10.123, [This is a text for object C]
Obj B: 57.321, 11.113, [This is a another text for object B]
Then I have something like the following matrix
This is a text for object C another B
ObjC: 1 1 1 1 1 1 1 0 0
ObjB: 1 1 1 1 1 1 0 1 1
I would also have something like, for the distance between the two objects (note, that the numbers are not real)
ObjC ObjB
ObjC 1 0.25
ObjB 0.25 1
I have looked at how I use neural network to either classify things into groups (like A,B,C) or predict something like a housing price, but nothing that I find helpful for my problem.
I would consider the prediction right if it is within some distance X, since I’m dealing with location.
This might be a stupid question, but someone point me to the right direction.
Everything helps!
Regards
Problem
I am trying to find the connected components of my undirected graph.
Matlabs function conncomp does exactly this. Mathworks - connected graph components
Example
Using the example given on matlabs webpage to keep it easy and repeatable:
G = graph([1 1 4],[2 3 5],[1 1 1],6);
plot(G)
bins = conncomp(G)
bins =
1 1 1 2 2 3
Two Question´s to this
First Question: Using this how can I find the initial node index, so that
cluster1 = (1 2 3); (instead of ( 1 1 1))
cluster2= (4 5); (instead of (2 2))
Second Question:
I am working on a big dataset and I know many nodes are not connected, so is there a way to only display clusters that contain more than one value ?
Thanks for your help, I am majorly stuck here.
You can use splitapply for the first part, like so:
clusters = splitapply(#(x) {x}, 1:numnodes(G), bins)
This returns a cell array where each cell contains the indices of the nodes in a group. You can filter this down in the usual way using cellfun
discard = cellfun(#isscalar, clusters);
clusters(discard) = [];
(Note that splitapply is new in R2015b - but the OP is using graph, also new in R2015b, so it should be fine for them)
Actually the first part of the question can be answered very simply as Matlabs conncomp provides a tool for this:
bins=conncomp(G,'OutputForm','cell');
Creates a cell array that contains the clusters, with all node names in the cells.
For the second part of the question I guess there are several ways but this one could be used as well:
clusters= bins(cellfun(#numel,bins)>1);
I am using cross valind function on a very small data... However I observe that it gives me incorrect results for the same. Is this supposed to happen ?
I have Matlab R2012a and here is my output
crossvalind('KFold',1:1:11,5)
ans =
2
5
1
3
2
1
5
3
5
1
5
Notice the absence of set 4.. Is this a bug ? I expected atleast 2 elements per set but it gives me 0 in one... and it happens a lot that is the values are not uniformly distributed in the sets.
The help for crossvalind says that the form you are using is: crossvalind(METHOD, GROUP, ...). In this case, GROUP is the e.g. the class labels of your data. So 1:11 as the second argument is confusing here, because it suggests no two examples have the same label. I think this is sufficiently unusual that you shouldn't be surprised if the function does something strange.
I tried doing:
numel(unique(crossvalind('KFold', rand(11, 1) > 0.5, 5)))
and it reliably gave 5 as a result, which is what I would expect; my example would correspond to a two-class problem (I would guess that, as a general rule, you'd want something like numel(unique(group)) <= numel(group) / folds) - my hypothesis would be that it tries to have one example of each class in the Kth fold, and at least 2 examples in every other, with a difference between fold sizes of no more than 1 - but I haven't looked in the code to verify this.
It is possible that you mean to do:
crossvalind('KFold', 11, 5);
which would compute 5 folds for 11 data points - this doesn't attempt to do anything clever with labels, so you would be sure that there will be K folds.
However, in your problem, if you really have very few data points, then it is probably better to do leave-one-out cross validation, which you could do with:
crossvalind('LeaveMOut', 11, 1);
although a better method would be:
for leave_out=1:11
fold_number = (1:11) ~= leave_out;
<code here; where fold_number is 0, this is the leave-one-out example. fold_number = 1 means that the example is in the main fold.>
end