Traversing a tree and assigning a subtree in Julia - macros

I am trying to manipulate a tree in Julia. The tree is created as an object. All I want is substituting the one of the branches with another one. I can do it manually but can not do it by using a recursion function.
mutable struct ILeaf
majority::Any # +1 when prediction is correct
values::Vector # num_of_samples
indicies::Any # holds the index of training samples
end
mutable struct INode
featid::Integer
featval::Any
left::Union{ILeaf,INode}
right::Union{ILeaf,INode}
end
ILeafOrNode = Union{ILeaf,INode}
And my function for chaning the tree is (tree is original one where, by using LR_STACK, I am willing to change one of the branches and substitute it with the subtree. ) :
function traverse_and_assign(tree, subtree, lr_stack) # by using Global LR_stack
if top(lr_stack) == 0
tree = subtree
elseif top(lr_stack) == :LEFT
pop!(lr_stack)
return traverse_and_assign(tree.left, subtree, lr_stack)
else # right otherwise
pop!(lr_stack)
return traverse_and_assign(tree.right, lr_stack)
end
end
What happens is that I cannot change the original tree.
On the other hand :
tree.left.left = subtree
works perfectly fine.
What is wrong with my code ? Do I have to write a macro for this ?
B.R.
edit#1
In order to generate data :
n, m = 10^3, 5 ;
features = randn(n, m);
lables = rand(1:2, n);
edit#2
use 100 samples for training the decision tree :
base_learner = build_iterative_tree(labels, features, [1:20;])
then give other samples one by one :
i = 21
feature = features[21, :], label = labels[21]
gtree_stack, lr_stack = enter_iterate_on_tree(base_learner, feature[:], i, label[1])
get the indices of incorrect samples
ids = subtree_ids(gtree_stack)
build the subtree:
subtree = build_iterative_tree(l, f, ids)
update the original tree(base_learner):
traverse_and_assign(base_learner, subtree, lr_stack)

I still miss MWE but maybe I could help with one problem without it.
In Julia value is bind to variable. Parameters in functions are new variables. Let's do test what does it mean:
function test_assign!(tree, subtree)
tree = subtree
return tree
end
a = 4;
b = 5;
test_assign!(a, b) # return 5
show(a) # 4 ! a is not changed!
What happend? value 4 was bind to tree and value 5 was bind to subtree.
subtree's value (5) was bind to tree.
And nothing else! Means a is stil bound to 4.
How to could we change a? This will work:
mutable struct SimplifiedNode
featid::Integer
end
function test_assign!(tree, subtree)
tree.featid = subtree.featid
end
a = SimplifiedNode(4)
b = SimplifiedNode(5)
test_assign!(a, b)
show(a) # SimplifiedNode(5)
Why? What happend?
Value of a (which is something like pointer to mutable struct) is bind to tree and value of b is bound to subtree.
So a and tree are bound to same structure! Means that if we change that structure a is bind to changed structure.

Related

Make the basis of a function from nest loop outer components

I have a segment of code where a composition of nested loops needs to be run at various times; however, each time the operations within the nested loops are different. Is there a way to make the outer portion (loop composition) somehow a functional piece, so that the internal operations are variable. For example, below, two code blocks are shown which both use the same loop introduction, but have different purposes. According to the principle of DRY, how can I improve this, so as not to need to repeat myself each time a similar loop needs to be used?
% BLOCK 1
for a = 0:max(aVec)
for p = find(aVec'==a)
iDval = iDauVec{p};
switch numel(iDval)
case 2
r = rEqVec(iDval);
qVec(iDval(1)) = qVec(p) * (r(2)^0.5 / (r(1)^0.5 + r(2)^0.5));
qVec(iDval(2)) = qVec(p) - qVec(iDval(1));
case 1
qVec(iDval) = qVec(p);
end
end
end
% BLOCK 2
for gen = 0:max(genVec)-1
for p = find(genVec'==gen)
iDval = iDauVec{p};
QinitVec(iDval) = QinitVec(p)/numel(iDval);
end
end
You can write your loop structure as a function, which takes a function handle as one of its inputs. Within the loop structure, you can call this function to carry out your operation.
It looks as if the code inside the loop needs the values of p and iDval, and needs to assign to different elements of a vector variable in the workspace. In that case a suitable function definition might be something like this:
function vec = applyFunctionInLoop(aVec, vec, iDauVec, funcToApply)
for a = 0:max(aVec)
for p = find(aVec'==a)
iDval = iDauVec{p};
vec = funcToApply(vec, iDval, p);
end
end
end
You would need to put the code for each different operation you want to carry out in this way into a function with suitable input and output arguments:
function qvec = myFunc1(qVec, iDval, p)
switch numel(iDval)
case 2
r = rEqVec(iDval); % see note
qVec(iDval(1)) = qVec(p) * (r(2)^0.5 / (r(1)^0.5 + r(2)^0.5));
qVec(iDval(2)) = qVec(p) - qVec(iDval(1));
case 1
qVec(iDval) = qVec(p);
end
end
function v = myFunc2(v, ix, q)
v(ix) = v(q)/numel(ix);
end
Now you can use your loop structure to apply each function:
qvec = applyFunctionInLoop(aVec, qVec, iDauVec, myFunc1);
QinitVec = applyFunctionInLoop(aVec, QinitVec, iDauVec, myFunc2);
and so on.
In most of the answer I've kept to the same variable names you used in your question, but in the definition of myFunc2 I've changed the names to emphasise that these variables are local to the function definition - the function is not operating on the variables you passed in to it, but on the values of those variables, which is why we have to pass the final value of the vector out again.
Note that if you want to use the values of other variables in your functions, such as rEqVec in myFunc1, you need to think about whether those variables will be available in the function's workspace. I recommend reading these help pages on the Mathworks site:
Share Data Between Workspaces
Dynamic Function Creation with Anonymous and Nested Functions

How to make a Matlab structure constant once it is already created?

Suppose I have a function defined in foo.m. This function can take a parameter thing of type struct. Once foo makes changes to thing, I want to "lock" thing so that it can no longer be changed. I essentially want to make it constant. I want to do this to ensure it isn't modified further down the line. How do I do this in Matlab?
You should
define the variable in the function to be persistent
lock your function in the memory using mlock.
mlock locks the currently running function in memory so that subsequent clear functions do not remove it. Locking a function in memory also prevents any persistent variables defined in the file from getting reinitialized.
Solution 1: Good if you don't know what form your struct will have in advance
You could 'capture' that variable with an anonymous function handle and only refer to your structure with that from now on. An anonymous function handle captures the state of the workspace at the time it is created. You will be able to access its elements as if it were the original struct, but if you try to assign to it, you'll generate an error.
E.g.
>> S_.a = 1;
>> S_.b = 2;
>> S = #() S_;
>> S_.a = 3;
>> S_
S_ =
scalar structure containing the fields:
a = 3
b = 2
>> S()
ans =
scalar structure containing the fields:
a = 1
b = 2
It's almost identical in syntax, except for the annoyance that you'll have to call it with ().
I've used it on the terminal here, but obviously it can easily also be used in the context of a function.
Small caveat; if you redefine and overwrite the anonymous function, obviously, this backfires, since it will inherit whatever new workspace it had access to at the time of the redefinition.
Solution 2: Good if you know your struct's form in advance:
Assume you know in advance that your struct will only contain fields a and b. Create a class with the same properties restricting 'SetAccess', e.g.
classdef ConstStruct
properties (GetAccess = 'public', SetAccess = 'private')
a
b
end
methods
%constructor
function obj = ConstStruct(S)
obj.a = S.a;
obj.b = S.b;
end
end
end
Then in your main code:
>> MyStruct = struct('a',1,'b',2)
MyStruct =
a: 1
b: 2
>> MyStruct = ConstStruct(MyStruct)
MyStruct =
ConstStruct with properties:
a: 1
b: 2
>> MyStruct.a
ans =
1
>> MyStruct.a = 2
You cannot set the read-only property 'a' of 'ConstStruct'.

How to make a "call with reference" to recursive MatLab function

As said in the title, I have a recursive function and I am trying to build a tree data structure out of it to save my results. Every node consists in just one single number. The problem is that when I input the tree to the next call of the function, it seems that only the value of the tree is passed along, not the actual tree. Does anyone know how to pass a reference to the tree instead?
Initial Call:
tree = struct('left', 'empty','right', 'empty','feature','empty');
decisiontree_train(AttributeSet, LabelSet, 50, tree, 'node');
Recursive Function:
function decisiontree_train( data, labels, before_split_purity_percentage, tree, branch )
% a1 is 0, a2 is 1
[ a1_split_data, a2_split_data, a1_split_labels, a2_split_labels, ...
split_feature ] = decisiontree_split( data, labels );
new_tree = struct('left', 'empty','right', 'empty','feature','empty');
if strcmp(branch, 'left')
tree.left = new_tree;
new_tree.feature = split_feature;
elseif strcmp(branch, 'right')
tree.right = new_tree;
new_tree.feature = split_feature;
elseif strcmp(branch, 'node')
tree.feature = split_feature;
new_tree = tree;
end
[ after_split_purity_percentage ] = decisiontree_classcount( a1_split_labels );
if after_split_purity_percentage < 100 && ...
after_split_purity_percentage > before_split_purity_percentage
decisiontree_train(a1_split_data, a1_split_labels, ...
after_split_purity_percentage, new_tree, 'left');
end
[ after_split_purity_percentage ] = decisiontree_classcount( a2_split_labels );
if after_split_purity_percentage < 100 && ...
after_split_purity_percentage > before_split_purity_percentage
decisiontree_train(a2_split_data, a2_split_labels, ...
after_split_purity_percentage, new_tree, 'right');
end
% add variable to workspace
% assignin('base', 'a1_split_data', a1_split_data)
end
Unless you use object oriented matlab, there is no pass by reference. While asking a different question, the answers somehow apply to your case as well. If you are using Matlab 2015b or newer, use Matlab OOP and implement your tree using a handle class. If performance isn't a big concern, do the same.
For the likely reason that both isn't true, you have to work around the issue. Matlab uses copy-on-wrote. Thus changing your functions to take your tree structure as a first input argument and returning the modified it isn't a bad idea. In typical cases only very little data is really copied.

How can I get the Methodlist while iterating?

I want to iterate through all classes and packages in a special path.
After that, I want to get the MethodList.
In the command window I can use following and it’s working fine:
a = ?ClassName;
a.MethodList(X);
Now I separate this into a function:
function s = befehlsreferenz(path)
s = what(path); % list MATLAB files in folder
for idx = 1:numel(s.classes)
c = s.classes(idx);
b = ?c;
b.MethodList(0);
end
end
I get an error:
Too many outputs requested. Most likely cause is missing [] around left hand side that has a comma separated list
expansion. Error in (line 7) b.MethodList(0);
While debugging I can see:
c: 1x1 cell = ‘Chapter’
b: empty 0x0 meta.class
Why is b empty? How can I get the methodlist?
1 Edit:
Here is an example class, also not working with it.
classdef TestClass
%TESTCLASS Summary of this class goes here
% Detailed explanation goes here
properties
end
methods
function [c] = hallo(a)
c = 1;
end
end
end
When struggling with operators in Matlab, it's typically the best choice to use the underlying function instead, which is meta.class.fromname(c)
Relevant documentation: http://de.mathworks.com/help/matlab/ref/metaclass.html
Further it seems s.classes(idx); is a cell, use cell indexing: s.classes{idx} ;

Matlab coder & dynamic field references

I'm trying to conjure up a little parser that reads a .txt file containing parameters for an algorithm so i don't have to recompile it everytime i change a parameter. The application is C code generated from .m via coder, which unfortunately prohibits me from using a lot of handy matlab gimmicks.
Here's my code so far:
% read textfile
string = readfile(filepath);
% do fancy rearranging
linebreaks = zeros(size(string));
equals = zeros(size(string));
% find delimiters
for n=1:size(string,2)
if strcmp(string(n),char(10))
linebreaks(n) = 1;
elseif strcmp(string(n), '=')
equals(n) = 1;
end
end
% write first key-value pair
idx_s = find(linebreaks);idx_s = [idx_s length(string)];
idx_e = find(equals);
key = string(1:idx_e(1)-1);
value = str2double(string(idx_e(1)+1:idx_s(1)-1));
parameters.(key) = value;
% find number of parameters
count = length(idx_s);
% write remaining key-value pairs
for n=2:count
key = string(idx_s(n-1)+1:idx_e(n)-1);
value = str2double(string(idx_e(n)+1:idx_s(n)-1));
parameters.(key) = value;
end
The problem is that seemingly coder does not support dynamic fieldnames for structures like parameters.(key) = value.
I'm a bit at a loss as to how else i am supposed to come up with a parameter struct that holds all my key-value pairs without hardcoding it. It would somewhat (though not completely) defeat the purpose if the names of keys were not dynamically linked to the parameter file (more manual work if parameters get added/deleted, etc.). If anybody has an idea how to work around this, i'd be very grateful.
As you say, dynamic fieldnames for structures aren't allowed in MATLAB code to be used by Coder. I've faced situations much like yours before, and here's how I handled it.
First, we can list some nice tools that are allowed in Coder. We're allowed to have classes (value or handle), which can be quite handy. Also, we're allowed to have variable sized data if we use coder.varsize to specifically designate it. We also can use string values in switch statements if we like. However, we cannot use coder.varsize for properties in a class, but you can have varsized persistent variables if you like.
What I'd do in your case is create a handle class for storing and retrieving the values. The following example is pretty basic, but will work and could be expanded. If a persistent variable were used in a method, you could even create a varsized allocated storage for the data, but in my example, it's a property and has been limited in the number of values it can store.
classdef keyval < handle %# codegen
%KEYVAL A key and value class designed for Coder
% Stores an arbitrary number of keys and values.
properties (SetAccess = private)
numvals = 0
end
properties (Access = private)
intdata
end
properties (Constant)
maxvals = 100;
maxkeylength = 30;
end
methods
function obj = keyval
%KEYVAL Constructor for keyval class
obj.intdata = repmat(struct('key', char(zeros(1, obj.maxkeylength)), 'val', 0), 1, obj.maxvals);
end
function result = put(obj, key, value)
%PUT Adds a key and value pair into storage
% Result is 0 if successful, 1 on error
result = 0;
if obj.numvals >= obj.maxvals
result = 1;
return;
end
obj.numvals = obj.numvals + 1;
tempstr = char(zeros(1,obj.maxkeylength));
tempstr(1,1:min(end,numel(key))) = key(1:min(end, obj.maxkeylength));
obj.intdata(obj.numvals).key = tempstr;
obj.intdata(obj.numvals).value = value;
end
function keystring = getkeyatindex(obj, index)
%GETKEYATINDEX Get a key name at an index
keystring = deblank(obj.intdata(index).key);
end
function value = getvalueforkey(obj, keyname)
%GETVALUEFORKEY Gets a value associated with a key.
% Returns NaN if not found
value = NaN;
for i=1:obj.numvals
if strcmpi(keyname, deblank(obj.intdata(i).key))
value = obj.intdata(i).value;
end
end
end
end
end
This class implements a simple key/value addition as well as lookup. There are a few things to note about it. First, it's very careful in the assignments to make sure we don't overrun the overall storage. Second, it uses deblank to clear out the trailing zeros that are necessary in the string storage. In this situation, it's not permitted for the strings in the structure to be of different length, so when we put a key string in there, it needs to be exactly the same length with trailing nulls. Deblank cleans this up for the calling function.
The constant properties allocate the total amount of space we're allowed in the storage array. These can be increased, obviously, but not at runtime.
At the MATLAB command prompt, using this class looks like:
>> obj = keyval
obj =
keyval with properties:
numvals: 0
>> obj.put('SomeKeyName', 1.23456)
ans =
0
>> obj
obj =
keyval with properties:
numvals: 1
>> obj.put('AnotherKeyName', 34567)
ans =
0
>> obj
obj =
keyval with properties:
numvals: 2
>> obj.getvalueforkey('SomeKeyName')
ans =
1.2346
>> obj.getkeyatindex(2)
ans =
AnotherKeyName
>> obj.getvalueforkey(obj.getkeyatindex(2))
ans =
34567
If a totally variable storage area is desired, the use of persistent variables with coder.varsize would work, but that will limit the use of this class to a single instance. Persistent variables are nice, but you only get one of them ever. As written, you can use this class in many different places in your program for different storage. If you use a persistent variable, you may only use it once.
If you know some of the key names and are later using them to determine functionality, remember that you can switch on strings in MATLAB, and this works in Coder.