How to implement a binary tree in matlab - matlab

Can somebody please help me with implementing a binary tree in matlab? Can we do it the same way we implement the same in C/C++ using pointers? I happen to read a question related to the same and the solution too using 'struct' but that code executes 'n' number of times given that n is predefined. But I deal with a problem where the tree has to be formed dynamically. ie,
1.Take a node
1.1 Do some processing
1.2 If the resulting two answers satisfy the condition, they are added as the left and right children
1.3 Continue the process till the condition is false.
2.Trace back and move to the next node.
Thanks in advance.

This may only partly answer your question. To get anywhere close to the mechanisms of pointers in C/C++, you might start by checking the object oriented features of MATLAB. Namely the ability to create handle classes.
There is a fully documented example for the implementation of a doubly-linked list, which comes pretty close to a binary tree.

Related

How are CPU Register values updated?

I know this might be a silly question which I am asking, but I am really curious about this, since I am not having much knowledge of computer architecture.
Suppose I have a Register R1 and say I loaded value of a variable say LOCK=5 into the register, so now R1 has the value 5 stored into it, now let's suppose I updated the value of LOCK to 10 after some time, so will the value of register still be 5 or will it be updated.
When it comes to register based CPU architectures, I think Neo from the matrix has a valueable lession: "There are no variables."
Variables, as you're using them in a higher level programming languages are an abstract thing for describing to the compiler what operations to do a particular piece of data. That data may reside in system memory, or for temporary values never leave the register file.
However once the program has been compiled to a binary, there no longer are variables! For debugging purposes the compiler may annotate the code with information of the kind "at this particular position in the code, what is referred to as variable 'x' right now happens to be held in …".
I think the best way to understand this is to compile some very simple programs and look at their respective assembly, to see how things fit together. The Godbolt Compiler Explorer is a really valuable tool, here.

How to pass multiple variables from one model to another model (inner/outer)

Let's say we have the following model:
Collector:
model Collector
Real collect_here;
annotation(defaultComponentPrefixes="inner");
end Collector;
and the following model potentially multiple times:
model Calculator
outer Collector collector;
Real calculatedVariable = 2*time;
equation
calculatedVariable = collector.collect_here;
end Calculator;
The code above works if calcModel is present only once in the system to be simulated. If the model exists more than once I get a singular system. This is demonstrated by the Example below. Changing the parameter works either gives a working or failing system.
model Example
parameter Boolean works = true;
inner Collector collector;
Calculator calculator1;
Calculator calculator2 if not works;
end Example;
Using an array inside the collector to pass multiple variables in it doesn't solve it.
Another possible way to solve this is possible by use of connectors, but I only made it work with one calcModel.
Using multiple instances of Calculator does brake the model, as the single variable calculatedVariable will have multiple equations trying to compute its value. Therefore Dymola complains that the system is structurally singular, in this case meaning that there are more equations than variables in the resulting system of equations.
To give a bit more of an insight: Actually checking Collector will fail, as since Modelica 3.0 every component has to be balanced (meaning it has to have as many unknowns as states), which is not the case for Collector as it does have one unknown but no equation. This strongly limits the possible applications for the inner/outer construct as basically every variable has to be computed where it is defined.
In the given example this is compensated in the overall system if exactly one Calculator is used. So this single combination will work. Although this works, it is something that should not be done - for the obvious reason of being very error-prone (and all sub-models should pass the check).
Your question on how to solve this issue actually misses a description of what the issue actually is. There are some cases in my mind that your approach could be useful for:
You want to plot multiple variables from a single point, which would be collector. For this purpose "variable selections" should be the most straight-forward way to go: see Dymola Manual Vol. 1, Section "4.3.11 Matching and variable selections" on how to apply them.
You want to carry out some mathematical operation on that variables. Then it could be useful to have a vectorized input of variable size. This enables an arbitrary number of connections to this input. For an example of this take a look at: Modelica.Blocks.Math.MultiSum
You want to route multiple signals between different models (which is unlikely judging from your description, but still): Then expandable connectors would be a good possibility. To get an impression of what that does take a look at Modelica.Blocks.Examples.BusUsage.
Hope this helps, otherwise please specify more clearly what you actually want to achieve with your code.
I prepared a demonstrative library for such scenario some days ago. You can access it at https://gist.github.com/beutlich/e630b2bf6cdf3efe96e5e9a637124fe1. If you read the documentation on Example2 you can see the link to an article from H. Elmqvis et. al., which is the clue to your problem. That is, you need a connector, and inherited connects from every Calculator to the one Collector.

what exactly is the phytree object in matlab?

This question has bothered me for a while, so I post it here just in case someone else has the similar issue. After debugging the code to ask it print out the variables, I understand that the phytree object is a struct array with three fields, i.e., tree, dist and names. Here, tree is a matrix with the size the number of branches times 2. But because the data is large, I cannot quite figure out what exactly is the matrix tree. Can someone help? Thanks in advance.
The output of seqneighjoin is not a struct array with the fields tree dist and names, it's a phytree object that has some internal properties called tree, dist and names. Since you're already taking a look at the code with the debugger, take a look at the line right at the end of phytree.m - you'll see that it specifies that the output tr is an object of class phytree, not a struct.
I'm not sure if you have much background using object-oriented programming in MATLAB, but it's a bigger topic than I can discuss here - I'll just say that an "object" is something that has properties that store information in the same way that a struct has fields that store information; but an object also has methods that are functions stored as part of the object and that act on it. For the phytree object, these methods are functions such as prune for removing branches, getnewickstr for getting a Newick-formatted string, and so on.
You can find out more about MATLAB OO programming in the documentation. Unfortunately, there's a bit of an issue with that - in R2008a, MATLAB introduced a new form of OO, and all the current documentation is based on that style of OO. phytree is implemented using the old style of OO, so you may need to look at the doc for an old version of MATLAB to find out its syntax.
You shouldn't be trying to access the internal tree property directly. If you want to get it, use get(tr, 'Pointers'). It's an array listing which branches are connected to which other branches/leaves.

Control Data Flow graphs or intermediate representation

we are working on a project to come up with an intermediate representation for the code in terms of something called an assignment decision diagram. So it would be very helpful if someone can tell us how you guys are compiling the code and how to access the graphs generated during compilation i.e after parsing the code for grammar.
Even help regarding accessing the code after parsing of the compiler is fine. Any help regarding how to go about doing it is also appreciated.
Currently, there is not a well defined intermediate representation of Chisel as it goes between the user source code and the specified C++ or Verilog backends.
However, I believe this is a current project amongst the Chisel devs to break apart the backend and allow access to the IR (and allow for user-defined compiler passes).
In the meantime, check out Backend.scala (particularly the elaborate() method). That's where a lot of the magic originates. I believe it is possible to jump into the Scala command line in the middle of elaboration, which will give you access to the hardware tree representation, but I'm not sure how meaningful or useful that will be for you.

random forest code review

I'm doing a research project on random forest algorithm. I have found numerous implementations of the algorithm but the main part of the code is often written in Fortran while I'm completely naive in it.
I have to edit the code, change the main parameters (like tree depth, num of feature variables, ...) and trace the algorithm's performance during each run.
Currently I'm using "Windows-Precompiled-RF_MexStandalone-v0.02-". The train and predict functions are matlab mex files and can not be opened or edited. Can anyone give me a piece of advice on what to do or is there a valid and completely matlab-based version of random forests.
I've read the randomforest-matlab carefully. The main training part unfortunately is a dll file. Through reading more, most of my wonders is now resolved. My question mainly was how to run several trees simultaneously.
Have you taken a look at these libraries?
Stochastic Bosque
randomforest-matlab
If you're doing a research project on it, the best thing is probably to implement the individual tree training yourself in C and then write Mex wrappers. I'd start with an ID3 tree (before attempting C4.5 for instance.) Then write the random forest code itself, which, once you write the tree code, isn't all that hard.
You'll:
learn a lot
be able to modify them as much as you like
eventually move on to exploring new areas with them
I've implemented them myself from scratch so I can help once you post some of your own code. But I don't think anybody on this site will write the code for you.
Will it take effort? Yes. Will you come out of it with more knowledge and ability than you had going in? Undoubtably.
There is a nice library in R called randomForest. It is based on the original implementation of Breiman in Fortran but it is now mainly recoded in C.
http://cran.r-project.org/web/packages/randomForest/index.html
The main parameters you talk about (tree depth, number of features to be tested, ...) are directly available.
Another library I would recommend is Weka. It is java based and lucid.Performance is slightly off though compared to R. The source code can be downloaded from http://www.cs.waikato.ac.nz/ml/weka/