MATLAB code analysis and visualization tools? - matlab

I just picked up a MATLAB codebase that's light on documentation and original developers (who all shot through long ago).
I'm comfortable with MATLAB but could still use some static analysis tools to visualize the program for a quick idea how it works, without acquainting myself with all 148 source files...
I can't find anything like this for MATLAB -- searches return plenty on m-lint results but that's not what I'm looking for, I'm particularly interested in code structure visualization.
Cheers

You can use doxygen plus an appropriate filter, such as UsingDoxygenwithMatlab.
Be sure to set EXTRACT_ALL = YES to get auto-generated documentation for code without comments. There are other parameters for generating call trees and such, not sure if they work with the converted MatLab code.

Some of this answer may be useful. And don't forget the publish function.

Related

Converting SBML model into a simulatable Matlab Function

I'm looking for a tool to convert a SBML model into a Matlab function. I've tried SBMLTranslate() function from libSBML but this returns a Matlab struct, not a function. Does anybody know if such tool exists? Thanks
There are at least three efforts in this direction:
Frank Bergmann offers an online service for SBML translation where you can upload an SBML file and it will generate a MATLAB file. The comments at the top of the generated MATLAB file explain how to use the results. The C++ source code is available on SourceForge.
Bergmann's code referenced above was used by Stanley Gu to create sbml2matlab, a Windows standalone program. Off-hand, I don't know whether Gu's version changed or enhanced the algorithm used by the Bergmann version, but it seems likely. (Note: Gu now works at Google and does not maintain this code anymore, as far as I know.)
The Systems Biology Format Converter (SBFC) is a framework written principally by Nicolas Rodriguez; it includes a collection of converters, one of which is an SBML-to-MATLAB converter. This converter is written in Java.
I have not compared the results of the translators myself yet, so cannot speak to the differences or quality of output. If you try them and have any feedback to relate, please let the authors know. Knowing what has or hasn't worked for real users will help improve things in the future.
A final caveat is that all of these have been research projects, so make sure to set your expectations accordingly. (This is not a criticism of the authors; the authors are very good – I know most of them personally – but the reality of academic development work is that we all lack the time and resources to make these systems comprehensive, hardened, polished, and documented to the degree that we wish we could.)

how to understand the linkagemex function inside of the defaule linkage function in matlab

I need to rewrite the linkage function in matlab. Now, as I examine it, I realized there is a method called linkagemex inside of it. But I simply cannot step into this method to see its code. Can anyone help me out with this strange situastion?
function Z= linkage (Y, method, pdistArg, varargin)
Z=linkagemex(Y,method);
PS. I think I am pretty good at learning, but matlab is not so easy to learn. If you have good references to learn it well, feel free to let me know. Thanks very much for your time and attention.
As #m.s. mentions, you've found a call to a MEX function. MEX functions are implemented as C code that is compiled into a function callable by MATLAB.
As you've found, you can't step into this method (as it is compiled C code, not MATLAB code), and you don't have access to the C source code, as it's not supplied with MATLAB.
Normally, you would be at kind of a dead end here. Fortunately, that's not quite the case with linkagemex. You'll notice on line 240 of linkage.m that it actually does a test to see whether linkagemex is present. If it isn't, it instead calls a local subfunction linkageold.
I think you can assume that linkageold does at least roughly the same thing as linkagemex. You may like to test them out with a few suitable input arguments to see if they give the same results. If so, then you should be able to rewrite linkage using the code from linkageold rather than linkagemex.
I'm going to comment more generally, related to your PS. Over the last few days I've been answering a few of your questions - and you do seem like a fast learner. But it's not really that MATLAB is hard to learn - you should realize that what you're attempting (rewriting the clustering behaviour of phytree) is not an easy thing to do for even a very advanced user.
MathWorks write their stuff in a way that makes it (hopefully) easy to use - but not necessarily in a way that makes it easy for users to extend or modify. Sometimes they do things for performance reasons that make it impossible for you to modify, as you've found with linkagemex. In addition, phytree is implemented using an old style of OO programming that is no longer properly documented, so even if you have the code, it's difficult to work out what it even does, unless you happen to have been working with MATLAB for years and remember how the old style worked.
My advice would be that you might find it easier to just implement your own clustering method from scratch, rather than trying to build on top of phytree. There will be a lot of further headaches for you down the road you're on, and mostly what you'll learn is that phytree is implemented in an obscure old-fashioned way. If you take the opportunity to implement your own from scratch, you could instead be learning how to implement things using more modern OO methods, which would be more useful for you in the future.
Your call though, that's just my thoughts. Happy to continue trying to answer questions when I can, if you choose to continue with the phytree route.
You came across a MEX function, which "are dynamically linked subroutines that the MATLAB interpreter loads and executes". Since these subroutines are natively compiled, you cannot step into them. See also the MATLAB documentation about MEX functions.

random forest code review

I'm doing a research project on random forest algorithm. I have found numerous implementations of the algorithm but the main part of the code is often written in Fortran while I'm completely naive in it.
I have to edit the code, change the main parameters (like tree depth, num of feature variables, ...) and trace the algorithm's performance during each run.
Currently I'm using "Windows-Precompiled-RF_MexStandalone-v0.02-". The train and predict functions are matlab mex files and can not be opened or edited. Can anyone give me a piece of advice on what to do or is there a valid and completely matlab-based version of random forests.
I've read the randomforest-matlab carefully. The main training part unfortunately is a dll file. Through reading more, most of my wonders is now resolved. My question mainly was how to run several trees simultaneously.
Have you taken a look at these libraries?
Stochastic Bosque
randomforest-matlab
If you're doing a research project on it, the best thing is probably to implement the individual tree training yourself in C and then write Mex wrappers. I'd start with an ID3 tree (before attempting C4.5 for instance.) Then write the random forest code itself, which, once you write the tree code, isn't all that hard.
You'll:
learn a lot
be able to modify them as much as you like
eventually move on to exploring new areas with them
I've implemented them myself from scratch so I can help once you post some of your own code. But I don't think anybody on this site will write the code for you.
Will it take effort? Yes. Will you come out of it with more knowledge and ability than you had going in? Undoubtably.
There is a nice library in R called randomForest. It is based on the original implementation of Breiman in Fortran but it is now mainly recoded in C.
http://cran.r-project.org/web/packages/randomForest/index.html
The main parameters you talk about (tree depth, number of features to be tested, ...) are directly available.
Another library I would recommend is Weka. It is java based and lucid.Performance is slightly off though compared to R. The source code can be downloaded from http://www.cs.waikato.ac.nz/ml/weka/

How to invoke a matlab function from mathematica?

I would like to call a matlab function from mathematica. How best to do that?
I have found an ancient post on Wolfram site describing a way to do this, is this still the way to connect the two?
You can try NETLink for this at least under Windows:
In[1]:= Needs["NETLink`"]
matlab = CreateCOMObject["matlab.application"]
Out[2]= «NETObject[COMInterface[MLApp.DIMLApp]]»
And then you can invoke Matlab functions:
In[4]:= matlab#Execute["version"]
Out[4]= "
ans =
7.9.0.529 (R2009b)
"
In[5]:= matlab#Execute["a=2"]
matlab#Execute["a*2"]
Out[5]= "
a =
2
"
Out[6]= "
ans =
4
"
HTH
You can use mEngine. The precompiled Windows MathLink executable works with Mathematica 8. On Windows you may need to add MATLAB to the system path.
The advantage of this compared to the NETLink method is that transferring variables between Mathematica and MATLAB will be as easy as mGet["x"] or mPut["x"]. Although this might be possible with NETLink too, the advantage of mEngine is that you don't need to implement it yourself (which is great if like me you don't know anything about COM or .NET)
I would imagine that this is a difficult problem in general, but can be easily solved with a little programming for a particular case. I'll demonstrate with C#.
I would build a string of calls, like so.
Mathematica calls a C# program, through MathLink. This is near trivial to setup, and Mathematica has a sample project in Mathematica\8.0\SystemFiles\Links\NETLink directory.
C# program calls Matlab. There are several ways to make this call, and this handy link describes how to do it and offers sample code.
C# program returns Matlab results.
All in all I could do this in less than 50 lines of C# code, for a specific problem. Not too much work, in other words. Possible problems are data conversion, but if you want to send back and forth arrays of data, MathLink offers a lot out of the box. Similarly Mathematica can be linked to MATLAB through Java, though I haven't done that myself.
Perhaps the easiest connection could be made through Python. Mathematica offers an installable MathLink python library, located at Mathematica\8.0\SystemFiles\Links\NETLink, and Matlab has an addon library called PyMat, which can be downloaded here, but this package hasn't been maintained for a long time and supports only the most ancient of Matlabs.
Alternatively you can forgo Matlab altogether in favor of SAGE and/or numpy.
There is now a new package for this --- MATLink. It is the most complete such package I am aware of. (Disclaimer: I'm one of the developers of MATLink.)
MATLink lets you ...
seamlessly call MATLAB functions form Mathematica
transfer data between the two systems
Most MATLAB data types are supported, including sparse arrays, structs and cells.
A more complete description is available here. For detailed examples, see the website.

Are there any tools to visualize template/class methods and their usage?

I have taken over a large code base and would like to get an overview how and where certain classes and their methods are used.
Is there any good tool that can somehow visualize the dependencies and draw a nice call tree or something similar?
The code is in C++ in Visual Studio if that helps narrow down any selection.
Here are a few options:
CodeDrawer
CC-RIDER
Doxygen
The last one, doxygen, is more of an automatic documentation tool, but it is capable of generating dependency graphs and inheritance diagrams. It's also licensed under the GPL, unlike the first two which are not free.
When I have used Doxygen it has produced a full list of callers and callees. I think you have to turn it on.
David, thanks for the suggestions. I spent the weekend trialing the programs.
Doxygen seems to be the most comprehensive of the 3, but it still leaves some things to be desired in regard to callers of methods.
All 3 seem to have problems with C++ templates to varying degrees. CC-Rider simply crashed in the middle of the analysis and CodeDrawer does not show many of the relationships. Doxygen worked pretty well, but it too did not find and show all relations and instead overwhelmed me with lots of macro references until I filtered them out.
So, maybe I should clarify "large codebase" a bit for eventual other suggestions: >100k lines of code overall spread out over more than 100 template files plus several actual class files pulling it all together.
Any other tools out there, that might be up to the task and could do better (more thoroughly)? Oh and specifically: anything that understands IDL and COM interfaces?
When I have used Doxygen it has produced a full list of callers and callees. I think you have to turn it on.
I did that of course, but like I mentioned, doxygen does not consider interfaces between objects as they are defined in the IDL. It "only" shows direct C++ calls.
Don't get me wrong, it is already amazing what it does, but it is still not complete from my high level view trying to get a good understanding of how everything fits together.
In Java I would start with JDepend. In .NET, with NDepend. Don't know about C++.