Your experiences with Matlab/F#/R for data analysis and modeling algorithms - matlab

I've been using F# for a while now to model algorithms before coding them in C++, and also using it afterwards to check the results of the C++ code, and also against real-world recorded data.
For the modeling side of things, it's very handy, but for the 'data mashup' kind of stuff, pulling in data from CSV and other sources, generating statistics, drawing charts etc., my colleague teases me no end ("why are you coding that yourself? It's built in to MatLab").
And I have another colleague who swears by R, which also has charting stuff 'built-in'.
I know that MatLab, R and F# are not strictly comparable, so I'm not asking for a 'feature comparison shoot out'. I just wondered what other people are using for these kind of pre- and post-analysis scenarios, and how happy they are with it.
(If there's anyone out there working on wrapping Microsoft Charts into something F#-friendly, let me know, I'd be happy to participate...)
(Note: answers to this question will be subjective, but based on experience, please)

I have very little experience with F#, but regarding C++/Matlab/R: If the speed of your program's execution is the most important, use C++. If speed of implementation is the most important, use Matlab or R. This is true for a number of reasons, not the least of which is their massive libraries of math/stats packages.
Both Matlab and R can be sped up through parallelism: so generally, I think that speed and quality of implementation should be a bigger concern. That's where the real "value" of programming is taking place, in the design of the application. It's not a minor proposition if you can write 3 or 4 good R programs in the same time it takes you to write 1 good C++ program.
Regarding F#: so far as it is part of Microsoft's framework, it must have a lot to offer. If you're developing in Visual Studio or working on a big .Net project (for instance), it might make sense to use F#. On the other hand, you can call both Matlab and R from .Net applications, so I would probably argue that their libraries should be a bigger concern. For instance, see this article as an example for R and the Matlab Builder.
Long story short: comparing F# and Matlab/R isn't a good comparison. F# is a general purpose programming language, while Matlab/R can be viewed as massive mathematical/data analysis toolkits. Some people call Matlab or R from F# in order to take advantage of each language's benefits (e.g. see this discussion, this article on Matlab/F#, or this article on R/F#).
So far as charting is concerned: R is extremely strong on this front. Have a look at the graphics view on CRAN and this series of posts on the LearnR blog about Lattice and ggplot2.

I've worked a bit with matlab and python/pylab for these purposes. What these tools have 'built-in' is a programming environment, a shell, and gui tools designed for quickly looking at data from a variety of sources.
In a few commands, you can go from having a csv file to interactive plots on the screen, then to an image export in just about any format. It takes a minute or two to go from data to visualization once you have the hang of it. I would imagine this is uncommon in the C++ world (although I have seen some professors with pretty impressive work-flows).
I've tried R, but I can't say much useful about it. It seems to offer about the same set of features, but it may be troublesome to Google for support.
If you are spending more than a couple minutes getting from data to plot using your current method, it's definitely worth learning one of these environments. The best choice depends on your colleagues, your work environment, experience, and your budget.

This is a reasonable close double to the previous question on suitable functional language for scientific/statistical computing so you may want to peruse the long and detailed answers there.
Answers depends, as so often, on your experience and prior language training. I very much prefer R for data munging / modeling / visualization.

I use R because on the one hand it has everything built in and on the other hand you can still manipulate almost everything or start from scratch. Nevertheless, R is rather slow for heavy calculations (although I do all my Monte Carlo simulations in it).
I would say that Matlab is best for the availability of mathematical functionalities in general, R is best for data input/manipulation/visualisation/analysis/etc., and C++ for high-speed subroutines. You can by the way easily integrate C++ (or C, fortran, ...) code in R. Why not read and manipulate input data in R, apply the models in C++, and analyse/visualize output back in R?

I always prototype my models in MATLAB. If my prototype is fast enough, I refactor and it's done. If not, I go back and implement certain functions in C to be called by MATLAB. This requires knowledge of a low level language, which I think is always going to be the case if you are doing anything that is technically challenging.
I'm intrigued with this Lisp flavor if it ever gets off the ground.

Related

Should I use Object Oriented Programming in MATLAB?

I have an issue, where I need to handle a lot of figures in matlab and the code is starting to get messy. Different kinds of plot objects are added to the code in different stages and some have legends and some does not. The problem is that there is no NULL legends. As soon as an object is created, so is a legend. However, until the legend(handles,...) is called they are not shown. This means that if things are plotted and some, need a legend entry and some not, a lot of handles needs to be passed around.
Now, the file is starting to be quite long, about 1500 lines, with some globals that spans over many functions in the file and so. To prevent the "Do not use globals" comments to pour in, yes I know globals are normally unnecessary, but the code was like that when I laid my hands on it. However, now the code is getting more and more messy and I think about using Object Oriented Programming (OOP) to handle figures.
The idea is to have the custom figure objects handling themselves and thus make more readable code, split up in smaller blocks. The idea is to have a design like
class Figure
private:
MainFrame;
SubFrame;
Lines;
Legends;
Title;
X-Label;
Y-Label;
Methods:
To be defined, for example formatting plotting, edit title,…
The complete design is not really thought through completely, but the point of this questions is really about using OOP in matlab. What I have seen so far it os not really used were much. Are there a reason for this? Could anyone give pros and cons to OOP in matlab? Is OOP recommended or not in matlab?
I have added the information about my issue since I understand that OOP is more needed for large complex issues, so an answer would preferably take the drawbacks in comparison with the complexity of the problem into account. (For example, do never use OOP in matlab, do it only when you have complex problems, do it whenever you like,...)
Okay the question is about OOP in Matlab - but is it not OOP in Matlab in your organisation?
By that I mean to think who is going to use/develop and maintain the code going forward.
Background: I have used OOP for my own toolbox (because its complex/large enough to warrant it - and I develop/maintain it) - however in consultancy jobs for the majority of my clients I create functions (which in some instances call my toolbox) - because when the job is finished they get the source code and the majority are (much) more comfortable working with functions rather than classes.
In summary - I decide on whether to use OOP on the job specifics and the situation where the code will be used (developed & maintained) in the future.
So back to your topic - I would consider where you think the code is going to go and who will develop/maintain it. Will they be comfortable with classes - or will they be more comfortable with functions?
FYI: Last year I was talking to Mathworks and they said that they run multiple "Intro to Matlab" courses per week - but only 1 "Matlab Classes" per quarter!! That gives you an indication on the level of Matlab class use in industry.

Which language should I prefer working with if I want to use the Fast Artificial Neural Network Library (FANN)?

I am working on reducing dimentionality of a set of (Boolean) vectors with both the number and dimentionality of vectors tending to be of the order of 10^5-10^6 using autoencoders. Hence even though speed is not of essence (it is supposed to be a pre-computation for a clustering algorithm) but obviously one would expect that the computations take a reasonable amount of time. Seeing how the library itself was written in c++ would it be a good idea to stick to it or to code in Java (Since the rest of the code is written in Java)? Or would it not matter at all?
That question is difficult to answer. It depends on:
How computationally demanding will be your code? If the hard part is done by the library and your code is only to generate the input and post-process the output, Java would be a valid choice. Compare it to Matlab: The language is very slow but the built-in algorithms are super-fast.
How skilled are you (or your team, or your future students) in Java and C++. Consider learning C++ takes a lot of time. If you have only a small scaled project, it could be easier to buy a bigger machine or wait two days instead of one, to get the results.
Have you legacy code in one of the languages you want to couple or maybe re-use?
Overall, I would advice you to set up a benchmark example in whatever language you like more. Then give it a try. If the speed is ok, stick to it. If you wait to long, think about alternatives (new hardware, parallel execution, different language).

Integrating NetLogo and Java : when should we think about this integration as a good option?

I just came to know about this excellent tutorials
http://scientificgems.wordpress.com/2013/12/11/integrating-netlogo-and-java-part-1/
http://scientificgems.wordpress.com/2013/12/12/integrating-netlogo-and-java-2/
http://scientificgems.wordpress.com/2013/12/13/integrating-netlogo-and-java-3/
Their example concerns about computation needed for patch diffusion and shows how to access patch variable from java and change them in netlogo.
I was wondering if anyone has any idea or comments on when we should think of writing an extension to make our model work better? I am new to netlogo itself, but I think it's good to know what are the options that I might not be aware of :)
I think looking through the extensions listed at https://github.com/NetLogo/NetLogo/wiki/Extensions , both the ones that we (on the NetLogo team) have built ourselves and the ones that have come from the user community, gives you a pretty good idea of the range of things extensions can be good for.
Some broad categories:
data structures (tables, arrays, matrices, priority queues...)
algorithms (networks, statistics, discrete event scheduling, diffusion, ...)
integration with other tools (R, SQL databases, MatLab, ...)
media (sound playback, sound synthesis, images, movies, speech, ...)
new device types (Gogo, Arduino, WiiMote...)
visualization (ray-tracing, sprites, Java2D drawing, ...)
not necessarily exhaustive!

Matlab vs Aforge vs OpenCV

I am about to start a project in visual image-processing and have no had experience with Matlab, Aforge, OpenCV and was wondering if anyone had any experiences with these different software packages.
I was also wondering which of the three packages were most efficient I assume OpenCV but has anyone had any experience?
Thanks
Jamie.
The question you need to ask yourself is which is more important - your time or the computer's time. If your task is really simple, you may be able to code it up in MATLAB and have it work right off the bat. MATLAB is by far the easiest for development - a scripted language with built-in memory management, a huge array of provided functions, and a great interface for displaying and manipulating data while debugging.
On the other hand, MATLAB is at least an order of magnitude slower than compiled openCV code for many tasks. This is especially true if you use the intel performance primitives libraries.
If you know how to code in MATLAB, I would suggest writing and debugging your algorithms in that language, then porting them to c/c++ with openCV for speed. If there are only a couple of simple functions that you need to speed up, you can call c code from MATLAB, but it's hard to get this working right the first few times you try it, so you're probably better off just rewriting your finished code entirely in c/c++
First, please elaborate about your project's needs. It has the biggest impact on the choice, in addition to other factors - your general programming knowledge (If you haven't dealt with dot net but just with C++, AForge is not a good choice, for example).
Generally,
Both AForge and OpenCV has a built-in interface to .Net, and OpenCV also with C++, python, and more. Matlab might be more efficient, but if you don't have any experience with it - you should also learn its syntax. Take it into consideration.
Matlab probably has the largest variety of functions, but it is more complicated than the other projects. OpenCV and AForge themselves have some differences - see them described in this StackOverflow question/ answers.
I worked last year in two similar projects with cars on the highway. Afaik, Matlab allows to process only one picture frame at a time (surely you could elaborate an algorithm to compute a stream) but using Simulink you can process the stream directly.
On the other hand, i found AForge a lot friendlier and easier to use since you can easily adjust the processing parameters from a GUI (not so fast/easy) to do in Matlab/simulink.
I'd go for Aforge.Net. It's also fast enough if you're worrying about processing speed. (using 640x480)
If you are asking about using one of these in .net,easily you can get info by this:
1-matlab mostly used in simulation of projects not the End-prototype project; my numer : 30;
2-aforge (as I'v used in many project) if you do not need the circular process like capturing image, or recognition of something in images or ... you'll find it very good, cause it is easy to use but useful for single processes; my number : 50
3-opencv very good at speed and useful for circular processes, for example you can capture images from a webcam and Instantly cartoonize it without any delay, But not easy-to-use as aforge. I like it anyway cause of its speed and MANY functions it gives us mostly anything we need in programming; my number : 80
Dr.Taha - Tahasoft.net

Is there a tool that supports discrete mathematics?

Discrete mathematics (also finite mathematics) deals with topics such as logic, set theory, information theory, partially ordered sets, proofs, relations, and a number of other topics.
For other branches of mathematics, there are tools that support programming. For statistics, there is R and S that have many useful statistics functions built in. For numerical analysis, Octave can be used as a language or integrated into C++.
I don't know of any languages or packages that deal specifically with discrete mathematics (although just about every language can be used to implement algorithms used in discrete mathematics, there should be libraries or environments out there designed specifically for these applications).
The current version of Mathematica is 7. License costs:
Home Edition: $295.
Standard: $2,495 Win/Mac/Linux PC ($3,120 for Solaris)
Government: $1,996 ($2,496 for Solaris)
Educational: $1,095 ($1,370 for Solaris)
Student: $139.95 (no Solaris)
Above, the Home Edition link says:
Mathematica Home Edition is a fully functional version of Mathematica Professional with the same features.
The current version of Maple is 12. License costs:
Student: $99
Commercial: $1,895
Academic: $995
Government: $1,795
And yes, check out Sage, mentioned above by Thomas Owens.
Mathematica
Mathematica has a Combinatorica package, which though quite venerable at this point, provides a good deal of support for combinatorics and graphs. Commands like this are available:
NecklacePolynomial[8, m, Cyclic];
GrayCodeSubsets[{1, 2, 3, 4}];
IntegerPartitions[6]
I'd say Mathematica is your best bet.. even if it does not come with some functionality out of the box, it has very well designed supplementary packages available for it on the net
check out http://www.wolfram.com/products/mathematica/analysis/
you might be interested in the links for Number Theory, Graph Visualizations
I also found Sage. It appears to be the closest thing to Mathematica that's open source, but I'm not sure how well it handles discrete mathematics.
Maple and Matlab would be a couple of Mathematical software packages that may cover part of what you want.
Stanford GraphBase, written primarily by Donald Knuth is a great package for combinatorial computing. I wouldn't call it an extensive code base, but it has great support for graphs and a great deal of discrete mathematics can be formulated in terms of graph theory. It's written in CWEB, which is (IMO) a more readable version of C.
EDIT: It's free.
I love Mathematica and used it to prototype ideas during my PhD in computational physics. However, Mathematica tries to be all things to all people and there are a few downsides:
Being a for-profit company, bug-fixes sometimes come in the next major release: you pay.
Being a proprietary product, sharing code with non-Mathematica people (the world) is problematic.
New features are often half-baked and break when you try to take it beyond the embedded example.
It's user base (tutorials, advice, external libraries) is less active than say python's,
Mulitpanel figures are difficult to generate; see SciDraw library.
That being said, Mathematica's core functionality is amazing for the following reasons:
Its default math functionality is quite robust allowing quick solutions.
It allows both functional and procedural programming.
One can quickly code & publish in a variety of formats: pdf, interactive website.
A new Discrete Book came out.
Bottom line
Apple users expecting ease of use, will like Mathematica for its Apple-like, get-up-and-go feel.
Linux users wanting extensibility, will find Mathematica frustrating for having its Apple-like, box-welded-shut design.