I am currently learning programming with GPU to improve the performance of machine learning algorithms. Initially I try to learn programming cuda with pure c, then I found pycuda which to me a wrapper of cuda library, and then I found theano and pylearn2 and got a little confused:
I understand them in this way:
pycuda: python wrapper for cuda library
theano: similar to numpy but transparent to GPU and CPU
pylearn2: deep learning package which build on theano and implemented several machine learning/deep learning model
Since I am new to GPU programming, should I start learning from C/C++ implementation or starting from pycuda is enough, even starting from theano? E.g. I would like to implement randomForest model after learning the GPU programming.Thanks.
Your understand is almost right. I would just add some remarks about Theano. It's much more than a Numpy which can run on the GPU. Theano is indeed is math expression compiler, which translates symbolic math expressions in highly optimized C/CUDA code, targeted for both CPU and GPU. The code it generates is often much more efficient than the one most programmers would write. Theano also can make symbolic differentiation (very useful for gradient based optimization) and has also a feature to achieve better numerical stability (which probably is something useful, though I don't know for real to what extent). It's very likely Theano will be enough to implement what you need. If you still decide to learn CUDA or PyCUDA, choose the one based no the language you will use, C++ or Python.
Related
This page of IMSL says
To obtain improved performance we recommend linking with High
Performance versions of LAPACK and BLAS, if available.
What is High Performance versions of LAPACK and BLAS ?
There are plenty of good implementations to pick from:
Intel MKL is likely the best on Intel machines. It's not free though, so that may be a problem.
According to their benchmark, OpenBLAS compares quite well with Intel MKL and is free
Eigen is also an option and has a largish (albeit old) benchmark showing good performance on small matrices (though it's not technically a drop-in BLAS library)
ATLAS, OSKI, POSKI are examples of auto-tuned kernels which will claim to work on many architectures
Generally, it is quite hard to pick one of these without benchmarking because:
some implementations work better on different types of matrices. For example Eigen works better on matrices with small rank (100s)
some are optimised for specific architectures (e.g. Intel's)
in some cases the multithreading of the BLAS library may conflict with a multithreaded application (e.g. OpenBLAS)
developer's benchmarks may tend to emphasise cases which work better on their implementation.
I would suggest pick one or two of these libraries that apply for your use case and benchmark them for your particular application on your particular (or similar) machine. This is quite easy to do even after compiling your code.
LAPACK and BLAS are performance libraries that provides basically Linear Algebra mathematical operations for a system of linear equations. you can find such libraries useful in the computer vision for example ( Object detection and classifications ) , Classical algorithms, Modelling , ...
TAsking provides a full C implementation of the LAPACK and BLAS performance libraries, both libraries are ISO-C99 Compliant with full documentation and examples, you can check it here
http://www.tasking.com/products/tasking-lapack-performance-libraries
Do you know if there is a C library to handle FMU and run simulations including a good solver?
As far as I know there are:
FMUSDK from QTronic
FMI Library from Modelon
Both can open FMUs, but only let running FMU for co-simulation with a simple Euler solver.
Libraries including a good solver handling discontinuities, but not in C, are:
PyFMI from Modelon: For Python
JFMY from Ptolemy: For Java (not sure if this includes a good solver)
I don't think the FMUSDK is really maintained, so the FMI Library is probably the better choice between those two.
To improve the solver, you'd probably have to pair the FMI Library with a solver like Sundials and figure out how to stitch those two together. Note, this is exactly what Modelon has done with the PyFMI library. While it is a Python library, I suspect you could probably find a relatively easy way to integrate into a non-Python project as long as you were able to use C code to integrate them (which apparently you are).
I suspect calling PyFMI from C is going to be easier than stitching FMIL together with Sundials on your own.
Is it possible to use a MATLAB code on Scilab? Is that what is meant when saying that Scilab is a "clone" from MATLAB?
There is a tool to automatically convert Matlab source to Scilab source, it's called M2SCI. A script parses the Matlab source code and replaces Matlab-specific functions by Scilab ones. See the documentation of the mfile2sci function.
Yes you can use MATLAB code on scilab. See these links for more information:
http://help.scilab.org/docs/5.4.0/fr_FR/section_36184e52ee88ad558380be4e92d3de21.html
http://help.scilab.org/docs/5.4.0/en_US/index.html
I would not bet on it. But if your code is simple enough chances are good.
Problems are:
There is encrypted p-code in Matlab that Scilab will not be able to open.
Matlab usually comes with a number of toolboxes that might not be available to you (i think especially Simulink)
last but not least (i don't know about scilab) there usually are minute differences in how functions are implemented.
There are a number of projects out there trying to replicate/replace MATLAB:
Julia language: which has a relatively similar syntax to MATLAB and offers great performance, but still lacks a lot of toolboxes/libraries, as well as not having a GUI like MATLAB. I think this has the brightest future among all MATLAB alternatives.
Python language and its libraries NumPy and matplotlib: which is the most used alternative. I think at this moment the community is a couple of orders of magnitude even bigger than MATLAB. Python is the de facto standard in machine learning and data science at the moment. But still, the syntax and memory concept is a bit far from what people are used to in the MATLAB ecosystem. There are also no equivalent to SIMULINK, although Spyder and Jupyter projects have come a long way in terms of the development environment.
Octave: is basically a clone of MATLAB to a point they consider any incompatibility as a bug. If you have a long MATLAB code that you don't want to touch, this is the safest bet. But again no alternative for SIMULINK.
SciLab and it's fork ScicoLab are the best alternatives in terms of GUI, having a SIMULINK replica xcos / scicos and a graphical user interface development features. However the community is not as big as Octave and the syntax is not completely compatible. Sadly the Scilab development team has gone through a devastating family crisis leading to the software falling behind.
Honorary mention of Modelica language implementations OpenModelica and jModelica for being a superior alternative to SIMULINK-SimScape. You should know that you can load Modelica scrips also in xcos and scicos. If you want to kno wmore about JModelica you may see this post.
you may check the MATLAB's Alternativeto page to see more Free and Open source alternatives.
I'm doing some numerical estimation and correction with the Kalman filter, and would like to better estimate my parameters of Q and R, preferably dynamically.
http://en.wikipedia.org/wiki/Kalman_filter#Estimation_of_the_noise_covariances_Qk_and_Rk
That article mentions that GNU Octave is currently the best way of determining these parameters from data:
http://en.wikipedia.org/wiki/GNU_Octave#C.2B.2B_integration
Unfortunately it is written for Matlab, and there's supposedly a C++ implementation. I'm very weak in C++ and would not even know how to import a C++ library and link it properly in XCode. All of my C++ libraries to date have been wrapped in 3rd party Objective-C classes.
Has anyone used the C++ implementation for scientific computing or engineering applications on iPhone? I'd appreciate any pointers or tutorials on how to do this kind of analysis with Objective-C.
Additional keywords:
estimating covariance from data
Autocovariance Least-Squares (ALS) technique
noise covariance
Thank you!
I do not know of any such C++ library, if you fancy doing numerical analysis on iOS, the best way to go is the accelerate framework, specifically (from this description):
Linear Algebra: LAPACK and BLAS
The Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package
(LAPACK) libraries contain—as you would expect—functions to perform
linear algebra computations such as solving simultaneous linear
equations, least squares solutions of linear equations, and eigenvalue
problems. The BLAS library serves as a building block for the LAPACK
library. The BLAS and LAPACK libraries are widely distributed and
industry standard computational libraries. They are available on a
number of different platforms and architectures. So, if you are
already using these libraries you should feel right at home, as the
APIs are exactly the same on Mac OS X.
You'll need a fairly good grounding in C, pointers, arrays and such though, no way around it I feel. There is a detailed description of how to use these linear algebra primitives to implement kalman filtering (although this is using R, so probably not of mush use to you).
This is a SO post on Kalman Filtering which expressed my opinion quite well. I'm afraid I think the chances of finding a magic Objective-C wrapper for Kalman Filtering are fairly low, though I would be very happy to be proven wrong!
I am writing programs that are based on robots navigating through mazes (would involve stochastic programming).
Since it will involve heavy matrix handling (plus point for MATLAB) and simulating a robot (plus point for Prolog), I am in a dilemma between the choice of MATLAB and Prolog.
Note: I do have MATLAB at my work environment, hence cost is not an issue.
As mentioned previously, I am not sure if you are looking for comparisons between MATLAB and Python or MATLAB and Prolog. I can speak to the former, at least: MATLAB provides fast linear algebraic computation and a great IDE... and that's about it. Python will cost you much fewer headaches (and dollars), and you can manage "heavy matrix handling" nearly as easily if you tack on Numpy in particular, or SciPy in general.
Also, VPython (Visual Python) is a great 3D visualization tool that uses Numpy under the hood. I developed a robot simulator using VPython; you can see screenshots and example code (for simple wall-following maze navigation) that you can check out in a recent blog post.