Which language should I prefer working with if I want to use the Fast Artificial Neural Network Library (FANN)? - neural-network

I am working on reducing dimentionality of a set of (Boolean) vectors with both the number and dimentionality of vectors tending to be of the order of 10^5-10^6 using autoencoders. Hence even though speed is not of essence (it is supposed to be a pre-computation for a clustering algorithm) but obviously one would expect that the computations take a reasonable amount of time. Seeing how the library itself was written in c++ would it be a good idea to stick to it or to code in Java (Since the rest of the code is written in Java)? Or would it not matter at all?

That question is difficult to answer. It depends on:
How computationally demanding will be your code? If the hard part is done by the library and your code is only to generate the input and post-process the output, Java would be a valid choice. Compare it to Matlab: The language is very slow but the built-in algorithms are super-fast.
How skilled are you (or your team, or your future students) in Java and C++. Consider learning C++ takes a lot of time. If you have only a small scaled project, it could be easier to buy a bigger machine or wait two days instead of one, to get the results.
Have you legacy code in one of the languages you want to couple or maybe re-use?
Overall, I would advice you to set up a benchmark example in whatever language you like more. Then give it a try. If the speed is ok, stick to it. If you wait to long, think about alternatives (new hardware, parallel execution, different language).

Related

Simulation speed of SystemC vs SystemVerilog

Does anyone have a pointer to measurements of how fast a behavioral model written in SystemC is compared to the same model written in SystemVerilog? Or at least first hand experience of the relative simulation speed of the two, when modeling the same thing?
To clarify, this is for high level system simulation, where the model is the minimum code to mimic the behavior. It's decidedly not synthesizable. In System Verilog, it would use as many non-synthesizable behavioral constructs as possible, for maximum execution speed. The point is, we're asking about an apples to apples comparison that someone did, where they tried to do both: stay high level (non-synthesizable), and make the code as close to equivalent as possible between the two languages.
It would be great if they stated the SystemVerilog environment used, and what machine each was run on, in order to be able to translate. Hopefully can be useful to a variety of people in choosing approach.
Thanks!
Like with other language benchmarks it is not possible to answer in general, because it depends on multiple factors like modeling style, compiler vendor and version, compilation options.
Also there are tools to convert Verilog to C++ and C++ to Verilog. So you can always match simulation performance by converting from one language to other.
Verilog can be converted to C++/SystemC using Verilator https://www.veripool.org/wiki/verilator
To convert C++/SystemC into Verilog you can use HLS (High-level synthesis) toos.

Numerical Integral of large numbers in Fortran 90

so I have the following Integral that i need to do numerically:
Int[Exp(0.5*(aCosx + bSinx + cCos2x + dSin2x))] x=0..2Pi
The problem is that the output at any given value of x can be extremely large, e^2000, so larger than I can deal with in double precision.
I havn't had much luck googling for the following, how do you deal with large numbers in fortran, not high precision, i dont care if i know it to beyond double precision, and at the end i'll just be taking the log, but i just need to be able to handle the large numbers untill i can take the log..
Are there integration packes that have the ability to handle arbitrarily large numbers? Mathematica clearly can.. so there must be something like this out there.
Cheers
This is probably an extended comment rather than an answer but here goes anyway ...
As you've already observed Fortran isn't equipped, out of the box, with the facility for handling such large numbers as e^2000. I think you have 3 options.
Use mathematics to reduce your problem to one which does (or a number of related ones which do) fall within the numerical range that your Fortran compiler can compute.
Use Mathematica or one of the other computer algebra systems (eg Maple, SAGE, Maxima). All (I think) of these can be integrated into a Fortran program (with varying degrees of difficulty and integration).
Use a library for high-precision (often called either arbitray-precision or multiple-precision too) arithmetic. Your favourite search engine will turn up a number of these for you, some written in Fortran (and therefore easy to integrate), some written in C/C++ or other languages (and therefore slightly harder to integrate). You might start your search at Lawrence Berkeley or the GNU bignum library.
(Yes I know that I wrote that you have 3 options, but your question suggests that you aren't ready to consider this yet) You could write your own high-/arbitrary-/multiple-precision functions. Fortran provides everything you need to construct such a library, there is a lot of work already done in the field to learn from, and it might be something of interest to you.
In practice it generally makes sense to apply as much mathematics as possible to a problem before resorting to a computer, that process can not only assist in solving the problem but guide your selection or construction of a program to solve what's left of the problem.
I agree with High Peformance Mark that the best option here numerically is to use analytics to scale or simplify the result first.
I will mention that if you do want to brute force it, gfortran (as of 4.6, with the libquadmath library) has support for quadruple precision reals, which you can use by selecting the appropriate kind. As long as your answers (and the intermediate results!) don't get too much bigger than what you're describing, that may work, but it will generally be much slower than double precision.
This requires looking deeper at the problem you are trying to solve and the behavior of the underlying mathematics. To add to the good advice already provided by Mark and Jonathan, consider expanding the exponential and trig functions into Taylor series and truncating to the desired level of precision.
Also, take a step back and ask why you are trying to accomplish by calculating this value. As an example, I recently had to debug why I was getting outlandish results from a property correlation which was calculating vapor pressure of a fluid to see if condensation was occurring. I spent a long time trying to understand what was wrong with the temperature being fed into the correlation until I realized the case causing the error was a simulation of vapor detonation. The problem was not in the numerics but in the logic of checking for condensation during a literal explosion; physically, a condensation check made no sense. The real problem was the code was asking an unnecessary question; it already had the answer.
I highly recommend Forman Acton's Numerical Methods That (Usually) Work and Real Computing Made Real. Both focus on problems like this and suggest techniques to tame ill-mannered computations.

Compilation optimization for iPhone : floating point or fixed point?

I'm building a library for iphone (speex, but i'm sure it will apply to a lot of other libs too) and the make script has an option to use fixed point instead of floating point.
As the iphone ARM processor has the VFP extension and performs very well floating point calculations, do you think it's a better choice to use the fixed point option ?
If someone already benchmarked this and wants to share , i would really thank him.
Well, it depends on the setup of your application, here is some guidelines
First try turning on optimization to 0s (Fastest Smallest)
Turn on Relax IEEE Compliance
If your application can easily process floating point numbers in contiguous memory locations independently, you should look at the ARM NEON intrinsic's and assembly instructions, they can process up to 4 floating point numbers in a single instruction.
If you are already heavily using floating point math, try to switch some of your logic to fixed point (but keep in mind that moving from an NEON register to an integer register results in a full pipeline stall)
If you are already heavily using integer math, try changing some of your logic to floating point math.
Remember to profile before optimization
And above all, better algorithms will always beat micro-optimizations such as the above.
If you are dealing with large blocks of sequential data, NEON is definitely the way to go.
Float or fixed, that's a good question. NEON is somewhat faster dealing with fixed, but I'd keep the native input format since conversions take time and eventually, extra memory.
Even if the lib offers a different output formats as an option, it almost alway means lib-internal conversions. So I guess float is the native one in this case. Stick to it.
Noone prevents you from micro-optimizing better algorithms. And usually, the better the algorithm, the more performance gain can be achieved through micro-optimizations due to the pipelining on modern machines.
I'd stay away from intrinsics though. There are so many posts on the net complaining about intrinsics doing something crazy, especially when dealing with immediate values.
It can and will get very troublesome, and you can hardly optimize anything with intrinsics either.

Matlab vs Aforge vs OpenCV

I am about to start a project in visual image-processing and have no had experience with Matlab, Aforge, OpenCV and was wondering if anyone had any experiences with these different software packages.
I was also wondering which of the three packages were most efficient I assume OpenCV but has anyone had any experience?
Thanks
Jamie.
The question you need to ask yourself is which is more important - your time or the computer's time. If your task is really simple, you may be able to code it up in MATLAB and have it work right off the bat. MATLAB is by far the easiest for development - a scripted language with built-in memory management, a huge array of provided functions, and a great interface for displaying and manipulating data while debugging.
On the other hand, MATLAB is at least an order of magnitude slower than compiled openCV code for many tasks. This is especially true if you use the intel performance primitives libraries.
If you know how to code in MATLAB, I would suggest writing and debugging your algorithms in that language, then porting them to c/c++ with openCV for speed. If there are only a couple of simple functions that you need to speed up, you can call c code from MATLAB, but it's hard to get this working right the first few times you try it, so you're probably better off just rewriting your finished code entirely in c/c++
First, please elaborate about your project's needs. It has the biggest impact on the choice, in addition to other factors - your general programming knowledge (If you haven't dealt with dot net but just with C++, AForge is not a good choice, for example).
Generally,
Both AForge and OpenCV has a built-in interface to .Net, and OpenCV also with C++, python, and more. Matlab might be more efficient, but if you don't have any experience with it - you should also learn its syntax. Take it into consideration.
Matlab probably has the largest variety of functions, but it is more complicated than the other projects. OpenCV and AForge themselves have some differences - see them described in this StackOverflow question/ answers.
I worked last year in two similar projects with cars on the highway. Afaik, Matlab allows to process only one picture frame at a time (surely you could elaborate an algorithm to compute a stream) but using Simulink you can process the stream directly.
On the other hand, i found AForge a lot friendlier and easier to use since you can easily adjust the processing parameters from a GUI (not so fast/easy) to do in Matlab/simulink.
I'd go for Aforge.Net. It's also fast enough if you're worrying about processing speed. (using 640x480)
If you are asking about using one of these in .net,easily you can get info by this:
1-matlab mostly used in simulation of projects not the End-prototype project; my numer : 30;
2-aforge (as I'v used in many project) if you do not need the circular process like capturing image, or recognition of something in images or ... you'll find it very good, cause it is easy to use but useful for single processes; my number : 50
3-opencv very good at speed and useful for circular processes, for example you can capture images from a webcam and Instantly cartoonize it without any delay, But not easy-to-use as aforge. I like it anyway cause of its speed and MANY functions it gives us mostly anything we need in programming; my number : 80
Dr.Taha - Tahasoft.net

Your experiences with Matlab/F#/R for data analysis and modeling algorithms

I've been using F# for a while now to model algorithms before coding them in C++, and also using it afterwards to check the results of the C++ code, and also against real-world recorded data.
For the modeling side of things, it's very handy, but for the 'data mashup' kind of stuff, pulling in data from CSV and other sources, generating statistics, drawing charts etc., my colleague teases me no end ("why are you coding that yourself? It's built in to MatLab").
And I have another colleague who swears by R, which also has charting stuff 'built-in'.
I know that MatLab, R and F# are not strictly comparable, so I'm not asking for a 'feature comparison shoot out'. I just wondered what other people are using for these kind of pre- and post-analysis scenarios, and how happy they are with it.
(If there's anyone out there working on wrapping Microsoft Charts into something F#-friendly, let me know, I'd be happy to participate...)
(Note: answers to this question will be subjective, but based on experience, please)
I have very little experience with F#, but regarding C++/Matlab/R: If the speed of your program's execution is the most important, use C++. If speed of implementation is the most important, use Matlab or R. This is true for a number of reasons, not the least of which is their massive libraries of math/stats packages.
Both Matlab and R can be sped up through parallelism: so generally, I think that speed and quality of implementation should be a bigger concern. That's where the real "value" of programming is taking place, in the design of the application. It's not a minor proposition if you can write 3 or 4 good R programs in the same time it takes you to write 1 good C++ program.
Regarding F#: so far as it is part of Microsoft's framework, it must have a lot to offer. If you're developing in Visual Studio or working on a big .Net project (for instance), it might make sense to use F#. On the other hand, you can call both Matlab and R from .Net applications, so I would probably argue that their libraries should be a bigger concern. For instance, see this article as an example for R and the Matlab Builder.
Long story short: comparing F# and Matlab/R isn't a good comparison. F# is a general purpose programming language, while Matlab/R can be viewed as massive mathematical/data analysis toolkits. Some people call Matlab or R from F# in order to take advantage of each language's benefits (e.g. see this discussion, this article on Matlab/F#, or this article on R/F#).
So far as charting is concerned: R is extremely strong on this front. Have a look at the graphics view on CRAN and this series of posts on the LearnR blog about Lattice and ggplot2.
I've worked a bit with matlab and python/pylab for these purposes. What these tools have 'built-in' is a programming environment, a shell, and gui tools designed for quickly looking at data from a variety of sources.
In a few commands, you can go from having a csv file to interactive plots on the screen, then to an image export in just about any format. It takes a minute or two to go from data to visualization once you have the hang of it. I would imagine this is uncommon in the C++ world (although I have seen some professors with pretty impressive work-flows).
I've tried R, but I can't say much useful about it. It seems to offer about the same set of features, but it may be troublesome to Google for support.
If you are spending more than a couple minutes getting from data to plot using your current method, it's definitely worth learning one of these environments. The best choice depends on your colleagues, your work environment, experience, and your budget.
This is a reasonable close double to the previous question on suitable functional language for scientific/statistical computing so you may want to peruse the long and detailed answers there.
Answers depends, as so often, on your experience and prior language training. I very much prefer R for data munging / modeling / visualization.
I use R because on the one hand it has everything built in and on the other hand you can still manipulate almost everything or start from scratch. Nevertheless, R is rather slow for heavy calculations (although I do all my Monte Carlo simulations in it).
I would say that Matlab is best for the availability of mathematical functionalities in general, R is best for data input/manipulation/visualisation/analysis/etc., and C++ for high-speed subroutines. You can by the way easily integrate C++ (or C, fortran, ...) code in R. Why not read and manipulate input data in R, apply the models in C++, and analyse/visualize output back in R?
I always prototype my models in MATLAB. If my prototype is fast enough, I refactor and it's done. If not, I go back and implement certain functions in C to be called by MATLAB. This requires knowledge of a low level language, which I think is always going to be the case if you are doing anything that is technically challenging.
I'm intrigued with this Lisp flavor if it ever gets off the ground.