Speed Comparison: C++ vs Objective C [duplicate] - iphone

When programming a CPU intensive or GPU intensive application on the iPhone or other portable hardware, you have to make wise algorithmic decisions to make your code fast.
But even great algorithm choices can be slow if the language you're using performs more poorly than another.
Is there any hard data comparing Objective-C to C++, specifically on the iPhone but maybe just on the Mac desktop, for performance of various similar language aspects? I am very familiar with this article comparing C and Objective-C, but this is a larger question of comparing two object oriented languages to each other.
For example, is a C++ vtable lookup really faster than an Obj-C message? How much faster? Threading, polymorphism, sorting, etc. Before I go on a quest to build a project with duplicate object models and various test code, I want to know if anybody has already done this and what the results where. This type of testing and comparison is a project in and of itself and can take a considerable amount of time. Maybe this isn't one project, but two and only the outputs can be compared.
I'm looking for hard data, not evangelism. Like many of you I love and hate both languages for various reasons. Furthermore, if there is someone out there actively pursuing this same thing I'd be interesting in pitching in some code to see the end results, and I'm sure others would help out too. My guess is that they both have strengths and weaknesses, my goal is to find out precisely what they are so that they can be avoided/exploited in real-world scenarios.

Mike Ash has some hard numbers for performance of various Objective-C method calls versus C and C++ in his post "Performance Comparisons of Common Operations". Also, this post
by Savoy Software is an interesting read when it comes to tuning the performance of an iPhone application by using Objective-C++.
I tend to prefer the clean, descriptive syntax of Objective-C over Objective-C++, and have not found the language itself to be the source of my performance bottlenecks. I even tend to do things that I know sacrifice a little bit of performance if they make my code much more maintainable.

Yes, well written C++ is considerably faster. If you're writing performance critical programs and your C++ is not as fast as C (or within a few percent), something's wrong. If your ObjC implementation is as fast as C, then something's usually wrong -- i.e. the program is likely a bad example of ObjC OOD because it probably uses some 'dirty' tricks to step below the abstraction layer it is operating within, such as direct ivar accesses.
The Mike Ash 'comparison' is very misleading -- I would never recommend the approach to compare execution times of programs you have written, or recommend it to compare C vs C++ vs ObjC. The results presented are provided from a test with compiler optimizations disabled. A program compiled with optimizations disabled is rarely relevant when you are measuring execution times. To view it as a benchmark which compares C++ against Objective-C is flawed. The test also compares individual features, rather than entire, real world optimized implementations -- individual features are combined in very different ways with both languages. This is far from a realistic performance benchmark for optimized implementations. Examples: With optimizations enabled, IMP cache is as slow as virtual function calls. Static dispatch (as opposed to dynamic dispatch, e.g. using virtual) and calls to known C++ types (where dynamic dispatch may be bypassed) may be optimized aggressively. This process is called devirtualization, and when it is used, a member function which is declared virtual may even be inlined. In the case of the Mike Ash test where many calls are made to member functions which have been declared virtual and have empty bodies: these calls are optimized away entirely when the type is known because the compiler sees the implementation and is able to determine dynamic dispatch is unnecessary. The compiler can also eliminate calls to malloc in optimized builds (favoring stack storage). So, enabling compiler optimizations in any of C, C++, or Objective-C can produce dramatic differences in execution times.
That's not to say the presented results are entirely useless. You could get some useful information about external APIs if you want to determine if there are measurable differences between the times they spend in pthread_create or +[NSObject alloc] on one platform or architecture versus another. Of course, these two examples will be using optimized implementations in your test (unless you happen to be developing them). But for comparing one language to another in programs you compile… the presented results are useless with optimizations disabled.
Object Creation
Consider also object creation in ObjC - every object is allocated dynamically (e.g. on the heap). With C++, objects may be created on the stack (e.g. approximately as fast as creating a C struct and calling a simple function in many cases), on the heap, or as elements of abstract data types. Each time you allocate and free (e.g. via malloc/free), you may introduce a lock. When you create a C struct or C++ object on the stack, no lock is required (although interior members may use heap allocations) and it often costs just a few instructions or a few instructions plus a function call.
As well, ObjC objects are reference counted instances. The actual need for an object to be a std::shared_ptr in performance critical C++ is very rare. It's not necessary or desirable in C++ to make every instance a shared, reference counted instance. You have much more control over ownership and lifetime with C++.
Arrays and Collections
Arrays and many collections in C and C++ also use strongly typed containers and contiguous memory. Since the address of the next element's members are often known, the optimizer can do much more, and you have great cache and memory locality. With ObjC, that's far from reality for standard objects (e.g. NSObject).
Dispatch
Regarding methods, many C++ implementations use few virtual/dynamic calls, particularly in highly optimized programs. These are static method calls and fodder for the optimizers.
With ObjC methods, each method call (objc message send) is dynamic, and is consequently a firewall for the optimizer. Ultimately, that results in many restrictions or inconveniences regarding what you can and cannot do to keep performance at a minimum when writing performance critical ObjC. This may result in larger methods, IMP caching, frequent use of C.
Some realtime applications cannot use any ObjC messaging in their render paths. None -- audio rendering is a good example of this. ObjC dispatch is simply not designed for realtime purposes; Allocations and locks may happen behind the scenes when messaging objects, making the complexity/time of objc messaging unpredictable enough that the audio rendering may miss its deadline.
Other Features
C++ also provides generics/template implementations for many of its libraries. These optimize very well. They are typesafe, and a lot of inlining and optimizations may be made with templates (consider it polymorphism, optimization, and specialization which takes place at compilation). C++ adds several features which just are not available or comparable in strict ObjC. Trying to directly compare langs, objects, and libraries which are very different is not so useful -- it's a very small subset of actual realizations. It's better to expand the question to a library/framework or real program, considering many aspects of design and implementation.
Other Points
C and C++ symbols can be more easily removed and optimized away in various stages of the build (stripping, dead code elimination, inlining and early inlining, as well as Link Time Optimization). The benefits of this include reduced binary sizes, reduced launch/load times, reduced memory consumption, etc.. For a single app, that may not be such a big deal; but if you reuse a lot of code, and you should, then your shared libraries could add a lot of unnecessary weight to the program, if implemented ObjC -- unless you are prepared to jump through some flaming hoops. So scalability and reuse are also factors in medium/large projects, and groups where reuse is high.
Included Libraries
ObjC library implementors also optimize for the environment, so its library implementors can make use of some language and environment features to offer optimized implementations. Although there are some pretty significant restrictions when writing an optimized program in pure ObjC, some highly optimized implementations exist in Cocoa. This is one of Cocoa's strong points, although the C++ standard library (what some people call the STL) is no slouch either. Cocoa operates at a much higher level of abstraction than C++ -- if you don't know well what you're doing (or should be doing), operating closer to the metal can really cost you. Falling back on to a good library implementation if you are not an expert in some domain is a good thing, unless you are really prepared to learn. As well, Cocoa's environments are limited; you can find implementations/optimizations which make better use of the OS.
If you're writing optimized programs and have experience doing so in both C++ and ObjC, clean C++ implementations will often be twice as fast or faster than clean ObjC (yes, you can compare against Cocoa). If you know how to optimize, you can often do better than higher level, general purpose abstractions. Although, some optimized C++ implementations will be as fast as or slower than Cocoa's (e.g. my initial attempt at file I/O was slower than Cocoa's -- primarily because the C++ implementation initializes its memory).
A lot of it comes down to the language features you are familiar with. I use both langs, they both have different strengths and models/patterns. They complement each other quite well, and there are great libraries for both. If you're implementing a complex, performance critical program, correct use of C++'s features and libraries will give you much more control and provide significant advantages for optimization, such that in the right hands, "several times faster" is a good default expectation (don't expect to win every time, or without some work, however). Remember, it takes years to understand C++ well enough to really reach that point.
I keep the majority of my performance critical paths as C++, but also recognize that ObjC is also a very good solution for some problems, and that there are some very good libraries available.

It's very hard to collect "hard data" for this that's not misguiding.
The biggest problem with doing a feature-to-feature comparison like you suggest is that the two languages encourage very different coding styles. Objective-C is a dynamic language with duck typing, where typical C++ usage is static. The same object-oriented architecture problem would likely have very different ideal solutions using C++ or Objective-C.
My feeling (as I have programmed much in both languages, mostly on huge projects): To maximize Objective-C performance, it has to be written very close to C. Whereas with C++, it's possible to make much more use of the language without any performance penalty compared to C.
Which one is better? I don't know. For pure performance, C++ will always have the edge. But the OOP style of Objective-C definitely has its merits. I definitely think it is easier to keep a sane architecture with it.

This really isn't something that can be answered in general as it really depends on how you use the language features. Both languages will have things that they are fast at, things that they are slow at, and things that are sometimes fast and sometimes slow. It really depends on what you use and how you use it. The only way to be certain is to profile your code.
In Objective C you can also write c++ code, so it might be easier to code in Objective C for the most part, and if you find something that doesn't perform well in it, then you can have a go at writting a c++ version of it and seeing if that helps (C++ tends to optimize better at compile time). Objective C will be easier to use if APIs you are interfacing with are also written in it, plus you might find it's style of OOP is easier or more flexible.
In the end, you should go with what you know you can write safe, robust code in and if you find an area that needs special attention from the other language, then you can swap to that. X-Code does allow you to compile both in the same project.

I have a couple of tests I did on an iPhone 3G almost 2 years ago, there was no documentation or hard numbers around in those days. Not sure how valid they still are but the source code is posted and attached.
This isn't a very extensive test, I was mainly interested in NSArray vs C Array for iterating a large number of objects.
http://memo.tv/nsarray_vs_c_array_performance_comparison
http://memo.tv/nsarray_vs_c_array_performance_comparison_part_ii_makeobjectsperformselector
You can see the C Array is much faster at high iterations. Since then I've realized that the bottleneck is probably not the iteration of the NSArray but the sending of the message. I wanted to try methodForSelector and calling the methods directly to see how big the difference would be but never got round to it. According to Mike Ash's benchmarks it's just over 5x faster.

I don't have hard data for Objective C, but I do have a good place to look for C++.
C++ started as C with Classes according to Bjarne Stroustroup in his reflection on the early years of C++ (http://www2.research.att.com/~bs/hopl2.pdf), so C++ can be thought of (like Objective C) as pushing C to its limits for object orientation.
What are those limits? In the 1994-1997 time frame, a lot of researchers figured out that object-orientation came at a cost due to dynamic binding, e.g. when C++ functions are marked virtual and there may/may not be children classes that override these functions. (In Java and C#, all functions expect ctors are inherently virtual, and there isnt' much you can do about it.) In "A Study of Devirtualization Techniques for a Java Just-In-Time Compiler" from researchers at IBM Research Tokyo, they contrast the techniques used to deal with this, including one from Urz Hölzle and Gerald Aigner. Urz Hölzle, in a separate paper with Karel Driesen, had shown that on average 5.7% of time in C++ programs (and up to ~50%) was spent in calling virtual functions (e.g. vtables + thunks). He later worked with some Smalltalk researachers in what ended up the Java HotSpot VM to solve these problems in OO. Some of these features are being backported to C++ (e.g. 'protected' and Exception handling).
As I mentioned, C++ is static typed where Objective C is duck typed. The performance difference in execution (but not lines of code) probably is a result of this difference.

This study says to really get the performance in a CPU intensive game, you have to use C. The linked article is complete with a XCode project that you can run.
I believe the bottom line is: Use Objective-C where you must interact with the iPhone's functions (after all, putting trampolines everywhere can't be good for anyone), but when it comes to loops, things like vector object classes, or intensive array access, stick with C++ STL or C arrays to get good performance.
I mean it would be totally silly to see position = [[Vector3 alloc] init] ;. You're just asking for a performance hit if you use references counts on basic objects like a position vector.

yes. c++ reign supreme in performance/expresiveness/resource tradeoff.
"I'm looking for hard data, not evangelism". google is your best friend.
obj-c nsstring is swapped with c++'s by apple enginneers for performance. in a resource constrained devices, only c++ cuts it as a MAINSTREAM oop language.
NSString stringWithFormat is slow
obj-c oop abstraction is deconstructed into procedural-based c-structs for performance, otherwise a MAGNITUDE order slower than java! the author is also aware of message caching - yet no-go. so modeling lots of small players/enemies objects is done in oop with c++ or else, lots of Procedural structs with a simple OOP wrapper around it with obj-c. there can be one paradigm that equates Procedural + Object-Oriented Programming = obj-c.
http://ejourneyman.wordpress.com/2008/04/23/writing-a-ray-tracer-for-cocoa-objective-c/

Related

Tooling for expressive, feature rich numeric computations on the JVM

I am looking for numeric computation tooling on the JVM. My major requirements are expressiveness/readability, ease of use, evaluation and features in terms of mathematical functions. I guess I am after something like the Matlab kernel (probably including some basic libraries and w/o graphics) on the JVM. I'd like to be able to "throw" computional code at a running JVM and want this code to be evaluated. I don't want to worry about types. Arbitrary precision and performance is not so important.
I guess there are some nice libraries out there but I think an appropriate language on top is needed to get the expressiveness.
Which tooling would you guys suggest to address expressive, feature rich numeric computation on the JVM ?
From the jGroovyLab page:
The GroovyLab environment aims to provide a Matlab/Scilab like scientific computing platform that is supported by a scripting engine implemented in Groovy language. The GroovyLab user can work either with a Matlab-lke command console, or with a flexible editor based on the jsyntaxpane (http://code.google.com/p/jsyntaxpane/) component, that offers more convenient code development. Also, GroovyLab supports Computer Algebra based on the symja (http://code.google.com/p/symja/) project.
And there is also GroovyLab:
GroovyLab is a collection of Groovy classes to provide matlab-like syntax and basic features (linear algebra, 2D/3D plots). It is based on jmathplot and jmatharray libs:
Groovy has a smooth learning curve for Java programmers and a flexible syntax similar to Ruby. It is also pretty easy to write a DSL on it.
Though Groovy's performance is pretty good for a dynamic language, you can use static compilation if you are in the need for it.
Most of Mathworks Matlab is built on the Intel Math Kernel Library (MKL), which is (IMHO) the unbeatable champion in linear algebra computations. There is java support, but it costs 500 dollar (the MKL, not just the java support)...
Best second option if you want to use java is jblas, which uses BLAS and LAPACK, the industry standards for linear algebra.
Pure java libraries' performances are horrible apparently, see here...
Spire sounds like it's aiming at the area you're looking at. It takes advantage of a lot of recent scala features such as macros to get decent performance without having to sacrifice the expressiveness of being in a high level language.
There's also breeze, which is targeted at machine learning but includes a fair amount of linear algebra stuff.
Depending how much work you want to get into and what languages you're already familiar with, Incanter in the Clojure world might be worth a look. Also quickly evolving in Clojure right now is core.matrix, which aims to encapsulate high-level common abstractions in linear algebra implemented with various methods or packages.
You highlighted expressiveness in your post, and the nice thing about Clojure is that, as a Lisp, it is possible to make or extend DSLs to closely match problem domains. This is one of the big draws of the language (and of Lisps in general).
I'm the original author of core.matrix for Clojure. So I have a clear affiniy and much more knowledge in this specific space. That said, I'm still going to try and give you an honest answer :-)
I was the the same position as you a year or so back, looking for a solution for numeric computation that would be scalable, flexible and suitable for deployment as a clustered cloud service.
I ended up going with Clojure for the following reasons:
Functional Programming: Clojure is a functional programming language at heart, more so than most other language (although not as much as Haskell....). Lazy infinite sequences, persistent data structures, immutability throughout etc. Makes for elegany code when you are dealing with big computations.
Metaprogramming: I saw a need to do code generation for vector / computational experessions. Hence being a Lisp was a big plus: once you have done code generation in a homoiconic language with a "whole language" macro system then it's hard to find anything else that comes close.
Concurrency - Clojure has an impressive and movel approach to multi-code concurrency. If you haven't seen it then watch: http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
Interactive REPL: Something I've always felt is very important for data work. You want to be able to work with your code / data "live" to get a real feel for its properties. Having a dynamically typed language with an interactive REPL works wonders here.
JVM based: big advantage for pragmantic purposes, because of the huge library / tool ecosystem and the excellent engineering in the JVM as a runtime platform.
Community: I saw a lot of innovation going on in Clojure, particularly around the general area of data and analytics.
The main thing Clojure was lacking at that time was a good library / API for matrix operations. There were some nice tools in Incanter, but they weren't very general purpose or performant. Hence I started developing core.matrix, which is shaping up to be an idiomatic Clojure-flavoured equivalent of NumPY / SciPY. Right now it is still work in progress but good enough for production use if you are careful.
In terms of low-level matrix support, I also maintain vectorz-clj, which is my attempt to provide a core.mattrix implementation that offers high performance vector/matrix operations while remaining Pure Java (i.e. no native dependencies). If you are interested in the performance of this, you may like to see:
http://clojurefun.wordpress.com/2013/03/07/achieving-awesome-numerical-performance-in-clojure/
My second choice after Clojure would have been Scala. I liked Scala's slightly greater maturity and decent static type system. Both the languages are JVM based so the library / tool side was a tie. It was probably the Lisp features that clinched it.
If you happen to have access to Mathematica, then it's fairly easy to get it working with the JVM by means of J/Link. For Clojure, Clojuratica is an excellent library to make that as seemless as possible, although it's not been maintained for a while and it may take some effort to get it working in modern environments again.

Teaching Programming Best Practices to Perl Developers

I have been delivering training on Programming Practices and on Writing Quality Code to participants who have been working on Java since sometime. Object Oriented Analysis and Design is the base and I cover S.O.L.I.D. Principles and excerpts from books like Clean Code, Code Complete 2 and so on.
I am scheduled to deliver training to Perl Programmers(with less than 1 yr. exp. in Perl) in two days and they do not use the Moose(an extension of the Perl 5 object system which brings modern object-oriented language features).
I am now confused as to how to structure my training as they don't follow OOPs.
Any suggestions?
Regards,
Shardul.
Even without Moose, object-oriented programming in Perl is quite possible, and very common. Many CPAN modules offer their functionality through an object-oriented API, even if many of these also offer a non–object-oriented API. (A good example of this duality is IO::Compress::Zip.) Obviously the norms of object-oriented design in Perl are somewhat different from those in some languages — encapsulation is not enforced by the language, for example — but the overall principles and practices are the same.
And even without any sort of object-oriented programming, Moosish or otherwise, there's plenty to talk about in terms of laying out packages, organizing code into functions/subroutines/modules, structuring data, taking advantage of use warnings (or -w) and use strict and -T and CPAN modules, and so on.
I'd also recommend Mark Jason Dominus's book Higher-Order Perl, which he has made available for free download. I don't know to what extent you can race through the whole book in a day and put together something useful in time for your presentation — functional programming is a bit of a paradigm-shift for someone who's not used to it (be it you, or the programmers you're presenting to!) — but you may find some useful things in there that you can use.
A lot of the answers here are answers about teaching OOP to Perl programmers who don't use it, but your question sounds like you're stymied on how to teach a course on code quality, in light of the fact that your Perl programmers do not use OOP, not specifically that you want to teach OOP to non-OO programmers and force them into that paradigm.
That leaves us with two other paradigms of programming which Perl supports well enough:
Good ol' fashioned Structured Programming also Modular Programming
Functional programming support in Perl (also Higher-Order Perl)
I use both of these--combined with a healthy dose of objects, as well. So, I use objects for the same reason that I use good structure and modules and functional pipelines. Using the tool that brings order and sanity to the programming process. For example, object-oriented programming is the main form of polymorphism--but OOP is not polymorphism itself. Thus if you are writing idioms that assist in polymorphism, they assist in polymorphism, they don't have to be stuck in some ad-hoc library "class" and called like UtilClass->meta_operator( $object ) which has little polymorphism itself.
Moose is a great object language, but you don't call Moose->has( attribute => is => 'rw', isa => 'object' ). You call the operator has. The power of Moose lies in a library of objects that encapsulate the meta-operations on classes--but also in simple expressive operators that the rather open syntax of Perl allows. I would call that the appreciation of solving the problems that OOP solves with objects.
Also, I guess I have a problem with your problem, because "not OOP" is a big field. It can range from everything-in-the-mainline coding to not-strictly-OOP (where the process of programming is not simply OOP analysis). So I think you have to know your audience and know what it is they use to keep that code structured and sane. I can't imagine a modern Perl audience that isn't at least object-users.
From there, Perl Best Practices (often abbreviated PBP) can help you. But so would learning that
simply because OOP is one of the best supports for polymorphism it isn't polymorphism in itself
simply because OOP is one of the best supports for encapsulation it isn't encapsulation in itself.
That OOP has been assisted by structured and modular programming--and is not by itself those things. Some of its power is simply just those disciplines.
In addition, as big as an object author and consumer I am, OOP is not the way I think. Reusability is the way I think: What have I done before that I do not want to write again? What have I written that is similar? How can I make my current task just an adapter of what has been written before. (And often: how can I sneak my behavior branch an established module in a single line?)
As a result, a number of my constructs would fail the pedestrian goal of OOP. To give you a better view: I divide code into two "domains": Highly abstract and polymorphic Library code, and the Scripting that I need to do to get the particular function that I'm required in a current project. (this is essentially what "application" means, but I don't think it would be as clear). As a result, polymorphism is mainly instrumental in providing adaptability, but the adaptation itself is whatever takes the least lines of code. My optimum system would be a library that allows scripting/adaptation at any juncture between library behavior and a set of configurations or scripts that address a particular problem. Again, if I had my druthers, configuration would be injected from the scripted domain and no library code would say "I need a properties file" by itself, unless it was a library module encapsulating the algorithm of configuration instanced in properties files. It would just know that it needs "policies" (or decisions from the application domain) in order to fulfil its function.
Thus, my ideal application contains special purpose "objects", which conform to "roles" but where classes are useless overhead--except that the classes perform the behavior which allow injectable data and behavior. So some of my Perl "objects" violate OOP analysis, because they are simply encapsulations of one-off solutions, kind of like the push-pin (expando) JavaScript objects.
I will often (later) revise a special-purpose object and push it further back into the library domain as I find that I need to write something like this again. All objects in the library domain are just on some level of the spectrum of specified behavior. Also, I arrange "data networks" where there is a Sourced type of class that simply encapsulates the behavior of accessing data either in the object itself or another source object. This helps speed my solutions immensely, but I've never seen it addressed in any duck-cat-dog-car-truck OOP primer. Also templating--especially when combined with "data networks"--immensely useful in coding solutions in a half-dozen lines or a half-day of work.
So I guess I'm saying, to the extent that you only know OOP for structuring programming, you won't be able to appreciate how much some older, sound practices or other paradigms do for you--or how things that qualify as OOP can promote mediocre adaptability. (Besides components are far more current than "objects".) Encapsulation solves many problems, but it also promotes the lack of data where you need it. The idea is to get data where you need it so that your canned behavior can realize the specifics of the problem and operate on that.
Reread some stuff on structured programming
Read some stuff on functional programming (assuming that you're not already familiar with it.)
Also it's possible that even an established, "productive" Perl team is writing ... crap. If they are not OOP programmers because they are simply writing crap code, then by all means teach them OOP and if they lack even structured programming *shove both of them down their throats* (I have a hard time considering the label "professional", here).
Take a good look at 'Perl Best Practices' by Damian Conway. It has lots of solid material in it, and you won't go far wrong taking his advice.
Be aware, though, that Getopt::Clade is only available as a placeholder package - it is vapourware, in other words.
You might want to look at what's covered in the "Modern Perl" book too:
http://onyxneon.com/books/modern_perl/
As the others say - plenty to cover without Moose.
Setting up modules/distros
Testing and TAP
Deployment with cpanm / cpan / local::lib
Important changes 5.8 5.10 vs 5.12 vs 5.14, autodie etc.
Perl programmers must know about Perl's weakly functional features, like list contexts, map, grep, etc. A little functional style makes Perl infinitely more readable.
Perl programmers must also understand Perl's traditional OO features, especially modules, bless, and tie. Make them write an object or maybe tie a Cache::Memcached object around a query or something.

What are "not so well defined problems" that LISP is supposed to solve?

Most people agree that LISP helps to solve problems that are not well defined, or that are not fully understood at the beginning of the project.
"Not fully understood"" might indicate that we don't know what problem we are trying to solve, so the developer refines the problem domain continuously. But isn't this process language independent?
All this refinement does not take away the need for, say, developing algorithms/solutions for the final problem that does need to be solved. And that is the actual work.
So, I'm not sure what advantage LISP provides if the developer has no idea where he's going i.e. solving a problem that is not finalised yet.
Lisp (not "LISP") has a number of advantages when you're facing problems that are not well-defined. First of all, you have a REPL where you can quickly experiment with -- that helps in sketching out quick functions and trying to play with them, leading to a very rapid development cycle. Second, having a dynamically typed language is working well in this context too: with a statically typed language you need to "design more" before you begin, and changing the design leads to changing more code -- in contrast, with Lisps you just write the code and the data it operates on can change as needed. In addition to these, there's the usual benefits of a functional language -- one with first class lambda functions, etc (eg, garbage collection).
In general, these advantage have been finding their way into other languages. For example, Javascript has everything that I listed so far. But there is one more advantage for Lisps that is still not present in other languages -- macros. This is an important tool to use when your problem calls for a domain specific language. Basically, in Lisp you can extend the language with constructs that are specific to your problem -- even if these constructs lead to a completely different language.
Finally, you need to plan ahead for what happens when the code becomes more than a quick experiment. In this case you want your language to cope with "growing scripts into applications" -- for example, having a module system means that you can get a more "serious"
application. For example, in Racket you can get your solution separated into such modules, where each can be written in its own language -- it even has a statically typed language which makes it possible to start with a dynamically typed development cycle and once the code becomes more stable and/or big enough that maintenance becomes difficult, you can switch some modules into the static language and get the usual benefits from that. Racket is actually unique among Lisps and Schemes in this kind of support, but even with others the situation is still far more advanced than in non-Lisp languages.
In AI (Artificial Intelligence) historically Lisp was seen as the AI assembly language. It was used to build higher-level languages which help to work with the problem domain in a more direct way. Many of these domains need a lot of 'knowledge' for finding usable answers.
A typical example is an expert system for, say, oil exploration. The expert system gets as inputs (geological) observations and gives information about the chances to find oil, what kind of oil, in what depths, etc. To do that it needs 'expert knowledge' how to interpret the data. When you start such a project to develop such an expert system it is typically not clear what kind of inferences are needed, what kind of 'knowledge' experts can provide and how this 'knowledge' can be written down for a computer.
In this case one typically develops new languages on top of Lisp and you are not working with a fixed predefined language.
As an example see this old paper about Dipmeter Advisor, a Lisp-based expert system developed by Schlumberger in the 1980s.
So, Lisp does not solve any problems. But it was originally used to solve problems that are complex to program, by providing new language layers which should make it easier to express the domain 'knowledge', rules, constraints, etc. to find solutions which are not straight forward to compute.
The "big" win with a language that allows for incremental development is that you (typically) has a read-eval-print loop (or "listener" or "console") that you interact with, plus you tend to not need to lose state when you compile and load new code.
The ability to keep state around from test run to test run means that lengthy computations that are untouched by your changes can simply be kept around instead of being re-computed.
This allows you to experiment and iterate faster. Being able to iterate faster means that exploration is less of a hassle. Very useful for exploratory programming, something that is typical with dealing with less well-defined problems.

Comparing Common Lisp with Gambit w.r.t their library access and object systems

I'm pretty intrigued by Gambit Scheme, in particular by its wide range of supported platforms, and its ability to put C code right in your Scheme source when needed. That said, it is a Scheme, which has fewer "batteries included" as compared to Common Lisp. Some people like coding lots of things from scratch, (a.k.a. vigorous yak-shaving) but not me!
This brings me to my two questions, geared to people who have used both Gambit and some flavor of Common Lisp:
1) Which effectively has better access to libraries? Scheme has fewer libraries than Common Lisp. However, Gambit Scheme has smoother access to C/C++ code & libraries, which far outnumber Common Lisp's libraries. In your opinion, does the smoothness of Gambit's FFI outweigh its lack of native libraries?
2) How do Scheme's object systems (e.g. TinyCLOS, Meroon) compare to Common Lisp's CLOS? If you found them lacking, what feature(s) did you miss most? Finally, how important is an object system in Lisp/Scheme in the first place? I have heard of entire lisp-based companies (e.g. ITA Software) forgoing CLOS altogether. Are objects really that optional in Lisp/Scheme? I do fear that if Gambit has no good object system, I may miss them (my programming background is purely object-oriented).
Thanks for helping an aspiring convert from C++/Python,
-- Matt
PS: Someone with more than 1500 rep, could you please create a "gambit" tag? :) Thanks Jonas!
Sure Scheme as a whole has fewer libraries in the defined standard, but any given Scheme implementation usually builds on that standard to include more "batteries included" type of functions.
Gambit, for example, uses the Snow package system which will give you access to several support libraries.
Other Schemes fare even better, having access to more (or better) support libraries. Both Racket (with PlaneT) and Chicken (with eggs) immediately come to mind.
That said, the Common Lisp is quite rich also and a large number of interesting and useful libraries are a simple asdf-install away.
As for Scheme object systems, I personally tend to favor Chicken Scheme and have taken to favoring coops. That said, there's absolutely nothing wrong with TinyCLOS. Either would serve well and don't really find anything to be lacking. Though that last statement might have more to do with the fact that I don't tend to rely on a lot of object oriented-isms when writing Scheme. Both systems in my experience tend to surface when I want to write "protocols" and then have a way of specializing on the protocol, if that makes sense.
1) I haven't used Gambit Scheme, so I cannot really tell how smooth the C/C++ integration is. But all Common Lisps I have used have fully functional C FFI:s. So the availability of C libraries is the same. It takes some work to integrate, but I assume this is the case with Gambit Scheme as well. After all, Lisp and C are different languages..? But maybe you have a different experience, I would like to learn more in that case.
You may be interested in Quicklisp, a really good new Common Lisp project - it makes it very easy to install a lot of quality libraries.
2) C++ and Python are designed to use OOP and classes as the typical means of encapsulating and structuring data. CLOS does not have this ambition at all. Instead, it provides generic functions that can be specialized for certain types of arguments - not necessarily classes. Essentially this enables OOP, but in Common Lisp, OOP is a convenient feature rather than something fundamental for getting things done.
I think CLOS is a lot more well-designed and flexible than the C++ object model - TinyCLOS should be no different in that aspect.

Language requirements for AI development [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why is Lisp used for AI?
What makes a language suitable for Artificial Intelligence development?
I've heard that LISP and Prolog are widely used in this field. What features make them suitable for AI?
Overall I would say the main thing I see about languages "preferred" for AI is that they have high order programming along with many tools for abstraction.
It is high order programming (aka functions as first class objects) that tends to be a defining characteristic of most AI languages http://en.wikipedia.org/wiki/Higher-order_programming that I can see. That article is a stub and it leaves out Prolog http://en.wikipedia.org/wiki/Prolog which allows high order "predicates".
But basically high order programming is the idea that you can pass a function around like a variable. Surprisingly a lot of the scripting languages have functions as first class objects as well. LISP/Prolog are a given as AI languages. But some of the others might be surprising. I have seen several AI books for Python. One of them is http://www.nltk.org/book. Also I have seen some for Ruby and Perl. If you study more about LISP you will recognize a lot of its features are similar to modern scripting languages. However LISP came out in 1958...so it really was ahead of its time.
There are AI libraries for Java. And in Java you can sort of hack functions as first class objects using methods on classes, it is harder/less convenient than LISP but possible. In C and C++ you have function pointers, although again they are much more of a bother than LISP.
Once you have functions as first class objects, you can program much more generically than is otherwise possible. Without functions as first class objects, you might have to construct sum(array), product(array) to perform the different operations. But with functions as first class objects you could compute accumulate(array, +) and accumulate(array, *). You could even do accumulate(array, getDataElement, operation). Since AI is so ill defined that type of flexibility is a great help. Now you can build much more generic code that is much easier to extend in ways that were not originally even conceived.
And Lambda (now finding its way all over the place) becomes a way to save typing so that you don't have to define every function. In the previous example, instead of having to make getDataElement(arrayelement) { return arrayelement.GPA } somewhere you can just say accumulate(array, lambda element: return element.GPA, +). So you don't have to pollute your namespace with tons of functions to only be called once or twice.
If you go back in time to 1958, basically your choices were LISP, Fortran, or Assembly. Compared to Fortran LISP was much more flexible (unfortunately also less efficient) and offered much better means of abstraction. In addition to functions as first class objects, it also had dynamic typing, garbage collection, etc. (stuff any scripting language has today). Now there are more choices to use as a language, although LISP benefited from being first and becoming the language that everyone happened to use for AI. Now look at Ruby/Python/Perl/JavaScript/Java/C#/and even the latest proposed standard for C you start to see features from LISP sneaking in (map/reduce, lambdas, garbage collection, etc.). LISP was way ahead of its time in the 1950's.
Even now LISP still maintains a few aces in the hole over most of the competition. The macro systems in LISP are really advanced. In C you can go and extend the language with library calls or simple macros (basically a text substitution). In LISP you can define new language elements (think your own if statement, now think your own custom language for defining GUIs). Overall LISP languages still offer ways of abstraction that the mainstream languages still haven't caught up with. Sure you can define your own custom compiler for C and add all the language constructs you want, but no one does that really. In LISP the programmer can do that easily via Macros. Also LISP is compiled and per the programming language shootout, it is more efficient than Perl, Python, and Ruby in general.
Prolog basically is a logic language made for representing facts and rules. What are expert systems but collections of rules and facts. Since it is very convenient to represent a bunch of rules in Prolog, there is an obvious synergy there with expert systems.
Now I think using LISP/Prolog for every AI problem is not a given. In fact just look at the multitude of Machine Learning/Data Mining libraries available for Java. However when you are prototyping a new system or are experimenting because you don't know what you are doing, it is way easier to do it with a scripting language than a statically typed one. LISP was the earliest languages to have all these features we take for granted. Basically there was no competition at all at first.
Also in general academia seems to like functional languages a lot. So it doesn't hurt that LISP is functional. Although now you have ML, Haskell, OCaml, etc. on that front as well (some of these languages support multiple paradigms...).
The main calling card of both Lisp and Prolog in this particular field is that they support metaprogramming concepts like lambdas. The reason that is important is that it helps when you want to roll your own programming language within a programming language, like you will commonly want to do for writing expert system rules.
To do this well in a lower-level imperative language like C, it is generally best to just create a separate compiler or language library for your new (expert system rule) language, so you can write your rules in the new language and your actions in C. This is the principle behind things like CLIPS.
The two main things you want are the ability to do experimental programming and the ability to do unconventional programming.
When you're doing AI, you by definition don't really know what you're doing. (If you did, it wouldn't be AI, would it?) This means you want a language where you can quickly try things and change them. I haven't found any language I like better than Common Lisp for that, personally.
Similarly, you're doing something not quite conventional. Prolog is already an unconventional language, and Lisp has macros that can transform the language tremendously.
What do you mean by "AI"? The field is so broad as to make this question unanswerable. What applications are you looking at?
LISP was used because it was better than FORTRAN. Prolog was used, too, but no one remembers that. This was when people believed that symbol-based approaches were the way to go, before it was understood how hard the sensing and expression layers are.
But modern "AI" (machine vision, planners, hell, Google's uncanny ability to know what you 'meant') is done in more efficient programming languages that are more sustainable for a large team to develop in. This usually means C++ these days--but it's not like anyone thinks of C++ as a good language for AI.
Hell, you can do a lot of what was called "AI" in the 70s in MATLAB. No one's ever called MATLAB "a good language for AI" before, have they?
Functional programming languages are easier to parallelise due to their stateless nature. There seems to already be a subject about it with some good answers here: Advantages of stateless programming?
As said, its also generally simpler to build programs that generate programs in LISP due to the simplicity of the language, but this is only relevant to certain areas of AI such as evolutionary computation.
Edit:
Ok, I'll try and explain a bit about why parallelism is important to AI using Symbolic AI as an example, as its probably the area of AI that I understand best. Basically its what everyone was using back in the day when LISP was invented, and the Physical Symbol Hypothesis on which it is based is more or less the same way you would go about calculating and modelling stuff in LISP code. This link explains a bit about it:
http://www.cs.st-andrews.ac.uk/~mkw/IC_Group/What_is_Symbolic_AI_full.html
So basically the idea is that you create a model of your environment, then searching through it to find a solution. One of the simplest to algorithms to implement is a breadth first search, which is an exhaustive search of all possible states. While producing an optimal result, it is usually prohibitively time consuming. One way to optimise this is by using a heuristic (A* being an example), another is to divide the work between CPUs.
Due to statelessness, in theory, any node you expand in your search could be ran in a separate thread without the complexity or overhead involved in locking shared data. In general, assuming the hardware can support it, then the more highly you can parallelise a task the faster you will get your result. An example of this could be the folding#home project, which distributes work over many GPUs to find optimal protein folding configurations (that may not have anything to do with LISP, but is relevant to parallelism).
As far as I know from LISP is that is a Functional Programming Language, and with it you are able to make "programs that make programs. I don't know if my answer suits your needs, see above links for more information.
Pattern matching constructs with instantiation (or the ability to easily construct pattern matching code) are a big plus. Pattern matching is not totally necessary to do A.I., but it can sure simplify the code for many A.I. tasks. I'm finding this also makes F# a convenient language for A.I.
Languages per se (without libraries) are suitable/comfortable for specific areas of research/investigation and/or learning/studying ("how to do the simplest things in the hardest way").
Suitability for commercial development is determined by availability of frameworks, libraries, development tools, communities of developers, adoption by companies. For ex., in internet you shall find support for any, even the most exotic issue/areas (including, of course, AI areas), for ex., in C# because it is mainstream.
BTW, what specifically is context of question? AI is so broad term.
Update:
Oooops, I really did not expect to draw attention and discussion to my answer.
Under ("how to do the simplest things in the hardest way"), I mean that studying and learning, as well as academic R&D objectives/techniques/approaches/methodology do not coincide with objectives of (commercial) development.
In student (or even academic) projects one can write tons of code which would probably require one line of code in commercial RAD (using of component/service/feature of framework or library).
Because..! oooh!
Because, there is no sense to entangle/develop any discussion without first agreeing on common definitions of terms... which are subjective and depend on context... and are not so easy to be formulate in general/abstract context.
And this is inter-disciplinary matter of whole areas of different sciences
The question is broad (philosophical) and evasively formulated... without beginning and end... having no definitive answers without of context and definitions...
Are we going to develop here some spec proposal?