Common Lisp: Are all functions built from the core functions, CAR, CDR, CONS, etc?

Common Lisp: Are all functions built from the core functions, CAR, CDR, CONS, etc? - lisp

True or False?
Common Lisp has a mountain of functions. All of those functions are built (or could be built) using this small set of core functions: CAR, CDR, CONS, ATOM, EQ, QUOTE, COND, LAMBDA, LABEL, NULL.
If the answer is False, can you provide an example of a function that cannot be implemented using the core functions? Perhaps the list of core functions is incomplete, and another or two additional core functions are required?

[..] All of those functions are built (or could be built) using [..]
The important part is the could, which you already figured out yourself. That (almost*) all of (a) Lisp could be built using that small set of core functions (forms) is part of the beauty of Lisp. In practice, though, the set of functions (forms) not implemented in Lisp is a lot larger.
Why do implementations then bother to implement that much, when they only could implement that minimal core? As a small example, think of this expression:
(+ 1 2)
You could implement a Lisp that uses only the small set of core functions and it would be able (given an appropriate parser for the numbers) to evaluate this expression. But it would be painfully slow! The systems (CPUs) that are available to us mostly provide a lot of different instructions which Lisp implementations (especially the compiling ones) try to leverage as much as possible in order to allow fast execution of Lisp programs. And, to get back to the example, that also means that one wouldn't do actual calculations using peano arithmetic but rather use the "boolean logic arithmetic" implemented in hardware.
If the answer is False, can you provide an example of a function that cannot be implemented using the core functions?
That's easy, how'd you implement format? Anything that's not part of the "algorithmic" nature of a programming language, i.e. the stuff interfacing with the "outside world", is often not implemented in itself but - being dependent on the underlying system - most often in C or assembly.

False. Any function which deals with a datatype other than conses can not be implemented from your proposed core set: for instance important types such as NUMBER and SYMBOL can not be dealt with at all. Any function which does I/O likewise, and probably other vast reaches of the language.
Your list sounds like it came from some incomplete description of a proposed core of Lisp 1.5.

Related

Typed Racket Optimizer

I am learning some Typed Racket at the moment and i have a somewhat philosophical dilemma:
Racket claims to be a language development framework and Typed Racket is one such languages implemented on top of it. The documentation mentions that due to types being used, the compiler now can do more/better optimizations.
The concrete question:
Where do these optimizations happen?
1) In the compile/expand part (which is "programmable" as part of the language building framework)
-or-
2) further down the line in the (bytecode) optimizer (which is written in C and not directly modifieable via the framework).
If 2) is true, does that mean the type information is lost after the compile/expand stage and later "rebuilt/guessed" by the optimizer or has the intermediate representation been altered to to accomodate the type information and inform later stages about them?
The reason i am asking this specific question is because i want to get a feeling for how general the Racket language framework really is, i.e. is also viable for statically typed languages without any modifications in the backend versus the type system being only a front-end thing, while the code at runtime is still dynamically typed (but statically checked of course).
Thank you.

Typed Racket's optimizations occur during macro expansion. To see for yourself, you can change #lang typed/racket to #lang typed/racket #:no-optimize, which shows Typed Racket is in complete control of what optimizations are applied.
The optimizations consist of using type information to replace various uses of certain procedures with their unsafe equivalents. The unsafe procedures perform no runtime checks on the types of their arguments and cause undefined behavior (read: segfaults) if used incorrectly. You can find out more in the documentation section entitled Optimization in Typed Racket.
The exposure of the unsafe variants of procedures is what really makes it possible for user-defined languages to implement these optimizations. For example, if you wrote your own language with a type system that could prove vectors were never accessed with out-of-bounds indices you could replaces uses of vector-ref with unsafe-vector-ref.
There are similar optimizations that occur at the bytecode level, but these mostly apply when the JIT can infer type information that's not visible at macro expansion time. These are not user-controlled, but you don't have to rely on them.

Would the ability to declare Lisp functions 'pure' be beneficial?

I have been reading a lot about Haskell lately, and the benefits that it derives from being a purely functional language. (I'm not interested in discussing monads for Lisp) It makes sense to me to (at least logically) isolate functions with side-effects as much as possible. I have used setf and other destructive functions plenty, and I recognize the need for them in Lisp and (most of) its derivatives.
Here we go:
Would something like (declare pure) potentially help an optimizing compiler? Or is this a moot point because it already knows?
Would the declaration help in proving a function or program, or at least a subset that was declared as pure? Or is this again something that is unnecessary because it's already obvious to the programmer and compiler and prover?
If for nothing else, would it be useful to a programmer for the compiler to enforce purity for functions with this declaration and add to the readability/maintainablity of Lisp programs?
Does any of this make any sense? Or am I too tired to even think right now?
I'd appreciate any insights here. Info on compiler implementation or provability is welcome.
EDIT
To clarify, I didn't intend to restrict this question to Common Lisp. It clearly (I think) doesn't apply to certain derivative languages, but I'm also curious if some features of other Lisps may tend to support (or not) this kind of facility.

You have two answers but neither touch on the real problem.
First, yes, it would obviously be good to know that a function is pure. There's a ton of compiler level things that would like to know that, as well as user level things. Given that lisp languages are so flexible, you could twist things a bit: instead of a "pure" declaration that asks the compiler to try harder or something, you just make the declaration restrict the code in the definition. This way you can guarantee that the function is pure.
You can even do that with additional supporting facilities -- I mentioned two of them in a comment I made to johanbev's answer: add the notion of immutable bindings and immutable data structures. I know that in Common Lisp these are very problematic, especially immutable bindings (since CL loads code by "side-effecting" it into place). But such features will help simplifying things, and they're not inconceivable (see for example the Racket implementation that has immutable pairs and other data structures, and has immutable bindings.
But the real question is what can you do in such restricted functions. Even a very simple looking problem would be infested with issues. (I'm using Scheme-like syntax for this.)
(define-pure (foo x)
(cons (+ x 1) (bar)))
Seems easy enough to tell that this function is indeed pure, it doesn't do anything . Also, seems that having define-pure restrict the body and allow only pure code would work fine in this case, and will allow this definition.
Now start with the problems:
It's calling cons, so it assumes that it is also known to be pure. In addition, as I mentioned above, it should rely on cons being what it is, so assume that the cons binding is immutable. Easy, since it's a known builtin. Do the same with bar, of course.
But cons does have a side effect (even if you're talking about Racket's immutable pairs): it allocates a new pair. This seems like a minor and ignorable point, but, for example, if you allow such things to appear in pure functions, then you won't be able to auto-memoize them. The problem is that someone might rely on every foo call returning a new pair -- one that is not-eq to any other existing pair. Seems that to make it fine you need to further restrict pure functions to deal not only with immutable values, but also values where the constructor doesn't always create a new value (eg, it could hash-cons instead of allocate).
But that code also calls bar -- so no you need to make the same assumptions on bar: it must be known as a pure function, with an immutable binding. Note specifically that bar receives no arguments -- so in that case the compiler could not only require that bar is a pure function, it could also use that information and pre-compute its value. After all, a pure function with no inputs could be reduced to a plain value. (Note BTW that Haskell doesn't have zero-argument functions.)
And that brings another big issue in. What if bar is a function of one input? In that case you'd have an error, and some exception will get thrown ... and that's no longer pure. Exceptions are side-effects. You now need to know the arity of bar in addition to everything else, and you need to avoid other exceptions. Now, how about that input x -- what happens if it isn't a number? That will throw an exception too, so you need to avoid it too. This means that you now need a type system.
Change that (+ x 1) to (/ 1 x) and you can see that not only do you need a type system, you need one that is sophisticated enough to distinguish 0s.
Alternatively, you could re-think the whole thing and have new pure arithmetic operations that never throw exceptions -- but with all the other restrictions you're now quite a long way from home, with a language that is radically different.
Finally, there's one more side-effect that remains a PITA: what if the definition of bar is (define-pure (bar) (bar))? It certainly is pure according to all of the above restrictions... But diverging is a form of a side effect, so even this is no longer kosher. (For example, if you did make your compiler optimize nullary functions to values, then for this example the compiler itself would get stuck in an infinite loop.) (And yes, Haskell doesn't deal with that, it doesn't make it less of an issue.)

Given a Lisp function, knowing if it is pure or not is undecidable in general. Of course, necessary conditions and sufficient conditions can be tested at compile time. (If there are no impure operations at all, then the function must be pure; if an impure operation gets executed unconditionally, then the function must be impure; for more complicated cases, the compiler could try to prove that the function is pure or impure, but it will not succeed in all cases.)
If the user can manually annotate a function as pure, then the compiler could either (a.) try harder to prove that the function is pure, ie. spend more time before giving up, or (b.) assume that it is and add optimizations which would not be correct for impure functions (like, say, memoizing results). So, yes, annotating functions as pure could help the compiler if the annotations are assumed to be correct.
Apart from heuristics like the "trying harder" idea above, the annotation would not help to prove stuff, because it's not giving any information to the prover. (In other words, the prover could just assume that the annotation is always there before trying.) However, it could make sense to attach to pure functions a proof of their purity.
The compiler could either (a.) check if pure functions are indeed pure at compile time, but this is undecidable in general, or (b.) add code to try to catch side effects in pure functions at runtime and report those as an error. (a.) would probably be helpful with simple heuristics (like "an impure operation gets executed unconditionally), (b.) would be useful for debug.
No, it seems to make sense. Hopefully this answer also does.

The usual goodies apply when we can assume purity and referential
transparency. We can automatically memoize hotspots. We can
automatically parallelize computation. We can deal away with a lot of
race conditions. We can also use structure sharing with data that we
know cannot be modified, for instance the (quasi) primitive ``cons()''
does not need to copy the cons-cells in the list it's consing to.
These cells are not affected in any way by having another cons-cell
pointing to it. This example is kinda obvious, but compilers are often
good performers in figuring out more complex structure sharing.
However, actually determining if a lambda (a function) is pure or has
referential transparency is very tricky in Common Lisp. Remember that
a funcall (foo bar) start by looking at (symbol-function foo). So in
this case
(defun foo (bar)
(cons 'zot bar))
foo() is pure.
The next lambda is also pure.
(defun quux ()
(mapcar #'foo '(zong ding flop)))
However, later on we can redefine foo:
(let ((accu -1))
(defun foo (bar)
(incf accu)))
The next call to quux() is no longer pure! The old pure foo() has been
redefined to an impure lambda. Yikes. This example is maybe somewhat
contrived but it's not that uncommon to lexically redefine some
functions, for instance with a let block. In that case it's not
possible to know what would happen at compile time.
Common Lisp has a very dynamic semantic, so actually being
able to determine control flow and data flow ahead of time (for
instance when compiling) is very hard, and in most useful cases
entirely undecidable. This is quite typical of languages with dynamic
type systems. There is a lot of common idioms in Lisp you cannot use
if you must use static typing. It's mainly these that fouls any
attempt to do much meaningful static analysis. We can do it for primitives
like cons and friends. But for lambdas involving other things than
primitives we are in much deeper water, especially in the cases where
we need to look at complex interplay between functions. Remember that
a lambda is only pure if all the lambdas it calls are also pure.
On the top of my head, it could be possible, with some deep macrology,
to do away with the redefinition problem. In a sense, each lambda gets
an extra argument which is a monad that represents the entire state of
the lisp image (we can obviously restrict ourselves to what the function
will actually look at). But it's probably more useful to be able do
declare purity ourselves, in the sense that we promise the compiler
that this lambda is indeed pure. The consequences if it isn't is then
undefined, and all sorts of mayhem could ensue...

Use of Clojure macros for DSLs

I am working on a Clojure project and I often find myself writing Clojure macros for DSLs, but I was watching a Clojure video of how a company uses Clojure in their real work and the speaker said that in practical use they do not use macros for their DSLs, they only use macros to add a little syntactic sugar. Does this mean I should write my DSL in using standard functions and then add a few macros at the end?
Update:
After reading the many varied (and entertaining) responses to this question I have realized that the answer is not as clear cut as I first thought, for many reasons:
There are many different types of API in an application (internal, external)
There are many types of user of the API (business user who just wants to get something done fast, Clojure expert)
Is there macro there to hide boiler plate code?
I will go away and think about the question more deeply, but thanks for your answers as they have given me lots to think about. Also I noticed that Paul Graham thinks the opposite of the Christophe video and thinks macros should be a large part of the codebase (25%):
http://www.paulgraham.com/avg.html

To some extent I believe this depends on the use / purpose of your DSL.
If you are writing a library-like DSL to be used in Clojure code and want it to be used in a functional way, then I would prefer functions over macros. Functions are "nice" for Clojure users because they can be composed dynamically into higher order functions etc. For example, you are writing a functional web framework like Ring.
If you are writing a imperative DSL that will be used pretty independently of other Clojure code and you have decided that you definitely don't need higher order functions, then the usage will be pretty similar and you can chose whichever makes most sense. For example, you might be creating some kind of business rules engine.
If you are writing a specialised DSL that needs to produce highly performant code, then you will probably want to use macros most of the time since they will be expanded at compile time for maximum efficiency. For example, you're writing some graphics code that needs to expand to exactly the right sequence of OpenGL calls......

Yes!
Write functions whenever possible. Never write a macro when a function will do. If you write to many macros you end up with somthing that is much harder to extend. Macros for example cant be applied or passed around.
Christophe Grand: (not= DSL macros)
http://clojure.blip.tv/file/4522250/

No!
Don't be afraid of using macros extensively. Always write a macro when in doubt. Functions are inferior for implementing DSLs - they're taking the burden onto the runtime, whereas macros allows to do many heavyweight computations in a compilation time. Just think of a difference of implementing, say, an embedded Prolog as an interpreter function and as a macro which compiles Prolog into some form of a WAM.
And do not listen to those who say that "macros cant be applied or passed around", this argument is entirely a strawman. Those people are advocating interpreters over compilers, which is simply ridiculous.
A couple of tips on how to implement DSLs using macros:
Do it in stages. Define a long chain of languages from your DSL to the underlying Clojure. Keep each transform as simple as possible - this way you'd be able to easily maintain and debug your DSL compiler.
Prepare a toolbox of DSL components that you will reuse when implementing your DSLs. It should include target languages of different semantics (e.g., untyped eager functional - it is Clojure itself, untyped lazy functional, first order logic, typed imperative, Hindley-Millner typed eager functional, dataflow, etc.). With macros it is trivial to combine properties of all that target semantics seamlessly.
Maintain a set of compiler-building tools. It should include parser generators (useful even if your DSLs are entirely in S-expressions), term rewriting engines, pattern matching engines, implementations for some common algorithms on graphs (e.g., graph colouring), etc.

Here's an example of a DSL in Haskell that uses functions rather than macros:
http://contracts.scheming.org/
Here is a video of Simon Peyton Jones giving a talk about this implementation:
http://ulf.wiger.net/weblog/2008/02/29/simon-peyton-jones-composing-contracts-an-adventure-in-financial-engineering/
Leverage the characteristics of Clojure and FP before going down the path of implementing your own language. I think SK-logic's tips give you a good indication of what is needed to implement a full blown language. There are times when it's worth the effort, but those are rare.

Comparing Common Lisp with Gambit w.r.t their library access and object systems

I'm pretty intrigued by Gambit Scheme, in particular by its wide range of supported platforms, and its ability to put C code right in your Scheme source when needed. That said, it is a Scheme, which has fewer "batteries included" as compared to Common Lisp. Some people like coding lots of things from scratch, (a.k.a. vigorous yak-shaving) but not me!
This brings me to my two questions, geared to people who have used both Gambit and some flavor of Common Lisp:
1) Which effectively has better access to libraries? Scheme has fewer libraries than Common Lisp. However, Gambit Scheme has smoother access to C/C++ code & libraries, which far outnumber Common Lisp's libraries. In your opinion, does the smoothness of Gambit's FFI outweigh its lack of native libraries?
2) How do Scheme's object systems (e.g. TinyCLOS, Meroon) compare to Common Lisp's CLOS? If you found them lacking, what feature(s) did you miss most? Finally, how important is an object system in Lisp/Scheme in the first place? I have heard of entire lisp-based companies (e.g. ITA Software) forgoing CLOS altogether. Are objects really that optional in Lisp/Scheme? I do fear that if Gambit has no good object system, I may miss them (my programming background is purely object-oriented).
Thanks for helping an aspiring convert from C++/Python,
-- Matt
PS: Someone with more than 1500 rep, could you please create a "gambit" tag? :) Thanks Jonas!

Sure Scheme as a whole has fewer libraries in the defined standard, but any given Scheme implementation usually builds on that standard to include more "batteries included" type of functions.
Gambit, for example, uses the Snow package system which will give you access to several support libraries.
Other Schemes fare even better, having access to more (or better) support libraries. Both Racket (with PlaneT) and Chicken (with eggs) immediately come to mind.
That said, the Common Lisp is quite rich also and a large number of interesting and useful libraries are a simple asdf-install away.
As for Scheme object systems, I personally tend to favor Chicken Scheme and have taken to favoring coops. That said, there's absolutely nothing wrong with TinyCLOS. Either would serve well and don't really find anything to be lacking. Though that last statement might have more to do with the fact that I don't tend to rely on a lot of object oriented-isms when writing Scheme. Both systems in my experience tend to surface when I want to write "protocols" and then have a way of specializing on the protocol, if that makes sense.

1) I haven't used Gambit Scheme, so I cannot really tell how smooth the C/C++ integration is. But all Common Lisps I have used have fully functional C FFI:s. So the availability of C libraries is the same. It takes some work to integrate, but I assume this is the case with Gambit Scheme as well. After all, Lisp and C are different languages..? But maybe you have a different experience, I would like to learn more in that case.
You may be interested in Quicklisp, a really good new Common Lisp project - it makes it very easy to install a lot of quality libraries.
2) C++ and Python are designed to use OOP and classes as the typical means of encapsulating and structuring data. CLOS does not have this ambition at all. Instead, it provides generic functions that can be specialized for certain types of arguments - not necessarily classes. Essentially this enables OOP, but in Common Lisp, OOP is a convenient feature rather than something fundamental for getting things done.
I think CLOS is a lot more well-designed and flexible than the C++ object model - TinyCLOS should be no different in that aspect.

Language requirements for AI development [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why is Lisp used for AI?
What makes a language suitable for Artificial Intelligence development?
I've heard that LISP and Prolog are widely used in this field. What features make them suitable for AI?

Overall I would say the main thing I see about languages "preferred" for AI is that they have high order programming along with many tools for abstraction.
It is high order programming (aka functions as first class objects) that tends to be a defining characteristic of most AI languages http://en.wikipedia.org/wiki/Higher-order_programming that I can see. That article is a stub and it leaves out Prolog http://en.wikipedia.org/wiki/Prolog which allows high order "predicates".
But basically high order programming is the idea that you can pass a function around like a variable. Surprisingly a lot of the scripting languages have functions as first class objects as well. LISP/Prolog are a given as AI languages. But some of the others might be surprising. I have seen several AI books for Python. One of them is http://www.nltk.org/book. Also I have seen some for Ruby and Perl. If you study more about LISP you will recognize a lot of its features are similar to modern scripting languages. However LISP came out in 1958...so it really was ahead of its time.
There are AI libraries for Java. And in Java you can sort of hack functions as first class objects using methods on classes, it is harder/less convenient than LISP but possible. In C and C++ you have function pointers, although again they are much more of a bother than LISP.
Once you have functions as first class objects, you can program much more generically than is otherwise possible. Without functions as first class objects, you might have to construct sum(array), product(array) to perform the different operations. But with functions as first class objects you could compute accumulate(array, +) and accumulate(array, *). You could even do accumulate(array, getDataElement, operation). Since AI is so ill defined that type of flexibility is a great help. Now you can build much more generic code that is much easier to extend in ways that were not originally even conceived.
And Lambda (now finding its way all over the place) becomes a way to save typing so that you don't have to define every function. In the previous example, instead of having to make getDataElement(arrayelement) { return arrayelement.GPA } somewhere you can just say accumulate(array, lambda element: return element.GPA, +). So you don't have to pollute your namespace with tons of functions to only be called once or twice.
If you go back in time to 1958, basically your choices were LISP, Fortran, or Assembly. Compared to Fortran LISP was much more flexible (unfortunately also less efficient) and offered much better means of abstraction. In addition to functions as first class objects, it also had dynamic typing, garbage collection, etc. (stuff any scripting language has today). Now there are more choices to use as a language, although LISP benefited from being first and becoming the language that everyone happened to use for AI. Now look at Ruby/Python/Perl/JavaScript/Java/C#/and even the latest proposed standard for C you start to see features from LISP sneaking in (map/reduce, lambdas, garbage collection, etc.). LISP was way ahead of its time in the 1950's.
Even now LISP still maintains a few aces in the hole over most of the competition. The macro systems in LISP are really advanced. In C you can go and extend the language with library calls or simple macros (basically a text substitution). In LISP you can define new language elements (think your own if statement, now think your own custom language for defining GUIs). Overall LISP languages still offer ways of abstraction that the mainstream languages still haven't caught up with. Sure you can define your own custom compiler for C and add all the language constructs you want, but no one does that really. In LISP the programmer can do that easily via Macros. Also LISP is compiled and per the programming language shootout, it is more efficient than Perl, Python, and Ruby in general.
Prolog basically is a logic language made for representing facts and rules. What are expert systems but collections of rules and facts. Since it is very convenient to represent a bunch of rules in Prolog, there is an obvious synergy there with expert systems.
Now I think using LISP/Prolog for every AI problem is not a given. In fact just look at the multitude of Machine Learning/Data Mining libraries available for Java. However when you are prototyping a new system or are experimenting because you don't know what you are doing, it is way easier to do it with a scripting language than a statically typed one. LISP was the earliest languages to have all these features we take for granted. Basically there was no competition at all at first.
Also in general academia seems to like functional languages a lot. So it doesn't hurt that LISP is functional. Although now you have ML, Haskell, OCaml, etc. on that front as well (some of these languages support multiple paradigms...).

The main calling card of both Lisp and Prolog in this particular field is that they support metaprogramming concepts like lambdas. The reason that is important is that it helps when you want to roll your own programming language within a programming language, like you will commonly want to do for writing expert system rules.
To do this well in a lower-level imperative language like C, it is generally best to just create a separate compiler or language library for your new (expert system rule) language, so you can write your rules in the new language and your actions in C. This is the principle behind things like CLIPS.

The two main things you want are the ability to do experimental programming and the ability to do unconventional programming.
When you're doing AI, you by definition don't really know what you're doing. (If you did, it wouldn't be AI, would it?) This means you want a language where you can quickly try things and change them. I haven't found any language I like better than Common Lisp for that, personally.
Similarly, you're doing something not quite conventional. Prolog is already an unconventional language, and Lisp has macros that can transform the language tremendously.

What do you mean by "AI"? The field is so broad as to make this question unanswerable. What applications are you looking at?
LISP was used because it was better than FORTRAN. Prolog was used, too, but no one remembers that. This was when people believed that symbol-based approaches were the way to go, before it was understood how hard the sensing and expression layers are.
But modern "AI" (machine vision, planners, hell, Google's uncanny ability to know what you 'meant') is done in more efficient programming languages that are more sustainable for a large team to develop in. This usually means C++ these days--but it's not like anyone thinks of C++ as a good language for AI.
Hell, you can do a lot of what was called "AI" in the 70s in MATLAB. No one's ever called MATLAB "a good language for AI" before, have they?

Functional programming languages are easier to parallelise due to their stateless nature. There seems to already be a subject about it with some good answers here: Advantages of stateless programming?
As said, its also generally simpler to build programs that generate programs in LISP due to the simplicity of the language, but this is only relevant to certain areas of AI such as evolutionary computation.
Edit:
Ok, I'll try and explain a bit about why parallelism is important to AI using Symbolic AI as an example, as its probably the area of AI that I understand best. Basically its what everyone was using back in the day when LISP was invented, and the Physical Symbol Hypothesis on which it is based is more or less the same way you would go about calculating and modelling stuff in LISP code. This link explains a bit about it:
http://www.cs.st-andrews.ac.uk/~mkw/IC_Group/What_is_Symbolic_AI_full.html
So basically the idea is that you create a model of your environment, then searching through it to find a solution. One of the simplest to algorithms to implement is a breadth first search, which is an exhaustive search of all possible states. While producing an optimal result, it is usually prohibitively time consuming. One way to optimise this is by using a heuristic (A* being an example), another is to divide the work between CPUs.
Due to statelessness, in theory, any node you expand in your search could be ran in a separate thread without the complexity or overhead involved in locking shared data. In general, assuming the hardware can support it, then the more highly you can parallelise a task the faster you will get your result. An example of this could be the folding#home project, which distributes work over many GPUs to find optimal protein folding configurations (that may not have anything to do with LISP, but is relevant to parallelism).

As far as I know from LISP is that is a Functional Programming Language, and with it you are able to make "programs that make programs. I don't know if my answer suits your needs, see above links for more information.

Pattern matching constructs with instantiation (or the ability to easily construct pattern matching code) are a big plus. Pattern matching is not totally necessary to do A.I., but it can sure simplify the code for many A.I. tasks. I'm finding this also makes F# a convenient language for A.I.

Languages per se (without libraries) are suitable/comfortable for specific areas of research/investigation and/or learning/studying ("how to do the simplest things in the hardest way").
Suitability for commercial development is determined by availability of frameworks, libraries, development tools, communities of developers, adoption by companies. For ex., in internet you shall find support for any, even the most exotic issue/areas (including, of course, AI areas), for ex., in C# because it is mainstream.
BTW, what specifically is context of question? AI is so broad term.
Update:
Oooops, I really did not expect to draw attention and discussion to my answer.
Under ("how to do the simplest things in the hardest way"), I mean that studying and learning, as well as academic R&D objectives/techniques/approaches/methodology do not coincide with objectives of (commercial) development.
In student (or even academic) projects one can write tons of code which would probably require one line of code in commercial RAD (using of component/service/feature of framework or library).
Because..! oooh!
Because, there is no sense to entangle/develop any discussion without first agreeing on common definitions of terms... which are subjective and depend on context... and are not so easy to be formulate in general/abstract context.
And this is inter-disciplinary matter of whole areas of different sciences
The question is broad (philosophical) and evasively formulated... without beginning and end... having no definitive answers without of context and definitions...
Are we going to develop here some spec proposal?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse