The purpose of Lisp syntax to model AST - lisp

Lisp syntax represents AST as far as I know, but in high level format to allow human to easily read and modify, at the same time make it easy for the machine to process the source code as well.
For this reason, in Lisp, it is said that code is data and data is code, since code (s-epxression) is just AST, in essence. We can plug in more ASTs (which is our data, which is just lisp code) into other ASTs (lisp code) or independently to extend its functionality and manipulate it on the fly (runtime) without having to recompile the whole OS to integrate new code.In other languages, we have to recompile from to turn the human-language source code into valid AST before it is compiled into code.
Is this the reason for Lisp syntax to be designed like it is (represent an AST but is human readable, to satisfy both human and the machine) in the first place? To enable stronger (on the fly - runtime) as well as simpler (no recompile, faster) communication between man-machine?
I heard that the Lisp machine only has a single address space which holds all data. In operating system like Linux, the programmers only have virtual address space and pretend it to be the real physical address space and can do whatever they want. Data and code in Linux are separated regions, because effectively, data is data and data is code. In normal OS written in C (or C like language), it would be very messy if we only operate a single address space for the whole system and mixing data with code would be very messy.
In Lisp Machine, since code is data and data is code, is this the reason for this to have only a single address space (without the virtual layer)? Since we have GC and no pointer, should it be safe to operate on physical memory without breaking it (since having only 1 single space is a lot less complicated)?
EDIT: I ask this because it is said that one of the advantage of Lisp is single address space:
A safe language means a reliable environment without the need to
separate tasks out into their own separate memory spaces.
The "clearly separated process" model characteristic of Unix has
potent merits when dealing with software that might be unreliable to
the point of being unsafe, as is the case with code written in C or
C++ , where an invalid pointer access can "take down the system."
MS-DOS and its heirs are very unreliable in that sense, where just
about any program bug can take the whole system down; "Blue Screen of
Death" and the likes.
If the whole system is constructed and coded in Lisp, the system is as
reliable as the Lisp environment. Typically this is quite safe, as
once you get to the standards-compliant layers, they are quite
reliable, and don't offer direct pointer access that would allow the
system to self-destruct.
Third Law of Sane Personal Computing
Volatile storage devices (i.e. RAM) shall serve exclusively as
read/write cache for non-volatile storage devices. From the
perspective of all software except for the operating system, the
machine must present a single address space which can be considered
non-volatile. No computer system obeys this law which takes longer to
fully recover its state from a disruption of its power source than an
electric lamp would.
Single address space, as it is stated, holds all the running processes in the same memory space. I am just curious why people insist that single address space is better. I relate it to the AST like syntax of Lisp, to try to explain how it fits the single space model.

Your question doesn't reflect reality very accurately, especially in the part about code/data separation in Linux and other OS'es. Actually, this separation is enforced not at the OS level, but by the compiler/program loader. At the OS level there are just memory pages that can have different protection bits set (like executable, read-only etc), and above this level different executable formats exist (like ELF in Linux) that specify restrictions on different parts of program memory.
Returning to Lisp, as far as I know, historically, the S-expression format was used by Lisp creators, because they wanted to concentrate on the semantics of the language, putting syntax aside for some time. There was a plan to eventually create some syntax for Lisp (see M-expressions), and there were some Lisp-based languages which had some more syntax, like Dylan. But, overall, the Lisp community had come to the consensus, that the benefits of S-expressions outweight their cons, so they had stuck.
Regarding code as data, this is not strictly bound to S-expressions, as other code can as well be treated as data. This whole approach is called meta-programming and is supported at different levels and with different mechanisms by many languages. Every language, that supports eval (Perl, JavaScript, Python) allows to treat code as data, just the representation is almost always a string, while in Lisp it is a tree, which is much much more convenient and facilitates advanced stuff, like macros.

Related

LISP 1.5 How lisp is like a machine language?

I wish that John McCarthy was still alive, but...
From LISP 1.5 Programmer's Manual :
LISP can interpret and execute programs written in the form of S-
expressions. Thus, like machine language, and unlike most other higher
level languages, it can be used to generate programs for further
execution.
I need more clarification about how machine language can used to generate programs and how Lisp can do it?
All that is saying is that machine code can directly write machine instructions to memory and jump to those instructions to execute them; this is the basis of many attack vectors to break into software, in fact.
The point is, when you're writing machine code, it's easy to generate machine code. But when you're writing in a compiled language like C, you can't just generate C code at run time and then execute it - unless your program includes a C compiler.
Lisp - and, these days, many other languages, especially "scripting languages" like Perl, Python, Ruby, Tcl, Javascript, and command shells - have the ability to execute code that is generated at runtime. In Lisp, since code and data have the same structure, this is usually less work than it is in the other languages, where the code to be evaluated is generally a string that has to be parsed. (Though Perl has the ability to eval a block instead of a string, which lets the compiler do the parsing ahead of time for literal code.)
A machine language can alter itself while running. The last assembly programming i did was for MS DOS and resident program that i used to run before testing other programs. When my program misbehaved, a keystroke switched to the resident program and could peek into the running program and alter it directly before resuming. It was quite handy since I didn't have a debugger.
LISP had this from the very beginning since it was originally interpreted. You could change the definition of a function while you were running and the whole langugage was always available at runtime, even eval and define. When it started getting compiled it wasn't compiled like Algol, but partially, allowing for interpreted and compiled code to intermix at the same time. The fact that its code structure was list structure and that symbols are a data type contributed to this.
Last interview I saw with McCarthy he was asked about what he thought of modern programming languages (Not LISP family but the Algol family language Ruby, that is said to be influenced by LISP), and before answering he asked if they could represent code as data (like list structure). Since it didn't, Ruby is still behind what LISP was in the 60s in his opinion.
Many new programming languages are emerging in the Algol family and some of the most promising ones, like Perl6 and Nemerle, are getting closer to the features LISP had in the 60s.
Machine language programs can fill memory regions with arbitrary bytes. Then they can just jump to the start of such region which will thus get executed right away.
Lisp language programs can easily create arbitrary S-expressions in memory, using cons. Then they can just call eval on these S-expressions to evaluate (interpret) them.
High level languages programs can easily fill memory regions with characters representing new code in the language's syntax. But they can not run such a code.

Lisp as a meta environment

I'm working towards my Ph.D regarding better software reuse by integrating different types of computer languages. Due to performance and safety issues I don't consider to integrate them with foreign function calls and/or the use of web services.
Lisp is my favorite vehicle, because of interactive development, macros, doing modifications at runtime, code as data (the usual things one would imagine hearing the word Lisp), and others.
There are some approaches to port different types of Lisp to virtual machines like the JVM (clojure, kawa, SISC, ABCL, etc.) or .NET (clojure .NET, DotLisp, IronLisp). This is quite interesting, but one is restricted to the "universe" of the respective virtual machine.
Does anybody know of approaches the other way round, i.e. running Java or C# on a Lisp system? I have found the rest of cloak. It seem to be more or less a dead project. To me it would be much more sensible to have Lisp as a common abstraction, hosting other languages like Java and C#.
Which obstacles do you see to overcome this lack of a generic and extendable "language environment" integrating languages like Java or C# (without foreign function calls or (web) services))? Is it due to the fact that no Lisp system is running on a kind of a virtual machine, like the LLVM for instance, or what else?
Best regards, Ingmar
Lisp is a good platform for this kind of language hosting because of its macro capabilities. However, you want many more language features to do it well: modules, reader macros, high-level macro specification, and so on. Racket is one Lisp variant that's going forward in this direction. You can already use Algol 60, a variant of Prolog, a typed sister language, and so on. Guile is also moving in this direction with an ECMAScript implementation.
As far as implementing Java or C# on Lisp, it is possible in theory but it would require a massive amount of work to support these languages at a practical level (Racket used to implement a small portion of Java). It's also not clear that you would really gain anything considering that the CLR and JVM are both multi-language platforms now. What is more interesting is harnessing Lisp macros to define better custom languages (DSLs), defining useful dialects of your Lisp, or implementing another language specifically to bootstrap a useful tool (e.g., Guile implementing Emacs Lisp).
well "it depends", as always, right?
How much of Lisp do you want expose to Java, if any? For example, if you port the JVM to Lisp, do you somehow mate the JVMs need for a garbage collector to the actual underlying GC of the Lisp implementation, or do you simply write your own that GCs the JVM objects within the JVM heap.
It may very well be impractical to mate the two, for several reasons. The Lisp GC is pretty much hidden, much like Javas GC, from the actual implementation. That may be too hidden to work with a JVM implementation.
There's no reason you can't build a JVM in Lisp, it's just a bunch of byte codes. Lisp handles bytes just fine.
There have been implementations of the JVM in JavaScript, it's not much different than a Lisp at its core.
But beyond having a lispy command line to interact with the JVM, the JVM itself wouldn't be very "lispy". How could it be? It's a JAVA VM. The IMPLEMENTATION can be "lispy", but, ideally, none of that lisp-ness would bubble out to the JVM itself.
Beyond any advantages Lisp has in developing ANY program, I don't think Lisp lends itself specifically to being "better" to developing a virtual machine.
Lisp is great at developing other languages, particularly other S-Exp based languages. But a VM is a VM. Monster case statement or some other dispatch base on numeric values mechanism.
Lisp is a perfect host language for such a meta-platform, but it is not necessarily an ideal target language for compiling something low level and imperative. Fortunately, nothing stops you from generating, say, an assembly code within your Lisp environment.

What are the actual differences between Scheme and Common Lisp? (Or any other two dialects of Lisp)

Note: I am not asking which to learn, which is better, or anything like that.
I picked up the free version of SICP because I felt it would be nice to read (I've heard good stuff about it, and I'm interested in that sort of side of programming).
I know Scheme is a dialect of Lisp and I wondered: what is the actual difference is between Scheme and, say, Common Lisp?
There seems to be a lot about 'CL has a larger stdlib...Scheme is not good for real-world programming..' but no actual thing saying 'this is because CL is this/has this'.
This is a bit of a tricky question, since the differences are both technical and (more importantly, in my opinion) cultural. An answer can only ever provide an imprecise, subjective view. This is what I'm going to provide here. For some raw technical details, see the Scheme Wiki.
Scheme is a language built on the principle of providing an elegant, consistent, well thought-through base language substrate which both practical and academic application languages can be built upon.
Rarely will you find someone writing an application in pure R5RS (or R6RS) Scheme, and because of the minimalistic standard, most code is not portable across Scheme implementations. This means that you will have to choose your Scheme implementation carefully, should you want to write some kind of end-user application, because the choice will largely determine what libraries are available to you. On the other hand, the relative freedom in designing the actual application language means that Scheme implementations often provide features unheard of elsewhere; PLT Racket, for example, enables you to make use of static typing and provides a very language-aware IDE.
Interoperability beyond the base language is provided through the community-driven SRFI process, but availability of any given SRFI varies by implementation.
Most Scheme dialects and libraries focus on functional programming idioms like recursion instead of iteration. There are various object systems you can load as libraries when you want to do OOP, but integration with existing code heavily depends on the Scheme dialect and its surrounding culture (Chicken Scheme seems to be more object-oriented than Racket, for instance).
Interactive programming is another point that Scheme subcommunities differ in. MIT Scheme is known for strong interactivitiy support, while PLT Racket feels much more static. In any case, interactive programming does not seem to be a central concern to most Scheme subcommunities, and I have yet to see a programming environment similarly interactive as most Common Lisps'.
Common Lisp is a battle-worn language designed for practical programming. It is full of ugly warts and compatibility hacks -- quite the opposite of Scheme's elegant minimalism. But it is also much more featureful when taken for itself.
Common Lisp has bred a relatively large ecosystem of portable libraries. You can usually switch implementations at any time, even after application deployment, without too much trouble. Overall, Common Lisp is much more uniform than Scheme, and more radical language experiments, if done at all, are usually embedded as a portable library rather than defining a whole new language dialect. Because of this, language extensions tend to be more conservative, but also more combinable (and often optional).
Universally useful language extensions like foreign-function interfaces are not developed through formal means but rely on quasi-standard libraries available on all major Common Lisp implementations.
The language idioms are a wild mixture of functional, imperative, and object-oriented approaches, and in general, Common Lisp feels more like an imperative language than a functional one. It is also extremely dynamic, arguably more so than any of the popular dynamic scripting languages (class redefinition applies to existing instances, for example, and the condition handling system has interactivity built right in), and interactive, exploratory programming is an important part of "the Common Lisp way." This is also reflected in the programming environments available for Common Lisp, practically all of which offer some sort of direct interaction with the running Lisp compiler.
Common Lisp features a built-in object system (CLOS), a condition handling system significantly more powerful than mere exception handling, run-time patchability, and various kinds of built-in data structures and utilites (including the notorious LOOP macro, an iteration sublanguage much too ugly for Scheme but much too useful not to mention, as well as a printf-like formatting mechanism with GOTO support in format strings).
Both because of the image-based, interactive development, and because of the larger language, Lisp implementations are usually less portable across operating systems than Scheme implementations are. Getting a Common Lisp to run on an embedded device is not for the faint of heart, for example. Similarly to the Java Virtual Machine, you also tend to encounter problems on machines where virtual memory is restricted (like OpenVZ-based virtual servers). Scheme implementations, on the other hand, tend to be more compact and portable. The increasing quality of the ECL implementation has mitigated this point somewhat, though its essence is still true.
If you care for commercial support, there are a couple of companies that provide their own Common Lisp implementations including graphical GUI builders, specialized database systems, et cetera.
Summing up, Scheme is a more elegantly designed language. It is primarily a functional language with some dynamic features. Its implementations represent various incompatible dialects with distinctive features. Common Lisp is a fully-fledged, highly dynamic, multi-paradigm language with various ugly but pragmatic features, whose implementations are largely compatible with one another. Scheme dialects tend to be more static and less interactive than Common Lisp; Common Lisp implementations tend to be heavier and trickier to install.
Whichever language you choose, I wish you a lot of fun! :)
Some basic practical differences:
Common Lisp has separate scopes for variables and functions; whereas in Scheme there is just one scope -- functions are values and defining a function with a certain name is just defining a variable set to the lambda. As a result, in Scheme you can use a function name as a variable and store or pass it to other functions, and then perform a call with that variable as if it were a function. But in Common Lisp, you need to explicitly convert a function into a value using (function ...), and explicitly call a function stored in a value using (funcall ...)
In Common Lisp, nil (the empty list) is considered false (e.g. in if), and is the only false value. In Scheme, the empty list is considered true, and (the distinct) #f is the only false value
That's a hard question to answer impartially, especially because many of the LISP folks would classify Scheme as a LISP.
Josh Bloch (and this analogy may not be his invention) describes choosing a language as being akin to choosing a local pub. In that light, then:
The "Scheme" pub has a lot of programming-languages researchers in it. These people spend a lot of attention on the meaning of the language, on keeping it well-defined and simple, and on discussing innovative new features. Everyone's got their own version of the language, designed to allow them to explore their own particular corner of programming languages. The Scheme people really like the parenthesized syntax that they took from LISP; it's flexible and lightweight and uniform and removes many barriers to language extension.
The "LISP" pub? Well... I shouldn't comment; I haven't spent enough time there :).
scheme:
orginally very few specifications (new R7RS seems to be heavier)
due to the easy syntax, scheme can be learned quickly
implementations provide additional functions, but names can differ in different implementations
common lisp:
many functions are defined by the bigger specification
different namespace for functions and variables (lisp-2)
that are some points, sure there are many more, which i don't remember right now.

What are the uses of Perl's C translation backend?

Other than the purely obvious: "It translates Perl to C."; are there any real world uses (a.k.a. hacks) for the Perl compiler's optimized C translation backend, B::CC?
Not really. It means you can convert a (small) Perl script into a (big) C program, which will be much harder for the recipient to reverse engineer. In some paranoid circles, this might be accounted an advantage (for example, if your Perl code is embarrassingly bad and you'd rather conceal that fact from your paying customers). But mostly it is of limited to negative value.
Compiling a Perl program to an optree, which can then be executed, can take a while sometimes. You can safe some of that time by using perlcc with any of its backends. That'll, in one way or another, serialise the compiled optree and make loading it later, when executing your compiled binary, somewhat faster. I can see that being useful in, for example, CGI environments, for which, however, much better alternatives of avoiding startup costs are available.
Contrary to popular believe, perlcc doesn't make it very hard to reverse-engineer the resulting binary, as discussed in How can I reverse-engineer a Perl program that has been compiled with perlcc?

What is a Lisp image?

Essentially, I would like to know what a Lisp image is? Is it a slice of memory containing the Lisp interpreter and one or more programs or what?
The Lisp image as dumped memory
The image is usually a file. It is a dump of the memory of a Lisp system. It contains all functions (often compiled to machine code), variable values, symbols, etc. of the Lisp system. It is a snapshot of a running Lisp.
To create an image, one starts the Lisp, uses it for a while and then one dumps an image (the name of the function to do that depends on the implementation).
Using a Lisp image
Next time one restarts Lisp, one can use the dumped image and gets a state back roughly where one was before. When dumping an image one can also tell the Lisp what it should do when the dumped image is started. That way one can reconnect to servers, open files again, etc.
To start such a Lisp system, one needs a kernel and an image. Sometimes the Lisp can put both into a single file, so that an executable file contains both the kernel (with some runtime functionality) and the image data.
On a Lisp Machine (a computer, running a Lisp operating system) a kind of boot loader (the FEP, Front End Processor) may load the image (called 'world') into memory and then start this image. In this case there is no kernel and all that is running on the computer is this Lisp image, which contains all functionality (interpreter, compiler, memory management, GC, network stack, drivers, ...). Basically it is an OS in a single file.
Some Lisp systems will optimize the memory before dumping an image. They may do a garbage collection, order the objects in memory, etc.
Why use images?
Why would one use images? It saves time to load things and one can give preconfigured Lisp systems with application code and data to users. Starting a Common Lisp implementation with a saved image is usually fast - a few milliseconds on a current computer.
Since the Lisp image may contain a lot of functionality (a compiler, even a development environment, lots of debugging information, ...) it is typically several megabytes in size.
Using images in Lisp is very similar to what Smalltalk systems do. Squeak for example also uses an image of Smalltalk code and data and a runtime executable. There is a practical difference: most current Lisp systems use compiled machine code. Thus the image is not portable between different processor architectures (x86, x86-64, SPARC, POWER, ARM, ...) or even operating systems.
History
Such Lisp images have been in use since a long time. For example the function SYSOUT in BBN Lisp from 1967 created such an image. SYSIN would read such an image at start.
Examples for functions saving images
For an example see the function save-image of LispWorks or read the SBCL manual on saving core images.
Several language implementations use an 'image' to store ''everything'' that is there
in the current context. This image may several abstraction layers of compilation ( i.e.
an intermediate parse level, an intermediate byte code, native op codes ). Loading this
image will be much faster than compiling all the source files. It's very much like the
hibernate feature, at a program level.
For example , if your-lisp-implementation uses a image, than first you ( or the compiler
vendor will ) boot strap an image, and save it.
Then you have two options: (1) either load your lisp file everytime you invoke lisp or (2) load all your lisp, and save the image , and use this image
Hope that helps
In general, it's the storage part of the lisp process (that is, all "lisp" functions and data), but no parts of the underlying lisp binary. On the plus side, this gives a fast start-up, since there's (essentially) no book-keeping to be done on loading the image, everything is just there. On the other hand, it means any open files, sockets and what-have-you are missing, so saving images as some sort of check-pointing will require a bit of implementation to make it work.