How does emacs interpreter evaluate an expression? - emacs

Is there an elisp interpreter and a byte code interpreter in emacs VM, or jus one interpreter?
If there is only one interpreter, what code format can it evaluate?
Only s-expression(so byte code is only a kind of s-expression?).
Only byte code(so s-expression will be compiled before evaluate?).
Both of them(then why not only byte code?).
I think emacs has only one interpreter, it can only evaluate s-expression, byte code is a kind of s-expression. Then why the interpreter doesn't compile the s-expression into byte-code before evaluating, why we need the byte-compile function? Cause of the macros?

GNU emacs has both, an evaluator, which works on S-expressions, and a byte-code compiler + interpreter.
Having the evaluator is nice for simple commands and functions, since it avoids the overhead of compilation. This is convenient for interactively defined helpers, say, some quickly hacked together functions to modify a buffer's content.
Having the byte-code interpreter is useful, since it increases the execution speed and lowers the memory overhead due to the more dense representation of the code. This is a concern, since a lot (if not most) of emacs is implemented in lisp.

Related

What is the difference and the relation between the Lisp interpreter and the Lisp image? Can they be used as synonoms?

I noticed some people using the terms as if they were synonoms.
For instance, in the same scenario, I heard "add this function to the lisp image evaluating it" and "eval this function into the Lisp interpreter to use it later".
However, I am not sure the use is technically precise. Thus, the question.
These are two orthogonal concepts. Let’s start from the usually comprehensive Common Lisp Glossary:
Lisp image n. a running instantiation of a Common Lisp implementation. A Lisp image is characterized by a single address space in which any object can directly refer to any another in conformance with this specification, and by a single, common, global environment.
So the key idea is that an image is a set of mutually referring Lisp objects (functions and data) that can be “called” or “accessed” during the execution of a program.
The way in which a Common Lisp program is executed depends instead from the way in which a system is implemented. It could be executed by compilation in machine language, for instance, or through some form of interpretation (or even a mix of the two). So a Lisp interpreter is just a particular way in which an implementation is done (and in the current Common Lisp systems there are many different ways to implement the language).
Image
"Image" is a file on disk.
"Add a function to the image"
means evaluate the function and save the image, so the function if immediately available on the next invocation.
REPL
"Interpreter" is (usually) a wrong level of abstraction; one should use
"REPL" instead. E.g., SBCL does not have an interpreter at all (everything is always compiled) but this is not a detail that is relevant to this topic.
"eval this function into the Lisp interpreter to use it later"
means evaluate the function in the current REPL and use it in the same process (i.e., it is available until Lisp is restarted).
An image is a copy of a Lisp heap written to disk (or another secondary storage). The Lisp heap is the memory for data storage in RAM of a computer. To write a Lisp heap to an image, the running Lisp is stopped and the memory is dumped to disk. Then the Lisp is either resumed or quit.
The image can be used to restore the heap upon starting a new Lisp. That's usually faster than starting a fresh Lisp and then loading the corresponding software.
A Lisp interpreter is a program which executes Lisp programs from source. Many Lisp implementations don't use an interpreter, but they execute compiled Lisp code, typically Lisp code compiled to native machine code.

"illegal terminating character after a colon: #\" in portacle, though no colons in code

I've recently set up Portacle 1.3 for learning common lisp on Win 7. However, whenever I run my code I get the error, even if there is no code.
Running individual lines works fine, however. The error only shows when I run the whole file.
I tried putting some code in an EVAL function, but I believe that only accepts one argument at a time, so I couldn't run a whole program in it.
I've found a similar error in this stackoverflow page, but their code contains colons and that's where their error lies.
I think it might be an error in the code that runs mine, seeing as I get the error even if I compile with no code, however I know nothing.
The full error:
main.lisp:1:1:
read-error:
READ error during COMPILE-FILE:
illegal terminating character after a colon: #\
Line: 1, Column: 13, File-Position: 12
Stream: #<SB-INT:FORM-TRACKING-STREAM for "file [path to file]\\main.lisp" {1005F5F0D3}>
Compilation failed.
Portacle is a standalone Emacs packaged with everything needed for Common Lisp development and which uses SBCL as the Common Lisp implementation.
I believe what you do when you say 'compile the whole file' is call slime-compile-and-load-file which is bound to the key sequence C-c C-k by default. There are a lot of moving components here:
Emacs is the text editor here. It also takes care of launching all the necessary components for Common Lisp development.
Slime is one such component. It serves as interface between Emacs and your Common Lisp implementation (SBCL in this case, but supports any Lisp in theory). Basically it sends the code you wrote in Emacs to your Lisp for evaluation.
SBCL is the Common Lisp implementation. In this case it is a compiler. This is what evaluates the code it receives and spews out the answers to the user interface in Emacs, through Slime. It also 'lives', in the sense that you interact with it by modifying the state of the loaded Lisp image, keeping track of defined functions, global dynamic variables and much more. This is why you can have the REPL, and why you need Slime to interact with it.
So to debug your problem, I would try to:
Launch SBCL from the Windows shell and run a simple .lisp file to check that everything works. You can put for example (format t "~a" (lisp-implementation-type)) in a .lisp file and run it in SBCL from the shell by calling (load "...\\file.lisp"). It should return "SBCL".
Create a completely new file using Emacs (and not weird Windows programs that could mess up the files) (C-x C-f), and try to call the compile from there (C-c C-k).
And I believe you made the right choice of IDE. Portacle is arguably the simplest tool out there if you are a total beginner in Common Lisp and do not know Emacs configuration. The keybindings are a bit daunting though.

The concept of Self-Hosting

So I'm developing a small programming language, and am trying to grasp around the concept of "Self-Hosting".
Wikipedia states:
The first self-hosting compiler (excluding assemblers) was written for Lisp by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.
From this, I understand that someone had a Lisp interpreter, (lets say in Python).
The Python program then reads a Lisp program which in turn can also read Lisp programs.
By the term, "Self-Hosting", this surely can't mean the Python program can cease to be of use, because removing that would remove the ability to run the Lisp program which reads other Lisp programs!
So by this, how does a program become able to host itself directly on the OS? Maybe I'm just not understanding it correctly.
In this case, the term self-hosting applies to the Lisp compiler they wrote, not the interpreter.
The Python Lisp interpreter (as in your example) would take Lisp source as input, and execute it directly.
The Lisp compiler (written in lisp) can take any Lisp source as input and generate a native machine binary[1] as output (which could then run without an interpreter).
With those two pieces, eliminating Python becomes feasible. The process would go as follows:
python.exe lispinterpret.py lispcompiler.lisp -i lispcompiler.lisp -o lispcompiler.exe
We ask Python to interpret a lisp program from source (lispcompiler.lisp), and we pass lispcompiler.lisp itself as input. lispcompiler.lisp then outputs lispcompiler.exe as output, which is a native machine binary (and doesn't depend on Python).
The next time you want to compile the compiler, the command is:
lispcompiler.exe -i lispcompiler.lisp -o lispcompiler2.exe
And you will have a new compiler without the use of Python.
[1] Or you could generate assembly code, which is passed to an assembler.

Might this permanently and accidentally overwrite the compiler's own functionality?

So I was writing my own function and I called it make-list and I got this from debugger:
The function MAKE-LIST is predefined in Clozure CL.
[Condition of type SIMPLE-ERROR]
Restarts:
0: [CONTINUE] Replace the definition of MAKE-LIST.
Fine, but what if I had accidentally chosen option 0?? Would my compiler be broken and forever have the wrong definition of an internal function, as I would have replaced it?
Only your currently running image would be broken, in which case you can restart CCL to restore it.
The only way to do permanent damage is to save the image, and chose to overwrite the original image file.
Many Lisp systems are written in Lisp themselves.
Clozure CL is such an example. Clozure CL is written in Clozure CL (with some C and assembler). Clozure CL can compile itself.
Thus many/most Common Lisp functions in Clozure CL are written in Clozure CL. So it needs some kind of a switch, where it allows to define or redefine built-in functionality. So there is definitely a way to edit the implementation's source code and change things. It would be best that your definitions are 'correct', so that the functioning of the Lisp system is not compromised. Keep in mind that redefinitions typically do not have an effect on inlined functions or on already expanded macros.
Now, if we as typical programmers use Clozure CL, some packages are protected and redefining the symbols are not allowed and an error is signaled. But you can continue and then change internal functions. As always in many Common Lisp, they are wide open for changes, but this comes with the responsibility for you as the programmer to do the right thing.
If you change a Lisp-internal function, there a some ways to leave permanent damage:
saving an image and using that later
using it to recompile CCL itself or parts of it
you could compile a file and somehow the generated code could be different than with the original compiler
you could compile a file and somehow the generated code includes an inlined version of the changed Lisp function
if one loads such a file, it could be automatically via some init file, it contains the changes and the changed code will be a part of the then currently running Lisp.

LISP 1.5 How lisp is like a machine language?

I wish that John McCarthy was still alive, but...
From LISP 1.5 Programmer's Manual :
LISP can interpret and execute programs written in the form of S-
expressions. Thus, like machine language, and unlike most other higher
level languages, it can be used to generate programs for further
execution.
I need more clarification about how machine language can used to generate programs and how Lisp can do it?
All that is saying is that machine code can directly write machine instructions to memory and jump to those instructions to execute them; this is the basis of many attack vectors to break into software, in fact.
The point is, when you're writing machine code, it's easy to generate machine code. But when you're writing in a compiled language like C, you can't just generate C code at run time and then execute it - unless your program includes a C compiler.
Lisp - and, these days, many other languages, especially "scripting languages" like Perl, Python, Ruby, Tcl, Javascript, and command shells - have the ability to execute code that is generated at runtime. In Lisp, since code and data have the same structure, this is usually less work than it is in the other languages, where the code to be evaluated is generally a string that has to be parsed. (Though Perl has the ability to eval a block instead of a string, which lets the compiler do the parsing ahead of time for literal code.)
A machine language can alter itself while running. The last assembly programming i did was for MS DOS and resident program that i used to run before testing other programs. When my program misbehaved, a keystroke switched to the resident program and could peek into the running program and alter it directly before resuming. It was quite handy since I didn't have a debugger.
LISP had this from the very beginning since it was originally interpreted. You could change the definition of a function while you were running and the whole langugage was always available at runtime, even eval and define. When it started getting compiled it wasn't compiled like Algol, but partially, allowing for interpreted and compiled code to intermix at the same time. The fact that its code structure was list structure and that symbols are a data type contributed to this.
Last interview I saw with McCarthy he was asked about what he thought of modern programming languages (Not LISP family but the Algol family language Ruby, that is said to be influenced by LISP), and before answering he asked if they could represent code as data (like list structure). Since it didn't, Ruby is still behind what LISP was in the 60s in his opinion.
Many new programming languages are emerging in the Algol family and some of the most promising ones, like Perl6 and Nemerle, are getting closer to the features LISP had in the 60s.
Machine language programs can fill memory regions with arbitrary bytes. Then they can just jump to the start of such region which will thus get executed right away.
Lisp language programs can easily create arbitrary S-expressions in memory, using cons. Then they can just call eval on these S-expressions to evaluate (interpret) them.
High level languages programs can easily fill memory regions with characters representing new code in the language's syntax. But they can not run such a code.