So I'm developing a small programming language, and am trying to grasp around the concept of "Self-Hosting".
Wikipedia states:
The first self-hosting compiler (excluding assemblers) was written for Lisp by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.
From this, I understand that someone had a Lisp interpreter, (lets say in Python).
The Python program then reads a Lisp program which in turn can also read Lisp programs.
By the term, "Self-Hosting", this surely can't mean the Python program can cease to be of use, because removing that would remove the ability to run the Lisp program which reads other Lisp programs!
So by this, how does a program become able to host itself directly on the OS? Maybe I'm just not understanding it correctly.
In this case, the term self-hosting applies to the Lisp compiler they wrote, not the interpreter.
The Python Lisp interpreter (as in your example) would take Lisp source as input, and execute it directly.
The Lisp compiler (written in lisp) can take any Lisp source as input and generate a native machine binary[1] as output (which could then run without an interpreter).
With those two pieces, eliminating Python becomes feasible. The process would go as follows:
python.exe lispinterpret.py lispcompiler.lisp -i lispcompiler.lisp -o lispcompiler.exe
We ask Python to interpret a lisp program from source (lispcompiler.lisp), and we pass lispcompiler.lisp itself as input. lispcompiler.lisp then outputs lispcompiler.exe as output, which is a native machine binary (and doesn't depend on Python).
The next time you want to compile the compiler, the command is:
lispcompiler.exe -i lispcompiler.lisp -o lispcompiler2.exe
And you will have a new compiler without the use of Python.
[1] Or you could generate assembly code, which is passed to an assembler.
Related
I am interested in creating an emacs extension that delegates the work to an external program.
I have my logic as a library, however, written in Common Lisp. If I can directly call the CL library from Elisp, that would be simpler for me; otherwise, I can use a client/server architecture.
I have looked into emacs LSP implementation, but I couldn't find a simple entry on how to do it.
You could build a binary of your CL app and call it from the Elisp side. It seems to suit you fine, so here are more pointers:
How to build a Common Lisp executable
short answer: see https://lispcookbook.github.io/cl-cookbook/scripting.html
Building a binary is done by calling sb-ext:save-lisp-and-die from the terminal (and not from a running image). Note that this function name changes on the different implementations.
ASDF has a directive that allows to do it declaratively, and portably (for all implementations). You add 3 lines in your .asd file and you mention what is your program's entry point. For example:
;; myprogram.asd
:build-operation "program-op" ;; leave this as is.
:build-pathname "myprogram"
:entry-point "myprogram::main" ;; up to you to write main.
Now, call (asdf:make :myprogram).
See a more complete example in the Cookboo.
Call it from Elisp
See https://wikemacs.org/wiki/Emacs_Lisp_Cookbook#Processes
This returns the output as a string:
(shell-command-to-string "seq 8 12 | sort")
Full documentation: https://www.gnu.org/software/emacs/manual/html_node/elisp/Synchronous-Processes.html
Other approaches
Other approaches are discussed here: https://www.reddit.com/r/lisp/comments/kce20l/what_is_the_best_way_to_call_common_lisp_inside/
For example, one could start a lisp process with Slime and execute CL code with slime-eval.
Does choosing a programming language decide performance when all of it is compiled to some 1's and 0's
Eg: printf (in C) vs cout (C++) vs print (in Python)
Do all of the above have same binary compiled code ?
Appreciate any help in understanding this concept of programming language and role on hardware in detail! Thanks in advance
The choice of programming language can have many impacts on the performance of your code, how portable it is, the comparability and among other things, how easily the objective can be put into code. To answer you question directly, C and C++ would likely produce the 'same binary' when printing an output, if they were both done for the same target environment. Python is different because it is an interpreted language, meaning the code is read by a program written in code native to the architecture and acted upon accordingly. Python is something of an edge case in this regard because it is technically compiled at execution time (and can be before distribution) but into an intermediate code similar in principle to Java byte code that is only understood by the Python interpreter.
The difference you bring up between lower language's like C and higher ones like Java, Python and even JavaScript is the nature of their execution being done by native hardware or by the interpreter. Language's running on bare metal are generally understood to be faster than those on interpreters as the interpreter takes time to understand the code and uses it's own system resources. Java tends to break this rule because it's interpreter is a full virtual machine that understands very simple byte code, making it competitive in speed to language's like C.
To what kind of binary code they are compiled depends on the compiler. For C and C++ there are dozens of different compilers which might generate different binary code. Besides that, most compilers even have optimization flags that influence the generated binary code a lot.
Python isn't even directly compiled into "machine code", it's compiled into bytecode for a python interpreter. The Python interpreter itself is a program that runs on the machine, then reads the python-bytecode and executes it probably by internally calling predefined functions (that already exist in machine-code)
I know there is no such thing, strictly speaking, as a compiled or interpreted language.
But, generally speaking, is LISP used to write scripts like Python, bash script, and batch script?
Or is it a general purpose programming language like C++, JAVA, and C#?
Can anyone explain this in simple terms?
Early versions of Lisp programming language and Dartmouth BASIC would be examples interpreter language (parse the source code and perform its behavior directly.). However, Common lisp (Current version) is a compiler language.
Note that most Lisp compilers are not Just In Time compilers. You as a programmer can invoke the compiler, for example in Common Lisp with the functions COMPILE and COMPILE-FILE. Then Lisp code gets compiled.
Additionally most Lisp systems with both a compiler and an interpreter allow the execution of interpreted and compiled code to be freely mixed.
For more details check here
Lisp is a compiled general purpose language, in its modern use.
To clarify:
“LISP” is nowadays understood as “Common Lisp”
Common Lisp is an ANSI Standard
There are several implementations of Common Lisp, both free and commercial
Code is usually compiled, then loaded into an image. The order in which the individual parts/files of an entire system are compiled and loaded is usually defined through a system definition facility (which mostly means ASDF nowadays).
Most implementations also provide a means for loading source code when started. Example:
sbcl --load 'foo.lisp'
This makes it also possible to use lisp source files as “scripts”, even though they will very likely be compiled before execution.
Traditionally, LISP can be interpreted or compiled -- with some of each running at the same time. Compilation, in some cases, would be to a virtual machine like JAVA.
LISP is a general purpose programming language, but rarely used as such anymore. In the days of microcoded LISP machines, the entire operating system, including things like network, graphics and printer drivers, were all written in LISP itself. The very first IMAP mail client, for example, was written entirely in LISP.
The unusual syntax likely makes other programming languages, like Python, more attractive. But if one looks carefully, you can find LISP-inspired elements in popular languages like Perl.
Simple question here, just can't seem to pass it google in a way it can understand.
Say I wanted to execute a line of actual programming code (c++ or java or python... etc) like SetCursorPos or printf from the command prompt command line. I vaguely imagine I would have to invoke the compiler and pass the command to it like a parameter, where from it would then be converted into machine language and passed to... where exactly?
Okay so that was kind of two questions.
How to run actual code from the command line and
what exactly is happening when a fully compiled program, or converted line of code (presuming these are essentially binary containers at that point), is executed?
Question one takes priority obviously. Unfortunately, I can not find any documentation on it, just a bunch of stuff vaguely related to it.
How to run actual code from the command line
Without delving into the vast amounts of blurriness between them, there are two major categories of language implementations: interpreters and compilers.
With many interpreters (or implementations with implicit compilation, such as V8 JavaScript's jit compiler, or pretty much anything with a repl), running a single line from the command line should be fairly trivial. CPython (the standard implementation of Python) has the -c command option:
$ python -c 'print("Hello, world!")'
Hello, world!
Language implementations with explicit compilation steps will tend to be decidedly less simple. In particular, the compiler would need to either accept source either from directly out of the argument list, or from standard input (via piping or redirection). On the output side, your compiler would have to support immediately executing that program, or outputting it to standard out, so that an operating system feature (if it exists) can execute it from a pipe.
To my knowledge, most explicit compilers are not designed with such usage in mind. In such cases, your best bet is to see if there is a REPL available for the language in question, preferably one as compatible with your compiler as possible, or to create (or find) a wrapper that makes it look like your language has a REPL. The wrapper would:
Accept input along the lines of CPython above.
Create a temporary source file behind the scenes with the code to be run and any necessary boilerplate.
Pass that file to the compiler.
Automatically run the resulting executable.
Delete the source file and executable. These may be cleaned up by the operating system later instead, if they're in a temp directory.
From the point of view of the user, this should look pretty similar to the CPython example, as they wouldn't have to interact with or see the compiler or temporary files.
I wish that John McCarthy was still alive, but...
From LISP 1.5 Programmer's Manual :
LISP can interpret and execute programs written in the form of S-
expressions. Thus, like machine language, and unlike most other higher
level languages, it can be used to generate programs for further
execution.
I need more clarification about how machine language can used to generate programs and how Lisp can do it?
All that is saying is that machine code can directly write machine instructions to memory and jump to those instructions to execute them; this is the basis of many attack vectors to break into software, in fact.
The point is, when you're writing machine code, it's easy to generate machine code. But when you're writing in a compiled language like C, you can't just generate C code at run time and then execute it - unless your program includes a C compiler.
Lisp - and, these days, many other languages, especially "scripting languages" like Perl, Python, Ruby, Tcl, Javascript, and command shells - have the ability to execute code that is generated at runtime. In Lisp, since code and data have the same structure, this is usually less work than it is in the other languages, where the code to be evaluated is generally a string that has to be parsed. (Though Perl has the ability to eval a block instead of a string, which lets the compiler do the parsing ahead of time for literal code.)
A machine language can alter itself while running. The last assembly programming i did was for MS DOS and resident program that i used to run before testing other programs. When my program misbehaved, a keystroke switched to the resident program and could peek into the running program and alter it directly before resuming. It was quite handy since I didn't have a debugger.
LISP had this from the very beginning since it was originally interpreted. You could change the definition of a function while you were running and the whole langugage was always available at runtime, even eval and define. When it started getting compiled it wasn't compiled like Algol, but partially, allowing for interpreted and compiled code to intermix at the same time. The fact that its code structure was list structure and that symbols are a data type contributed to this.
Last interview I saw with McCarthy he was asked about what he thought of modern programming languages (Not LISP family but the Algol family language Ruby, that is said to be influenced by LISP), and before answering he asked if they could represent code as data (like list structure). Since it didn't, Ruby is still behind what LISP was in the 60s in his opinion.
Many new programming languages are emerging in the Algol family and some of the most promising ones, like Perl6 and Nemerle, are getting closer to the features LISP had in the 60s.
Machine language programs can fill memory regions with arbitrary bytes. Then they can just jump to the start of such region which will thus get executed right away.
Lisp language programs can easily create arbitrary S-expressions in memory, using cons. Then they can just call eval on these S-expressions to evaluate (interpret) them.
High level languages programs can easily fill memory regions with characters representing new code in the language's syntax. But they can not run such a code.