Why do we need `nil`? - lisp

I do not see why we need nil [1] when to cons a sequence (so-called proper list) of items. It seems to me we can achieve the same goal by using the so-called improper list (cons-ed pairs without an ending nil) alone. Since Lisps [2] have already provided a primitive procedure to distinguish between a pair? and an atom (some implementations even provide atom?), when defining a procedure on a list, e.g., length, I can do the same with just dotted-pairs, as shown below:
(define len
(lambda (l)
(cond ((pair? l) (+ 1 (len (cdr l))))
(else 1) ) ) )
It is obvious that we can apply this procedure to an improper list like '(1 . (2 . 3)) to get the expected answer 3, in contrast to the traditional (length '(1 2 3)).
I'd like to hear any opinions in defense of the necessity of nil. Thanks in advance.
[1] Let's ignore the debate among nil/NIL, '() and ().
[2] Here it means the Lisp family of languages.

Working with lists without nil (or '()) would be like doing arithmetic without zero. Using only pairs without nil, how would we represent an empty list, or a singleton list '(1)?
It gets worse: since lists don't have to be lists of atoms, but can contain other lists, how would we represent the nested list '(1 2 (3 4))? If we do the following conversions:
'(3 4) => '(3 . 4)
'(1 2 x) => '(1 . (2 . x)) == '(1 2 . x)
we get:
'(1 2 (3 4)) => '(1 . (2 . (3 . 4))) == '(1 2 3 . 4)
But also:
'(1 2 3 4) => '(1 . (2 . (3 . 4))) == '(1 2 3 . 4)
So constructing lists only using pairs and no nil prevents us from distinguishing between a nested list structure and a flat list, at least at the end of the list. You can still include nested lists as any element except the last, so now there's a strange and arbitrary limitation on what the elements of a list can be.
More theoretically, proper lists are an inductively defined data type: a list is either the empty list, or it has a first element, which can be anything, and a rest, which is always another list defined in the same way. Take away the empty list, and now you have a data type where the rest might be another list, or it might be the last element of the list. We can't tell except by passing it to pair?, which leads to the problem with nested listing above. Keeping nil around lets us have whatever we like as list elements, and allows us to distinguish between 1, '(1), '((1)) and so on.

You need it to represent "Nothing".

Related

Understanding function `tailp` in Common Lisp

While looking through the "Common Lisp Quick Reference" by Bert Burgemeister, I stumbled over tailp.
First, I misunderstood the definitions of this function. And I tried:
(tailp '(3 4 5) '(1 2 3 4 5))
But it returned
NIL
CLTL2 says, tailp is true iff the first argument is any (nthcdr n list) with existing n.
(nthcdr 2 '(1 2 3 4 5))
;; (3 4 5)
I further tried:
(tailp '(3 4 5) '(1 2 3 4 5))
;; NIL - and I would expect: T following the definition above.
(tailp '() '(1 2 3 4 5))
;; T
(tailp '5 '(1 2 3 4 . 5))
;; T
Until I tried (and then understood tailp looks for cdr of l which share even the same address):
(defparameter l '(1 2 3 4 5 6))
(tailp (nthcdr 3 l) l)
;; T
But then I had my next question:
For what such a function is useful at all?
Wouldn't be a function more useful which looks whether a sublist is part of a list? (Or looks like a part of a list, instead that it has to share the same address?)
Remark:
Ah okay slowly I begin to understand, that maybe that this is kind of a eq for cdr parts of a list ... Kind of ... "Any cdr-derivative of given list eq to the first argument?".
But maybe someone can explain me in which situations such test is very useful?
Remark:
In a long discussion with #Lassi here, we found out:
Never Use tailp On Circular Lists!
Because the behavior is undefined (already in SBCL problematic).
So tailp is for usage on non-circular lists.
The basic purpose of tailp is to check whether there is list structure shared. This means whether the cons cells are the same (which means EQL as a predicate) - not just the content of the cons cells.
One can also check if an item is in the last cdr:
CL-USER 87 > (tailp t '(1 2 3 4 . t))
T
CL-USER 88 > (tailp nil '(1 2 3 4 . nil))
T
CL-USER 89 > (tailp nil '(1 2 3 4))
T
CL-USER 90 > (tailp #1="e" '(1 2 3 4 . #1#))
T
This is one of the rarely used functions in Common Lisp.
Here is a case where tailp is useful:
(defun circular-list-p (l)
(and (consp l)
(tailp l (rest l))))
A couple of notes.
This terminates in all cases: tailp is allowed not to terminate on circular lists if the first argument is not a tail of the second (ie there's no requirement for it to check circularity), but it must terminate if the first argument is a tail of the second. But, if the list is circular, that's exactly what we check for here, so it terminates. (I was confused about this for a while).
The consp check is so (circular-list-p nil) is false: I think this is pragmatically the useful choice although you a pedant might argue that nil is circular.
I'm pretty sure the answer to (tailp '(3 4 5) '(1 2 3 4 5)) can be both t and nil as a smart compiler might do (tailp '#1=#(3 4 5) '(1 2 . #1#)) to reduce memory footprint. Quoted stuff are immutable literals so why not use the same memory twice?
Here is how tailp is working:
(defparameter *tail* (list 3 4 5))
(defparameter *larger* (list* 1 2 *tail*))
(defparameter *replica* (copy-list *larger*))
(tailp *tail* *replica*) ; ==> nil
(tailp *tail* *larger*) ; ==> t
Since copy-list creates new cons for each element in a list it will share
nothing but the empty list with any other list. It is unique.
*larger* has been made with a common tail with *tail* and thus (tailp *tail* *larger*) will be t.
It's important that it compares the arguments as same objects. Since the tail does not need to be a list it compares with eql. When comparing if stuff look the same you use equal so tailp is more specific than that. It has to be pointer equal (eq) or eql atomic value.
So where do you use this? I'm thinking a functional data structure where you typically reuse shared structure. tailp might be used to identify a subtree of a parent. It's basically a more performant version of this:
(defun my-tailp (needle haystack)
(cond ((eql needle haystack) t)
((atom haystack) nil)
(t (my-tailp needle (cdr haystack)))))
#Sylwester:
Thank you, #Sylwester!
I recently read in Edi Weitz's book about tail wagging a list:
(defparameter *list* (list 'a 'b 'c 'd))
(defparameter *tail* (cdddr *list*))
This names the car of the last cons cell of the list as *tail* - and now, one can add to it a new element and rename the new last car of the last cons cell of the list *tail*.
(setf (cdr *tail*) (cons 'e 'nil)
*tail* (cdr *tail*))
;; (E)
Now the list is:
*list*
;; (A B C D E)
and one can add via setf further things to *tail* without having to traverse the list again. (So improves performance. But with a warning of course, because this is a destructive action).
Perhaps, if one would name a list like you did:
(defparameter *my-new-tail* '(F G H))
and tail wagg it to the end of the new list
(setf (cdr *tail*) *my-new-tail*
*tail* (cddr *my-new-tail*))
Ah or alternatively:
(defparameter *tail-of-my-new-tail* (cddr *my-new-tail*))
;; and then
(setf (cdr *tail*) *my-new-tail*
*tail* *tail-of-my-new-tail*)
Then
(tailp *my-new-tail* *list*)
would be the test, whether this procedure was done correctly
would also be a test, whether I added a new tail in addition to *my-new-tail* or not or if *my-new-tail* was the last tail which has been tail wagged to *list* or not ...
therefore tailp would be a quite useful test in the context of tail wagging ...
or like you said, if an implementation uses then tail wagging for performance reasons, (and perhaps keeping track of the tails of the lists constantly) in this context, the implementation could use tailp as a test whether a given list contributes to another list as the recently added tail ...
This just came into my thoughts, when reading your answer, #Sylwester! (I didn't realize this while reading about tail wagging in the book - (which is by the way a super useful one!) Thank you that you answered!

Why is (type-of list) equal to CONS?

I am playing around with Common Lisp and just realized that
(type-of (cons 1 2)) is CONS
and
(type-of (list 1 2)) is also CONS
However the two are clearly not the same because all "proper" lists, must be conses with second element being a list.
That said, when there are only two elements, the second element is 2, and first element is 1, neither is a list, but the construct is also still called a cons.
This gets even more confusing since
(print (list (cons 1 2) 3)) ; this is a ((1 . 2) 3), an improper list, but still cons
(print (cons 1 (list 2 3))) ; this is a (1 2 3), a proper list, but still cons
(cons 1 (cons 2 3)) ; is not a proper list, but is a (1 2 . 3), but still cons...
All are cons, but why isn't (list 1 2) a list? It can't be a cons because cons and list must be different types in order to be told apart in the algorithm for determining whether or not it is a proper list (and in turn, (equal (list 1 2) (cons 1 2)) should be true; without this discrimination, there should be no difference between a cons and a list, there would just be a cons.
Can somebody please help me understand why it says that (type-of (list 1 2)) is cons, even though it is clearly a list (otherwise it would be an improper list to my understanding).
Proper and improper lists are not defined at the type level. This would require recursive type definitions which is only possible to do with Lisp with a satisfies type, and in that case type-of would still not return a type-specifier as complex:
b. the type returned does not involve and, eql,
member, not, or, satisfies or values.
The list type could be defined as (or cons null):
The types cons and null form an exhaustive partition of the type list.
That means that nil is a list, and any cons cell is a list. See also the definition of listp.
In other words:
(typep '(a b c) 'cons)
=> T
But also:
(typep '(a b c) 'list)
=> T
And of course this is true for any supertype:
(typep '(a b c) 'sequence)
=> T
(typep '(a b c) 't)
=> T
The type-of function returns the most basic type, i.e. cons, which can be thought of as the type for which no other subtype satisfy typep (but read the specification which gives the actual definition).
Remarks
Just to clarify:
(cons 1 2)
... is a list, but it cannot be passed to functions that expect proper lists like map, etc. This is checked at runtime and generally, there is no confusion because the cases where one use improper lists are actually quite rare (when you treat cons cells as trees, for example). Likewise, circular lists require special treatment.
In order to check if a list is proper or not, you only need to check whether the last cons has a nil or not as its cdr.
Also, I saw that you wrote:
((1 . 2) 3) ; [...] an improper list
What you have here is a proper-list of two elements where the first one is an improper list, a.k.a. a dotted-list.
#coredump's answer is the correct one, but it's perhaps useful to see pragmatic reasons why it's correct.
Firstly, it's quite desirable that typechecks are quick. So if I say (typep x 'list), I'd like it not to have to go away for a long time to do the check.
Well, consider what a proper list checker has to look like. Something like this, perhaps:
(defun proper-list-p (x)
(typecase x
(null t)
(cons (proper-list-p (rest x)))
(t nil)))
For any good CL compiler this is a loop (and it can obviously be rewritten as an explicit loop if you might need to deal with rudimentary compilers). But it's a loop which is as long as the list you are checking, and this fails the 'typechecks should be quick' test.
In fact it fails a more serious test: typechecks should terminate. Consider a call like (proper-list-p #1=(1 . #1#)). Oops. So we need something like this, perhaps:
(defun proper-list-p (x)
(labels ((plp (thing seen)
(typecase thing
(null (values t nil))
(cons
(if (member thing seen)
(values nil t) ;or t t?
(plp (rest thing)
(cons thing seen))))
(t (values nil nil)))))
(plp x '())))
Well, this will terminate (and tell you whether the list is circular):
> (proper-list-p '#1=(1 . #1#))
nil
t
(This version considers circular lists not to be proper: I think the other decision is less useful but perhaps equally justified in some theoretical sense.)
But this is now quadratic in the length of the list. This can be made better by using a hashtable in the obvious way, but then the implementation is ludicrously consy for small lists (hashtables are big).
Another reason is to consider the difference between representational type and intentional type: the representational type of something tells you how it is implemented, while the intentional type tells you what it logically is. And it's easy to see that, in a lisp with mutable data structures, it is absurdly difficult for the representational type of a (non-null) list to be different than that of a cons. Here's an example of why:
(defun make-list/last (length init)
;; return a list of length LENGTH, with each element being INIT,
;; and its last cons.
(labels ((mlt (n list last)
(cond ((zerop n)
(values list last))
((null last)
(let ((c (cons init nil)))
(mlt (- n 1) c c)))
(t (mlt (- n 1) (cons init list) last)))))
(mlt length '() '())))
(multiple-value-bind (list last) (make-list/last 10 3)
(values
(proper-list-p list)
(progn
(setf (cdr last) t)
(proper-list-p list))
(progn
(setf (cdr (cdr list)) '(2 3))
(proper-list-p list))))
So the result of the last form is t nil t: list is initially a proper list, then it isn't because I fiddled with its final cons, then it is again because I fiddled with some intermediate cons (and now, whatever I do to the cons bound to last will make no difference to that bound to list).
It would be insanely difficult to keep track, in terms of representational type, of whether something is a proper list or not, if you want to use anything that is remotely like linked lists. And type-of, for instance, tells you the representational type of something, which can only be cons (or null for empty lists).

push does not work as I would expect it - why?

I am experiencing a behavior of the push function that I don't get. Maybe someone could explain to me why Lisp behaves this way.
Supposed I define a list as a global variable, and then try to push a new value to it using the following code:
(defparameter *primes* '(3 5 7 11))
(push 2 *primes*)
Then *primes* is now (2 3 5 7 11). So far, so good.
Now I try to do the same thing, but without the *primes* variable, i.e.:
(push 2 '(3 5 7 11))
The result is an error message:
EVAL: 3 is not a function name; try using a symbol instead
Now I have two questions:
Why does this not work? I would expect that push returns the list (2 3 5 7 11), why does this not happen? Where am I wrong?
Apart from that, I don't get the error message. What is Lisp trying to tell me with 3 is not a function name? Of course, 3 is not a function name, but I don't try to call a function named 3 anywhere, do I?
Any help is appreciated :-)
If you read the CL Hyperspec for PUSH, you will read that push expects a place.
A place is something like a variable, a structure slot, a class slot, an array access, or similar. Since Lisp uses linked cons cells for lists, it does not make sense to push something in front of a cons cell, without a reference for that.
So above is simple: we can't push to a direct list.
Why this error message?
This gets a bit complicated...
(push 2 '(3 5 7 11))
Is actually:
(push 2 (quote (3 5 7 11))
A function can be a place, it then needs a corresponding setter function. Here the setter is thought to be (setf quote) - that's right, Common Lisp can sometimes have lists as function names, not only symbols.
If we look at the macroexpansion of above:
? (pprint (macroexpand '(push 2 (quote (3 5 7 11)))))
(LET* ((#:G328 2) (#:G327 (3 5 7 11)) (#:G326 (CONS #:G328 '#:G327)))
#:G327
#:G326
(FUNCALL #'(SETF QUOTE) #:G326 #:G327))
You can see that it tries to call the setter. But it also thinks that (3 5 7 11) is a Lisp form.
I give you an example, where it actually works, but we don't use quote, but a real accessor function:
CL-USER 40 > (let ((v (vector (list (list 'a 'b 'c) (list 'd 'e 'f))
(list (list 1 2 3) (list 4 5 6)))))
(print v)
(push 42 (first (aref v 1)))
(print v)
(values))
#(((A B C) (D E F)) ((1 2 3) (4 5 6)))
#(((A B C) (D E F)) ((42 1 2 3) (4 5 6)))
In above first is the getter and CL knows the corresponding setter. The form (aref v 1) is the call and returns the index 1 element of the vector. We are then pushing to the first list of the element.
Your call has a similar structure and (3 5 7 11) is at a similar position as (aref v 1). The Lisp system says that in (3 4 7 11) then number 3 is not a valid function. Which is correct. But the real error was about the push operation. Since the macro could not detect the error, the error gets later detected in the macro expanded code.
I have found only the emacs lisp manual push, but I guess it behaves similar for Common Lisp
— Macro: push element listname
This macro creates a new list whose car is element and whose cdr is the list specified by listname, and saves that list in listname.
So it seems push is modifying its argument listname, which isn't possible with a literal list. To do what you have in mind, one would use cons instead.
To the second part 3 is not a function name, I would say push, or some function inside it, tries to evaluate the literal list. Evaluating (3 5 7 11) means, call the function 3 with arguments 5 7 11. Hence the error message.
Again from emacs, Ctrl-h f push
push is a Lisp macro in `cl.el'.
(push X PLACE)
Insert X at the head of the list stored in PLACE.
Analogous to (setf PLACE (cons X PLACE)), though more careful about
evaluating each argument only once and in the right order. PLACE may
be a symbol, or any generalized variable allowed by `setf'.
setf in turn allows place to be a
symbolic references such as (car x) or (aref x i)
which explains, why push evaluates the second argument.
I think you need CONS in second case:
(cons 2 '(3 5 7 11)) => (2 3 5 7 11)

How to watch out for the fact that NREVERSE may modify CARs instead

http://www.aiai.ed.ac.uk/~jeff/lisp/cl-pitfalls states this as one of Common Lisp pitfalls
Destructive functions that you think would modify CDRs might
modify CARs instead. (Eg, NREVERSE.)
I am not sure what precautions I am supposed to take. Usual precaution I can take from the fact that NREVERSE may modify CDRs is to use NREVERSE only when the list (the argument) does not share tail with any other lists that my variables may refer to later (except for the variable I save the return value to). What precaution I should take from the fact that NREVERSE may modify CARs? How is this something to watch out for?
Without any context this is very hard to understand.
Example:
(setq list1 (list 1 2 3 4))
We now have a list of four numbers. The variable list1 points to the first cons.
If we look at a destructive reverse we are talking about an operation which may alter the cons cells. There are different ways how this list can be reversed.
We could for example take the cons cells and reverse those. The first cons cell then is the last. The cdr of that cons cell then has to be changed into NIL.
CL-USER 52 > (setq list1 (list 1 2 3 4))
(1 2 3 4)
CL-USER 53 > (nreverse list1)
(4 3 2 1)
Now our variable list1 still points to the same cons cell, but its cdr has been changed:
CL-USER 54 > list1
(1)
To make sure that the variable points to a reversed list, the programmer then has the duty to update the variable and set it to the result of the nreverse operation. One may also be tempted to exploit the observable result that list1 points to the last cons.
Above is what a Lisp developer usually would expect. Most implementation of reverse seem to work that way. But it is not specified in the ANSI CL standard how nreverse has to be implemented.
So what would it mean to change the CARs instead?
Let's look at an alternative implementation of nreverse:
(defun nreverse1 (list)
(loop for e across (reverse (coerce list 'vector))
for a on list do
(setf (car a) e))
list)
Above function let's the chain of cons cells intact, but changes the car.
CL-USER 56 > (setq list1 (list 1 2 3 4))
(1 2 3 4)
Now let's use the new version, nreverse1.
CL-USER 57 > (nreverse1 list1)
(4 3 2 1)
CL-USER 58 > list1
(4 3 2 1)
Now you see the difference: list1 still points to the whole list.
Summary: one needs to be aware that there are different implementations of nreverse possible. Don't exploit the usual behavior, where a variable then will point to the last cons. Just use the result of nreverse and everything is fine.
Side note: where could the second version have been used?
Some Lisp implementations on Lisp Machines allowed a compact vector-like representation of lists. If on such a Lisp implementation one would nreverse such a list, the implementors could provide an efficient vector-like nreverse.
In any case, be it CAR or CDR of cons cell modified, you shouldn't use NREVERSE if any cons cell (including first cons cell) of passed list may be shared with another list. Use REVERSE instead.
BTW, clisp indeed modifies CARs:
> (let ((a (list 1 2 3 4 5 6 7 8 9 0)))
(nreverse a)
a)
(0 9 8 7 6 5 4 3 2 1)

Using lists with Common LISP

I'm just starting out with LISP, as in, just opened the book, I'm two pages into it. I'm trying to understand what is and what is not an acceptable fn call. Every time I try to execute
(1 2 3 4)
I get an illegal fn call error
same goes for
(cdr (1 2 3 4))
(first (1 2 3 4))
(a b c d)
Are CL programs unable to return lists? How would I go about using these functions or printing a list? I'm using the SLIME implementation if it matters. LISP is very different than anything I've worked with before and I want to be sure I'm getting it conceptually.
You need to quote lists if you are using them as constants. Otherwise, the system will try to call the function 1 on the arguments 2 3 4, which will not work (note that function calls have the same syntax as lists). Your examples should be:
'(1 2 3 4)
(cdr '(1 2 3 4))
(first '(1 2 3 4))
'(a b c d)
Hooo boy.
Look up Practical Common Lisp by Seibel. He's such a nice guy, he put it online for free reading. It's very useful.
Part of the definition of Lisp is this rule:
When a list is seen: Using the first element of the list, apply it to the rest of the list.
But wait: How do you actually enter lists then? There are two functions to do this: QUOTE and LIST.
As an example, let's print a list to the screen on standard out:
(format *standard-output* "~a" '(1 2 3 4))
For format, *standard-output* is aliased to t (well, at least in SBCL!), so usually we see (format t ....