Can a Lisp read procedure read this and how? - lisp

I'm writing a grammar which I intend to implement in a Lisp read procedure, i.e. reading one expression at a time from an input source which is i.e. mutable. Most of the grammar is just like Lisp, but the two pertinent changes are:
Whitespace is read and is part of the resulting syntax. Contiguous whitespace is grouped together like contiguous non-whitespace characters are grouped as identifiers, and the result of reading such a string is a "whitespace object", which stores the exact sequence of characters read. The evaluator ignores whitespace objects when they appear in a list (in other words, if foo is a whitespace object then (eval '(+ 3 foo 4)) is equivalent to (eval '(+ 3 4))), and if it is asked to evaluate one directly, it is self-evaluating.
Secondly, if several tokens other than whitespace tokens appear on the same line, those tokens are collected into a list and that list is the result of the read.
e.g.,
+ 3 4 5
(+ 3 4 5)
+ 3 4 (+ 1 4)
(+ 3 4 (+ 1 4))
all produce the value 12.
Is it possible to implement this reader as a Lisp read procedure that follows the typical expectations of a read procedure? If so, how? (I'm at a loss.)
Edit: Clarification on whitespace:
If we say that a "whitespace object" is simply a string and read, then reading the following segment:
(foo bar baz)
produces a syntax object like:
'(foo " " bar " " baz)
In other words, the whitespace between tokens is stored in the resultant syntax object.
Suppose I write a macro named ->, which takes a syntax object (scheme style macro), and whitespace? is a predicate identifying whitespace syntax objects
(define-macro (-> stx)
(let* ((stxl (syntax-object->list stx))
(obj (car stxl))
(let proc ((res empty))
(lst (cdr stxl)))
(let ((method (car lst)))
(if (whitespace? method)
; skip whitespace, recur immediately
(proc res (cdr lst))
; Insert obj as the second element in method
(let ((modified-method (cons (car method)
(cons obj (cdr method)))))
; recur
(proc (cons res modified-method) (cdr lst))))))))

The reading part of this is pretty easy. You just need a whitespace test, and then your reading function will install a custom reader character macro that detects whitespace and reads consecutive sequences of whitespace into a single object. First, the whitespace test and a whitespace object; these are pretty simple:
(defparameter *whitespace*
#(#\space #\tab #\return #\newline)
"A vector of whitespace characters.")
(defun whitespace-p (char)
"Returns true if CHAR is in *WHITESPACE*."
(find char *whitespace* :test 'char=))
(defstruct whitespace-object
characters)
Now the macro character function:
(defun whitespace-macro-char (stream char)
"A macro character function that consumes characters from
stream (including CHAR), until a non-whitespace character (or end of
file) is encountered. Returns a whitespace-object whose characters
slot contains a string of the whitespace characters."
(let ((chars (loop for c = (peek-char nil stream nil #\a)
while (whitespace-p c)
collect (read-char stream))))
(make-whitespace-object
:characters (coerce (list* char chars) 'string))))
Now the read function just has the same signature as the normal read, but copies the readtable, then installs the macro function, and calls read. The result from read is returned, and the readtable is restored:
(defun xread (&optional (stream *standard-input*) (eof-error-p t) eof-value recursive-p)
"Like READ, but called with *READTABLE* bound to a readtable in
which each whitespace characters (that is, each character in
*WHITESPACE*) is a macro characters whose macro function is
WHITESPACE-MACRO-CHAR."
(let ((rt (copy-readtable)))
(map nil (lambda (wchar)
(set-macro-character wchar #'whitespace-macro-char))
*whitespace*)
(unwind-protect (read stream eof-error-p eof-value recursive-p)
(setf *readtable* rt))))
Example:
(with-input-from-string (in "(+ 1 2 (* 3
4))")
(xread in))
(+ #S(WHITESPACE-OBJECT :CHARACTERS " ") 1
#S(WHITESPACE-OBJECT :CHARACTERS " ") 2
#S(WHITESPACE-OBJECT :CHARACTERS " ")
(* #S(WHITESPACE-OBJECT :CHARACTERS " ") 3
#S(WHITESPACE-OBJECT
:CHARACTERS "
")
4))
Now, to implement the eval counterpart that you want, you need to be able to remove whitespace objects from lists. This isn't too hard, and we can write a slightly more general utility function to do it for us:
(defun remove-element-if (predicate tree)
"Returns a new tree like TREE, but which contains no elements in an
element position which ssatisfy PREDICATE. An element is in element
position if it is the car of some cons cell in TREE."
(if (not (consp tree))
tree
(if (funcall predicate (car tree))
(remove-element-if predicate (cdr tree))
(cons (remove-element-if predicate (car tree))
(remove-element-if predicate (cdr tree))))))
CL-USER> (remove-element-if (lambda (x) (and (numberp x) (evenp x))) '(+ 1 2 3 4))
(+ 1 3)
CL-USER> (with-input-from-string (in "(+ 1 2 (* 3
4))")
(remove-element-if 'whitespace-object-p (xread in)))
(+ 1 2 (* 3 4))
So now the evaluation function is a simple wrapper around eval:
(defun xeval (form)
(eval (remove-element-if 'whitespace-object-p form)))
CL-USER> (with-input-from-string (in "(+ 1 2 (* 3
4))")
(xeval (xread in)))
15
Let's make sure that standalone whitespace objects still appear as expected:
CL-USER> (with-input-from-string (in " ")
(let* ((exp (xread in))
(val (xeval exp)))
(values exp val)))
#S(WHITESPACE-OBJECT :CHARACTERS " ")
#S(WHITESPACE-OBJECT :CHARACTERS " ")

Related

Common Lisp: Destructure a list in first, rest, last (like Python iterable unpacking)

Exercise 6.36 of David Touretzky's Common Lisp book asks for a function swap-first-last that swaps the first and last argument of any list. I feel really stupid right now, but I am unable to solve this with destructuring-bind.
How can I do what in Python would be first, *rest, last = (1,2,3,4) (iterable unpacking) in Common Lisp/with destructuring-bind?
After all trying out, and with some comments by #WillNess (thanks!) I came up with this idea:
macro bind
The idea is trying to subdivide the list and use the &rest functionality of the lambda list in destructuring-bind, however, using the shorter . notation - and using butlast and the car-last combination.
(defmacro bind ((first _rest last) expr &body body)
`(destructuring-bind ((,first . ,_rest) ,last)
`(,,(butlast expr) ,,(car (last expr)))
,#body)))
usage:
(bind (f _rest l) (list 1 2 3 4)
(list f _rest l))
;; => (1 (2 3) 4)
My original answer
There is no so elegant possibility like for Python.
destructuring-bind cannot bind more differently than lambda can: lambda-lists take only the entire rest as &rest <name-for-rest>.
No way there to take the last element out directly.
(Of course, no way, except you write a macro extra for this kind of problems).
(destructuring-bind (first &rest rest) (list 1 2 3 4)
(let* ((last (car (last rest)))
(*rest (butlast rest)))
(list first *rest last)))
;;=> (1 (2 3) 4)
;; or:
(destructuring-bind (first . rest) (list 1 2 3 4)
(let* ((last (car (last rest)))
(*rest (butlast rest)))
(list first *rest last)))
But of course, you are in lisp, you could theoretically write macros to
destructuring-bind in a more sophisticated way ...
But then, destructuring-bind does not lead to much more clarity than:
(defparameter *l* '(1 2 3 4))
(let ((first (car *l*))
(*rest (butlast (cdr *l*)))
(last (car (last *l*))))
(list first *rest last))
;;=> (1 (2 3) 4)
The macro first-*rest-last
To show you, how quickly in common lisp such a macro is generated:
;; first-*rest-last is a macro which destructures list for their
;; first, middle and last elements.
;; I guess more skilled lisp programmers could write you
;; kind of a more generalized `destructuring-bind` with some extra syntax ;; that can distinguish the middle pieces like `*rest` from `&rest rest`.
;; But I don't know reader macros that well yet.
(ql:quickload :alexandria)
(defmacro first-*rest-last ((first *rest last) expr &body body)
(let ((rest))
(alexandria:once-only (rest)
`(destructuring-bind (,first . ,rest) ,expr
(destructuring-bind (,last . ,*rest) (nreverse ,rest)
(let ((,*rest (nreverse ,*rest)))
,#body))))))
;; or an easier definition:
(defmacro first-*rest-last ((first *rest last) expr &body body)
(alexandria:once-only (expr)
`(let ((,first (car ,expr))
(,*rest (butlast (cdr ,expr)))
(,last (car (last ,expr))))
,#body))))
Usage:
;; you give in the list after `first-*rest-last` the name of the variables
;; which should capture the first, middle and last part of your list-giving expression
;; which you then can use in the body.
(first-*rest-last (a b c) (list 1 2 3 4)
(list a b c))
;;=> (1 (2 3) 4)
This macro allows you to give any name for the first, *rest and last part of the list, which you can process further in the body of the macro,
hopefully contributing to more readability in your code.

Function returns list but prints out NIL in LISP

I'm reading a file char by char and constructing a list which is consist of list of letters of words. I did that but when it comes to testing it prints out NIL. Also outside of test function when i print out list, it prints nicely. What is the problem here? Is there any other meaning of LET keyword?
This is my read fucntion:
(defun read-and-parse (filename)
(with-open-file (s filename)
(let (words)
(let (letter)
(loop for c = (read-char s nil)
while c
do(when (char/= c #\Space)
(if (char/= c #\Newline) (push c letter)))
do(when (or (char= c #\Space) (char= c #\Newline) )
(push (reverse letter) words)
(setf letter '())))
(reverse words)
))))
This is test function:
(defun test_on_test_data ()
(let (doc (read-and-parse "document2.txt"))
(print doc)
))
This is input text:
hello
this is a test
You're not using let properly. The syntax is:
(let ((var1 val1)
(var2 val2)
...)
body)
If the initial value of the variable is NIL, you can abbreviate (varN nil) as just varN.
You wrote:
(let (doc
(read-and-parse "document2.txt"))
(print doc))
Based on the above, this is using the abbreviation, and it's equivalent to:
(let ((doc nil)
(read-and-parse "document2.txt"))
(print doc))
Now you can see that this binds doc to NIL, and binds the variable read-and-parse to "document2.txt". It never calls the function. The correct syntax is:
(let ((doc (read-and-parse "document2.txt")))
(print doc))
Barmar's answer is the right one. For interest, here is a version of read-and-parse which makes possibly-more-idiomatic use of loop, and also abstracts out the 'is the character white' decision since this is something which is really not usefully possible in portable CL as the standard character repertoire is absurdly poor (there's no tab for instance!). I'm sure there is some library available via Quicklisp which deals with this better than the below.
I think this is fairly readable: there's an outer loop which collects words, and an inner loop which collects characters into a word, skipping over whitespace until it finds the next word. Both use loop's collect feature to collect lists forwards. On the other hand, I feel kind of bad every time I use loop (I know there are alternatives).
By default this collects the words as lists of characters: if you tell it to it will collect them as strings.
(defun char-white-p (c)
;; Is a character white? The fallback for this is horrid, since
;; tab &c are not a standard characters. There must be a portability
;; library with a function which does this.
#+LispWorks (lw:whitespace-char-p c)
#+CCL (ccl:whitespacep c) ;?
#-(or LispWorks CCL)
(member char (load-time-value
(mapcan (lambda (n)
(let ((c (name-char n)))
(and c (list c))))
'("Space" "Newline" "Page" "Tab" "Return" "Linefeed"
;; and I am not sure about the following, but, well
"Backspace" "Rubout")))))
(defun read-and-parse (filename &key (as-strings nil))
"Parse a file into a list of words, splitting on whitespace.
By default the words are returned as lists of characters. If
AS-STRINGS is T then they are coerced to strings"
(with-open-file (s filename)
(loop for maybe-word = (loop with collecting = nil
for c = (read-char s nil)
;; carry on until we hit EOF, or we
;; hit whitespace while collecting a
;; word
until (or (not c) ;EOF
(and collecting (char-white-p c)))
;; if we're not collecting and we see
;; a non-white character, then we're
;; now collecting
when (and (not collecting) (not (char-white-p c)))
do (setf collecting t)
when collecting
collect c)
while (not (null maybe-word))
collect (if as-strings
(coerce maybe-word 'string)
maybe-word))))

sbcl (and clisp): When is a character not a character? (using defconstant)

This question is about sbcl -- or so I thought originally. The question: When is a character not a character? Consider the following code:
(defconstant +asc-lf+ #\Newline)
(defconstant +asc-space+ #\Space)
(prin1 (type-of #\Newline )) (terpri)
(prin1 (type-of #\Space )) (terpri)
(prin1 (type-of +asc-lf+ )) (terpri)
(prin1 (type-of +asc-space+)) (terpri)
As expected, it produces:
STANDARD-CHAR
STANDARD-CHAR
STANDARD-CHAR
STANDARD-CHAR
Now consider this code:
(defun st (the-string)
(string-trim '(#\Newline #\Space) the-string))
(princ "\"")
(princ (st " abcdefgh "))
(princ "\"")
(terpri)
It produces:
"abcdefgh"
But consider this code:
(defconstant +asc-lf+ #\Newline)
(defconstant +asc-space+ #\Space)
(defun st (the-string)
(string-trim '(+asc-lf+ +asc-space+) the-string))
(princ "\"")
(princ (st " abcdefgh "))
(princ "\"")
(terpri)
When you load it using sbcl, it gives you:
While evaluating the form starting at line 6, column 0
of #P"/u/home/sbcl/experiments/type-conflict.d/2.lisp":"
debugger invoked on a TYPE-ERROR:
The value
+ASC-LF+
is not of type
CHARACTER
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [RETRY ] Retry EVAL of current toplevel form.
1: [CONTINUE] Ignore error and continue loading file "/u/home/sbcl/experiments/type-conflict.d/2.lisp".
2: [ABORT ] Abort loading file "/u/home/sbcl/experiments/type-conflict.d/2.lisp".
3: Exit debugger, returning to top level.
((FLET SB-IMPL::TRIM-CHAR-P :IN SB-IMPL::GENERIC-STRING-TRIM) #\ )
0]
At first, I was anticipating being able to report that clisp does the appropriate call to #'string-trim, with the anticipated returned value, or maybe errors out. But it does neither of these. The function returns the same string that was passed to it, without any trimming.
Is this what should be happening? What am I missing?
EDIT approx. 2017-10-21 08:50 UTC
The fine answer by PuercoPop inspires a follow-up question. If I should post this as a separate question, just give the word and I will.
Why is it that (at least with sbcl and clisp) this:
(defconstant +asc-lf+ #\Newline)
(defconstant +asc-space+ #\Space)
(prin1 (type-of (first (list #\Newline #\Space))))
(terpri)
(prin1 (type-of (first '(#\Newline #\Space))))
(terpri)
yields this?
STANDARD-CHAR
STANDARD-CHAR
With PuercoPop's answer, I would have expected it to yield something about a symbol, not a character, for the second expression.
The main confusion comes from
the dual purpose of lists: data and code. For evaluation (+ a b) is code, here a function call. Both (quote (+ a b)) and '(+ a b) is data, as they evaluate to the quoted literal data.
reading already creates objects. #\newline is already read as a character object. It is built-in syntax: Sharpsign Backslash It is not a string, not a symbol and not some yet unknown piece of data. It is read as an object of type character (I use the wording character object for that here, one could also just say character).
These are symbols:
foo
bar
+foo+
*the-foo*
When symbols get evaluated, they evaluate to their value.
These are character objects:
#\f
#\O
#\o
#\newline
When character objects get evaluated, they evaluate to themselves.
Thus '#\foo, (quote #\foo) and #\foo evaluate all to the same object.
These are lists
(newline #\newline) ; the first item is a symbol, the second a character object
(#\a #\b #\c) ; a list of character objects
(a b c) ; a list of symbols
What happens if we evaluate lists:
(+ a b) ; the sum of the values of A and B
(list a b) ; a list gets computed, with the values of variables a and b
(list 'a 'b) ; a list gets computed, with the symbols A and B
'(a b) ; a literal list of the symbols A and B
'(#\a #\b) ; a literal list of the character objects #\a and #\b
'(a #\a) ; a literal list of the symbol A and the character object #\a
(#\a #\b) ; an error, #\a is not a function/macro/special-form
(+ a 'b) ; an error, a symbol B is not a number
Evaluating backquoted lists:
`(a ,a #\a ,#\a) ; a list of the symbol a, the value of the variable a,
; the character object a and again the character object a
Your error:
'(+asc-lf+ +asc-space+) evaluates to a list of symbols.
The function STRING-TRIM expects a sequence of characters.
You need to write something like this:
(list +asc-lf+ +asc-space+) ; calling the function list
`(,+asc-lf+ ,+asc-space+) ; a backquoted list with comma for evaluation
(vector +asc-lf+ +asc-space+) ; the constructed vector is also a sequence
Also:
(list #\Newline #\Space) and '(#\Newline #\Space) evaluate both to a list of characters. The #\ syntax is a built-in feature of the Lisp reader to construct character objects. Thus #\newline is converted at read-time into a character-object:
CL-USER 82 > (describe (read))
#\Newline ; we type the nine characters #\Newline
#\Newline is a CHARACTER
Name "Newline"
Code 10
The problem is that you are quoting the "character list". So instead of a list of characters it is a list of symbols. That is
(defun st (the-string)
(string-trim (list +asc-lf+ +asc-space+) the-string))
The error message hints at this when it says
The value
+ASC-LF+ is not of type
CHARACTER
and not
The value
#\Newline is not of type
CHARACTER

return a line of text if match found

I am having some trouble working out how to return a line of text if a match is found.
(set 'wireshark "http://anonsvn.wireshark.org/wireshark/trunk/manuf")
(set 'arptable (map (fn (x) (parse x " ")) (exec "arp -a")))
(define (cleanIPaddress x)
(slice x 1 -1))
(define (cleanMACaddress x)
(upper-case (join (slice (parse x ":") 0 3) ":")))
(define (addIPandMACaddress x)
(list (cleanIPaddress (nth 1 x)) (cleanMACaddress (nth 3 x))))
(set 'arplist (map addIPandMACaddress arptable))
(set 'routerMAC (last (assoc (exec "ipconfig getoption en1 router") arplist)))
(find-all routerMAC (get-url wireshark))
returns
("20:AA:4B")
so I know that the code "works"
but I would like to retrieve the full line of text
"20:AA:4B Cisco-Li # Cisco-Linksys, LLC"
This can be performed simply by using a string-split procedure that allows us to use remove-if (the Common Lisp version of filter) to search through a string split by newlines removing any lines that do not contain the string we are searching for. That would result in a list of every line containing the string. The functions we will define here are already available via various Common Lisp libraries, but for the education purposes, we will define them all ourselves. The code you need works like so:
; First we need a function to split a string by character
(defun string-split (split-string string)
(loop with l = (length split-string)
for n = 0 then (+ pos l)
for pos = (search split-string string :start2 n)
if pos collect (subseq string n pos)
else collect (subseq string n)
while pos))
; Now we will make a function based on string-split to split by newlines
(defun newline-split (string)
(string-split "
" string))
; Finally, we go through our text searching for lines that match our string.
; Make sure to replace 'needle' with the string you wish to search for.
(remove-if #'(lambda (x)
(equal 'nil (search (string-upcase "needle")
(string-upcase x))))
(newline-split haystack))
You should be able to apply this strategy to the code you posted with a few small modifications. This code was tested on SBCL 1.0.55.0-abb03f9, an implementation of ANSI Common Lisp, on Mac OS X 10.7.5.
In the end I used:
(find-all (string routerMAC ".*") (get-url wireshark))

String to list without #\ in common lisp

I'd like to turn String into lists. For example, http => (h t t p).
I try:
(defun string-to-list (s)
(assert (stringp s) (s) "~s :questa non e una stringa")
(coerce s 'list))
but if I do
(string-to-list "http")
results:
(#\h #\t #\t #\p).
Can I remove #\ ?
thanks in advance :)
Why would you do that? What you ask is to split a string (a one-dimensional array of characters) into a list of symbols. Do you really want that?
#\h is a character object printed.
You can print them differently:
CL-USER 8 > (princ #\h)
h
CL-USER 9 > (prin1 #\h)
#\h
Let's print the list using PRINC:
CL-USER 10 > (map nil #'princ (coerce "Hello!" 'list))
Hello!
Btw., since strings, vectors and lists are sequences, you can MAP directly over the string...
CL-USER 11 > (map nil #'princ "Hello!")
Hello!
You can turn a string into a symbol with intern. You can turn a character into a string with string. Interning a lower-case string might cause it to be printed as |h| instead of h, so you'll want to string-upcase it. Putting all that together gives:
(loop for c in (coerce "http" 'list)
collecting (intern (string-upcase (string c))))
Expanding upon larsmans' answer, you can print lowercase symbols unquoted if you change the readtable:
(let ((*readtable* (copy-readtable)))
(setf (readtable-case *readtable*) :preserve)
(prin1 (loop for c in (coerce "http" 'list)
collecting (intern (string c)))))
This will print (h t t p) and return (|h| |t| |t| |p|).
You can print characters unescaped. See the variable *PRINT-ESCAPE*.
The function WRITE has a keyword parameter :ESCAPE for that:
(defun string-to-list (s)
(assert (stringp s) (s) "~s :questa non e una stringa")
(write (coerce s 'list) :escape nil)
CL-USER 11 > (string-to-list "abcd")
(a b c d)
(#\a #\b #\c #\d)
In above example the first form is printed by calling WRITE and the second form is the return value printed by the REPL.