terpri, princ & co. vs format - lisp

Chapter 9.10 of Common Lisp: A Gentle Introduction To Symbolic Computation claims:
The primitive i/o functions TERPRI, PRIN1, PRINC and PRINT were defined in Lisp 1.5 (the ancestor of all modern Lisp systems) and are still found in Common Lisp today. They are included in the Advanced Topics section as a historical note; you can get the same effect with FORMAT.
This implies that you do not neet princ & co. any more and that, in modern code, you only should rely on format instead.
Are there any disadvantages when doing this? Respectively, are there any things one can not achieve with format that works with the other ones?

These functions correspond exactly to the following FORMAT operators:
TERPRI = ~%
FRESH-LINT = ~&
PRIN1 = ~S
PRINC = ~A
PRINT = ~%~S<space>

You can also use the more modern write. I'm not a huge fan of format because of its terse sub language, which usually is interpreted. Note that a good implementation might be able to compile format directives to more efficient code. I use FORMAT mostly when it makes complex code shorter, but not to output plain objects or things like single carriage returns...
Common Lisp includes three or more generations of text I/O APIs:
the old s-expression printing routines
the specialized and generalized stream IO functions
the complex formatter, based on earlier Fortran and/or Multics IO formatters
the Generic Function to print objects
the pretty printer
Additionally there are semi-standard CLOS-based IO implementations like Gray Streams.
Each might have its purpose and none is going away soon...
CL-USER 54 > (let ((label "Social security number")
(colon ": ")
(social-security-number '|7537 DD 459234957324 DE|))
(terpri)
(princ label)
(princ colon)
(princ social-security-number)
(write-char #\newline)
(write-string label)
(write-string colon)
(write social-security-number :escape nil)
(format t "~%~A~A~A" label colon social-security-number)
)
Social security number: 7537 DD 459234957324 DE
Social security number: 7537 DD 459234957324 DE
Social security number: 7537 DD 459234957324 DE

Related

What's the semantic difference between the backtick and quote symbols in Common Lisp?

I understand that both suppress evaluation of a symbol or expression. But the backtick is used for macro definitions while the apostrophe is used for symbols (among other things). What is the difference, semantically speaking, between these two notations?
Backticks allow for ,foo and ,#foo to interpolate dynamic parts into the quoted expression.
' straight up quotes everything literally.
If there are no comma parts in the expression, ` and ' can be used interchangeably.
A standard quote is a true constant literal and similar lists and list that end with the same structure can share values:
'(a b c d) ; ==> (a b c d)
A backquoted structure might not be a literal. It is evaluated as every unquote needs to be evaluated and inserted into place. This means that something like `(a ,#b ,c d) actually gets expanded to something similar to (cons 'a (append b (cons c '(d)))).
The standard is very flexible on how the implementations solves this so if you try to macroexpand the expression you get many different solutions and sometimes internal functions. The result though is well explained in the standard.
NB: Even though two separate evaluation produces different values the implementation is still free to share structure and thus in my example '(d) has the potential to be shared and if one would use mutating concatenation of the result might end up with an infinite structure.
A parallel to this is that in some algol languages you have two types of strings. One that interpolates variables and one that don't. Eg. in PHP
"Hello $var"; // ==> 'Hello Shoblade'
'Hello $var'; // ==> 'Hello $var'

What do elisp expression (1+ (buffer-size)) and (+ 1 (buffer-size)) mean?

I'm very very new in elisp and just started learning it. I have seen the following expressions in the document:
(1+ (buffer-size))
(+ 1 (buffer-size))
What do they mean? As I know elisp use prefix notation, so the second one should be correct one. But both of them can be executed without any errors. The first one is from the documentation of point-max function.
Thanks.
The token 1+ is an identifier which denotes a symbol. This symbol has a binding as a function, and so (1+ arg) means "call the 1+ function, with the value of arg as its argument). The 1+ function returns 1 plus the value of its argument.
The syntax (+ 1 arg) is a different way to achieve that effect. Here the function is named by the symbol +. The + function receives two arguments which it adds together.
In many mainstream programming languages popular today, the tokenization rules are such that there is no difference between 1+ and 1 +: both of these denote a numeric constant followed by a + token. Lisp tokenization is different. Languages in the Lisp family usually support tokens that can contain can contain digits and non-alpha-numeric characters. I'm looking at the Emacs Lisp reference manual and do not see a section about the logic which the read function uses to convert printed representations to objects. Typically, "Lispy" tokenizing behavior is something like this: token is scanned first without regard for what kind of token it is based on accumulating characters which are valid token constituents, stopping at a character which is not a token constituent. For instance when the input is abcde(f, the token that will be extracted is abcde. The ( character terminates the token (and stays in the input stream). Then, the resulting clump of characters abcde is re-examined and classified, converted to an object based on what it looks like, according to the rules of the given Lisp dialect. Across Lisp dialects, we can broadly depend on a token of all alphabetic characters to denote a symbol, and a token of all digits (possibly with a leading sign) to denote an integer. 1+ has a trailing + though, which is different!

Using Emoji literals in Clojure source

On Linux with UTF-8 enabled console:
Clojure 1.6.0
user=> (def c \の)
#'user/c
user=> (str c)
"の"
user=> (def c \🍒)
RuntimeException Unsupported character: \🍒 clojure.lang.Util.runtimeException (Util.java:221)
RuntimeException Unmatched delimiter: ) clojure.lang.Util.runtimeException (Util.java:221)
I was hoping to have an emoji-rich Clojure application with little effort, but it appears I will be looking up and typing in emoji codes? Or am I missing something obvious here? 😞
Java represents Unicode characters in UTF-16. The emoji characters are "supplementary characters" and have a codepoint that cannot be represented in 16 bits.
http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html
In essence, supplementary characters are represented not as chars but as ints and there are special apis for dealing with them.
One way is with (Character/toChars 128516) - this returns a char array that you can convert to a string to print: (apply str (Character/toChars 128516)). Or you can create a String from an array of codepoint ints directly with (String. (int-array [128516]) 0 1). Depending on all the various things between Java/Clojure and your eyeballs, that may or may not do what you want.
The format api supports supplementary characters so that may be easiest, however it takes an int so you'll need a cast: (format "Smile! %c" (int 128516)).
Thanks to Clojure’s extensible reader tags, you can create Unicode literals quite easily yourself.
We already know that not all of Unicode can be represented as char literals; that the preferred representation of Unicode characters on the JVM is int; and that a string literal can hold any Unicode character in a way that’s also convenient for humans to read.
So, a tagged literal #u "🍒" that reads as an int would make an excellent Unicode character literal!
Set up a reader function for the new tagged literal in *data-readers*:
(defn read-codepoint
[^String s]
{:pre [(= 1 (.codePointCount s 0 (.length s)))]}
(.codePointAt s 0))
(set! *data-readers* (assoc *data-readers* 'u #'read-codepoint))
With that in place, the reader reads such literals as code point integers:
#u"🍒" ; => 127826
(Character/getName #u"🍒") ; => "CHERRIES"
‘Reader tags without namespace qualifiers are reserved for Clojure’, says the documentation … #u is short but perhaps not the most responsible choice.

What exactly is a symbol in lisp/scheme?

For the love of the almighty I have yet to understand the purpose of the symbol 'iamasymbol. I understand numbers, booleans, strings... variables. But symbols are just too much for my little imperative-thinking mind to take. What exactly do I use them for? How are they supposed to be used in a program? My grasp of this concept is just fail.
In Scheme and Racket, a symbol is like an immutable string that happens to be interned so that symbols can be compared with eq? (fast, essentially pointer comparison). Symbols and strings are separate data types.
One use for symbols is lightweight enumerations. For example, one might say a direction is either 'north, 'south, 'east, or 'west. You could of course use strings for the same purpose, but it would be slightly less efficient. Using numbers would be a bad idea; represent information in as obvious and transparent a manner as possible.
For another example, SXML is a representation of XML using lists, symbols, and strings. In particular, strings represent character data and symbols represent element names. Thus the XML <em>hello world</em> would be represented by the value (list 'em "hello world"), which can be more compactly written '(em "hello world").
Another use for symbols is as keys. For example, you could implement a method table as a dictionary mapping symbols to implementation functions. To call a method, you look up the symbol that corresponds to the method name. Lisp/Scheme/Racket makes that really easy, because the language already has a built-in correspondence between identifiers (part of the language's syntax) and symbols (values in the language). That correspondence makes it easy to support macros, which implement user-defined syntactic extensions to the language. For example, one could implement a class system as a macro library, using the implicit correspondence between "method names" (a syntactic notion defined by the class system) and symbols:
(send obj meth arg1 arg2)
=>
(apply (lookup-method obj 'meth) obj (list arg1 arg2))
(In other Lisps, what I've said is mostly truish, but there are additional things to know about, like packages and function vs variable slots, IIRC.)
A symbol is an object with a simple string representation that (by default) is guaranteed to be interned; i.e., any two symbols that are written the same are the same object in memory (reference equality).
Why do Lisps have symbols? Well, it's largely an artifact of the fact that Lisps embed their own syntax as a data type of the language. Compilers and interpreters use symbols to represent identifiers in a program; since Lisp allows you to represent a program's syntax as data, it provides symbols because they're part of the representation.
What are they useful apart from that? Well, a few things:
Lisp is commonly used to implement embedded domain-specific languages. Many of the techniques used for that come from the compiler world, so symbols are an useful tool here.
Macros in Common Lisp usually involve dealing with symbols in more detail than this answer provides. (Though in particular, generation of unique identifiers for macro expansions requires being able to generate a symbol that's guaranteed never to be equal to any other.)
Fixed enumeration types are better implemented as symbols than strings, because symbols can be compared by reference equality.
There are many data structures you can construct where you can get a performance benefit from using symbols and reference equality.
Symbols in lisp are human-readable identifiers. They are all singletons. So if you declare 'foo somewhere in your code and then use 'foo again, it will point to the same place in memory.
Sample use: different symbols can represent different pieces on a chessboard.
From Structure and Interpretation of Computer Programs Second Edition by Harold Abelson and Gerald Jay Sussman 1996:
In order to manipulate symbols we need a new element in our language:
the ability to quote a data object. Suppose we want to construct the list
(a b). We can’t accomplish this with (list a b), because this expression
constructs a list of the values of a and b rather than the symbols themselves.
This issue is well known in the context of natural languages, where words
and sentences may be regarded either as semantic entities or as character
strings (syntactic entities). The common practice in natural languages is to use quotation marks to indicate that a word or a sentence is to be treated
literally as a string of characters. For instance, the first letter of “John” is
clearly “J.” If we tell somebody “say your name aloud,” we expect to hear
that person’s name. However, if we tell somebody “say ‘your name’ aloud,”
we expect to hear the words “your name.” Note that we are forced to nest
quotation marks to describe what somebody else might say.
We can follow this same practice to identify lists and symbols that are
to be treated as data objects rather than as expressions to be evaluated.
However, our format for quoting differs from that of natural languages in
that we place a quotation mark (traditionally, the single quote symbol ’)
only at the beginning of the object to be quoted. We can get away with this in Scheme syntax because we rely on blanks and parentheses to delimit
objects. Thus, the meaning of the single quote character is to quote the
next object.
Now we can distinguish between symbols and their values:
(define a 1)
(define b 2)
(list a b)
(1 2)
(list ’a ’b)
(a b)
(list ’a b)
(a 2)
Lists containing symbols can look just like the expressions of our language:
(* (+ 23 45) (+ x 9))
(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))
Example: Symbolic Differentiation
A symbol is just a special name for a value. The value could be anything, but the symbol is used to refer to the same value every time, and this sort of thing is used for fast comparisons. As you say you are imperative-thinking, they are like numerical constants in C, and this is how they are usually implemented (internally stored numbers).
To illustrate the point made by Luis Casillas, it might be useful to observe how symbols eval differently than strings.
The example below is for mit-scheme (Release 10.1.10). For convenience, I use this function as eval:
(define my-eval (lambda (x) (eval x (scheme-report-environment 5))))
A symbol can easily evaluate to the value or function it names:
(define s 2) ;Value: s
(my-eval "s") ;Value: "s"
(my-eval s) ;Value: 2
(define a '+) ;Value: a
(define b "+") ;Value: b
(my-eval a) ;Value: #[arity-dispatched-procedure 12]
(my-eval b) ;Value: "+"
((my-eval a) 2 3) ;Value: 5
((my-eval b) 2 3) ;ERROR: The object "+" is not applicable.

Dates and times in Emacs Lisp

I understand emacs lisp is great for handling dates and times, but does it have a function to convert strings to internal representation of integers using formats like %Y, %m, %d, %H, %M, %S, and so on? And also, in the emacs reference manual, it says that times are lists of two or three integers, but is there a more formal specification or description? ~ Thanks ~
Edit: Thanks for the responses - but guess I was wondering if there was a function that does format-time-string in reverse (like parse-time-string but with structure specifications for the input string)?
Edit2: I guess the answer is that there is nothing built in... but a partial implementation has been implemented here.
(defun encode-time-string (string)
(apply #'encode-time (parse-time-string string)))
The internal representation may change; I think it'd be better to use the provided documented API (encode-time, decode-time, etc) to access it.
The time is returned by most time related functions as a list of three integers. The first has the
most significant 16 bits of the seconds, while the second has the
least significant 16 bits. The third integer gives the microsecond
count.
The microsecond count is zero on systems that do not provide
resolution finer than a second.
As for the rest of your question have a look at this section of the manual in case you missed it.
If you need to parse a date (without a time), such as "July 7, 2022", then parse-time will return nil for the first 3 values. To avoid Wrong type argument: fixnump, nil, this possibility should be checked for:
(defun encode-time-string (string)
(let* ((dt (parse-time-string string))
(dtt (if (car dt)
dt
(append '(0 0 0) (nthcdr 3 dt))))
)
(apply #'encode-time dtt)))