Racket - recursive contracts - racket

I was trying to make a recursive contract for my binary tree struct:
(struct node (l r))
(struct leaf (val))
(define (tree-of val)
(or/c (struct/c leaf val) (struct/c node (tree-of val) (tree-of val))))
(define/contract (id-tree t)
(-> (tree-of symbol?) (tree-of symbol?))
t)
(id-tree (leaf 'a))
It seems though that my contract leads to an infinite loop, not sure why. First of all, shouldn't it stop after or/c receives any positive value (from (struct/c leaf val) in this case)?
Even if it checks the second predicate, (leaf 'a) is obviously not a node, so why would it recursively call tree-of again?

In some sense, there are two phases: contract computing and contract checking. Your example doesn't terminate in the contract computing phase.
Suppose you attach (or/c <a> <b>) to a value x. or/c is simply a normal function, so under call-by-value (which is what Racket has), both <a> and <b> will be computed.
If nothing goes wrong, <a> and <b> should evaluate to contract values va and vb respectively. The contract checking then starts by testing x against va. If it fails, then it tests x against vb.
The problem with your example is that the process of computing contract values doesn't even terminate. No checking has even occurred yet by that point.
To accomplish what you want to do, use flat-rec-contract:
(define (tree-of/c val)
(flat-rec-contract tree-of
(struct/c leaf val)
(struct/c node tree-of tree-of)))

Related

What is atom in LISP?

I would like to have a clear understanding, what is 'Atom' in LISP?
Due to lispworks, 'atom - any object that is not a cons.'.
But this definition is not enough clear for me.
For example, in the code below:
(cadr
(caddar (cddddr L)))
Is 'L' an atom? On the one hand, L is not an atom, because it is cons, because it is the list (if we are talking about object, which is associated with the symbol L).
On the other hand, if we are talking about 'L' itself (not about its content, but about the symbol 'L'), it is an atom, because it is not a cons.
I've tried to call function 'atom',
(atom L) => NIL
(atom `L) => T
but still I have no clue... Please, help!
So the final question: in the code above, 'L' is an atom, or not?
P.S. I'm asking this question due to LISP course at my university, where we have a definition of 'simple expression' - it is an expression, which is atom or function call of one or two atomic parameters. Therefore I wonder if expression (cddddr L) is simple, which depends on whether 'L' is atomic parameter or not.
Your Lisp course's private definition of "simple expression" is almost certainly rooted purely in syntax. The idea of "atomic parameter" means that it's not a compound expression. It probably has nothing to do with the run-time value!
Thus, I'm guessing, these are simple expressions:
(+ 1 2)
42
"abc"
whereas these are not:
(+ 1 (* 3 4)) ;; (* 3 4) is not an atomic parameter
(+ a b c) ;; parameters atomic, but more than two
(foo) ;; not simple: fewer than one parameter, not "one or two"
In light of the last counterexample, it would probably behoove them to revise their definition.
On the one hand, L is not an atom, because it is cons, because it is the list (if we are talking about object, which is associated with the symbol L).
You are talking here about the meaning of the code being executed, its semantics. L here stands for a value, which is a list in your tests. At runtime you can inspect values and ask about their types.
On the other hand, if we are talking about 'L' itself (not about its content, but about the symbol 'L'), it is an atom, because it is not a cons.
Here you are looking at the types of the values that make up the syntax of your code, how it is being represented before even being evaluated (or compiled). You are manipulating a tree of symbols, one of them being L. In the source code, this is a symbol. It has no meaning by itself other than being a name.
Code is data
Lisp makes it easy to represent source code using values in the language itself, and easy to manipulate fragments of code at one point to build code that is executed later. This is often called homoiconicity, thought it is somewhat a touchy word because people don't always think the definition is precise enough to be useful. Another saying is "code is data", something that most language designers and programmers will agree to be true.
Lisp code can be built at runtime as follows (> is the prompt of the REPL, what follows is the result of evaluation):
> (list 'defun 'foo (list 'l) (list 'car 'l))
(DEFUN FOO (L) (CAR L))
The resulting form happens to be valid Common Lisp code, not just a generic list of values. If you evaluate it with (eval *), you will define a function named FOO that takes the first element of some list L.
NB. In Common Lisp the asterisk * is bound in the REPL to the last value being successfully returned.
Usually you don't build code like that, the Lisp reader turns a stream of characters into such a tree. For example:
> (read-from-string "(defun foo (l) (car l))")
(DEFUN FOO (L) (CAR L))
But the reader is called also implicitly in the REPL (that's the R in the acronym).
In such a tree of symbols, L is a symbol.
Evaluation model
When you call function FOO after it has been defined, you are evaluating the body of FOO in a context where L is bound to some value. And the rule for evaluating a symbol is to lookup the value it is bound to, and return that. This is the semantics of the code, which the runtime implements for you.
If you are using a simple interpreter, maybe the symbol L is present somewhere at runtime and its binding is looked up. Usually the code is not interpreted like that, it is possible to analyze it and transform it in an efficient way, during compilation. The result of compilation probably does not manipulate symbols anymore here, it just manipulates CPU registers and memory.
In all cases, asking for the type of L at this point is by definition of the semantics just asking the type for whatever value is bound to L in the context it appears.
An atom is anything that is not a cons cell
Really, the definition of atom is no more complex than that. A value in the language that is not a cons-cell is called an atom. This encompasses numbers, strings, everything.
Sometimes you evaluate a tree of symbols that happens to be code, but then the same rule applies.
Simple expressions
P.S. I'm asking this question due to LISP course at my university, where we have a definition of 'simple expression' - it is an expression, which is atom or function call of one or two atomic parameters. Therefore I wonder if expression (cddddr L) is simple, which depends on whether 'L' is atomic parameter or not.
In that course you are writing functions that analyze code. You are given a Lisp value and must decide if it is a simple expression or not.
You are not interested in any particular interpretation of the value being given, at no point you are going to traverse the value, see a a symbol and try to resolve it to a value: you are checking if the syntax is a valid simple expression or not.
Note also that the definitions in your course might be a bit different than the one from any particular exising flavor (this is not necessarily Common Lisp, or Scheme, but a toy LISP dialect). Follow in priority the definitions from your course.
Imagine we have a predicate which tells us if an object is not a number:
(not-number-p 3) -> NIL
(not-number-p "string") -> T
(let ((foo "another string))
(not-number-p foo)) -> T
(not-number '(1 2 3)) -> T
(not-number (first '(1 2 3)) -> NIL
We can define that as:
(defun not-number-p (object)
(not (numberp object))
Above is just the opposite of NUMBERP.
NUMBERP -> T if object is a number
NOT-NUMBER-P -> NIL if object is a number
Now imagine we have a predicate NOT-CONS-P, which tells us if an object is not a CONS cell.
(not-cons-p '(1 . 2)) -> NIL
(let ((c '(1 . 2)))
(not-cons-p c)) -> NIL
(not-cons-p 3) -> T
(let ((n 4))
(not-cons-p n)) -> T
(not-cons-p NIL) -> T
(not-cons-p 'NIL) -> T
(not-cons-p 'a-symbol) -> T
(not-cons-p #\space) -> T
The function NOT-CONS-P can be defined as:
(defun not-cons-p (object)
(if (consp object)
NIL
T))
Or shorter:
(defun not-cons-p (object)
(not (consp object))
The function NOT-CONS-P is traditionally called ATOM in Lisp.
In Common Lisp every object which is not a cons cell is called an atom. The function ATOM is a predicate.
See the Common Lisp HyperSpec: Function Atom
Your question:
(cadr (caddar (cddddr L)))
Is 'L' an atom?
How would we know that? L is a variable. What is the value of L?
(let ((L 10))
(atom L)) -> T
(let ((L (cons 1 2)))
(atom L) -> NIL
(atom l) answers this question:
-> is the value of L an atom
(atom l) does not answer this question:
-> is L an atom? L is a variable and in a function call the value of L is passed to the function ATOM.
If you want to ask if the symbol L is an atom, then you need to quote the symbol:
(atom 'L) -> T
(atom (quote L)) -> T
symbols are atoms. Actually everything is an atom, with the exception of cons cells.

How to convert cond statements that produces a boolean value into an expression involving only not, and and or

I am learning about racket/scheme and came across an online resource that said, if a function written using a cond gives true or false, it can be rewritten using only not, and, and or. I have worked out some simple examples where I was able to to convert cond statements into a statement only involving not and and or. My question is if there is a way that the logic can be "seen" right away when converting between these two types of statements. I understand that it is not always practical to convert every cond statement into a combination of not's and's and or's but I am interested in learning about the logic behind the process of converting.
Thanks in advance.
(If something about the question does not make sense, leave a comment and I will try clarifying what I want to understand)
All conditional expressions (and not only those evaluating to true/false) can be rewritten using only boolean combinators. This is because of how logical operators are evaluated in Scheme/Racket. For instance, logically (and a b) would be true if both a and b are true, and otherwise false. But in Racket, the result of (and a b) is b if both a and b are truthy, and otherwise false. That is, evaluation proceeds to the right until either the last argument or a falsy value is encountered. At that point, evaluation stops and that value (which could be a boolean but needn't be) is returned. It's because and and or don't simply produce boolean output that they can be used to stand in for conditional expressions.
E.g.
(if #t 'hello 'bye) ;=> hello
(or (and #t 'hello) 'bye) ;=> hello
(if #f 'hello 'bye) ;=> bye
(or (and #f 'hello) 'bye) ;=> bye
(cond [#f 'hello]
[#f 'bye]
[#t 'aloha]) ;=> aloha
(or (and #f 'hello)
(and #f 'bye)
(and #t 'aloha)) ;=> aloha
But you wouldn't usually want to use them that way since they're hard to read. As a general guideline, use if and cond in most cases, rather than elementary boolean operators. If you only care about taking action on a positive or negative result of the conditional, then you could use when or unless. If you do care about handling both positive and negative results, but one of them is a boolean result such as this example:
(if (positive? n)
#t
(even? n))
... then this would be a case where a boolean operator would be preferable, like so:
(or (positive? n) (even? n))
If both arms of the if conditional are boolean values, like this:
(if (> n 3)
#t
#f)
... then just replace the entire conditional expression with the condition itself:
(> n 3)
Otherwise, stick to if and cond.
Once you convert a cond to nested ifs, you can always turn it into and or and not like this:
(if A B C) --> (or (and A B) (and (not A) C))
However, if you do this blindly, you will get a much more complicated expression than what you could get, so I would add a couple more transformations you can use:
(if A B #f) --> (and A B)
(if A B #t) --> (or (not A) B)
(if A #f C) --> (and (not A) C)
(if A #t C) --> (or A C)
(note: that or above might return a different truthy-value other than #t, making it technically different but equivalent-when-used-as-a-boolean)
Another thing I should note is that sometimes you can transform a multi-branch cond into and or not without transforming into ifs first. For example a 3-branch cond:
(cond [A B]
[C D]
[else E])
-->
(or (and A B)
(and (not A) C D)
(and (not A) (not C) E))
Or a 4-branch cond:
(cond [A B]
[C D]
[E F]
[else G])
-->
(or (and A B)
(and (not A) C D)
(and (not A) (not C) E F)
(and (not A) (not C) (not E) G))
Each and corresponds to a cond-branch, and each cond-branch's and has nots in it for every previous condition, in addition to its own condition.
A more generic rule you can apply:
for i from 1 through n,
(cond [Q_i A_i]
...
[else E])
-->
on each i, for j from 1 through i-1,
(or (and (not Q_j) ... Q_i A_i)
...
(and (not Q_i) ... E)
First of all, you need to desugar cond language into a sequence of if-then-else sequences, which is trivial.
After that, you can rewrite if conditionals into boolean operators. You can look into a manual of propositional logic to learn this. Or look here.
Btw. It is forbidden to paste your homework on stack overflow.

racket contract dependency evaluation twice?

#lang racket
(module inside racket
(provide
(contract-out
[dummy (->i ([x (lambda (x) (begin (displayln 0) #t))]
[y (x) (lambda (y) (begin (displayln 1) #t))]
[z (x y) (lambda (z) (begin (displayln 2) #t))]
)
any
)]
)
)
(define (dummy x y z) #t)
)
(require 'inside)
(dummy 1 2 3)
The output is
0
0
1
1
2
#t
It's unclear to me why having x and y as dependencies would require the corresponding guard to fire again.
The doc of ->i http://docs.racket-lang.org/reference/function-contracts.html#%28form._%28%28lib.racket%2Fcontract%2Fbase..rkt%29.-~3ei%29%29 doesn't seem to mention this behavior.
Anyone can shed some light on this?
This was as confusing to me as it was to you, so I took the opportunity to ask this question on the Racket mailing list. What follows is an attempt to summarize what I found.
The ->i combinator produces a dependent contract that uses the indy blame semantics presented in the paper Correct Blame for Contracts. The key idea presented in the paper is that, with dependent contracts, there can actually be three parties that might need to be blamed for contract violations.
With normal function contracts, there are two potentially guilty parties. The first is the most obvious one, which is the caller. For example:
> (define/contract (foo x)
(integer? . -> . string?)
(number->string x))
> (foo "hello")
foo: contract violation
expected: integer?
given: "hello"
in: the 1st argument of
(-> integer? string?)
contract from: (function foo)
blaming: anonymous-module
(assuming the contract is correct)
The second potential guilty party is the function itself; that is, the implementation might not match the contract:
> (define/contract (bar x)
(integer? . -> . string?)
x)
> (bar 1)
bar: broke its own contract
promised: string?
produced: 1
in: the range of
(-> integer? string?)
contract from: (function bar)
blaming: (function bar)
(assuming the contract is correct)
Both of these cases are pretty obvious. However, the ->i contract introduces a third potential guilty party: the contract itself.
Since ->i contracts can execute arbitrary expressions at contract attachment time, it’s possible for them to violate themselves. Consider the following contract:
(->i ([mk-ctc (integer? . -> . contract?)])
[val (mk-ctc) (mk-ctc "hello")])
[result any/c])
This is a somewhat silly contract, but it’s easy to see that it’s a naughty one. It promises to only call mk-ctc with integers, but the dependent expression (mk-ctc "hello") calls it with a string! It would obviously be wrong to blame the calling function, since it has no control over the invalid contract, but it might also be wrong to blame the contracted function, since the contract could be defined in complete isolation from the function it is attached to.
For an illustration of this, consider a multi-module example:
#lang racket
(module m1 racket
(provide ctc)
(define ctc
(->i ([f (integer? . -> . integer?)]
[v (f) (λ (v) (> (f v) 0))])
[result any/c])))
(module m2 racket
(require (submod ".." m1))
(provide (contract-out [foo ctc]))
(define (foo f v)
(f #f)))
(require 'm2)
In this example, the ctc contract is defined in the m1 submodule, but the function that uses the contract is defined in a separate submodule, m2. There are two possible blame scenarios here:
The foo function is obviously invalid, since it applies f to #f, despite the contract specifying (integer? . -> . integer?) for that argument. You can see this in practice by calling foo:
> (foo add1 0)
foo: broke its own contract
promised: integer?
produced: #f
in: the 1st argument of
the f argument of
(->i
((f (-> integer? integer?))
(v (f) (λ (v) (> (f v) 0))))
(result any/c))
contract from: (anonymous-module m2)
blaming: (anonymous-module m2)
(assuming the contract is correct)
I’ve emphasized the spot in the contract error that includes blame information, and you can see that it blames m2, which makes sense. This isn’t the interesting case, since it’s the second potential blame party mentioned in the non-dependent case.
However, the ctc contract is actually a little bit wrong! Note that the contract on v applies f to v, but it never checks that v is an integer. For this reason, if v is something else, f will be called in an invalid way.1 You can see this behavior by giving a non-integral value for v:
> (foo add1 "hello")
foo: broke its own contract
promised: integer?
produced: "hello"
in: the 1st argument of
the f argument of
(->i
((f (-> integer? integer?))
(v (f) (λ (v) (> (f v) 0))))
(result any/c))
contract from: (anonymous-module m1)
blaming: (anonymous-module m1)
(assuming the contract is correct)
The top of the contract error is the same (Racket provides the same “broke its own contract” message for these kinds of contract violations), but the blame information is different! It now blames m1, which is the actual source of the contract. This is the indy blame party.
This distinction is what means the contracts have to be applied twice. It applies them with each distinct blame party’s information: first it applies them with the contract-blame, then it applies them with the function-blame.
Technically, this could be avoided for flat contracts, since flat contracts will never signal a contract violation after the initial contract attachment process. However, the ->i combinator currently does not implement any such optimization, since it probably wouldn’t have a significant impact on performance, and the contract implementation is already fairly complex (though if someone wanted to implement it, it would likely be accepted).
In general, though, contracts are expected to be stateless and idempotent (flat contracts are expected to be simple predicates), so there’s not really any guarantee that this won’t happen, and ->i just uses that to implement its fine-grained blame information.
1. As it turns out, the ->d contract combinator doesn’t catch this issue at all, so add1 ends up raising a contract violation here. This is why ->i was created, and it’s why ->i is favored over ->d.

Type Predicates for Function Types in Typed/Racket

I'm at the early stages of designing a framework and am fooling around with typed/racket. Suppose I have the following types:
(define-type Calculate-with-one-number (-> Number Number))
(define-type Calculate-with-two-numbers (-> Number Number Number))
And I want to have a function that dispatches on type:
(: dispatcher (-> (U Calculate-with-one-number Calculate-with-two-numbers) Number))
(define (dispatcher f args)
(cond [(Calculate-with-one-number? f)
(do-something args)]
[(Calculate-with-two-numbers? f)
(do-something-else args)]
[else 42]))
How do I create the type-predicates Calculate-with-one-number? and Calculate-with-two-numbers? in Typed/Racket? For non-function predicates I can use define-predicate. But it's not clear how to implement predicates for function types.
Since I am self answering, I'm taking the liberty to clarify the gist of my question in light of the discussion of arity as a solution. The difference in arity was due to my not considering its implications when specifying the question.
The Problem
In #lang typed/racket as in many Lisps, functions, or more properly: procedures, are first class dataypes.
By default, #lang racket types procedures by arity and any additional specificity in argument types must be done by contract. In #lang typed/racket procedures are typed both by arity and by the types of their arguments and return values due to the language's "baked-in contracts".
Math as an example
The Typed Racket Guide provides an example using define-type to define a procedure type:
(define-type NN (-> Number Number))
This allows specifying a procedure more succinctly:
;; Takes two numbers, returns a number
(define-type 2NN (-> Number Number Number))
(: trigFunction1 2NN)
(define (trigFunction1 x s)
(* s (cos x)))
(: quadraticFunction1 2NN)
(define (quadraticFunction1 x b)
(let ((x1 x))
(+ b (* x1 x1))))
The Goal
In a domain like mathematics, it would be nice to work with more abstract procedure types because knowing that a function is cyclical between upper and lower bounds (like cos) versus having only one bound (e.g. our quadratic function) versus asymptotic (e.g. a hyperbolic function) provides for clearer reasoning about the problem domain. It would be nice to have access to abstractions something like:
(define-type Cyclic2NN (-> Number Number Number))
(define-type SingleBound2NN (-> Number Number Number))
(: trigFunction1 Cyclic2NN)
(define (trigFunction1 x s)
(* s (cos x)))
(: quadraticFunction1 SingleBound2NN)
(define (quadraticFunction1 x b)
(let ((x1 x))
(+ b (* x1 x1))))
(: playTone (-> Cyclic2NN))
(define (playTone waveform)
...)
(: rabbitsOnFarmGraph (-> SingleBound2NN)
(define (rabbitsOnFarmGraph populationSize)
...)
Alas, define-type does not deliver this level of granularity when it comes to procedures. Even moreover, the brief false hope that we might easily wring such type differentiation for procedures manually using define-predicate is dashed by:
Evaluates to a predicate for the type t, with the type (Any -> Boolean : t). t may not contain function types, or types that may refer to mutable data such as (Vectorof Integer).
Fundamentally, types have uses beyond static checking and contracts. As first class members of the language, we want to be able to dispatch our finer grained procedure types. Conceptually, what is needed are predicates along the lines of Cyclic2NN? and SingleBound2NN?. Having only arity for dispatch using case-lambda just isn't enough.
Guidance from Untyped Racket
Fortunately, Lisps are domain specific languages for writing Lisps once we peal back the curtain to reveal the wizard, and in the end we can get what we want. The key is to come at the issue the other way and ask "How canwe use the predicates typed/racket gives us for procedures?"
Structures are Racket's user defined data types and are the basis for extending it's type system. Structures are so powerful that even in the class based object system, "classes and objects are implemented in terms of structure types."
In #lang racket structures can be applied as procedures giving the #:property keyword using prop:procedure followed by a procedure for it's value. The documentation provides two examples:
The first example specifies a field of the structure to be applied as a procedure. Obviously, at least once it has been pointed out, that field must hold a value that evaluates to a procedure.
> ;; #lang racket
> (struct annotated-proc (base note)
#:property prop:procedure
(struct-field-index base))
> (define plus1 (annotated-proc
(lambda (x) (+ x 1))
"adds 1 to its argument"))
> (procedure? plus1)
#t
> (annotated-proc? plus1)
#t
> (plus1 10)
11
> (annotated-proc-note plus1)
"adds 1 to its argument"
In the second example an anonymous procedure [lambda] is provided directly as part of the property value. The lambda takes an operand in the first position which is resolved to the value of the structure being used as a procedure. This allows accessing any value stored in any field of the structure including those which evaluate to procedures.
> ;; #lang racket
> (struct greeter (name)
#:property prop:procedure
(lambda (self other)
(string-append
"Hi " other
", I'm " (greeter-name self))))
> (define joe-greet (greeter "Joe"))
> (greeter-name joe-greet)
"Joe"
> (joe-greet "Mary")
"Hi Mary, I'm Joe"
> (joe-greet "John")
"Hi John, I'm Joe
Applying it to typed/racket
Alas, neither syntax works with struct as implemented in typed/racket. The problem it seems is that the static type checker as currently implemented cannot both define the structure and resolve its signature as a procedure at the same time. The right information does not appear to be available at the right phase when using typed/racket's struct special form.
To get around this, typed/racket provides define-struct/exec which roughly corresponds to the second syntactic form from #lang racket less the keyword argument and property definition:
(define-struct/exec name-spec ([f : t] ...) [e : proc-t])
name-spec = name
| (name parent)
Like define-struct, but defines a procedural structure. The procdure e is used as the value for prop:procedure, and must have type proc-t.
Not only does it give us strongly typed procedural forms, it's a bit more elegant than the keyword syntax found in #lang racket. Example code to resolve the question as restated here in this answer is:
#lang typed/racket
(define-type 2NN (-> Number Number Number))
(define-struct/exec Cyclic2NN
((f : 2NN))
((lambda(self x s)
((Cyclic2NN-f self) x s))
: (-> Cyclic2NN Number Number Number)))
(define-struct/exec SingleBound2NN
((f : 2NN))
((lambda(self x s)
((SingleBound2NN-f self) x s))
: (-> SingleBound2NN Number Number Number)))
(define trigFunction1
(Cyclic2NN
(lambda(x s)
(* s (cos x)))))
(define quadraticFunction1
(SingleBound2NN
(lambda (x b)
(let ((x1 x))
(+ b (* x1 x1)))))
The defined procedures are strongly typed in the sense that:
> (SingleBound2NN? trigFunction1)
- : Boolean
#f
> (SingleBound2NN? quadraticFunction1)
- : Boolean
#t
All that remains is writing a macro to simplify specification.
In the general case, what you want is impossible due to how types are implemented in Racket. Racket has contracts, which are run-time wrappers that guard parts of a program from other parts. A function contract is a wrapper that treats the function as a black box - a contract of the form (-> number? number?) can wrap any function and the new wrapper function first checks that it receives one number? and then passes it to the wrapped function, then checks that the wrapped function returns a number?. This is all done dynamically, every single time the function is called. Typed Racket adds a notion of types that are statically checked, but since it can provide and require values to and from untyped modules, those values are guarded with contracts that represent their type.
In your function dispatcher, you accept a function f dynamically, at run time and then want to do something based on what kind of function you got. But functions are black boxes - contracts don't actually know anything about the functions they wrap, they just check that they behave properly. There's no way to tell if dispatcher was given a function of the form (-> number? number?) or a function of the form (-> string? string?). Since dispatcher can accept any possible function, the functions are black boxes with no information about what they accept or promise. dispatcher can only assume the function is correct with a contract and try to use it. This is also why define-type doesn't make a predicate automatically for function types - there's no way to prove a function has the type dynamically, you can only wrap it in a contract and assume it behaves.
The exception to this is arity information - all functions know how many arguments they accept. The procedure-arity function will give you this information. So while you can't dispatch on function types at run-time in general, you can dispatch on function arity. This is what case-lambda does - it makes a function that dispatches based on the number of arguments it receives:
(: dispatcher (case-> [-> Calculate-with-one-number Number Void]
[-> Calculate-with-two-numbers Number Number Void]))
(define dispatcher
(case-lambda
[([f : Calculate-with-one-number]
[arg : Number])
(do-something arg)]
[([f : Calculate-with-two-numbers]
[arg1 : Number]
[arg2 : Number])
(do-something-else arg1 arg2)]
[else 42]))

scheme continuations for dummies

For the life of me, I can't understand continuations. I think the problem stems from the fact that I don't understand is what they are for. All the examples that I've found in books or online are very trivial. They make me wonder, why anyone would even want continuations?
Here's a typical impractical example, from TSPL, which I believe is quite recognized book on the subject. In english, they describe the continuation as "what to do" with the result of a computation. OK, that's sort of understandable.
Then, the second example given:
(call/cc
(lambda (k)
(* 5 (k 4)))) => 4
How does this make any sense?? k isn't even defined! How can this code be evaluated, when (k 4) can't even be computed? Not to mention, how does call/cc know to rip out the argument 4 to the inner most expression and return it? What happens to (* 5 .. ?? If this outermost expression is discarded, why even write it?
Then, a "less" trivial example stated is how to use call/cc to provide a nonlocal exit from a recursion. That sounds like flow control directive, ie like break/return in an imperative language, and not a computation.
And what is the purpose of going through these motions? If somebody needs the result of computation, why not just store it and recall later, as needed.
Forget about call/cc for a moment. Every expression/statement, in any programming language, has a continuation - which is, what you do with the result. In C, for example,
x = (1 + (2 * 3));
printf ("Done");
has the continuation of the math assignment being printf(...); the continuation of (2 * 3) is 'add 1; assign to x; printf(...)'. Conceptually the continuation is there whether or not you have access to it. Think for a moment what information you need for the continuation - the information is 1) the heap memory state (in general), 2) the stack, 3) any registers and 4) the program counter.
So continuations exist but usually they are only implicit and can't be accessed.
In Scheme, and a few other languages, you have access to the continuation. Essentially, behind your back, the compiler+runtime bundles up all the information needed for a continuation, stores it (generally in the heap) and gives you a handle to it. The handle you get is the function 'k' - if you call that function you will continue exactly after the call/cc point. Importantly, you can call that function multiple times and you will always continue after the call/cc point.
Let's look at some examples:
> (+ 2 (call/cc (lambda (cont) 3)))
5
In the above, the result of call/cc is the result of the lambda which is 3. The continuation wasn't invoked.
Now let's invoke the continuation:
> (+ 2 (call/cc (lambda (cont) (cont 10) 3)))
12
By invoking the continuation we skip anything after the invocation and continue right at the call/cc point. With (cont 10) the continuation returns 10 which is added to 2 for 12.
Now let's save the continuation.
> (define add-2 #f)
> (+ 2 (call/cc (lambda (cont) (set! add-2 cont) 3)))
5
> (add-2 10)
12
> (add-2 100)
102
By saving the continuation we can use it as we please to 'jump back to' whatever computation followed the call/cc point.
Often continuations are used for a non-local exit. Think of a function that is going to return a list unless there is some problem at which point '() will be returned.
(define (hairy-list-function list)
(call/cc
(lambda (cont)
;; process the list ...
(when (a-problem-arises? ...)
(cont '()))
;; continue processing the list ...
value-to-return)))
Here is text from my class notes: http://tmp.barzilay.org/cont.txt. It is based on a number of sources, and is much extended. It has motivations, basic explanations, more advanced explanations for how it's done, and a good number of examples that go from simple to advanced, and even some quick discussion of delimited continuations.
(I tried to play with putting the whole text here, but as I expected, 120k of text is not something that makes SO happy.
TL;DR: continuations are just captured GOTOs, with values, more or less.
The exampe you ask about,
(call/cc
(lambda (k)
;;;;;;;;;;;;;;;;
(* 5 (k 4)) ;; body of code
;;;;;;;;;;;;;;;;
)) => 4
can be approximately translated into e.g. Common Lisp, as
(prog (k retval)
(setq k (lambda (x) ;; capture the current continuation:
(setq retval x) ;; set! the return value
(go EXIT))) ;; and jump to exit point
(setq retval ;; get the value of the last expression,
(progn ;; as usual, in the
;;;;;;;;;;;;;;;;
(* 5 (funcall k 4)) ;; body of code
;;;;;;;;;;;;;;;;
))
EXIT ;; the goto label
(return retval))
This is just an illustration; in Common Lisp we can't jump back into the PROG tagbody after we've exited it the first time. But in Scheme, with real continuations, we can. If we set some global variable inside the body of function called by call/cc, say (setq qq k), in Scheme we can call it at any later time, from anywhere, re-entering into the same context (e.g. (qq 42)).
The point is, the body of call/cc form may contain an if or a condexpression. It can call the continuation only in some cases, and in others return normally, evaluating all expressions in the body of code and returning the last one's value, as usual. There can be deep recursion going on there. By calling the captured continuation an immediate exit is achieved.
So we see here that k is defined. It is defined by the call/cc call. When (call/cc g) is called, it calls its argument with the current continuation: (g the-current-continuation). the current-continuation is an "escape procedure" pointing at the return point of the call/cc form. To call it means to supply a value as if it were returned by the call/cc form itself.
So the above results in
((lambda(k) (* 5 (k 4))) the-current-continuation) ==>
(* 5 (the-current-continuation 4)) ==>
; to call the-current-continuation means to return the value from
; the call/cc form, so, jump to the return point, and return the value:
4
I won't try to explain all the places where continuations can be useful, but I hope that I can give brief examples of main place where I have found continuations useful in my own experience. Rather than speaking about Scheme's call/cc, I'd focus attention on continuation passing style. In some programming languages, variables can be dynamically scoped, and in languages without dynamically scoped, boilerplate with global variables (assuming that there are no issues of multi-threaded code, etc.) can be used. For instance, suppose there is a list of currently active logging streams, *logging-streams*, and that we want to call function in a dynamic environment where *logging-streams* is augmented with logging-stream-x. In Common Lisp we can do
(let ((*logging-streams* (cons logging-stream-x *logging-streams*)))
(function))
If we don't have dynamically scoped variables, as in Scheme, we can still do
(let ((old-streams *logging-streams*))
(set! *logging-streams* (cons logging-stream-x *logging-streams*)
(let ((result (function)))
(set! *logging-streams* old-streams)
result))
Now lets assume that we're actually given a cons-tree whose non-nil leaves are logging-streams, all of which should be in *logging-streams* when function is called. We've got two options:
We can flatten the tree, collect all the logging streams, extend *logging-streams*, and then call function.
We can, using continuation passing style, traverse the tree, gradually extending *logging-streams*, finally calling function when there is no more tree to traverse.
Option 2 looks something like
(defparameter *logging-streams* '())
(defun extend-streams (stream-tree continuation)
(cond
;; a null leaf
((null stream-tree)
(funcall continuation))
;; a non-null leaf
((atom stream-tree)
(let ((*logging-streams* (cons stream-tree *logging-streams*)))
(funcall continuation)))
;; a cons cell
(t
(extend-streams (car stream-tree)
#'(lambda ()
(extend-streams (cdr stream-tree)
continuation))))))
With this definition, we have
CL-USER> (extend-streams
'((a b) (c (d e)))
#'(lambda ()
(print *logging-streams*)))
=> (E D C B A)
Now, was there anything useful about this? In this case, probably not. Some minor benefits might be that extend-streams is tail-recursive, so we don't have a lot of stack usage, though the intermediate closures make up for it in heap space. We do have the fact that the eventual continuation is executed in the dynamic scope of any intermediate stuff that extend-streams set up. In this case, that's not all that important, but in other cases it can be.
Being able to abstract away some of the control flow, and to have non-local exits, or to be able to pick up a computation somewhere from a while back, can be very handy. This can be useful in backtracking search, for instance. Here's a continuation passing style propositional calculus solver for formulas where a formula is a symbol (a propositional literal), or a list of the form (not formula), (and left right), or (or left right).
(defun fail ()
'(() () fail))
(defun satisfy (formula
&optional
(positives '())
(negatives '())
(succeed #'(lambda (ps ns retry) `(,ps ,ns ,retry)))
(retry 'fail))
;; succeed is a function of three arguments: a list of positive literals,
;; a list of negative literals. retry is a function of zero
;; arguments, and is used to `try again` from the last place that a
;; choice was made.
(if (symbolp formula)
(if (member formula negatives)
(funcall retry)
(funcall succeed (adjoin formula positives) negatives retry))
(destructuring-bind (op left &optional right) formula
(case op
((not)
(satisfy left negatives positives
#'(lambda (negatives positives retry)
(funcall succeed positives negatives retry))
retry))
((and)
(satisfy left positives negatives
#'(lambda (positives negatives retry)
(satisfy right positives negatives succeed retry))
retry))
((or)
(satisfy left positives negatives
succeed
#'(lambda ()
(satisfy right positives negatives
succeed retry))))))))
If a satisfying assignment is found, then succeed is called with three arguments: the list of positive literals, the list of negative literals, and function that can retry the search (i.e., attempt to find another solution). For instance:
CL-USER> (satisfy '(and p (not p)))
(NIL NIL FAIL)
CL-USER> (satisfy '(or p q))
((P) NIL #<CLOSURE (LAMBDA #) {1002B99469}>)
CL-USER> (satisfy '(and (or p q) (and (not p) r)))
((R Q) (P) FAIL)
The second case is interesting, in that the third result is not FAIL, but some callable function that will try to find another solution. In this case, we can see that (or p q) is satisfiable by making either p or q true:
CL-USER> (destructuring-bind (ps ns retry) (satisfy '(or p q))
(declare (ignore ps ns))
(funcall retry))
((Q) NIL FAIL)
That would have been very difficult to do if we weren't using a continuation passing style where we can save the alternative flow and come back to it later. Using this, we can do some clever things, like collect all the satisfying assignments:
(defun satisfy-all (formula &aux (assignments '()) retry)
(setf retry #'(lambda ()
(satisfy formula '() '()
#'(lambda (ps ns new-retry)
(push (list ps ns) assignments)
(setf retry new-retry))
'fail)))
(loop while (not (eq retry 'fail))
do (funcall retry)
finally (return assignments)))
CL-USER> (satisfy-all '(or p (or (and q (not r)) (or r s))))
(((S) NIL) ; make S true
((R) NIL) ; make R true
((Q) (R)) ; make Q true and R false
((P) NIL)) ; make P true
We could change the loop a bit and get just n assignments, up to some n, or variations on that theme. Often times continuation passing style is not needed, or can make code hard to maintain and understand, but in the cases where it is useful, it can make some otherwise very difficult things fairly easy.