Binding a self reference via macros - racket

The project I am working on defines some complex structures that receive messages and run within their own thread. The structures are user-defined and transformed via macros to threads and runtime stuff. Roughly speaking we can say that a complex structure consists of some behaviour that implements the logic, and a procedure for spawning an instance of the behaviour. In the code below I have vastly simplified the situation, where a behaviour defined by the create-thread-behaviour macro is a simple thunk that can be spawned via the spawn macro. I'd like to implement the ability for (an instance of) a behaviour to send messages to itself via a self parameter that would be bound to (current-thread) (~ the thread that is running the behaviour).
I've tried to rig something up using syntax-parameterize but for some reason cannot get it to work. The code below implements a simple application that should clarify what I want to achieve - the special point of interest is the (unimplemented) <self> reference towards the bottom.
#lang racket
(require (for-syntax syntax/parse))
(define-syntax (create-thread-behaviour stx)
(syntax-parse stx
[(_ body:expr ...+)
#'(λ () body ...)]))
(define-syntax (spawn stx)
(syntax-parse stx
[(_ behaviour:id)
#'(thread behaviour)]))
(define behaviour
(create-thread-behaviour
(let loop ()
(define message (thread-receive))
(printf "message: ~a~n" message)
(thread-send <self> "And this is crazy.")
(loop))))
(define instance (spawn behaviour))
(thread-send instance "Hey I just met you")
So the thing with syntax parameters that I tried is the following, which raises the self-defined "can only be used in a behaviour" error. I know I have correctly used syntax parameters before, but perhaps I have just been looking at the problem for too long.
(require racket/stxparam)
(define-syntax-parameter self
(lambda (stx) (raise-syntax-error (syntax-e stx) "can only be used in a behaviour")))
(define-syntax (spawn stx)
(syntax-parse stx
[(_ behaviour:id)
#'(thread
(lambda ()
(syntax-parameterize ([self #'(current-thread)])
(behaviour))))]))

You’re right that syntax parameters seem like the right tool for the job here. There are two issues here around your use of them, however, that are causing the issue. Let’s take them one at a time.
First of all, syntax parameters are semantically just syntax transformers, as you can see by your initial use of define-syntax-parameter, which binds the syntax parameter to a function. Your use of syntax-parameterize, in contrast, binds the syntax parameter to a piece of syntax, which is wrong. Instead, you need to bind it to a syntax transformer as well.
An easy way to achieve the behavior you’re looking for is to use the make-variable-like-transformer function from syntax/transformer, which makes a syntax transformer that, as the name would imply, behaves like a variable. More generally, though, it actually produces a transformer that behaves like an expression, which (current-thread) is. For that reason, your use of syntax-parameterize should actually look like this:
(require (for-syntax syntax/transformer))
(syntax-parameterize ([self (make-variable-like-transformer #'(current-thread))])
(behaviour))
This will avoid the “bad syntax” errors when attempting to use self after it has been parameterized.
However, there’s another problem in your code, which is that it appears to use a syntax parameter like a normal, non-syntax parameter, when syntax parameters don’t work like that. Normal parameters are effectively dynamically scoped, so the use of syntax-parameterize to wrap (behavior) would adjust self within the dynamic extent of the call to behavior.
Syntax parameters, however, don’t work like that. In fact, they can’t: Racket is syntactically a lexically scoped language, so you really can’t have a dynamic syntax binding: all syntax transformers are expanded at compile time, so adjusting a binding during the dynamic extent of a call is impossible. Syntax parameters are entirely lexically scoped, they simply hygienically adjust a binding within a particular scope. In that sense, they are really just like let, except that they adjust an existing binding rather than produce a new one.
With this consideration in mind, it becomes clear that putting the syntax-parameterize form in spawn can’t really work, because behavior is lexically defined outside of spawn. You could just move the use of syntax-parameterize to create-thread-behavior, but now there’s another problem, which is that this wouldn’t work:
(define (behavior-impl)
(define message (thread-receive))
(printf "message: ~a~n" message)
(thread-send self "And this is crazy.")
(behavior-impl))
(define behaviour
(create-thread-behavior
(behavior-impl)))
Now, once again, self is used outside of the lexical extent of syntax-parameterize, so it won’t be bound.
You’ve mentioned that this is a simplified example of what you’re actually doing, so maybe your real example requires a more complicated solution. If so, you may just need to require that self is only bound within the lexical extent of create-thread-behavior. However, your current use of self is remarkably simple, and in fact, it never changes: it’s always (current-thread). For that reason, you could actually just ditch syntax parameters entirely and define self directly:
(define-syntax self (make-variable-like-transformer #'(current-thread)))
Now self will work everywhere as a variable-looking reference to the value of a parameter, current-thread. This might be what you actually want, since it allows the value of self to be truly dynamically scoped (since it uses runtime parameters, not syntax parameters), but it still makes it look like a variable instead of a function.

Related

Dynamically binding free identifiers in a function body

Consider a very simple actor language where an actor defines some local state and some methods that can be invoked by sending messages to the actor. In its implementation, one such method of the actor can be transformed into a function that defines the formal parameters of the method and accepts the current local state of the actor. Calling the method returns the new local state.
Binding the formal parameters in the body is no problem, but binding the local state seems to be more difficult. In the example at the end of the code below, in the body of the save method the a will remain unbound, despite an a (a different a) being bound by the generated evaluate-body function in the METHOD macro. The critical point in the code sample below is thus the METHOD macro, more specifically the evaluate-body function (which is where binding should happen, implying that my program design is reasonable)
Is there a way to hygienically bind this arbitrary set of free identifiers (currently only containing a, but it may be anything, really)?
#lang racket
(require (for-syntax syntax/parse))
(require racket/stxparam)
(struct actor (local-state methods))
(struct method (name formal-parameters body))
(define-syntax-parameter local-state-variables #f)
(define-syntax (ACTOR stx)
(syntax-parse stx
[(_ (LOCAL_STATE state-variable ...) method:expr ...+)
#'(syntax-parameterize ([local-state-variables '(state-variable ...)])
; For the sake of simplicity, an actor is currently a list of message handlers
(actor
(make-list (length '(state-variable ...)) (void))
(list method ...)))]))
(define-syntax (METHOD stx)
(syntax-parse stx
[(_ (name:id formal-parameter:id ...) body:expr ...+)
(with-syntax ([(local-state-variable ...) (syntax-parameter-value #'local-state-variables)])
#'(method
'name
'(formal-parameter ...)
(λ (formal-parameter ... #:local-state [current-state '()])
; the "a" that will be bound here is different from the free identifier "a" in the body
(define (evaluate-body local-state-variable ...)
body ...
(list local-state-variable ...))
(apply evaluate-body current-state))))]))
(ACTOR (LOCAL_STATE a)
(METHOD (save new-a)
; "a" is an unbound identifier
(set! a new-a)))
In order for the local state variables to have the proper lexical context, you need to store them as identifiers, not symbols. That is, in the result of the ACTOR macro, you need to change the syntax-parameterize to this:
#'(syntax-parameterize ([local-state-variables #'(state-variable ...)])
#| rest of the template (unchanged)... |#)
Note the replacement of quote/' with syntax/#'. This will store the identifiers with their lexical context instead of as symbols.
The next step is to properly introduce them within the METHOD macro. To do this, you just need to apply syntax-local-introduce to the value of the syntax parameter, which will add the macro introduction scope to the identifiers. You can also replace with-syntax with syntax-parse’s #:with clause to simplify things slightly, so the overall macro becomes this:
(define-syntax (METHOD stx)
(syntax-parse stx
[(_ (name:id formal-parameter:id ...) body:expr ...+)
#:with (local-state-variable ...)
(syntax-local-introduce (syntax-parameter-value #'local-state-variables))
#'(method #| rest of the template (unchanged)... |#)]))
This will work.
The reason syntax-local-introduce is needed here might be a little confusing, but the most intuitive way to think about it is by considering the “sets of scopes” hygiene model that Racket currently uses. In order for macro-introduced bindings to not conflict with user-defined bindings, each piece of syntax returned by a syntax transformer has a fresh scope attached to it, a scope that will never be attached to anything written by the user. Of course, some of the syntax in the result is syntax provided by the user, so the macroexpander needs to ensure it doesn’t attach the fresh scope to those syntax objects.
It’s not possible, in general, to figure out which syntax objects should be considered provided by the user since macro authors can “bend” hygiene and create new syntax objects from other ones. The solution, fortunately, is simple and elegant: just attach the macro introduction scope to all syntax objects provided by the user before handing them off to the macro, then flip the scopes on all pieces of syntax in the result. This way, the user-provided syntax objects will not have the macro introduction scope after the flipping occurs.
The syntax-local-introduce function lets you flip this special scope manually. In this case, since the value of local-state-variables should be treated like an input to the macro, but it isn’t automatically given the macro introduction scope by the macroexpander (since it isn’t a direct input to the macro), you have to add the scope yourself. That way, the macroexpander will remove the scope after the macro is expanded, and the identifier will end up with the proper lexical context.

How to call other macros from a Chicken Scheme macro?

I'm trying to move from Common Lisp to Chicken Scheme, and having plenty of problems.
My current problem is this: How can I write a macro (presumably using define-syntax?) that calls other macros?
For example, in Common Lisp I could do something like this:
(defmacro append-to (var value)
`(setf ,var (append ,var ,value)))
(defmacro something-else ()
(let ((values (list))
(append-to values '(1)))))
Whereas in Scheme, the equivalent code doesn't work:
(define-syntax append-to
(syntax-rules ()
((_ var value)
(set! var (append var value)))))
(define-syntax something-else
(syntax-rules ()
((_)
(let ((values (list)))
(append-to values '(1))))))
The append-to macro cannot be called from the something-else macro. I get an error saying the append-to "variable" is undefined.
According to all the information I've managed to glean from Google and other sources, macros are evaluated in a closed environment without access to other code. Essentially, nothing else exists - except built-in Scheme functions and macros - when the macro is evaluated. I have tried using er-macro-transformer, syntax-case (which is now deprecated in Chicken anyway) and even the procedural-macros module.
Surely the entire purpose of macros is that they are built upon other macros, to avoid repeating code. If macros must be written in isolation, they're pretty much useless, to my mind.
I have investigated other Scheme implementations, and had no more luck. Seems it simply cannot be done.
Can someone help me with this, please?
It looks like you're confusing expansion-time with run-time. The syntax-rules example you give will expand to the let+set, which means the append will happen at runtime.
syntax-rules simply rewrites input to given output, expanding macros until there's nothing more to expand. If you want to actually perform some computation at expansion time, the only way to do that is with a procedural macro (this is also what happens in your defmacro CL example).
In Scheme, evaluation levels are strictly separated (this makes separate compilation possible), so a procedure can use macros, but the macros themselves can't use the procedures (or macros) defined in the same piece of code. You can load procedures and macros from a module for use in procedural macros by using use-for-syntax. There's limited support for defining things to run at syntax expansion time by wrapping them in begin-for-syntax.
See for example this SO question or this discussion on the ikarus-users mailing list. Matthew Flatt's paper composable and compilable macros explains the theory behind this in more detail.
The "phase separation" thinking is relatively new in the Scheme world (note that the Flatt paper is from 2002), so you'll find quite a few people in the Scheme community who are still a bit confused about it. The reason it's "new" (even though Scheme has had macros for a long long time) is that procedural macros have only become part of the standard since R6RS (and reverted in R7RS because syntax-case is rather controversial), so the need to rigidly specify them hasn't been an issue until now. For more "traditional" Lispy implementations of Scheme, where compile-time and run-time are all mashed together, this was never an issue; you can just run code whenever.
To get back to your example, it works fine if you separate the phases correctly:
(begin-for-syntax
(define-syntax append-to
(ir-macro-transformer
(lambda (e i c)
(let ((var (cadr e))
(val (caddr e)))
`(set! ,var (append ,var ,val)))))) )
(define-syntax something-else
(ir-macro-transformer
(lambda (e i c)
(let ((vals (list 'print)))
(append-to vals '(1))
vals))))
(something-else) ; Expands to (print 1)
If you put the definition of append-to in a module of its own, and you use-for-syntax it, that should work as well. This will also allow you to use the same module both in the macros you define in a body of code as well as in the procedures, by simply requiring it both in a use and a use-for-syntax expression.

Can runtime information be used during macro expansion in racket?

Say I have a hash table at runtime that has strings as keys. Can a macro have access to this information and build a let expression from it?
(define env (hash 'a 123 'b 321))
(magic-let env (+ a b)) ; 444
I know I can hack around with with identifier-binding by replacing non-defined identifiers with a lookup in the hash table but then shadowing will not work as in a normal let.
Tagging scheme too as I assume its macro system is similar.
No, you can’t do that. At least not the way you describe.
The general reason why you cannot access runtime values within macros is simple: macros are fully expanded at compile time. When your program is compiled, the runtime values simply do not exist. A program can be compiled, and the bytecode can be placed on another computer, which will run it weeks later. Macro-expansion has already happened. No matter what happens at runtime, the program isn’t going to change.
This guarantee turns out to be incredibly important for a multitude of reasons, but that’s too general a discussion for this question. It would be relevant to discuss a particular question, which is why bindings themselves need to be static.
In Racket, as long as you are within a module (i.e. not at the top-level/REPL), all bindings can be resolved statically, at compile-time. This is a very useful property in other programming languages, mostly because the compiler can generate much more efficiently optimized code, but it is especially important in Racket or Scheme. This is because of how the macro system operates: in a language with hygienic macros, scope is complicated.
This is actually a very good thing—it is robust enough to support very complex systems that would be much harder to manage without hygiene—but it introduces some constraints:
Since every binding can be a macro or a runtime value, the binding needs to be known ahead of time in order to perform program expansion. The compiler needs to know if it needs to perform macro expansion or simply emit a variable reference.
Additionally, scoping rules are much more intricate because macro-introduced bindings live in their own scope. Because of this, binding scopes do not need to be strictly lexical.
Your magic-let could not work quite as you describe because the compiler could not possibly deduce the bindings for a and b statically. However, all is not lost: you could hook into #%top, a magical identifier introduced by the expander when encountering an unbound identifier. You could use this to replace unbound values with a hash lookup, and you could use syntax parameters to adjust #%top hygienically within each magic-let. Here’s an example:
#lang racket
(require (rename-in racket/base [#%top base-#%top])
racket/stxparam)
(define-syntax-parameter #%top (make-rename-transformer #'base-#%top))
(define-syntax-rule (magic-let env-expr expr ...)
(let ([env env-expr])
(syntax-parameterize ([#%top (syntax-rules ()
[(_ . id) (hash-ref env 'id)])])
(let () expr ...))))
(magic-let (hash 'a 123 'b 321) (+ a b)) ; => 444
Of course, keep in mind that this would replace all unbound identifiers with hash lookups. The effects of this are twofold. First of all, it will not shadow identifiers that are already bound:
(let ([a 1])
(magic-let (hash 'a 2)
a)) ; => 1
This is probably for the best, just to keep things semi-sane. It also means that the following would raise a runtime exception, not a compile-time error:
(magic-let (hash 'a 123) (+ a b))
; hash-ref: no value found for key
; key: 'b
I wouldn’t recommend doing this, as it goes against a lot of the Racket philosophy, and it would likely cause some hard-to-find bugs. There’s probably a better way to solve your problem without abusing things like #%top. Still, it is possible, if you really want it.

Scheme syntax-rules - Difference in variable bindings between (let) and (define)

The R5RS spec states that as part of the requirements for a macro defined using syntax-rules:
If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that may surround the use of the macro.
I am trying to understand how this works in practice. So for example, if I have the following code:
(define var 'original)
(define-syntax test-var
(syntax-rules (var)
((_ var)
var)
((_ pattern-var)
'no-match)))
I would expect the following, if executed immediately after, to evaluate to original, which it does:
(test-var var)
And I would expect this one to be no-match since the var introduced into scope prior to test-var does not match the binding of var at macro definition:
(let ((var 1)) (test-var var))
However the following example has me puzzled:
(define var 'new-var)
(test-var var)
In Chicken Scheme, this evaluates to new-var. I would have expected it to be no-match for the same reasons as the previous (let) example. I thought that perhaps this was an issue with using define twice, but the result is still new-var even if I use (set! var 'new-var)
Does anyone have any insight as to what is going on here? What should happen per R5RS?
This is the usual trick that Schemes have when dealing with redefinitions on the REPL -- treating them as a mutation for the existing binding. So the second define is not really creating a new binding, and instead it just set!s the existing one.

Arbitrary computation in Scheme macro

Scheme macros, at least the syntax-case variety, are said to allow arbitrary computation on the code to be transformed. However (both in the general case and in the specific case I'm currently looking at) this requires the computation to be specified in terms of recursive functions. When I try various variants of this, I get e.g.
main.scm:32:71: compile: unbound identifier in module (in the transformer environment, which does not include the run-time definition) in: expand-vars
(The implementation is Racket, if it matters.)
The upshot seems to be that you can't define named functions until after macro processing.
I suppose I could resort to the Y combinator, but I figure it's worth asking first whether there's a better approach?
Yes, the fact that you're using Racket matters -- in Racket, there is something that is called "phase separation", which means that the syntax level cannot use runtime functions. For example, this:
#lang racket
(define (bleh) #'123)
(define-syntax (foo stx)
(bleh))
(foo)
will not work since bleh is bound at a runtime, not available for syntax. Instead, it should be
(define-for-syntax (bleh) #'123)
or
(begin-for-syntax (define (bleh) #'123))
or moved as an internal definition to the macro body, or moved to its own module and required using (require (for-syntax "bleh.rkt")).