How do I write a new #%datum function to catch all strings? - racket

I want to write a new racket language that catches and deals with strings in some special way. I've written the following example code:
#lang racket
(provide #%top #%app #%top-interaction #%module-begin
(rename-out [datum #%datum]))
(define big-string "")
(define (add-string x)
(set! big-string (string-append big-string x)))
(define-syntax (datum stx)
(syntax-case stx ()
[(_ . x)
#'(if (string? x)
(#%datum . (add-string x))
(#%datum . x))]))
It gives me an out of memory error when I try to use the target language. Is it recursively calling itself? I would have thought that hygiene would prevent that.
Perhaps the problem is that #%datum returns syntax, rather than datums?

First let's look at the problem with version of datum above.
Let's say the program contains the string "a".
The expander sees the string "a" and turns it in to (#%datum . "a").
Since #%datum is bound to datum defined as:
(define-syntax (datum stx)
(syntax-case stx ()
[(_ . x)
#'(if (string? x)
(#%datum . (add-string x))
(#%datum . x))]))
The syntax (#%datum . "a") will expand to
(if (string? "a")
(#%datum . (add-string "a"))
(#%datum . "a"))
The expander will then begin to expand the above expression.
When it comes to the first "a" it will expand it into (#%datum . "a")
which becomes (datum . "a") which then becomes another copy of
(if (string? "a")
(#%datum . (add-string "a"))
(#%datum . "a"))
etc.
The intention of the datum were to use two different expansions either (add-string x) or (#%datum . x). However since the output of datum is #'(if (string? x) ...) the if isn't evaluated at compile time, but at runtime.
The solution is to move the if.
#lang racket
(provide #%top #%app #%top-interaction #%module-begin
(rename-out [datum #%datum]))
(define big-string "")
(define (add-string x)
(set! big-string (string-append big-string x))
x)
(define-syntax (datum stx)
(syntax-case stx ()
[(_ . x)
(if (string? (syntax-e #'x))
#'(add-string (#%datum . x))
#'(#%datum . x))]))
Besides moving the if I have altered the add-string consequence.
Note: If a macro use expands into a use of the same macro,
then you are likely to run into this infinite expansion problem.
The easiest way to find the culprit is to use the macro stepper.
Turn the "Macro hiding:" setting to "Disable". Then step until you
see the loop.

Related

Writing a `define-let` macro, with hygiene

I'm trying to write a define-let macro in racket, which "saves" the header of a (let ((var value) ...) ...) , namely just the (var value) ... part, and allows re-using it later on.
The code below works as expected:
#lang racket
;; define-let allows saving the header part of a let, and re-use it later
(define-syntax (define-let stx1)
(syntax-case stx1 ()
[(_ name [var value] ...)
#`(define-syntax (name stx2)
(syntax-case stx2 ()
[(_ . body)
#`(let ([#,(datum->syntax stx2 'var) value] ...)
. body)]))]))
;; Save the header (let ([x "works]) ...) in the macro foo
(define-let foo [x "works"])
;; Use the header, should have the same semantics as:
;; (let ([x "BAD"])
;; (let ([x "works])
;; (displayln x))
(let ([x "BAD"])
(foo (displayln x))) ;; Displays "works".
The problem is that the macro breaks hygiene: as shown in the example below, the variable y, declared in a define-let which is produced by a macro, should be a new, uninterned symbol, due to hygiene, but it manages to leak out of the macro, and it is erroneously accessible in (displayln y).
;; In the following macro, hygiene should make y unavailable
(define-syntax (hygiene-test stx)
(syntax-case stx ()
[(_ name val)
#'(define-let name [y val])]))
;; Therefore, the y in the above macro shouldn't bind the y in (displayln y).
(hygiene-test bar "wrong")
(let ((y "okay"))
(bar (displayln y))) ;; But it displays "wrong".
How can I write the define-let macro so that it behaves like in the first example, but also preserves hygiene when the identifier is generated by a macro, giving "okay" in the second example?
Following the cue "syntax-parameter" from Chris, here is an one solution:
#lang racket
(require racket/stxparam
(for-syntax syntax/strip-context))
(define-syntax (define-let stx1)
(syntax-case stx1 ()
[(_ name [var expr] ...)
(with-syntax ([(value ...) (generate-temporaries #'(expr ...))])
#`(begin
(define-syntax-parameter var (syntax-rules ()))
...
(define value expr)
...
(define-syntax (name stx2)
(syntax-case stx2 ()
[(_ . body)
(with-syntax ([body (replace-context #'stx1 #'body)])
#'(syntax-parameterize ([var (syntax-id-rules () [_ value])] ...)
. body))]))))]))
(define-let foo [x "works"])
(let ([x "BAD"])
(foo (displayln x))) ; => works
(let ([x "BAD"])
(foo
(let ([x "still works"])
(displayln x)))) ; => still works
UPDATE
This solution passes the additional test in the comments.
The new solution transfers the context of the body to
the variables to be bound.
#lang racket
(require (for-syntax syntax/strip-context))
(define-syntax (define-let stx1)
(syntax-case stx1 ()
[(_ name [var expr] ...)
#`(begin
(define-syntax (name stx2)
(syntax-case stx2 ()
[(_ . body)
(with-syntax ([(var ...) (map (λ (v) (replace-context #'body v))
(syntax->list #'(var ...)))])
#'(let ([var expr] ...)
. body))])))]))
(define-let foo [x "works"])
(let ([x "BAD"])
(foo (displayln x))) ; => works
(let ([x "BAD"])
(foo
(let ([x "still works"])
(displayln x)))) ; => still works
(let ([z "cool"])
(foo (displayln z))) ; => cool

Using Syntax Parameters in Racket

I'm trying to define a macro that generates an anonymous function taking one argument named it, for succinctness, so that instead of
(λ (it) body)
I can write
(λλ body)
(In other words, (λλ body) transforms to (λ (it) body))
(define-syntax-parameter it #f)
(define-syntax λλ
(syntax-rules ()
((_ body)
(λ (x) (syntax-parameterize ((it x)) body)))))
(λλ (< it 0)) ; For testing
I get operators.rkt:13:28: ?: literal data is not allowed; no #%datum syntax transformer is bound in the transformer environment in: #f at (define-syntax-parameter if #f), but as far as I can tell, this is exactly like the example given in racket's doc for how to use define-syntax-parameter. I can suppress the error by replacing #f with a function (I used member, but not for any real reason), but after doing that, I get operators.rkt:17:38: x: identifier used out of context in: x. What am I doing wrong?
You left out the syntax-id-rules part in the example. It's the part that specifies that it should expand to x. Alternatively, you can use make-rename-transformer:
#lang racket
(require racket/stxparam)
(define-syntax-parameter it #f)
(define-syntax λλ
(syntax-rules ()
((_ body)
(λ (x) (syntax-parameterize ([it (make-rename-transformer #'x)]) body)))))
((λλ (< it 0)) 5)
((λλ (< it 0)) -5)
=>
#f
#t
Syntax parameters are not the only way to implement the macro you have in mind. A simpler (IMO) way is to just use datum->syntax to inject the identifier it:
(define-syntax (λλ stx)
(syntax-case stx ()
((_ body ...)
(with-syntax ((it (datum->syntax stx 'it)))
#'(λ (it) body ...)))))
To use your example:
(define my-negative? (λλ (< it 0)))
(my-negative? -1) ;; => #t

Racket Macro to auto-define functions given a list

I want to auto-generate a bunch of test functions from a list. The advantage being I can change the list (e.g. by reading in a CSV data table) and the program will auto-generate different tests on the next program execution.
For example, say I am trying to identify oxyanions in a string containing a chemical formula.
My list may be something like:
(define *oxyanion-tests*
; name cation
(list (list "aluminate" "Al")
(list "borate" "B")
(list "gallate" "Ga")
(list "germanate" "Ge")
(list "phosphate" "P")
(list "sulfate" "S")
(list "silicate" "Si")
(list "titanate" "Ti")
(list "vanadate" "V")
(list "stannate" "Sn")
(list "carbonate" "C")
(list "molybdate" "Mo")
(list "tungstate" "W")))
I'm reasonably confident that the chemical formula contains one of these oxyanions if there is a cation followed by an oxygen within parentheses (e.g. "(C O3)" ), or if the cation is followed by 2 or more oxygens (e.g. "C O3"). Note that this isn't perfect, since it will miss hypochlorite anions (e.g. "Cl O"), but it's good enough for my application.
(define ((*ate? elem) s-formula)
(or (regexp-match? (regexp (string-append "\\(" elem "[0-9.]* O[0-9.]*\\)")) s-formula)
(regexp-match? (regexp (string-append "(^| )" elem "[0-9.]* O[2-9][0-9.]*")) s-formula)))
I think I need a macro to do this, but I don't really understand how they work from reading the documentation. I'm asking here so that I have a good example to look at that is immediately useful to me.
Here is what I kind of think the macro should look like, but it doesn't work and I don't really have a mental model for figuring out how to fix it.
(require (for-syntax racket))
(define-syntax-rule (define-all/ate? oxyanion-tests)
(for ([test oxyanion-tests])
(match test
[(list name cation) (syntax->datum (syntax (define ((string->symbol (string-append name "?")) s-formula)
((*ate? cation) s-formula))))])))
Thanks for any guidance you can give me!
P.S. Here are a few tests that should pass:
(define-all/ate? *oxyanion-tests*)
(module+ test
(require rackunit)
(check-true (borate? "B O3"))
(check-true (carbonate? "C O3"))
(check-true (silicate? "Si O4")))
I see a couple of errors in your code:
Your *oxyanion-tests* is a runtime value, but you need its values to use as function name identifiers, so it must be available at compile time.
The syntax around the result of syntax-rules is implicit. So with syntax-rules, you only get the macro template language (see the docs for syntax for more info). Thus you can't do the datum->syntax that you are trying to do. You have to use syntax-case instead, which allows you to use all of Racket to compute the syntax objects you want.
Here's what I came up with:
#lang racket
(require (for-syntax racket/syntax)) ; for format-id
(define-for-syntax *oxyanion-tests*
; name cation
(list (list "aluminate" "Al")
(list "borate" "B")
(list "gallate" "Ga")
(list "germanate" "Ge")
(list "phosphate" "P")
(list "sulfate" "S")
(list "silicate" "Si")
(list "titanate" "Ti")
(list "vanadate" "V")
(list "stannate" "Sn")
(list "carbonate" "C")
(list "molybdate" "Mo")
(list "tungstate" "W")))
(define ((*ate? elem) s-formula)
(or (regexp-match?
(regexp (string-append "\\(" elem "[0-9.]* O[0-9.]*\\)"))
s-formula)
(regexp-match?
(regexp (string-append "(^| )" elem "[0-9.]* O[2-9][0-9.]*"))
s-formula)))
(define-syntax (define-all/ate? stx)
(syntax-case stx ()
[(_)
(let ([elem->fn-id
(λ (elem-str)
(format-id
stx "~a?"
(datum->syntax stx (string->symbol elem-str))))])
(with-syntax
([((ate? cation) ...)
(map
(λ (elem+cation)
(define elem (car elem+cation))
(define cation (cadr elem+cation))
(list (elem->fn-id elem) cation))
*oxyanion-tests*)])
#`(begin
(define (ate? sform) ((*ate? cation) sform))
...)))]))
(define-all/ate?)
(module+ test
(require rackunit)
(check-true (borate? "B O3"))
(check-true (carbonate? "C O3"))
(check-true (silicate? "Si O4")))
The key is the elem->fn-id function, which turns a string into a function identifier. It uses datum->syntax with stx as the context, meaning the defined function will be available in the context where the macro is invoked.

Racket Macro Ellipsis Syntax

I have a macro that's working when one argument is passed, and I'd like to expand it to accept n number of arguments using ..., but I'm having trouble figuring out the syntax.
The macro accepts either custom syntax, ie, key:val key:val, or it accepts a procedure.
For example: (3 different usages)
(schema-properties [(name:first-name type:string)])
(schema-properties [(name:age type:number required:#t)])
(schema-properties [(my-custom-fn arg1 arg2 arg3)])
Definition:
(define-syntax (schema-properties stx)
(syntax-parse stx
[(_ [(prop:expr ...)])
(with-syntax ([prop0 (make-prop-hash #'(prop ...))])
#'(list prop0))]))
(define-for-syntax (make-prop-hash stx)
(with-syntax ([(props ...) stx])
(if (regexp-match #px":"
(symbol->string (car (syntax->datum #'(props ...)))))
#'(pairs->hash 'props ...)
#'(props ...))))
This works, in that it checks the prop:expr syntax for the presense of ":", and if it exists, passes it to the function (pairs->hash 'props ...), otherwise, it just invokes it (props ...).
Now, I'd like to be able to pass in:
(schema-properties [(name:first-name type:string)
(name:last-name type:string)
(my-fn arg1 arg2 arg3)])
and have it work the same way. But I'm currently in ellipsis hell and my brain is no longer working correctly.
Any insights are appreciated.
Recommendation: use helper functions to help deal with nesting. Your schema-properties macro knows how to deal with one level of nesting, and you want to apply that to multiple clauses. It's the same principle as when we deal with lists of things: have a helper to deal with the thing, and then apply that across your list. It helps cut down complexity.
For your code, we can do it like this:
#lang racket
(require (for-syntax syntax/parse))
(define-syntax (schema-properties stx)
(syntax-parse stx
[(_ [clause ...])
(with-syntax ([(transformed-clauses ...)
(map handle-clause (syntax->list #'(clause ...)))])
#'(list transformed-clauses ...))]))
;; handle-clause: clause-stx -> stx
(define-for-syntax (handle-clause a-clause)
(syntax-parse a-clause
[(prop:expr ...)
(make-prop-hash #'(prop ...))]))
(define-for-syntax (make-prop-hash stx)
(with-syntax ([(props ...) stx])
(if (regexp-match #px":"
(symbol->string (car (syntax->datum #'(props ...)))))
#'(pairs->hash 'props ...)
#'(props ...))))
;;; Let's try it out. I don't know what your definition of pairs->hash is,
;;; but it probably looks something like this:
(define (pairs->hash . pairs)
(define ht (make-hash))
(for ([p pairs])
(match (symbol->string p)
[(regexp #px"([-\\w]+):([-\\w]+)"
(list _ key value))
(hash-set! ht key value)]))
ht)
(schema-properties [(name:first-name type:string)
(name:last-name type:string)
(list 1 2 3)])
Another recommendation: use syntax classes to help deal with nesting:
First, define a syntax class that recognizes key:value identifiers (and makes their component strings available as key and value attributes):
(begin-for-syntax
(define-syntax-class key-value-id
#:attributes (key value)
(pattern x:id
#:do [(define m (regexp-match "^([^:]*):([^:]*)$"
(symbol->string (syntax-e #'x))))]
#:fail-unless m #f
#:with (_ key value) m)))
Now define a clause as either a sequence of those (to be handled one way) or anything else (to be treated as an expression, which must produce a procedure). The code attribute contains the interpretation of each kind of clause.
(begin-for-syntax
(define-syntax-class clause
#:attributes (code)
(pattern (x:key-value-id ...)
#:with code #'(make-immutable-hash '((x.key . x.value) ...)))
(pattern proc
#:declare proc (expr/c #'(-> any))
#:with code #'(proc.c))))
Now the macro just puts the pieces together:
(define-syntax (schema-properties stx)
(syntax-parse stx
[(_ [c:clause ...])
#'(list c.code ...)]))

Is it possible to just print the string that was passed into the Scheme macro?

I am working on a language translator in guile scheme, and need to handle the basic case, where you're trying to convert a single word.
(define var 5)
(translate var)
This should return the string var and not the number 5.
How do I do this using R5RS Scheme macros (the define-syntax style)?
Edit:
I'm translating from Scheme to Coffeescript.
(define-syntax translate
(syntax-rules ()
[(_ v) 'v]))
And if you want a string:
(define-syntax translate
(syntax-rules ()
[(_ v) (symbol->string 'v)]))
Hopefully Guile's compiler is smart enough to fold the resulting expression so it essentially becomes a constant string.
With syntax-case and its guard support:
(define-syntax translate
(lambda (stx)
(syntax-case stx ()
[(_ v) (identifier? #'v)
#'(symbol->string 'v)]
[(_ v) (number? (syntax-e #'v))
#'(number->string v)])))
(I've used square brackets for easy comparison with Eli's answer, however, it's not my usual style. ;-))
But if you're using syntax-case, then you can just as well do the conversion at the syntax level instead of producing code that does it at runtime:
(define-syntax translate
(lambda (stx)
(syntax-case stx ()
[(_ v) (identifier? #'v)
(datum->syntax stx (symbol->string (syntax->datum #'v)))]
[(_ v) (number? (syntax-e #'v))
(datum->syntax stx (number->string (syntax->datum #'v)))])))
The main thing here is that the macro code is now plain scheme, for example, you could abstract the common parts into a helper:
(define-syntax translate
(lambda (stx)
(define (rewrap convert x)
(datum->syntax stx (convert (syntax->datum x))))
(syntax-case stx ()
[(_ v) (identifier? #'v) (rewrap symbol->string #'v)]
[(_ v) (number? (syntax-e #'v)) (rewrap number->string #'v)])))
Along the same lines, if this macro is so simple, then there's no real need for syntax-case, other than pulling out the subexpression:
(define-syntax translate
(lambda (stx)
(syntax-case stx ()
[(_ v) (let ([d (syntax->datum #'v)])
(datum->syntax
stx
((cond [(number? d) number->string]
[(symbol? d) symbol->string])
d)))])))
Note, BTW, that there is no magic in syntax-case -- and in the case of this simple pattern, you could just pull out the value yourself:
(define-syntax translate
(lambda (stx)
(let ([d (cadr (syntax->datum #'v))])
(datum->syntax
stx
((cond [(number? d) number->string]
[(symbol? d) symbol->string])
d)))))
There is some boilerplate stuff that syntax-case does that this last version loses:
If you use the macro in an unexpected way like (translate) then this version will throw an error about cadr instead of a more comprehensible syntax error
Similarly, if you use (translate 1 2) then this version will just silently ignore the 2 instead of an error.
And if it's used with something that is neither an identifier nor a number (eg, (translate (+ 1 2))) then this will depend on the unspecified value that cond returns rather than throwing a syntax error.
The other answers are useful enough already, but I thought I'd just point out that it's possible to generalize this technique in a very useful ways: macros to print out expressions and their results for debugging:
(define-syntax log-expr
(syntax-rules ()
((_ expr)
(let ((result expr))
(write (quote expr))
(display " evaluates to ")
(write result)
(newline)
result))))