extend racket syntax for backus-naur-forms - racket

I have a set of derivation rules implemented in racket. We can assume that there aren't any optional , which means there are no rules containing pipes (in BNF): ::= |
In racket, I have got something like this:
(define *rules*
'((S . ("b" "a"))
(B . ("a"))
(C . (S B))))
Please note that terminal symbols are implemented in the form of racket strings, nonterminal symbols in the form of racket symbols. Now, I'd like to import this rules from another racket file containing rules in backus naur syntax:
S ::= ba
B ::= a
C ::= SB
(capital letter = nonterminal)
Therefore, I need to extend the racket syntax. I have no idea how to handle that. Can you help me? It shouldn't be that much code ...

I think you're looking to parse a file that's written using the BNF syntax, and produce an s-expression version; is that right?
If so, it shouldn't be hard. In particular, the format implied by your question is that every line is of the form
<NT> :: = [<NT>|<T>]*
... which you could take apart like this:
#lang racket
;; COPYRIGHT 2012 John B. Clements (clements#brinckerhoff.org)
;; Licensed under the Apache License, version 2.
;; (You're free to use it, but your source code has to include
;; my authorship.)
(require rackunit)
(define example
(list "S ::= ba"
"B ::= a"
"C ::= SB"))
;; parse a single line:
;; string -> (list/c symbol? (listof (or/c string? symbol?)))
(define (parse-line l)
(match (regexp-match #px"^([A-Z]) ::= ([A-Za-z]*)$")
[(list _ lhs rhses)
(list lhs (map parse-char (string->list rhses)))]))
;; parse a single char:
;; char -> (or/c symbol? string?)
(define (parse-char ch)
.. oops! out of time. You'll have to write this part yourself... )
(check-expect (map parse-line example)
'((S ("b" "a"))
(B ("a"))
(C (S B))))
Oops! I see a bug in there. No problem, you'll figure it out. Gotta run....

Related

What is the use of `,` and `,#` in Racket?

I'm new to Racket and I was hoping to get more insights in the these two operators: , & ,#.
There's very little documentation of these new operators, however, to my understanding the former (,) unquotes everything if its is followed by a list. And the latter (,#) splices the values.
For example if the following is typed in the Dr. Racket interpreter:
(define scores '(1 3 2))
(define pets '(dog cat))
and then the following query is made:
`(,scores ,#pets)
this would yield : '((1 3 2) dog cat)
It would be appreciated if I could get more details, definitions and more examples about these operators.
Thanks in advance.
A single quote followed by the written representation of a value
will produce that value:
Example:
'(1 x "foo")
will produce a value that prints as (1 x "foo").
Suppose now that I don't want a literal symbol x in the list.
I have a variable x in my program, and I want to insert
the value to which x is bound.
To mark that I want the value of x rather than the symbol x,
I insert a comma before x:
'(1 ,x "foo")
It won't work as-is though - I now get a value that has a literal comma as well as a symbol x. The problem is that quote does not know about the comma convention.
Backtick or backquote knows about the comma-convention, so that will give the correct result:
> `(1 ,x "foo")
(1 3 "foo") ; if the value of x is 3
Now let's say x is the list (a b).
> `(1 ,x "foo")
(1 (a b) "foo") ; if the value of x is (a b)
This looks as expected. But what if I wanted (1 a b "foo")
as the result? We need a way so show "insert the elements of a list".
That's where ,# comes into the picture.
> `(1 ,#x "foo")
(1 a b "foo") ; if the value of x is (a b)
They are "reader abbreviations" or "reader macros". They are introduced in the section of the Racket guide on quasiquotation. To summarize:
`e reads as (quasiquote e)
,e reads as (unquote e)
,#e reads as (unquote-splicing e)
Because Racket's printer uses the same abbreviations by default, it can be confusing to test this yourself. Here are a few examples that should help:
> (equal? (list 'unquote 'abc) (read (open-input-string ",abc")))
#t
> (writeln (read (open-input-string ",abc")))
(unquote abc)
A more exhaustive description of the Racket reader is in the section on The Reader in the Racket Reference. A list of reader abbreviations is in the Reading Quotes subsection.

What is the difference between #\ , ' and #'?

In Common Lisp, given that "a" is simply a character, what is the difference between #\a, 'a #'a?
My question comes from the tutorialspoint.com tutorial on Lisp. At one point the tutorial introduces:
; a character array with all initial elements set to a
; is a string actually
(write(make-array 10 :element-type 'character :initial-element #\a))
(terpri)
; a two dimensional array with initial values a
(setq myarray (make-array '(2 2) :initial-element 'a :adjustable t))
(write myarray)
(terpri)
With the output:
"aaaaaaaaaa"
#2A((A A) (A A))
#' is not included in this example but I'm including it in the question because it can be confusing as well. 🙂
Thank you very much! 😊
To start, a is not "simply a character." The Lisp reader parses #\a as the character literal a, which is an object in Common Lisp. Note that #\a and #\A are different character objects.
When the Lisp reader encounters a single quote, the expression following the single quote is not evaluated. Specifically, 'a is treated as (quote a), where quote returns its argument unevaluated. Now, a is a symbol, so 'a evaluates to that symbol. But the Lisp reader upcases most characters it reads by default, so 'a really evaluates to the symbol A. The good news is that whether you type a or A, the Lisp reader will read A (unless you mess with the readtable), and both 'a and 'A evaluate to the symbol A.
When the Lisp reader encounters #'a, the entire expression is treated as (function a), which when evaluated returns the function associated with the name a. But, note that it is an error to use function, and by extension #', on an identifier that does not denote a function.
To clarify this last part a bit, consider the following REPL interaction:
CL-USER> (defvar a 1)
A
CL-USER> a
1
CL-USER> #'a
The function COMMON-LISP-USER::A is undefined.
[Condition of type UNDEFINED-FUNCTION]
Here the variable a is defined and given the value 1, but when we try to access the function denoted by a we get an error message because there is no such function. Continuing:
; Evaluation aborted on #<UNDEFINED-FUNCTION A {1002DDC303}>.
CL-USER> (defun a (x) x)
A
CL-USER> (a 'b)
B
CL-USER> a
1
CL-USER> #'a
#<FUNCTION A>
Now we have defined a function named a that simply returns its argument. You can see that when we call a with an argument 'b we get the expected result: (a 'b) --> b. But, then when we evaluate a alone we still get 1. Symbols in Common Lisp are objects that have, among other cells, value cells and function cells. After the above interaction, the symbol a now has 1 in its value cell, and it has the function we have defined in its function cell. When the symbol a is evaluated the value cell is accessed, but when (function a) or #'a is evaluated, the function cell is accessed. You can see above that when #'a is evaluated, the function we defined is returned, and the REPL prints #<FUNCTION A> to show this.
As an aside, I wouldn't recommend using Tutorialspoint to learn Common Lisp. Glancing over the site, right away I see this:
LISP expressions are case-insensitive, cos 45 or COS 45 are same.
This is just wrong. And, Lisp is not written in all-caps. None of this inspires faith. Instead, find a good book. There are some recommendations on the common-lisp tag-info page.
#\
This is to introduce a character.
CL-USER> #\a
#\a
CL-USER> (character 'a)
#\A
CL-USER> (character "a")
#\a
'
This is quote, to quote and not evaluate things and construct object literals.
CL-USER> a
=> error: the variable a is unbound.
CL-USER> 'a
A
CL-USER> (inspect 'a)
The object is a SYMBOL.
0. Name: "A"
1. Package: #<PACKAGE "COMMON-LISP-USER">
2. Value: "unbound"
3. Function: "unbound"
4. Plist: NIL
> q
CL-USER> (equal (list 1 2) (quote (1 2))) ;; aka '(1 2)
T ;; but watch out with object literals constructed with quote, prefer constructor functions.
and #'
This is sharpsign-quote to reference a function.
CL-USER> #'a
=> error: The function COMMON-LISP-USER::A is undefined.
CL-USER> (defun a () (print "hello A"))
A
CL-USER> (a)
"hello A"
"hello A"
CL-USER> #'a
#<FUNCTION A>
CL-USER> (function a)
#<FUNCTION A>
One can ask Lisp to describe the data objects you've mentioned.
If we look at the expressions:
CL-USER 13 > (dolist (object (list '#\a ''a '#'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
(QUOTE A) is a LIST
0 QUOTE
1 A
(FUNCTION A) is a LIST
0 FUNCTION
1 A
NIL
If we look at the evaluated expressions:
CL-USER 5 > (dolist (object (list #\a 'a #'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
A is a SYMBOL
NAME "A"
VALUE #<unbound value>
FUNCTION #<interpreted function A 422005BD54>
PLIST NIL
PACKAGE #<The COMMON-LISP-USER package, 73/256 internal, 0/4 external>
#<interpreted function A 422005BD54> is a TYPE::INTERPRETED-FUNCTION
CODE (LAMBDA (B)
A)

Create suffixed numbers Racket

I'm trying to experiment with what I can do in Racket, and I want to suffix numbers with letters.
For this example, I'd simply like to represent 10000 as 10K, and 1000000 as 1M.
Is there way (with macros or otherwise) that I can expand 1M to:
(* 1 1000000)
Or something to that effect?
In Racket, things like 10K are identifiers, which normally would refer to variables. There are two ways to make them into numbers:
1: redefine what "undefined" identifiers mean
You can redefine what to do on an undefined identifier by defining a #%top macro.
#lang racket
(require syntax/parse/define
(only-in racket [#%top old-#%top]))
(define-syntax-parser #%top
[(_ . x:id)
#:when (id-has-a-k-at-the-end? #'x)
(transform-id-into-number #'x)]
[(_ . x)
#'(old-#%top . x)])
However, this has a subtle problem. If there are any identifiers or variables in your program with K's on the end, they could override any numbers that were written that way. You would need to be careful not to accidentally override something that was intended to be a number.
2: make a reader extension that turns them into numbers instead of identifiers
This will take more time, but it's closer to the "right way" to do this, since it avoids conflicts when variables happen to have K's on the end.
One of the easier ways to extend the reader is with a readtable. You can make a function that extends a readtable like this:
;; Readtable -> Readtable
(define (extend-readtable orig-rt)
;; Char InputPort Any Nat Nat Nat -> Any
(define (rt-proc char in src ln col pos)
....)
...
(make-readtable orig-rt
#f 'non-terminating-macro rt-proc
...))
To use this to define a #lang language, you need to put the reader implementation in your-language/lang/reader.rkt. Here that's number-with-k/lang/reader.rkt, where the number-with-k directory is installed as a single-collection package (raco pkg install path/to/number-with-k).
number-with-k/lang/reader.rkt
#lang racket
(provide (rename-out [-read read]
[-read-syntax read-syntax]
[-get-info get-info]))
(require syntax/readerr
syntax/module-reader)
;; Readtable -> Readtable
(define (extend-readtable orig-rt)
;; Char InputPort Any Nat Nat Nat -> Any
(define (rt-proc char in src ln col pos)
....)
...
(make-readtable orig-rt
#f 'non-terminating-macro rt-proc))
;; [X ... -> Y] -> [X ... -> Y]
(define ((wrap-reader rd) . args)
(parameterize ([current-readtable (extend-readtable (current-readtable))])
(apply rd args)))
(define-values [-read -read-syntax -get-info]
(make-meta-reader 'number-with-k
"language path"
lang-reader-module-paths
wrap-reader
wrap-reader
identity))
The main work goes into filling in the .... holes in the extend-readtable function. For example, you can make it recognize identifiers that end with K like this:
;; Readtable -> Readtable
(define (extend-readtable orig-rt)
;; Char InputPort Any Nat Nat Nat -> Any
(define (rt-proc char in src ln col pos)
(define v (read-syntax/recursive src in char orig-rt #f))
(cond
[(and (identifier? v) (id-has-a-k-at-the-end? v))
(transform-id-into-number v)]
[else
v]))
(make-readtable orig-rt
#f 'non-terminating-macro rt-proc))
Once this is done, and you have the number-with-k directory installed as a package, you should be able to use #lang number-with-k like this:
#lang number-with-k racket
(+ 12K 15)
; => 12015
The simplest to is to define the suffixes you need.
(define K 1000)
(define M 1000000)
Then write (* 3.14 M) for 3.14 millions.
As others mention, Racket supports scientific notation 3.14E6 is also 3.14 million.
Yet another alternative is to define functions K, M etc like:
(define (M x) (* x 1000000))
Then you can write
(M 3.14)
to mean 3.14 million.
Racket does already have built in support for this, kind of, via scientific notation:
1e6 ; 1000000.0 ("1M")
2e7 ; 20000000.0

Removing racket's default reader procedure from the readtable for the character |

I'm trying to write a racket reader extension procedure that disables the special treatment of the pipe character |.
I have two files: mylang/lang/reader.rkt to implement the lang reader and mylang/testing.rkt to try it out. I've run raco pkg install --link to install the lang.
Here is reader.rkt:
#lang s-exp syntax/module-reader
racket
#:read my-read
#:read-syntax my-read-syntax
(define (parse-pipe char in srcloc src linum colnum)
#'\|)
(define my-readtable
(make-readtable #f #\| 'terminating-macro parse-pipe))
(define (my-read-syntax src in)
(parameterize ((current-readtable my-readtable))
(read-syntax src in)))
(define (my-read in)
(syntax->datum
(my-read-syntax #f in)))
With testing.rkt like this:
#lang mylang
(define | 3)
(+ 3 2)
runs and produces 5 as expected. But this next one doesn't:
#lang mylang
(define |+ 3)
(+ |+ 2)
complaining that define: bad syntax (multiple expressions after identifier) in: (define \| + 3) which is reasonable since parse-pipe produces a syntax object, not a string, so it terminates the reading of the symbol prematurely.
One thing I can do is keep reading until the end of the symbol, but that is hackish at best because I would be reimplementing symbol parsing and it wouldn't fix the case that the symbol has the pipe char in the middle, or if | is inside a string, etc.
What I would like to do is remove the default reader procedure for | from the readtable, but I don't know how/if it can be done.
OK, I've found a way. The documentation for make-readtable says:
char like-char readtable — causes char to be parsed in the same way
that like-char is parsed in readtable, where readtable can be #f to
indicate the default readtable.
so I can make the reader read | like a normal character like a with:
(define my-readtable
(make-readtable #f #\| #\a #f))
And it works
(define hawdy|+ "hello")
(string-append hawdy|+ "|world")
; => "hello|world"

in clojure language what <'a> really is

actually i am trying to perfectly understand clojure and particularly symbols
(def a 1)
(type a)
;;=>java.lang.Long
(type 'a)
;;=>clojure.lang.Symbol
I know that type is a function so its arguments get evaluated first so i perfectly understand why the code above work this way .In the flowing code i decided to delay the evaluation using macro
(defmacro m-type [x] (type x))
(m-type a)
;;==>clojure.lang.Symbol
and i am fine with that but what i fail to uderstand is this:
(m-type 'a)
;;=>clojure.lang.Cons
why the type of 'a is a cons
the character ' is interpreted by the clojure reader as a reader-macro which expands to a list containing the symbol quote followed by whatever follows the ', so in your call to (m-type 'a) the 'a is expanding to:
user> (macroexpand-1 ''a)
(quote a)
then calling type on the list (quote a) which is a Cons.
This may be a bit more clear if we make the m-type macro print the arguments as it sees them while it is evaluating:
user> (defmacro m-type [x] (println "x is " x) (type x))
#'user/m-type
user> (m-type 'a)
x is (quote a)
clojure.lang.Cons