The Racket docs indicate that Racket has separate forms for: require, load, include, and import. Many other languages only contain one of these and are generally used synonymously (although obviously with language specific differences such as #include in C, and import in Java).
Since Racket has all four of these, what is the difference between each one and which should I be using in general? Also if each has a specific use, when should I use an alternative type? Also, this question seems to indicate that require (paired with provide) is preferred, why?
1. Require
You are correct, the default one you want is almost always require (paired with a provide). These two forms go hand in hand with Racket's modules and allows you to more easily determine what variables should be scoped in which files. For example, the following file defines three variables, but only exports 2.
#lang racket ; a.rkt
(provide b c)
(define a 1)
(define b 2)
(define c 3)
As per the Racket style guide, the provide should ideally be the first form in your file after the #lang so that you can easily tell what a module provides. There are a few cases where this isn't possible, but you probably won't encounter those until you start making your own Racket libraries you intend for public distribution. Personally, I still put a file's require before its provide, but I do sometimes get flack for it.
In a repl, or another module, you can now require this file and see the variables it provides.
Welcome to Racket v6.12.
> (require "a.rkt")
> c
3
> b
2
> a
; a: undefined;
; cannot reference undefined identifier
; [,bt for context]
There are ways to get around this, but this serves as a way for a module to communicate what its explicit exports are.
2. Load
This is a more dynamic variant of require. In general you should not use it, and instead use dynamic-require when you need to load a module dynamically. In this case, load is effectively a primitive that require uses behind the scenes. If you are explicitly looking to emulate the top level however (which, to be clear, you almost never do), then load is a fine option. Although in those rare cases, I would still direct you to the racket/load language. Which interacts exactly as if each form was entered into the repl directly.
#lang racket/load
(define x 5)
(displayln x) ; => prints 5
(define x 6)
(displayln x) ; => prints 6
3. Include
Include is similar to #include in C. There are even fewer cases where you should use it. The include form grabs the s-expression syntax of the given path, and puts it directly in the file where the include form was. At first, this can appear as a nice solution to allow you to split up a single module into multiple files, or to have a module 'piece' you want to put in multiple files. There are however better ways to do both of those things without using include, that also don't come with the confusing side effects you get with include.1 One thing to remember if you still insist on using import, is that the file you are importing probably shouldn't have a #lang line unless you are explicitly wanting to embed a submodule. (In which case you will also need to have a require form in addition to the include).
4. Import
Finally, import is not actually a core part of Racket, but is part of its unit system. Units are similar in some ways to modules, but allow for circular dependencies (unit A can depend on Unit B while Unit B depends on Unit A). They have fallen out of favor in recent years due to the syntactic overhead they have.
Also unlike the other forms import (and additionally export), take signatures, and relies on an external linker to decide which actual units should get linked together. Units are themselves a complex topic and deserve their own question on how to create and link them.
Conclusion (tldr)
TLDR; Use require and provide. They are the best supported and easiest to understand. The other forms do have their uses, but should be thought of 'advanced uses' only.
1These side effects are the same as you would expect for #include in C. Such as order being important, and also with expressions getting intermixed in very unpredictable ways.
Related
I am implementing an interpreter that codegen to another language using Racket. As a novice I'm trying to avoid macros to the extent that I can ;) Hence I came up with the following "interpreter":
(define op (open-output-bytes))
(define (interpret arg)
(define r
(syntax-case arg (if)
[(if a b) #'(fprintf op "if (~a) {~a}" a b)]))
; other cases here
(eval r))
This looks a bit clumsy to me. Is there a "best practice" for doing this? Am I doing a totally crazy thing here?
Short answer: yes, this is a reasonable thing to do. The way in which you do it is going to depend a lot on the specifics of your situation, though.
You're absolutely right to observe that generating programs as strings is an error-prone and fragile way to do it. Avoiding this, though, requires being able to express the target language at a higher level, in order to circumvent that language's parser.
Again, it really has a lot to do with the language that you're targeting, and how good a job you want to do. I've hacked together things like this for generating Python myself, in a situation where I knew I didn't have time to do things right.
EDIT: oh, you're doing Python too? Bleah! :)
You have a number of different choices. Your cleanest choice is to generate a representation of Python AST nodes, so you can either inject them directly or use existing serialization. You're going to ask me whether there are libraries for this, and ... I fergits. I do believe that the current Python architecture includes ... okay, yes, I went and looked, and you're in good shape. Python's "Parser" module generates ASTs, and it looks like the AST module can be constructed directly.
https://docs.python.org/3/library/ast.html#module-ast
I'm guessing your cleanest path would be to generate JSON that represents these AST modules, then write a Python stub that translates these to Python ASTs.
All of this assumes that you want to take the high road; there's a broad spectrum of in-between approaches involving simple generalizations of python syntax (e.g.: oh, it looks like this kind of statement has a colon followed by an indented block of code, etc.).
If your source language shares syntax with Racket, then use read-syntax to produce a syntax-object representing the input program. Then use recursive descent using syntax-case or syntax-parse to discern between the various constructs.
Instead of printing directly to an output port, I recommend building a tree of elements (strings, numbers, symbols etc). The last step is then to print all the elements of the tree. Representing the output using a tree is very flexible and allows you to handle sub expressions out of order. It also allows you to efficiently concatenate output from different sources.
Macros are not needed.
I want to keep a list of required modules of a particular module (let's say the current-module).
I feel like there are some other options (such as parsing the module?) that could be tried, but I started playing with the idea of shadowing (require) and adding the required items to a hash-table with the module-name. The problem is I cannot figure how to write a syntax definition for it.
Although not working, a function definition equivalent would be like below:
(define require-list (make-hash))
(define require
(lambda vals
; add vals to hash-table with key (current-namespace)
(let ([cn (current-namespace)])
(hash-set! require-list cn
(append vals (hash-ref require-list cn))))
(require vals)))
.. it seems the last line call should be modified to call the original (require) as well?
A correct version or a pointer to how to do it, or any other way of achieving the original goal highly appreciated.
If you just want to get a list of imports for a particular module, there is a convenient built-in called module->imports that will do what you want. It will return a mapping between phase levels and module imports—phase levels higher than 0 indicate imports used at compile-time for use in macro expansion.
> (require racket/async-channel)
> (module->imports 'racket/async-channel)
'((0
#<module-path-index:(racket/base)>
#<module-path-index:(racket/contract/base)>
#<module-path-index:(racket/contract/combinator)>
#<module-path-index:(racket/generic)>))
Note that the module in question must be included into the current namespace in order for module->imports to work, which require or dynamic-require will both do.
This will inspect the information known by the compiler, so it will find all static imports for a particular module. However, the caveats about dynamic requires mentioned by John Clements still apply: those can be dynamically performed at runtime and therefore will not be detected by module->imports.
Short short version:
Have you tried turning on the module browser?
Short version:
You're going to need a macro for this, and
It's not going to be a complete solution
The existing require is not a function; it's a language form, implemented as a macro. This is because the compiler needs to collect the same information you do, and therefore the required modules must be known at compile time.
The right way to do this--as you suggest--is definitely to leverage the existing parsing. If you expand the module and then walk the resulting tree, you should be able to find
everything you need. The tree will be extremely large, but will contain (many instances of) a relatively small number of primitives, so writing this traversal shouldn't be too hard. There will however be a lot of fiddling involved in setting up namespace anchors etc. in order to get the expansion to happen in the first place.
Regarding your original idea: you can definitely create a macro that shadows require. You're going to want to define it in another file and rename it on the way out so that your macro can refer to the original require. Also, the require form has a bunch of interesting subforms, and coming up with a macro that tries to handle all of these subforms will be tricky. If you're looking at writing a macro, though, you're already thinking about an 80% solution, so maybe this won't bother you.
Finally: there are forms that perform dynamic module evaluation, and so you can't ever know for sure all of the modules that might be required, though you could potentially annotate these forms (or maybe shadow the dynamic module-loading function) to see these as they happen.
Also, it's worth mentioning that you may get more precise answers on the Racket mailing list.
#lang racket/base
(module x scribble/text
#(display 123))
It seems like #lang statements are not valid in nested sub-modules, and the expanded module version above is missing something:
error: module: no #%module-begin binding in the module's language
update:
looks like this more or less works, but is there a better way? does scribble do something with output ports that isn't being handled?
#lang racket/base
(module x scribble/text/lang
(#%module-begin
#reader scribble/reader #list{
hi
#(+ 1 456)
}))
First, your code has a redundant #%module-begin which can be removed.
#lang does several things -- one is control the semantics of the file
by determining the set of initially imported bindings, and that's
something that the module form has done before #lang came up. With
submodules, it became possible to use module for parts of a file too.
However, #lang can also determine the reader that parses the file, and
that's not possible to do with submodules, so you're stuck with only one
toplevel #lang to set the parser for the whole file.
(Sidenote: There is a technical reason for that. A #lang reader reads
the rest of the file until it reaches an eof value, so a nested
#lang would require getting an eof value before getting the end of
the file, or adding a new kind of eof-like value. That means that it's
a change that should be done carefully -- it's possible to do, of
course, but the need didn't come up often enough. Hopefully it will, in
the future.)
But in your case you don't want a completely new concrete syntax, just
an extension for s-expressions -- and an extension that was chosen to
have a minimal impact on regular code. So in almost all cases it's fine
to just enable the #-form syntax for the whole file, and then use
#-forms where you want it. Since it's just an alternative way for
reading sexprs, you can even use that with module, leading to this
code that doesn't need to use #reader:
#lang at-exp racket/base
#module[x scribble/text/lang]{
hi
#(+ 1 456)
}
(require 'x)
One thing that is a bit strange here is using scribble/text/lang and
not just scribble/text. Usually, #lang foo is exactly the same as
(module x foo ...) after reading the code with foos reader. But in
the case of the scribble/text language there is another difference:
using it as a #lang makes the semantics of the module body be "output
each thing". The idea is that as a language you'll want to spit out
mostly-text files, but as a library you'll want to write code in it
and do the printout yourself.
Since this code uses module, using scribble/text means that you're
not getting the spit-all-out functionality, which is why you need to
explicitly switch to scribble/text/lang. But you could have instead
just do the spitting yourself using the language's output, which would
give you this code:
#lang at-exp racket/base
(module x racket/base
(require scribble/text)
(output #list{
hi
#(+ 1 456)}))
(require 'x)
Note that scribble/text is not used as a language here, since it
doesn't provide enough stuff to be one when used (outside of a #lang).
(Which you've found out, leading to that redundant #%module-begin...)
This version is slightly more verbose, but I'm guessing that it makes
more sense in your case, since using it for some part of the code means
that you want to use it as a library.
Finally, if you really don't want to read the whole file with the #
syntax, only some parts, then the #reader that you've found is
perfectly fine. (And this is made simple with scribble/text that
treats lists as concatenated outputs, so you need just one wrapper for
each chunk of text.)
I have seen one answer of How does Lisp let you redefine the language itself?
Stack Overflow question (answered by Noah Lavine):
Macros aren't quite a complete redefinition of the language, at least as far as I know (I'm actually a Schemer; I could be wrong), because there is a restriction. A macro can only take a single subtree of your code, and generate a single subtree to replace it. Therefore you can't write whole-program-transforming macros, as cool as that would be.
After reading this I am curious about whether there are "whole-program-transforming macros" in Lisp or Scheme (or some other language).
If not then why?
It is not useful and never required?
Same thing could be achieved by some other ways?
It is not possible to implement it even in Lisp?
It is possible, but not tried or implemented ever?
Update
One kind of use case
e.g.
As in stumpwm code
here are some functions all in different lisp source files
uses a dynamic/global defvar variable *screen-list* that is defined in primitives.lisp , but used in screen.lisp, user.lisp, window.lisp.
(Here each files have functions, class, vars related to one aspect or object)
Now I wanted to define these functions under the closure where
*screen-list* variable available by let form, it should not be
dynamic/global variable, But without moving these all functions into
one place (because I do not want these functions to lose place from their
related file)
So that this variable will be accessible to only these functions.
Above e.g. equally apply to label and flet, so that it will further possible
that we could make it like that only required variable, function will be available,
to those who require it.
Note one way might be
implement and use some macro defun_with_context for defun where first argument is
context where let, flet variables definend.
But apart from it could it be achieved by reader-macro as
Vatine and Gareth Rees answered.
You quoted Noah Lavine as saying:
A macro can only take a single subtree of your code, and generate a single subtree to replace it
This is the case for ordinary macros, but reader macros get access to the input stream and can do whatever they like with it.
See the Hyperspec section 2.2 and the set-macro-character function.
In Racket, you can implement whole-program-transforming macros. See the section in the documentation about defining new languages. There are many examples of this in Racket, for example the lazy language and Typed Racket.
Off the top of my head, a few approaches:
First, you can. Norvig points out that:
We can write a compiler as a set of macros.
so you can transform an entire program, if you want to. I've only seen it done rarely, because typically the intersection between "things you want to do to every part of your program" and "things that you need macro/AST-type transformations for" is a pretty small set. One example is Parenscript, which transforms your Lisp code ("an extended subset of CL") into Javascript. I've used it to compile entire files of Lisp code into Javascript which is served directly to web clients. It's not my favorite environment, but it does what it advertises.
Another related feature is "advice", which Yegge describes as:
Great systems also have advice. There's no universally accepted name for this feature. Sometimes it's called hooks, or filters, or aspect-oriented programming. As far as I know, Lisp had it first, and it's called advice in Lisp. Advice is a mini-framework that provides before, around, and after hooks by which you can programmatically modify the behavior of some action or function call in the system.
Another is special variables. Typically macros (and other constructs) apply to lexical scope. By declaring a variable to be special, you're telling it to apply to dynamic scope (I think of it as "temporal scope"). I can't think of any other language that lets you (the programmer) choose between these two. And, apart from the compiler case, these two really span the space that I'm interested in as a programmer.
A typical approach is to write your own module system. If you just want access to all the code, you can have some sort of pre-processor or reader extension wrap source files with your own module annotation. If you then write your own require or import form, you will ultimately be able to see all the code in scope.
To get started, you could write your own module form that lets you define several functions which you then compile in some clever way before emitting optimized code.
There's always the choice of using compiler macros (they can do whole-function transformation based on a lew of criteria, but shouldn't change the value returned, as that would be confusing).
There's reader macros, they transform the input "as it is read" (or "before it is read", if you prefer). I haven't done much large-scale reader-macro hacking, but I have written some code to allow elisp sourec to be (mostly) read in Common Lisp, with quite a few subtle differences in syntactic sugar between the two.
I believe those sorts of macros are called code-walking macros. I haven't implemented a code walker myself, so I am not familiar with the limits.
In Common LISP, at least, you may wrap top-level forms in PROGN and they still retain their status as top-level forms (see CLTL2, section 5.3). Therefore, the limitation of a macro generating a single subtree is not much of a limitation since it could wrap any number of resulting subtrees within PROGN. This makes whole-program macros quite possible.
E.g.
(my-whole-program-macro ...)
= expands to =>
(progn
(load-system ...)
(defvar ...)
(defconstant ...)
(defmacro ...)
(defclass ...)
(defstruct ...)
(defun ...)
(defun ...)
...
)
My problem isn't with the built-in eval procedure but how to create a simplistic version of it. Just for starters I would like to be able to take this in '(+ 1 2) and have it evaluate the expression + where the quote usually takes off the evaluation.
I have been thinking about this and found a couple things that might be useful:
Unquote: ,
(quasiquote)
(apply)
My main problem is regaining the value of + as a procedure and not a symbol. Once I get that I think I should just be able to use it with the other contents of the list.
Any tips or guidance would be much appreciated.
Firstly, if you're doing what you're doing, you can't go wrong reading at least the first chapter of the Metalinguistic Abstraction section of Structure and Interpretation of Computer Programs.
Now for a few suggestions from myself.
The usual thing to do with a symbol for a Scheme (or, indeed, any Lisp) interpreter is to look it up in some sort of "environment". If you're going to write your own eval, you will likely want to provide your own environment structures to go with it. The one thing for which you could fall back to the Scheme system you're building your eval on top of is the initial environment containing bindings for things like +, cons etc.; this can't be achieved in a 100% portable way, as far as I know, due to various Scheme systems providing different means of getting at the initial environment (including the-environment special form in MIT Scheme and interaction-environment in (Petite) Chez Scheme... and don't ask me why this is so), but the basic idea stays the same:
(define (my-eval form env)
(cond ((self-evaluating? form) form)
((symbol? form)
;; note the following calls PCS's built-in eval
(if (my-kind-of-env? env)
(my-lookup form env)
;; apparently we're dealing with an environment
;; from the underlying Scheme system, so fall back to that
;; (note we call the built-in eval here)
(eval form env)))
;; "applicative forms" follow
;; -- special forms, macro / function calls
...))
Note that you will certainly want to check whether the symbol names a special form (lambda and if are necessary -- or you could use cond in place of if -- but you're likely to want more and possibly allow for extentions to the basic set, i.e. macros). With the above skeleton eval, this would have to take place in what I called the "applicative form" handlers, but you could also handle this where you deal with symbols, or maybe put special form handlers first, followed by regular symbol lookup and function application.