Let’s say that I have a library xx with namespace xx.core, and I’m writing it in pure Clojure, intending to target both Clojure and ClojureScript. The de facto way to do this seems to using lein-cljsbuild’s crossovers and conditional comments. So far, so good. This premise is now obsolete. lein-cljsbuild has been deprecated in favor of reader conditionals, and there are many other namespace/macro ClojureScript enhancements. See the updated answer below.
Let’s say xx has a bunch of vars that I want its users, both in Clojure and ClojureScript, to be able to use. These can be split into three sorts of vars.
Macros
Functions / other vars that depend on no macro in xx (I’ll call these type-1 vars)
Functions / other vars that just happen to do depend on a macro in xx (I’ll call these type-2 vars)
However, since ClojureScript requires macros to be separated from regular .cljs namespaces in their own special .clj namespaces, all macros have to be isolated away from all other vars in xx.core.
But the implementation of some of those other vars—the type-2 vars—incidentally depend on those macros!
(To be sure, only macros seem to be accessible using ClojureScript’s use-macro or require-macro. Just now I tested this; I tried out simply keeping everything (macros, type-1 vars, and type-2 vars) inside a single xx/core.clj file, and referring to it in a ClojureScript test file using (:use-macro xx.core :only […]). The compiler then emits a WARNING: Use of undeclared Var message for each non-macro var in xx.core that the ClojureScript file referred to.)
What do people tend to do in this situation? The way I see it, the only thing I can do is to…split the library’s public API into three namespaces: one for type-1 vars, one for macros, and one for type-2 vars. Something like…xx.core, xx.macro, and xx.util?…
Of course, this sort of stinks, since now any user of xx (both in Clojure or ClojureScript) has to know whether each var (of which there may be dozens) happens to depend on a macro in its implementation, and which namespace it thus belongs to. This wouldn’t be necessary if I targeted Clojure only. Is this really how it is right now, if I want to target both Clojure and ClojureScript?
This question’s premise was largely obsoleted several years ago. I’m doing my part to not pollute the web with outdated information, with this update.
ClojureScript still, unlike Clojure, usually compiles macros in a compilation stage separate from the runtime. There is still much incidental compexity. However, the situation has vastly improved thanks to several enhancements.
Since version 1.7 in 2015, Clojure and ClojureScript now support reader conditionals, which enable macros and functions to be defined in the same .cljc file for Clojure, ClojureScript, Clojure CLR, or all three: #?(:clj …, :cljs …, :cljr …, :default …). This alone mitigates much of the problem.
In addition, ClojureScript itself now has had several enhancements to ns that get rid of much other incidental complexity for users of a namespace. They are now documented in Differences from Clojure, § Namespaces. They include implicit macro loading, inline macro specification, and auto-aliasing clojure namespaces:
Implicit macro loading: If a namespace is required or used, and that namespace itself requires or uses macros from its own namespace, then the macros will be implicitly required or used using the same specifications. Furthermore, in this case, macro vars may be included in a :refer or :only spec. This oftentimes leads to simplified library usage, such that the consuming namespace need not be concerned about explicitly distinguishing between whether certain vars are functions or macros. For example:
(ns testme.core (:require [cljs.test :as test :refer [test-var deftest]])) will result in test/is resolving properly, along with the test-var function and the deftest macro being available unqualified.
Inline macro specification: As a convenience, :require can be given either :include-macros true or :refer-macros [syms…]. Both desugar into forms which explicitly load the matching Clojure file containing macros. (This works independently of whether the namespace being required internally requires or uses its own macros.) For example:
(ns testme.core
(:require [foo.core :as foo :refer [foo-fn] :include-macros true]
[woz.core :as woz :refer [woz-fn] :refer-macros [apple jax]]))
is sugar for
(ns testme.core
(:require [foo.core :as foo :refer [foo-fn]]
[woz.core :as woz :refer [woz-fn]])
(:require-macros [foo.core :as foo]
[woz.core :as woz :refer [apple jax]]))
Auto-aliasing clojure namespaces: If a non-existing clojure.* namespace is required or used and a matching cljs.* namespace exists, the cljs.* namespace will be loaded and an alias will be automatically established from the clojure.* namespace to the cljs.* namespace. For example:
(ns testme.core (:require [clojure.test]))
will be automatically converted to
(ns testme.core (:require [cljs.test :as clojure.test]))`
Lastly, ClojureScript now has a second target: bootstrapped, self-hosted ClojureScript, aka CLJS-in-CLJS. In contrast to the CLJS-on-JVM compiler, the bootstrapped ClojureScript compiler actually can compile macros! Separation is still enforced in source files, but its REPL can run them intermingled with functions fine.
Mike Fikes has written a series of valuable articles on these and other issues on Clojure–ClojureScript portability, while these features were being developed. These include:
“Portable Macro Musing”, 2015
“Messing with Macros at the REPL, 2015
“ClojureScript Macro-Functions”, 2015
“ClojureScript Macro Tower and Loop”, 2015
“ClojureScript Macros Calling Functions”, 2016
“ClojureScript Macro Sugar”, 2016
“Collapsing Macro Tower”, 2016
Even in 2017, it is exciting to watch ClojureScript continue to mature.
It seems you understand the situation properly :)
regarding: "any user of xx (both in Clojure or ClojureScript) has to know whether each var (of which there may be dozens) happens to depend on a macro in its implementation, and which namespace it thus belongs to."
You can add two more namespaces, api.clj and api.cljs which include the proper namspaces for each of the vars for that api and take some of the pain out of that decision. It seems that this area is still quite new though.
Related
I am learning some Typed Racket at the moment and i have a somewhat philosophical dilemma:
Racket claims to be a language development framework and Typed Racket is one such languages implemented on top of it. The documentation mentions that due to types being used, the compiler now can do more/better optimizations.
The concrete question:
Where do these optimizations happen?
1) In the compile/expand part (which is "programmable" as part of the language building framework)
-or-
2) further down the line in the (bytecode) optimizer (which is written in C and not directly modifieable via the framework).
If 2) is true, does that mean the type information is lost after the compile/expand stage and later "rebuilt/guessed" by the optimizer or has the intermediate representation been altered to to accomodate the type information and inform later stages about them?
The reason i am asking this specific question is because i want to get a feeling for how general the Racket language framework really is, i.e. is also viable for statically typed languages without any modifications in the backend versus the type system being only a front-end thing, while the code at runtime is still dynamically typed (but statically checked of course).
Thank you.
Typed Racket's optimizations occur during macro expansion. To see for yourself, you can change #lang typed/racket to #lang typed/racket #:no-optimize, which shows Typed Racket is in complete control of what optimizations are applied.
The optimizations consist of using type information to replace various uses of certain procedures with their unsafe equivalents. The unsafe procedures perform no runtime checks on the types of their arguments and cause undefined behavior (read: segfaults) if used incorrectly. You can find out more in the documentation section entitled Optimization in Typed Racket.
The exposure of the unsafe variants of procedures is what really makes it possible for user-defined languages to implement these optimizations. For example, if you wrote your own language with a type system that could prove vectors were never accessed with out-of-bounds indices you could replaces uses of vector-ref with unsafe-vector-ref.
There are similar optimizations that occur at the bytecode level, but these mostly apply when the JIT can infer type information that's not visible at macro expansion time. These are not user-controlled, but you don't have to rely on them.
I’ve been working with Scala language for a few months and I’ve already created couple projects in Scala. I’ve found Scala REPL (at least its IntelliJ worksheet implementation) is quite convenient for quick development. I can write code, see what it does and it’s nice. But I do the procedure only for functions (not whole program). I can’t start my application and change it on spot. Or at least I don’t know how (so if you know you are welcome to give me piece of advice).
Several days ago my associate told me about Clojure REPL. He uses Emacs for development process and he can change code on spot and see results without restarting. For example, he starts the process and if he changes implementation of a function, his code will change his behavior without restart. I would like to have the same thing with Scala language.
P.S. I want to discuss neither which language is better nor does functional programming better than object-oriented one. I want to find a good solution. If Clojure is the better language for the task so let it be.
The short answer is that Clojure was designed to use a very simple, single pass compiler which reads and compiles a single s-expression or form at a time. For better or worse there is no global type information, no global type inference and no global analysis or optimization. Clojure uses clojure.lang.Var instances to create global bindings through a series of hashmaps from textual symbols to transactional values. def forms all create bindings at global scope in this global binding map. So where in Scala a "function" (method) will be resolved to an instance or static method on a given JVM class, in Clojure a "function" (def) is really just a reference to an entry in the table of var bindings. When a function is invoked, there isn't a static link to another class, instead the var is reference by symbolic name, then dereferenced to get an instance of a clojure.lang.IFn object which is then invoked.
This layer of indirection means that it is possible to re-evaluate only a single definition at a time, and that re-evaluation becomes globaly visible to all clients of the re-defined var.
In comparison, when a definition in Scala changes, scalac must reload the changed file, macroexpand, type infer, type check, and compile. Then due to the semantics of classloading on the JVM, scalac must also reload all classes which depend on methods in the class which changed. Also all values which are instances of the changed class become trash.
Both approaches have their strengths and weaknesses. Obviously Clojure's approach is simpler to implement, however it pays an ongoing cost in terms of performance due to continual function lookup operations forget correctness concerns due to lack of static types and what have you. This is arguably suitable for contexts in which lots of change is happening in a short timeframe (interactive development) but is less suitable for context when code is mostly static (deployment, hence Oxcart). some work I did suggests that the slowdown on Clojure programs from lack of static method linking is on the order of 16-25%. This is not to call Clojure slow or Scala fast, they just have different priorities.
Scala chooses to do more work up front so that the compiled application will perform better which is arguably more suitable for application deployment when little or no reloading will take place, but proves a drag when you want to make lots of small changes.
Some material I have on hand about compiling Clojure code more or less cronological by publication order since Nicholas influenced my GSoC work a lot.
Clojure Compilation [Nicholas]
Clojure Compilation: Full Disclojure [Nicholas]
Why is Clojure bootstrapping so slow? [Nicholas]
Oxcart and Clojure [me]
Of Oxen, Carts and Ordering [me]
Which I suppose leaves me in the unhappy place of saying simply "I'm sorry, Scala wasn't designed for that the way Clojure was" with regards to code hot swapping.
I'm having trouble reloading multimethods when developing in Emacs with a Slime repl.
Redefining the defmethod forms works fine, but if I change the dispatch function I don't seem to be able to reload the defmulti form. I think I specifically added or removed dispatch function parameters.
As a workaround I've been able to ns-unmap the multimethod var, reload the defmulti form, and then reload all the defmethod forms.
Presumably this is a "limitation" of the way Clojure implements multimethods, i.e. we're sacrificing some dynamism for execution speed, but are there any idioms or development practises that help workaround this?
The short answer is that your way of dealing with this is exactly correct. If you find yourself updating a multimethod in order to change the dispatch function particularly frequently, (1) I think that's unusual :-), (2) you could write a suite of functions / macros to help with the reloading. I sketch two untested (!) macros to help with (2) further below.
Why?
First, however, a brief discussion of the "why". Dispatch function lookup for a multimethod as currently implemented requires no synchronization -- the dispatch fn is stored in a final field of the MultiFn object. This of course means that you cannot just change the dispatch function for a given multimethod -- you have to recreate the multimethod itself. That, as you point out, necessitates re-registration of all previously defined methods, which is a hassle.
The current behaviour lets you reload namespaces with defmethod forms in them without losing all your methods at the cost of making it slightly more cumbersome to replace the actual multimethod when that is indeed what you want to do.
If you really wanted to, the dispatch fn could be changed via reflection, but that has problematic semantics, particularly in multi-threaded scenarios (see Java Language Specification 17.5.3 for information on reflective updates to final fields after construction).
Hacks (non-reflective)
One approach to (2) would be to automate re-adding the methods after redefinition with a macro along the lines of (untested)
(defmacro redefmulti [multifn & defmulti-tail]
`(let [mt# (methods ~multifn)]
(ns-unmap (.ns (var ~multifn)) '~multifn)
(defmulti ~multifn ~#defmulti-tail)
(doseq [[dispval# meth#] mt#]
(.addMethod ~multifn dispval# meth#))))
An alternative design would use a macro called, say, with-method-reregistration, taking a seqable of multifn names and a body and promising to reregister the methods after executing the body; here's a sketch (again, untested):
(defmacro with-method-reregistration [multifns & body]
`(let [mts# (doall (zipmap ~(map (partial list 'var) multifns)
(map methods ~multifns))))]
~#body
(doseq [[v# mt#] mts#
[dispval# meth#] mt#]
(.addMethod #v# dispval# meth#))))
You'd use it to say (with-method-reregistration [my-multi-1 my-multi-2] (require :reload 'ns1 ns2)). Not sure this is worth the loss of clarity.
I am working on a Clojure project and I often find myself writing Clojure macros for DSLs, but I was watching a Clojure video of how a company uses Clojure in their real work and the speaker said that in practical use they do not use macros for their DSLs, they only use macros to add a little syntactic sugar. Does this mean I should write my DSL in using standard functions and then add a few macros at the end?
Update:
After reading the many varied (and entertaining) responses to this question I have realized that the answer is not as clear cut as I first thought, for many reasons:
There are many different types of API in an application (internal, external)
There are many types of user of the API (business user who just wants to get something done fast, Clojure expert)
Is there macro there to hide boiler plate code?
I will go away and think about the question more deeply, but thanks for your answers as they have given me lots to think about. Also I noticed that Paul Graham thinks the opposite of the Christophe video and thinks macros should be a large part of the codebase (25%):
http://www.paulgraham.com/avg.html
To some extent I believe this depends on the use / purpose of your DSL.
If you are writing a library-like DSL to be used in Clojure code and want it to be used in a functional way, then I would prefer functions over macros. Functions are "nice" for Clojure users because they can be composed dynamically into higher order functions etc. For example, you are writing a functional web framework like Ring.
If you are writing a imperative DSL that will be used pretty independently of other Clojure code and you have decided that you definitely don't need higher order functions, then the usage will be pretty similar and you can chose whichever makes most sense. For example, you might be creating some kind of business rules engine.
If you are writing a specialised DSL that needs to produce highly performant code, then you will probably want to use macros most of the time since they will be expanded at compile time for maximum efficiency. For example, you're writing some graphics code that needs to expand to exactly the right sequence of OpenGL calls......
Yes!
Write functions whenever possible. Never write a macro when a function will do. If you write to many macros you end up with somthing that is much harder to extend. Macros for example cant be applied or passed around.
Christophe Grand: (not= DSL macros)
http://clojure.blip.tv/file/4522250/
No!
Don't be afraid of using macros extensively. Always write a macro when in doubt. Functions are inferior for implementing DSLs - they're taking the burden onto the runtime, whereas macros allows to do many heavyweight computations in a compilation time. Just think of a difference of implementing, say, an embedded Prolog as an interpreter function and as a macro which compiles Prolog into some form of a WAM.
And do not listen to those who say that "macros cant be applied or passed around", this argument is entirely a strawman. Those people are advocating interpreters over compilers, which is simply ridiculous.
A couple of tips on how to implement DSLs using macros:
Do it in stages. Define a long chain of languages from your DSL to the underlying Clojure. Keep each transform as simple as possible - this way you'd be able to easily maintain and debug your DSL compiler.
Prepare a toolbox of DSL components that you will reuse when implementing your DSLs. It should include target languages of different semantics (e.g., untyped eager functional - it is Clojure itself, untyped lazy functional, first order logic, typed imperative, Hindley-Millner typed eager functional, dataflow, etc.). With macros it is trivial to combine properties of all that target semantics seamlessly.
Maintain a set of compiler-building tools. It should include parser generators (useful even if your DSLs are entirely in S-expressions), term rewriting engines, pattern matching engines, implementations for some common algorithms on graphs (e.g., graph colouring), etc.
Here's an example of a DSL in Haskell that uses functions rather than macros:
http://contracts.scheming.org/
Here is a video of Simon Peyton Jones giving a talk about this implementation:
http://ulf.wiger.net/weblog/2008/02/29/simon-peyton-jones-composing-contracts-an-adventure-in-financial-engineering/
Leverage the characteristics of Clojure and FP before going down the path of implementing your own language. I think SK-logic's tips give you a good indication of what is needed to implement a full blown language. There are times when it's worth the effort, but those are rare.
I want to provide multiple implementations of a message reader/writer. What is the best approach?
Here is some pseudo-code of what I'm currently thinking:
just have a set of functions that all implementations must provide and leave it up to the caller to hold onto the right streams
(ns x-format)
(read-message [stream] ...)
(write-message [stream message] ...)
return a map with two closed functions holding onto the stream
(ns x-format)
(defn make-formatter [socket]
{:read (fn [] (.read (.getInputStream socket))))
:write (fn [message] (.write (.getOutputStream socket) message)))})
something else?
I think the first option is better. It's more extensible, depending how these objects are going to be used. It's easier to add or change a new function that works on an existing object if the functions and objects are separate. In Clojure there usually isn't much reason to bundle functions along with the objects they work on, unless you really want to hide implementation details from users of your code.
If you're writing an interface for which you expect many implementations, consider using multimethods also. You can have the default throw a "not implemented" exception, to force implementors to implement your interface.
As Gutzofter said, if the only reason you're considering the second option is to allow people not to have to type a parameter on every function call, you could consider having all of your functions use some var as the default socket object and writing a with-socket macro which uses binding to set that var's value. See the builtin printing methods which default to using the value of *out* as the output stream, and with-out-str which binds *out* to a string writer, as a Clojure example.
This article may interest you; it compares and contrasts some OOP idioms with Clojure equivalents.
I think that read-message and write-message are utility functions. What you need to do is encapsulate your functions in a with- macro(s). See 'with-output-to-string' in common lisp to see what I mean.
Edit:
When you use a with- macro you can have error handling and resource allocation in the macro expansion.
I'd go with the first option and make all those functions multimethods.