which clojure library interface design is best? - lisp

I want to provide multiple implementations of a message reader/writer. What is the best approach?
Here is some pseudo-code of what I'm currently thinking:
just have a set of functions that all implementations must provide and leave it up to the caller to hold onto the right streams
(ns x-format)
(read-message [stream] ...)
(write-message [stream message] ...)
return a map with two closed functions holding onto the stream
(ns x-format)
(defn make-formatter [socket]
{:read (fn [] (.read (.getInputStream socket))))
:write (fn [message] (.write (.getOutputStream socket) message)))})
something else?

I think the first option is better. It's more extensible, depending how these objects are going to be used. It's easier to add or change a new function that works on an existing object if the functions and objects are separate. In Clojure there usually isn't much reason to bundle functions along with the objects they work on, unless you really want to hide implementation details from users of your code.
If you're writing an interface for which you expect many implementations, consider using multimethods also. You can have the default throw a "not implemented" exception, to force implementors to implement your interface.
As Gutzofter said, if the only reason you're considering the second option is to allow people not to have to type a parameter on every function call, you could consider having all of your functions use some var as the default socket object and writing a with-socket macro which uses binding to set that var's value. See the builtin printing methods which default to using the value of *out* as the output stream, and with-out-str which binds *out* to a string writer, as a Clojure example.
This article may interest you; it compares and contrasts some OOP idioms with Clojure equivalents.

I think that read-message and write-message are utility functions. What you need to do is encapsulate your functions in a with- macro(s). See 'with-output-to-string' in common lisp to see what I mean.
Edit:
When you use a with- macro you can have error handling and resource allocation in the macro expansion.

I'd go with the first option and make all those functions multimethods.

Related

Can monads replace macros when writing a redis json driver?

In this blog entry by Jim Duey - he provides a list of reasons that you'd want to use monads. One his reasons is this:
So what are some clues that a monadic solution is possible? It seems to me that anytime you're copying and pasting code to define a new function that's similar to an existing one, there might be a monad lurking.
This is actually quite similar to the justification for using a Clojure style macro.
In this presentation by Aaron Bedra - he talks about a use case for Macros when writing a redis driver to generate json. On slide 66 - he shows an example of this.
(defmacro defcommand
[name params]
(let [p (parameters params)]
`(defn ~name ~params
My question is - can Monads be used to solve this problem of duplicate code when generating json for redis calls instead of a macro?
Assumptions
I understand it is more idiomatic to choose a macro over a monad in Clojure. For the purpose of this question I'm choosing to ignore what is idiomatic and just look at what is possible.
anytime you're copying and pasting code to define a new function
that's similar to an existing one
I would create a higher order function to get rid of copy paste, as simple as that.
Now, Macro or Monad or even a simple higher order function solves a single purpose but at different conceptual levels. The purpose is to "abstract away a common pattern in a system".
The different conceptual levels at which you can see a pattern appearing in a system are:
Same chunk of code - Use function / higher order function
Same chunk of code but that require some sort of "compile time" preprocessing to make it abstract - Macro
Pattern of "A value inside a box with some context" and it fits with the monad laws - Monads. (Maybe, IO, List - all these are API patterns where we see a value inside a box with some context)
I know it sounds very "abstract", but once you practice enough thinking about how to abstract general patterns in your system, you will eventually get the intuition about which tool to use for which pattern.
Monads and Macros operate at different levels of abstraction.
I think of macros as "code monkeys": anytime I find myself writing a lot of boilerplate, I take a moment and think whether macros can help.
Monads on the other hand are very powerful abstractions generally used in order to isolate and compose side effects in a purely functional setting. (I'm not an expert on Monads so take this with a grain of salt).
The bottom line is that I think they are solving fundamentally different problems.
If we take the example you provided, writing a Redis driver, any function accessing the network is a candidate for an abstraction through monads - in this case, both the IO and Maybe (or Either) monads. But not because of repetition, but rather because your functions have side effects.
Now this is where Monads in clojure can get harder than it's worth: If you want to compose both monads, you need to use monad transformers. Obviously this is not enforced by clojure since it's a dynamic language so one can argue debugging composed monadic code in a dynamic language can be very hard. A static type system such as Haskell's would be of great help here.
So while a redis driver can definitely be written in a monadic style, I believe the point Aaron is making in the presentation is avoiding repetition, and for that, I believe macros are a better fit in Clojure.
I hope this is helpful.
EDIT:
I should also node that monads in Clojure would be hardly useful without the Haskell inspired do notation which is essentially made possible by macros, providing further proof that macros are at a different level of abstraction.

Reloading multimethods via Slime

I'm having trouble reloading multimethods when developing in Emacs with a Slime repl.
Redefining the defmethod forms works fine, but if I change the dispatch function I don't seem to be able to reload the defmulti form. I think I specifically added or removed dispatch function parameters.
As a workaround I've been able to ns-unmap the multimethod var, reload the defmulti form, and then reload all the defmethod forms.
Presumably this is a "limitation" of the way Clojure implements multimethods, i.e. we're sacrificing some dynamism for execution speed, but are there any idioms or development practises that help workaround this?
The short answer is that your way of dealing with this is exactly correct. If you find yourself updating a multimethod in order to change the dispatch function particularly frequently, (1) I think that's unusual :-), (2) you could write a suite of functions / macros to help with the reloading. I sketch two untested (!) macros to help with (2) further below.
Why?
First, however, a brief discussion of the "why". Dispatch function lookup for a multimethod as currently implemented requires no synchronization -- the dispatch fn is stored in a final field of the MultiFn object. This of course means that you cannot just change the dispatch function for a given multimethod -- you have to recreate the multimethod itself. That, as you point out, necessitates re-registration of all previously defined methods, which is a hassle.
The current behaviour lets you reload namespaces with defmethod forms in them without losing all your methods at the cost of making it slightly more cumbersome to replace the actual multimethod when that is indeed what you want to do.
If you really wanted to, the dispatch fn could be changed via reflection, but that has problematic semantics, particularly in multi-threaded scenarios (see Java Language Specification 17.5.3 for information on reflective updates to final fields after construction).
Hacks (non-reflective)
One approach to (2) would be to automate re-adding the methods after redefinition with a macro along the lines of (untested)
(defmacro redefmulti [multifn & defmulti-tail]
`(let [mt# (methods ~multifn)]
(ns-unmap (.ns (var ~multifn)) '~multifn)
(defmulti ~multifn ~#defmulti-tail)
(doseq [[dispval# meth#] mt#]
(.addMethod ~multifn dispval# meth#))))
An alternative design would use a macro called, say, with-method-reregistration, taking a seqable of multifn names and a body and promising to reregister the methods after executing the body; here's a sketch (again, untested):
(defmacro with-method-reregistration [multifns & body]
`(let [mts# (doall (zipmap ~(map (partial list 'var) multifns)
(map methods ~multifns))))]
~#body
(doseq [[v# mt#] mts#
[dispval# meth#] mt#]
(.addMethod #v# dispval# meth#))))
You'd use it to say (with-method-reregistration [my-multi-1 my-multi-2] (require :reload 'ns1 ns2)). Not sure this is worth the loss of clarity.

Would the ability to declare Lisp functions 'pure' be beneficial?

I have been reading a lot about Haskell lately, and the benefits that it derives from being a purely functional language. (I'm not interested in discussing monads for Lisp) It makes sense to me to (at least logically) isolate functions with side-effects as much as possible. I have used setf and other destructive functions plenty, and I recognize the need for them in Lisp and (most of) its derivatives.
Here we go:
Would something like (declare pure) potentially help an optimizing compiler? Or is this a moot point because it already knows?
Would the declaration help in proving a function or program, or at least a subset that was declared as pure? Or is this again something that is unnecessary because it's already obvious to the programmer and compiler and prover?
If for nothing else, would it be useful to a programmer for the compiler to enforce purity for functions with this declaration and add to the readability/maintainablity of Lisp programs?
Does any of this make any sense? Or am I too tired to even think right now?
I'd appreciate any insights here. Info on compiler implementation or provability is welcome.
EDIT
To clarify, I didn't intend to restrict this question to Common Lisp. It clearly (I think) doesn't apply to certain derivative languages, but I'm also curious if some features of other Lisps may tend to support (or not) this kind of facility.
You have two answers but neither touch on the real problem.
First, yes, it would obviously be good to know that a function is pure. There's a ton of compiler level things that would like to know that, as well as user level things. Given that lisp languages are so flexible, you could twist things a bit: instead of a "pure" declaration that asks the compiler to try harder or something, you just make the declaration restrict the code in the definition. This way you can guarantee that the function is pure.
You can even do that with additional supporting facilities -- I mentioned two of them in a comment I made to johanbev's answer: add the notion of immutable bindings and immutable data structures. I know that in Common Lisp these are very problematic, especially immutable bindings (since CL loads code by "side-effecting" it into place). But such features will help simplifying things, and they're not inconceivable (see for example the Racket implementation that has immutable pairs and other data structures, and has immutable bindings.
But the real question is what can you do in such restricted functions. Even a very simple looking problem would be infested with issues. (I'm using Scheme-like syntax for this.)
(define-pure (foo x)
(cons (+ x 1) (bar)))
Seems easy enough to tell that this function is indeed pure, it doesn't do anything . Also, seems that having define-pure restrict the body and allow only pure code would work fine in this case, and will allow this definition.
Now start with the problems:
It's calling cons, so it assumes that it is also known to be pure. In addition, as I mentioned above, it should rely on cons being what it is, so assume that the cons binding is immutable. Easy, since it's a known builtin. Do the same with bar, of course.
But cons does have a side effect (even if you're talking about Racket's immutable pairs): it allocates a new pair. This seems like a minor and ignorable point, but, for example, if you allow such things to appear in pure functions, then you won't be able to auto-memoize them. The problem is that someone might rely on every foo call returning a new pair -- one that is not-eq to any other existing pair. Seems that to make it fine you need to further restrict pure functions to deal not only with immutable values, but also values where the constructor doesn't always create a new value (eg, it could hash-cons instead of allocate).
But that code also calls bar -- so no you need to make the same assumptions on bar: it must be known as a pure function, with an immutable binding. Note specifically that bar receives no arguments -- so in that case the compiler could not only require that bar is a pure function, it could also use that information and pre-compute its value. After all, a pure function with no inputs could be reduced to a plain value. (Note BTW that Haskell doesn't have zero-argument functions.)
And that brings another big issue in. What if bar is a function of one input? In that case you'd have an error, and some exception will get thrown ... and that's no longer pure. Exceptions are side-effects. You now need to know the arity of bar in addition to everything else, and you need to avoid other exceptions. Now, how about that input x -- what happens if it isn't a number? That will throw an exception too, so you need to avoid it too. This means that you now need a type system.
Change that (+ x 1) to (/ 1 x) and you can see that not only do you need a type system, you need one that is sophisticated enough to distinguish 0s.
Alternatively, you could re-think the whole thing and have new pure arithmetic operations that never throw exceptions -- but with all the other restrictions you're now quite a long way from home, with a language that is radically different.
Finally, there's one more side-effect that remains a PITA: what if the definition of bar is (define-pure (bar) (bar))? It certainly is pure according to all of the above restrictions... But diverging is a form of a side effect, so even this is no longer kosher. (For example, if you did make your compiler optimize nullary functions to values, then for this example the compiler itself would get stuck in an infinite loop.) (And yes, Haskell doesn't deal with that, it doesn't make it less of an issue.)
Given a Lisp function, knowing if it is pure or not is undecidable in general. Of course, necessary conditions and sufficient conditions can be tested at compile time. (If there are no impure operations at all, then the function must be pure; if an impure operation gets executed unconditionally, then the function must be impure; for more complicated cases, the compiler could try to prove that the function is pure or impure, but it will not succeed in all cases.)
If the user can manually annotate a function as pure, then the compiler could either (a.) try harder to prove that the function is pure, ie. spend more time before giving up, or (b.) assume that it is and add optimizations which would not be correct for impure functions (like, say, memoizing results). So, yes, annotating functions as pure could help the compiler if the annotations are assumed to be correct.
Apart from heuristics like the "trying harder" idea above, the annotation would not help to prove stuff, because it's not giving any information to the prover. (In other words, the prover could just assume that the annotation is always there before trying.) However, it could make sense to attach to pure functions a proof of their purity.
The compiler could either (a.) check if pure functions are indeed pure at compile time, but this is undecidable in general, or (b.) add code to try to catch side effects in pure functions at runtime and report those as an error. (a.) would probably be helpful with simple heuristics (like "an impure operation gets executed unconditionally), (b.) would be useful for debug.
No, it seems to make sense. Hopefully this answer also does.
The usual goodies apply when we can assume purity and referential
transparency. We can automatically memoize hotspots. We can
automatically parallelize computation. We can deal away with a lot of
race conditions. We can also use structure sharing with data that we
know cannot be modified, for instance the (quasi) primitive ``cons()''
does not need to copy the cons-cells in the list it's consing to.
These cells are not affected in any way by having another cons-cell
pointing to it. This example is kinda obvious, but compilers are often
good performers in figuring out more complex structure sharing.
However, actually determining if a lambda (a function) is pure or has
referential transparency is very tricky in Common Lisp. Remember that
a funcall (foo bar) start by looking at (symbol-function foo). So in
this case
(defun foo (bar)
(cons 'zot bar))
foo() is pure.
The next lambda is also pure.
(defun quux ()
(mapcar #'foo '(zong ding flop)))
However, later on we can redefine foo:
(let ((accu -1))
(defun foo (bar)
(incf accu)))
The next call to quux() is no longer pure! The old pure foo() has been
redefined to an impure lambda. Yikes. This example is maybe somewhat
contrived but it's not that uncommon to lexically redefine some
functions, for instance with a let block. In that case it's not
possible to know what would happen at compile time.
Common Lisp has a very dynamic semantic, so actually being
able to determine control flow and data flow ahead of time (for
instance when compiling) is very hard, and in most useful cases
entirely undecidable. This is quite typical of languages with dynamic
type systems. There is a lot of common idioms in Lisp you cannot use
if you must use static typing. It's mainly these that fouls any
attempt to do much meaningful static analysis. We can do it for primitives
like cons and friends. But for lambdas involving other things than
primitives we are in much deeper water, especially in the cases where
we need to look at complex interplay between functions. Remember that
a lambda is only pure if all the lambdas it calls are also pure.
On the top of my head, it could be possible, with some deep macrology,
to do away with the redefinition problem. In a sense, each lambda gets
an extra argument which is a monad that represents the entire state of
the lisp image (we can obviously restrict ourselves to what the function
will actually look at). But it's probably more useful to be able do
declare purity ourselves, in the sense that we promise the compiler
that this lambda is indeed pure. The consequences if it isn't is then
undefined, and all sorts of mayhem could ensue...

Scheme R5RS: pass by reference

I'm trying to emulate a stack in scheme. I'm using DrScheme and I select the language R5RS. I need to create functions that pop, push, and peek. But i'm having trouble figuring out how to pass by reference. I've read some information about boxes, but they are not supported in R5RS. Is there any other way to pass by reference?
Short answer: don't use r5rs; just use the native language. In current versions of DrRacket, that language is called "racket". Here's a program that uses boxes:
#lang racket
(define b (box 234))
(set-box! b 333)
(unbox b)
FWIW: Greg's answer is more purely functional than mine, but it would be a mistake to believe that mutable structures are not available in DrRacket (nee DrScheme).
Finally finally, you're misusing the term "call by reference". Boxes are just mutable structures, and a call-by-value language (such as racket, r5rs, java, etc.) can mutate these structures just fine.
Instead of passing "by reference", which is something you might do in an imperative language, Scheme encourages you to think in a functional sense. This means that your push operation, for example, would take two parameters:
a stack
a new element
and return a new stack that contains the new element in combination with the rest of the existing stack. Similarly, the pop operation would take a stack and return one with the top element gone, and peek would return the value of the top element.
As it turns out, lists in Scheme work almost exactly like stacks. The following mappings will help you get started:
push - cons
pop - rest
peek - first

Why to use empty parentheses in Scala if we can just use no parentheses to define a function which does not need any arguments?

As far as I understand, in Scala we can define a function with no parameters either by using empty parentheses after its name, or no parentheses at all, and these two definitions are not synonyms. What is the purpose of distinguishing these 2 syntaxes and when should I better use one instead of another?
It's mostly a question of convention. Methods with empty parameter lists are, by convention, evaluated for their side-effects. Methods without parameters are assumed to be side-effect free. That's the convention.
Scala Style Guide says to omit parentheses only when the method being called has no side-effects:
http://docs.scala-lang.org/style/method-invocation.html
Other answers are great, but I also think it's worth mentioning that no-param methods allow for nice access to a classes fields, like so:
person.name
Because of parameterless methods, you could easily write a method to intercept reads (or writes) to the 'name' field without breaking calling code, like so
def name = { log("Accessing name!"); _name }
This is called the Uniform Access Principal
I have another light to bring to the usefulness of the convention encouraging an empty parentheses block in the declaration of functions (and thus later in calls to them) with side effects.
It is with the debugger.
If one add a watch in a debugger, such as, say, process referring for the example to a boolean in the focused debug context, either as a variable view, or as a pure side-effect free function evaluation, it creates a nasty risk for your later troubleshooting.
Indeed, if the debugger keeps that watch as a try-to-evaluate thing whenever you change the context (change thread, move in the call stack, reach another breakpoint...), which I found to be at least the case with IntelliJ IDEA, or Visual Studio for other languages, then the side-effects of any other process function possibly found in any browsed scope would be triggered...
Just imagine the kind of puzzling troubleshooting this could lead to if you do not have that warning just in mind, because of some innocent regular naming. If the convention were enforced, with my example, the process boolean evaluation would never fall back to a process() function call in the debugger watches; it might just be allowed in your debugger to explicitly access the () function putting process() in the watches, but then it would be clear you are not directly accessing any attribute or local variables, and fallbacks to other process() functions in other browsed scopes, if maybe unlucky, would at the very least be very less surprising.