Avoiding global state in DAO layer of a Clojure application

Avoiding global state in DAO layer of a Clojure application - mongodb

I'm trying to implement the ideas from http://thinkrelevance.com/blog/2013/06/04/clojure-workflow-reloaded into my codebase.
I have a dao layer, where I now need to pass in a database in order to avoid global state. One thing that is throwing me off is the phrase:
Any function which needs one of these components has to take it as a
parameter. This isn't as burdensome as it might seem: each function
gets, at most, one extra argument providing the "context" in which it
operates. That context could be the entire system object, but more
often will be some subset. With judicious use of lexical closures, the
extra arguments disappear from most code.
Where should I use closures in order to avoid passing global state for every call? One example would be to create an init function in the dao layer, something like this:
(defprotocol Persistable
(collection-name [this]))
(def save nil)
(defn init [{:keys [db]}]
(alter-var-root #'save (fn [_] (fn [obj] (mc/insert-and-return db (collection-name obj) obj WriteConcern/SAFE)))))
This way I can initiate my dao layer from the system/start function like this:
(defn start
[{:keys [db] :as system}]
(let [d (-> db
(mc/connect)
(mc/get-db "my-test"))]
(dao/init d)
(assoc system :db d)))
This works, but it feels a bit icky. Is there a better way? If possible I would like to avoid forcing clients of my dao layer to have to pass a database every time it uses a function.

You can use higher order function to represent your DAO layer - that's the crux of functional programming, using functions to represent small to large parts of your system. So you have a higher order function which takes in the DB connection as param and return you another function which you can use to call various operations like save, delete etc on the database. Below is one such example:
(defn db-layer [db-connection]
(let [db-operations {:save (fn [obj] (save db-connection obj))
:delete (fn [obj] (delete db-connection obj))
:query (fn [query] (query db-connection query))}]
(fn [operation & params]
(-> (db-operations operation) (apply params)))))
Usage of DB layer:
(let [my-db (create-database)
db-layer-fn (db-layer my-db)]
(db-layer-fn :save "abc")
(db-layer-fn :delete "abc"))
This is just an example of how higher order functions can allow you to sort of create a context for another set of functions. You can take this concept even further by combining it with other Clojure features like protocols.

Related

Misunderstanding Clojure's require

I have 2 files: enviro.clj and point.clj; both in the same folder.
I want to import point.clj into enviro.clj.
enviro.clj:
(ns game-of-life.enviro
(:require [game_of_life.point :as point]))
(defrecord Enviro [cells dims])
(defn create-dead-enviro [width height]
(Enviro.
(replicate (* width height) :dead)
(point/Point. width height)))
point.clj:
(ns game-of-life.point)
; A 2D point representing a coordinate, or any pair of numbers
(defrecord Point [x y])
With this set-up though, Intellij (with Cursive) is saying that it can't resolve point/Point. inside of create-dead-enviro. It does however suggest importing it. If I allow it to auto-fix it, it changes the top of enviro.clj to:
(ns game-of-life.enviro
(:require [game_of_life.point :as point])
(:import (game_of_life.point Point)))
From what I've read though, import is only for Java interop to import a Java class; it's not used to "import" a Clojure namespace.
What am I missing here?
Edit
Still no. I changed enviro.clj to:
(ns game-of-life.enviro
(:require [game_of_life.point :as point]))
(defrecord Enviro [cells dims])
(defn create-dead-enviro [width height]
(Enviro.
(replicate (* width height) :dead)
(->Point width height)))
And I'm still getting an "cannot resolve" error.

This is a bit of specialness around records and only records where you need to import them if you want to use the (recordName. args) java interop constructor form.
if you use the ->Enviro helper function you don't need to add the extra import.
user> (defrecord Enviro [cells dims])
user.Enviro
user> (->Enviro 1 2)
#user.Enviro{:cells 1, :dims 2}
and it's a bit more clojure'ish do do it that way anyway.
Recort types are a quick way to define a named type to interacting with java libraries that expect this. They are also slightly faster for field access than maps. When using records keep in mind that if you conj some extra field into them in the course of working with them, then remove it later, they will silently stop being records and revert back to being normal maps. In general use records when you know you need them for java interop or in very tightly optimized code that you have already very carefully benchmarked (I have never seen this in practice). They have some value for documentation as well.
here is an example of using the ->recordName function instead of the java interop form.
user> (ns game-of-life.point)
nil
game-of-life.point> (defrecord Point [x y])
game_of_life.point.Point
game-of-life.point> (in-ns 'user)
#namespace[user]
user> (require '[game-of-life.point :as point])
nil
user> (point/->Point 1 2)
#game_of_life.point.Point{:x 1, :y 2}
Because the java interop form generates a names class outside the usual namespace onventions you need to import that class iff you use the className. constructor or use an explicit call to new to create your record object. if you use the automatically created function ->className then you don't need to use import

Lazy evaluation with compare-and-set

I'm trying to implement a get-database function that retrieves the database reference from Monger the first time it is called, remembers the value in an atom and returns it directly on subsequent calls. My current code looks like this:
(def database (atom nil))
(defn get-database
[]
(compare-and-set! database nil
(let [db (:db (mg/connect-via-uri (System/getenv "MONGOLAB_URI")))] db))
#database)
The problem is that the let clause seems to be evaluated even if compare-and-set! returns false (i.e. database is not nil). Is there some way to get this to evaluate lazily so I don't incur the penalty of retrieving the Monger connection, or is this approach fundamentally misguided?

The problem here is that compare-and-set! is a function, so evaluating it will evaluate all the parameters before the function is called.
The typical approach that I take for the use case of caching and re-using some expensive-to-compute value is with a delay:
Takes a body of expressions and yields a Delay object that will
invoke the body only the first time it is forced (with force or deref/#), and
will cache the result and return it on all subsequent force
calls. See also - realized?
In your case:
(def database (delay (:db (mg/connect-via-uri (System/getenv "MONGOLAB_URI")))))
Now you can just say #database any time you want to get a reference to the database, and the connection will get initialized the first time your code actually causes the delay to be dereferenced. You could wrap the call to dereference the delay inside a get-database function if you'd like, but this isn't necessary.

Class finalization: how to avoid creating dummy instances?

I've run into a problem that a third-party library needs to act on a class as if it was finalized. After some reading I understand the motivation behind this mechanism, but I don't really know how it functions.
Example:
(make-instance 'expression :op '+ :left 'nan :right 'nan)
(defmethod normalize-expression ((this expression))
(optima:match this
((optima::or (expression :left 'nan) (expression :right 'nan)) 'nan)
((expression :op op :left x :right y) (funcall op x y))))
Unless I add the first line, the function will not compile, giving me this error:
; caught ERROR:
; (during macroexpansion of (SB-PCL::%DEFMETHOD-EXPANDER NORMALIZE-EXPRESSION ...))
; SB-MOP:CLASS-SLOTS called on #<STANDARD-CLASS EXPRESSION>, which is not yet finalized.
; See also:
; AMOP, Generic Function SB-MOP:CLASS-SLOTS
optima is a pattern-matching library, the (expression :op op ...) is matching instances of class expression against the given pattern. I don't know in much details, but it looks like it needs to know what are the accessors defined for this class, and it looks like that information is not available until it is finalized. So, is there any way to sidestep the finalization problem?
The class will not be extended (at least not in this project, and it's not being planned). It doesn't hurt that much to create a dummy instance... it is just an ugly solution, so I hoped to find a better one. Also, perhaps, I'd get some more info on finalization, which is good too :)

Forgetting to ensure class finalization seems to be quite common mistake when using MOP.
In lisp, classes are defined in two "phases":
Direct class definition
Effective class definition
Direct class definition is isomorphic to defclass form. It has class name, names of superclasses, list of direct slots (i.e., slots defined on this particular class but on its superclasses).
Effective class definition contains all information needed for compiler/interpreter. It contains list of all class slots (including those defined on superclasses), class instance layout, references to accessor methods, etc.
Process of transforming direct class definition to effective class definition is called class finalization. Since CLOS supports redefining classes, finalization might be called multiple times for a class. One of the reasons why finalization is delayed is because class may be defined before its superclasses are defined.
Regarding your particular problem: is seems that optima:match should ensure that class is finalized before trying to list its slots. This can be done with two functions: class-finalized-p (to check whether class needs finalization) and finalize-inheritance to actually perform finalization. Or you can use utility function closer-mop:ensure-finalized. (closer-mop is a library for portable usage of CLOS MOP).
E.g.,:
(c2mop:ensure-finalized (find-class 'expression))

In common-lisp how can i override/change evaluation behaviour for a specific type of object?

In common-lisp, I want to implement a kind of reference system like this:
Suppose that I have:
(defclass reference () ((host) (port) (file)))
and also I have:
(defun fetch-remote-value (reference) ...) which fetches and deserializes a lisp object.
How could I intervene in the evaluation process so as whenever a reference object is being evaluated, the remote value gets fetched and re-evaluated again to produce the final result?
EDIT:
A more elaborate description of what I want to accomplish:
Using cl-store I serialize lisp objects and send them to a remote file(or db or anything) to be saved. Upon successful storage I keep the host,port and file in a reference object. I would like, whenever eval gets called on a reference object, to first retrieve the object, and then call eval on the retrieved value. Since a reference can be also serialized in other (parent) objects or aggregate types, I can get free recursive remote reference resolution by modyfing eval so i dont have to traverse and resolve the loaded object's child references myself.
EDIT:
Since objects always evaluate to themselves, my question is a bit wrongly posed. Essentially what I would like to do is:
I would like intercept the evaluation of symbols so that when their value is an object of type REFERENCE then instead of returning the object as the result of the symbol evaluation, to return the result of (fetch-remote-value object) ?

In short: you cannot do this, except by rewriting the function eval and modifying your Lisp's compiler. The rules of evaluation are fixed Lisp standard.
Edit After reading the augmented question, I don't think, that you can achieve full transperency for your references here. In a scenario like
(defclass foo () (reference :accessor ref))
(ref some-foo)
The result of the call to ref is simply a value; it will not be considered for evaluation regardless of its type.
Of course, you could define your accessors in a way, which does the resolution transparently:
(defmacro defresolver (name class slot)
`(defmethod ,name ((inst ,class))
(fetch-remote-reference (slot-value inst ',slot))))
(defresolver foo-reference foo reference)
Edit You can (sort of) hook into the symbol resolution mechanism of Common Lisp using symbol macros:
(defmacro let-with-resolution (bindings &body body)
`(symbol-macrolet ,(mapcar #'(lambda (form) (list (car form) `(fetch-aux ,(cadr form)))) bindings) ,#body))
(defmethod fetch-aux ((any t)) any)
(defmethod fetch-aux ((any reference)) (fetch-remote-reference any))
However, now things become pretty arcane; and the variables are no longer variables, but magic symbols, which merely look like variables. For example, modifying the content of a variable "bound" by this macro is not possible. The best you can do with this approach is to provide a setf expansion for fetch-aux, which modifies the original place.

Although libraries for lazy evaluatione and object persistence bring you part of the way, Common Lisp does not provide a portable way to implement fully transparent persistent values. Lazy or persistent values still have to be explicitly forced.
MOP can be used to implement lazy or persistent objects though, with the slot values transparently forced. It would take a change in the internals of the Common Lisp implementations to provide general transparency, so you could do e.g. (+ p 5) with p potentially holding a persistent or lazy value.

It is not possible to directly change the evaluation mechanisms. You would need to write a compiler for your code to something else. Kind of an embedded language.
On the CLOS level there are several ways to deal with it:
Two examples:
write functions that dispatch on the reference object:
(defmethod move ((object reference) position)
(move (dereference reference) position))
(defmethod move ((object automobile) position)
...))
This gets ugly and might be automated with a macro.
CHANGE-CLASS
CLOS objects already have an indirection, because they can change their class. Even though they may change their class, they keep their identity. CHANGE-CLASS is destructively modifying the instance.
So that would make it possible to pass around reference objects and at some point load the data, change the reference object to some other class and set the slots accordingly. This changing the class needs to be triggered somewhere in the code.
One way to have it automagically triggered might be an error handler that catches some kinds of errors involving reference object.

I would add a layer on top of your deserialize mechanism that dispatches based on the type of the incoming data.

lisp file pointers in classes

I'm running up against a problem in understanding the CLOS way of handling file access within a class. In c++ I would be able to do this:
class Foo {
Foo (string filename); // opens the file (my_file) requested by the filename
~Foo (); // close the file
FILE * my_file; // a persistent file-handle
DataStruct my_data; // some data
void ParseData (); // will perform some function on the file and populate my_data
DataStruct * GetData () { return &my_data; } // accessor to the data
};
What I'd like to point out is that PraseData() will be called multiple times, and each time a new block of data will be parsed from the file and my_data will be altered.
I'm trying to perform the same trick in CLOS - create all the generic methods to parse the data, load the file, read headers, etc. as well as the class definition which I have as:
(defclass data-file ()
((filename :initarg :filename :accessor filename)
(file :accessor file)
(frame :accessor frame)))
In the "constructor" (i.e. initialize-instance) I open the file just as my c++ idiom. Then I have access to the data and I can parse the data as before. However, I'm told that using a "destructor" or (finalize) method to close the file is not idiomatic CLOS for handling this type of situation where I need the file to be around so I can access it outside of my data-file methods.
I'm going to define a function that loads a data-file, and then performs a series of analyses with its data, and then hopefully close it. What's a way to go about doing this? (I'm assuming a macro or some type of closure would work in here, but I'm not familiar enough with the lisp way to decide what is needed or how to implement it).

One option is to have the stream as a slot instead of the filename, and then scope it with WITH-OPEN-FILE:
(with-open-file (stream file)
(let ((foo (make-instance 'foo :stream stream)))
(frob foo)
(...other processing of foo...)))
Then your stream will be closed automatically.

I think I would lean towards making classes only to store complete authoritative data (what you call DataStruct?).
You don't really need a special class for "loading + storage of another class". Plus, that way has the unspoken invariant that my_data holds the data of my_file up to the current seek position, which seems a bit strange to my eye.
Put another way: what does Foo do? Given a filename, it loads data, and gives you a DataStruct. That sounds like a function to me. If you need to be able to run it in a thread, or fire events between loading records, a class is the natural way to do it in C++, but you don't need a class for those things in Lisp.
Also, remember that you don't need to use DEFCLASS in order to use generic methods in CLOS.
I don't know what the structure of your data is, but in similar situations I've made a parse-one-chunk function that takes a stream and returns one record, and then create a complete Foo instance inside a loop in a with-open-file. If the stream is never needed outside the scope of a with-open-file expansion, you never need to worry about closing it.