Lazy evaluation with compare-and-set - mongodb

I'm trying to implement a get-database function that retrieves the database reference from Monger the first time it is called, remembers the value in an atom and returns it directly on subsequent calls. My current code looks like this:
(def database (atom nil))
(defn get-database
[]
(compare-and-set! database nil
(let [db (:db (mg/connect-via-uri (System/getenv "MONGOLAB_URI")))] db))
#database)
The problem is that the let clause seems to be evaluated even if compare-and-set! returns false (i.e. database is not nil). Is there some way to get this to evaluate lazily so I don't incur the penalty of retrieving the Monger connection, or is this approach fundamentally misguided?

The problem here is that compare-and-set! is a function, so evaluating it will evaluate all the parameters before the function is called.
The typical approach that I take for the use case of caching and re-using some expensive-to-compute value is with a delay:
Takes a body of expressions and yields a Delay object that will
invoke the body only the first time it is forced (with force or deref/#), and
will cache the result and return it on all subsequent force
calls. See also - realized?
In your case:
(def database (delay (:db (mg/connect-via-uri (System/getenv "MONGOLAB_URI")))))
Now you can just say #database any time you want to get a reference to the database, and the connection will get initialized the first time your code actually causes the delay to be dereferenced. You could wrap the call to dereference the delay inside a get-database function if you'd like, but this isn't necessary.

Related

When should we not use "for loops" in scala?

Twitter's Effective Scala says:
"for provides both succinct and natural expression for looping and aggregation. It is especially useful when flattening many sequences. The syntax of for belies the underlying mechanism as it allocates and dispatches closures. This can lead to both unexpected costs and semantics; for example
for (item <- container) {
if (item != 2)
return
}
may cause a runtime error if the container delays computation, making the return nonlocal!
For these reasons, it is often preferrable to call foreach, flatMap, map, and filter directly — but do use fors when they clarify."
I don't understand why there can be runtime errors here.
The Twitter manual is warning that the code inside of the for may be bound inside a closure, which is (in this case) a function running inside of an execution environment that is built to capture the local environment present at the call site. You can read more about closures here - http://en.wikipedia.org/wiki/Closure_%28computer_programming%29.
So the for construct may package all of the code inside the loop into a separate function, bind the particular item and the the local environment to the closure (so that all references to variables outside of the loop are still valid inside the function), and then execute that function somewhere else, ie in another, hidden call frame.
This becomes a problem for your return, which is supposed to immediately exit the current call frame. But if your closure is executing elsewhere, then the closure function becomes the target for the return, rather than the actual surrounding method, which is what you probably meant.
Simon has a very nice answer regarding the return problem. I would like to add that in general using for loops is simply bad Scala style when foreach, map, etc would do. The exception is when you want nested foreach, map, etc like behaviour and in this case a for loop, rather, to be specific, a "for comprehension" would be appropriate.
E.g.
// Good, better than for loop
myList.foreach(println)
// Ok but might be easier to understsnd the for comprehension
(1 to N).map(i => (1 to M).map((i, _)))
// Equivilent to above and some say easier to read
for {
i <- (1 to N)
j <- (1 to M)
} yield (i, j)

Lisp Function Interpretation

I am reading a book and I am confused on what the following code does:
(defmethod execute ((o ORDER) (l SIMUL) (e MARKETUPDATE))
(values
(list (make-TRADE :timestamp (timestamp e)
:price (price e)
:quantity (orderquantity o)))
NIL))
The source to which I got this function says that it returns two values. My question is what the body does. From my understanding, the line 3-5 creates a list with :timestamp, :price, :quantity. Am I correct? What about values, the second line? Does it return this variable too? Any summary would help. Thanks
This is a method for a generic function, specializing on arguments of types order, simul, and marketupdate.
It returns 2 values:
A list of length 1 created by the eponymous function list, which contains a single object of, presumably, type trade (probably - but not necessarily - created by a defstruct), which has slots timestamp, price, and quantity.
Symbol nil.
You can probably access the slots of the trade using functions trade-timestamp &c (unless the defstruct form is non-trivial or trade is not defined by a defstruct at all).
Why the result of make-trade is wrapped in a list is hard to guess without more context, but I'd guess that an execute can be split into N trades in some scenarios.
I suspect your confusion arises almost entire because this is the first time you have encountered a use of values. Common Lisp allows functions to return multiple values. That's slightly similar to how any language allows functions to receive multiple parameters.
These multiple return values are quite efficiently implemented. Most newbies encounter multiple values for the first time on the integer division functions, which will return a remainder as their second value. Hash table look ups will return a second value to indicate if the key was actually in the table, since the value stored for the key might be nil.
In your example the second value is NIL, presumably other execute methods might return something more interesting - for example where in the update Q the order was place, or an error code if something goes wrong. Of course checking out the manual for values will be fraught with educational values(sic).
This function is a method returning two values by using the keyword values. Have a look at CLOS to better understand object orientation and at "values" for the way of returning more than one value.

Avoiding global state in DAO layer of a Clojure application

I'm trying to implement the ideas from http://thinkrelevance.com/blog/2013/06/04/clojure-workflow-reloaded into my codebase.
I have a dao layer, where I now need to pass in a database in order to avoid global state. One thing that is throwing me off is the phrase:
Any function which needs one of these components has to take it as a
parameter. This isn't as burdensome as it might seem: each function
gets, at most, one extra argument providing the "context" in which it
operates. That context could be the entire system object, but more
often will be some subset. With judicious use of lexical closures, the
extra arguments disappear from most code.
Where should I use closures in order to avoid passing global state for every call? One example would be to create an init function in the dao layer, something like this:
(defprotocol Persistable
(collection-name [this]))
(def save nil)
(defn init [{:keys [db]}]
(alter-var-root #'save (fn [_] (fn [obj] (mc/insert-and-return db (collection-name obj) obj WriteConcern/SAFE)))))
This way I can initiate my dao layer from the system/start function like this:
(defn start
[{:keys [db] :as system}]
(let [d (-> db
(mc/connect)
(mc/get-db "my-test"))]
(dao/init d)
(assoc system :db d)))
This works, but it feels a bit icky. Is there a better way? If possible I would like to avoid forcing clients of my dao layer to have to pass a database every time it uses a function.
You can use higher order function to represent your DAO layer - that's the crux of functional programming, using functions to represent small to large parts of your system. So you have a higher order function which takes in the DB connection as param and return you another function which you can use to call various operations like save, delete etc on the database. Below is one such example:
(defn db-layer [db-connection]
(let [db-operations {:save (fn [obj] (save db-connection obj))
:delete (fn [obj] (delete db-connection obj))
:query (fn [query] (query db-connection query))}]
(fn [operation & params]
(-> (db-operations operation) (apply params)))))
Usage of DB layer:
(let [my-db (create-database)
db-layer-fn (db-layer my-db)]
(db-layer-fn :save "abc")
(db-layer-fn :delete "abc"))
This is just an example of how higher order functions can allow you to sort of create a context for another set of functions. You can take this concept even further by combining it with other Clojure features like protocols.

In common-lisp how can i override/change evaluation behaviour for a specific type of object?

In common-lisp, I want to implement a kind of reference system like this:
Suppose that I have:
(defclass reference () ((host) (port) (file)))
and also I have:
(defun fetch-remote-value (reference) ...) which fetches and deserializes a lisp object.
How could I intervene in the evaluation process so as whenever a reference object is being evaluated, the remote value gets fetched and re-evaluated again to produce the final result?
EDIT:
A more elaborate description of what I want to accomplish:
Using cl-store I serialize lisp objects and send them to a remote file(or db or anything) to be saved. Upon successful storage I keep the host,port and file in a reference object. I would like, whenever eval gets called on a reference object, to first retrieve the object, and then call eval on the retrieved value. Since a reference can be also serialized in other (parent) objects or aggregate types, I can get free recursive remote reference resolution by modyfing eval so i dont have to traverse and resolve the loaded object's child references myself.
EDIT:
Since objects always evaluate to themselves, my question is a bit wrongly posed. Essentially what I would like to do is:
I would like intercept the evaluation of symbols so that when their value is an object of type REFERENCE then instead of returning the object as the result of the symbol evaluation, to return the result of (fetch-remote-value object) ?
In short: you cannot do this, except by rewriting the function eval and modifying your Lisp's compiler. The rules of evaluation are fixed Lisp standard.
Edit After reading the augmented question, I don't think, that you can achieve full transperency for your references here. In a scenario like
(defclass foo () (reference :accessor ref))
(ref some-foo)
The result of the call to ref is simply a value; it will not be considered for evaluation regardless of its type.
Of course, you could define your accessors in a way, which does the resolution transparently:
(defmacro defresolver (name class slot)
`(defmethod ,name ((inst ,class))
(fetch-remote-reference (slot-value inst ',slot))))
(defresolver foo-reference foo reference)
Edit You can (sort of) hook into the symbol resolution mechanism of Common Lisp using symbol macros:
(defmacro let-with-resolution (bindings &body body)
`(symbol-macrolet ,(mapcar #'(lambda (form) (list (car form) `(fetch-aux ,(cadr form)))) bindings) ,#body))
(defmethod fetch-aux ((any t)) any)
(defmethod fetch-aux ((any reference)) (fetch-remote-reference any))
However, now things become pretty arcane; and the variables are no longer variables, but magic symbols, which merely look like variables. For example, modifying the content of a variable "bound" by this macro is not possible. The best you can do with this approach is to provide a setf expansion for fetch-aux, which modifies the original place.
Although libraries for lazy evaluatione and object persistence bring you part of the way, Common Lisp does not provide a portable way to implement fully transparent persistent values. Lazy or persistent values still have to be explicitly forced.
MOP can be used to implement lazy or persistent objects though, with the slot values transparently forced. It would take a change in the internals of the Common Lisp implementations to provide general transparency, so you could do e.g. (+ p 5) with p potentially holding a persistent or lazy value.
It is not possible to directly change the evaluation mechanisms. You would need to write a compiler for your code to something else. Kind of an embedded language.
On the CLOS level there are several ways to deal with it:
Two examples:
write functions that dispatch on the reference object:
(defmethod move ((object reference) position)
(move (dereference reference) position))
(defmethod move ((object automobile) position)
...))
This gets ugly and might be automated with a macro.
CHANGE-CLASS
CLOS objects already have an indirection, because they can change their class. Even though they may change their class, they keep their identity. CHANGE-CLASS is destructively modifying the instance.
So that would make it possible to pass around reference objects and at some point load the data, change the reference object to some other class and set the slots accordingly. This changing the class needs to be triggered somewhere in the code.
One way to have it automagically triggered might be an error handler that catches some kinds of errors involving reference object.
I would add a layer on top of your deserialize mechanism that dispatches based on the type of the incoming data.

lisp file pointers in classes

I'm running up against a problem in understanding the CLOS way of handling file access within a class. In c++ I would be able to do this:
class Foo {
Foo (string filename); // opens the file (my_file) requested by the filename
~Foo (); // close the file
FILE * my_file; // a persistent file-handle
DataStruct my_data; // some data
void ParseData (); // will perform some function on the file and populate my_data
DataStruct * GetData () { return &my_data; } // accessor to the data
};
What I'd like to point out is that PraseData() will be called multiple times, and each time a new block of data will be parsed from the file and my_data will be altered.
I'm trying to perform the same trick in CLOS - create all the generic methods to parse the data, load the file, read headers, etc. as well as the class definition which I have as:
(defclass data-file ()
((filename :initarg :filename :accessor filename)
(file :accessor file)
(frame :accessor frame)))
In the "constructor" (i.e. initialize-instance) I open the file just as my c++ idiom. Then I have access to the data and I can parse the data as before. However, I'm told that using a "destructor" or (finalize) method to close the file is not idiomatic CLOS for handling this type of situation where I need the file to be around so I can access it outside of my data-file methods.
I'm going to define a function that loads a data-file, and then performs a series of analyses with its data, and then hopefully close it. What's a way to go about doing this? (I'm assuming a macro or some type of closure would work in here, but I'm not familiar enough with the lisp way to decide what is needed or how to implement it).
One option is to have the stream as a slot instead of the filename, and then scope it with WITH-OPEN-FILE:
(with-open-file (stream file)
(let ((foo (make-instance 'foo :stream stream)))
(frob foo)
(...other processing of foo...)))
Then your stream will be closed automatically.
I think I would lean towards making classes only to store complete authoritative data (what you call DataStruct?).
You don't really need a special class for "loading + storage of another class". Plus, that way has the unspoken invariant that my_data holds the data of my_file up to the current seek position, which seems a bit strange to my eye.
Put another way: what does Foo do? Given a filename, it loads data, and gives you a DataStruct. That sounds like a function to me. If you need to be able to run it in a thread, or fire events between loading records, a class is the natural way to do it in C++, but you don't need a class for those things in Lisp.
Also, remember that you don't need to use DEFCLASS in order to use generic methods in CLOS.
I don't know what the structure of your data is, but in similar situations I've made a parse-one-chunk function that takes a stream and returns one record, and then create a complete Foo instance inside a loop in a with-open-file. If the stream is never needed outside the scope of a with-open-file expansion, you never need to worry about closing it.