Does scala have a lazy evaluating wrapper? - scala

I want to return a wrapper/holder for a result that I want to compute only once and only if the result is actually used. Something like:
def getAnswer(question: Question): Lazy[Answer] = ???
println(getAnswer(q).value)
This should be pretty easy to implement using lazy val:
class Lazy[T](f: () => T) {
private lazy val _result = Try(f())
def value: T = _result.get
}
But I'm wondering if there's already something like this baked into the standard API.
A quick search pointed at Streams and DelayedLazyVal but neither is quite what I'm looking for.
Streams do memoize the stream elements, but it seems like the first element is computed at construction:
def compute(): Int = { println("computing"); 1 }
val s1 = compute() #:: Stream.empty
// computing is printed here, before doing s1.take(1)
In a similar vein, DelayedLazyVal starts computing upon construction, even requires an execution context:
val dlv = new DelayedLazyVal(() => 1, { println("started") })
// immediately prints out "started"

There's scalaz.Need which I think you'd be able to use for this.

Related

Scala parallelization method

Having a look at the forall method implementation, how could this method be parallel?
def forall(p: A => Boolean): Boolean = {
var result = true
breakable {
for (x <- this)
if (!p(x)) { result = false; break }
}
result
}
As I understand for better parallelization, avoid using var's and prefer val's. How is this now supposed to work if I use forall?
You are looking at a non-parallel version of forall.
A parallel version looks like this (provided by a parallel collection, in this case ParIterableLike):
def forall(pred: T => Boolean): Boolean = {
tasksupport.executeAndWaitResult
(new Forall(pred, splitter assign new DefaultSignalling with VolatileAbort))
}
To get a parallel collection, insert a .par, for example:
List.range(1, 10).par.forall(n => n % 2 == 0)
If you use an IDE like IntelliJ, just "zoom" into that forall, and you'll see the code above. Otherwise here is the source of ParIterableLike (thanks #Kigyo for sharing the URL).

How to create a play.api.libs.iteratee.Enumerator which inserts some data between the items of a given Enumerator?

I use Play framework with ReactiveMongo. Most of ReactiveMongo APIs are based on the Play Enumerator. As long as I fetch some data from MongoDB and return it "as-is" asynchronously, everything is fine. Also the transformation of the data, like converting BSON to String, using Enumerator.map is obvious.
But today I faced a problem which at the bottom line narrowed to the following code. I wasted half of the day trying to create an Enumerator which would consume items from the given Enumerator and insert some items between them. It is important not to load all the items at once, as there could be many of them (the code example has only two items "1" and "2"). But semantically it is similar to mkString of the collections. I am sure it can be done very easily, but the best I could come with - was this code. Very similar code creating an Enumerator using Concurrent.broadcast serves me well for WebSockets. But here even that does not work. The HTTP response never comes back. When I look at Enumeratee, it looks that it is supposed to provide such functionality, but I could not find the way to do the trick.
P.S. Tried to call chan.eofAndEnd in Iteratee.mapDone, and chunked(enums >>> Enumerator.eof instead of chunked(enums) - did not help. Sometimes the response comes back, but does not contain the correct data. What do I miss?
def trans(in:Enumerator[String]):Enumerator[String] = {
val (res, chan) = Concurrent.broadcast[String]
val iter = Iteratee.fold(true) { (isFirst, curr:String) =>
if (!isFirst)
chan.push("<-------->")
chan.push(curr)
false
}
in.apply(iter)
res
}
def enums:Enumerator[String] = {
val en12 = Enumerator[String]("1", "2")
trans(en12)
//en12 //if I comment the previous line and uncomment this, it prints "12" as expected
}
def enum = Action {
Ok.chunked(enums)
}
Here is my solution which I believe to be correct for this type of problem. Comments are welcome:
def fill[From](
prefix: From => Enumerator[From],
infix: (From, From) => Enumerator[From],
suffix: From => Enumerator[From]
)(implicit ec:ExecutionContext) = new Enumeratee[From, From] {
override def applyOn[A](inner: Iteratee[From, A]): Iteratee[From, Iteratee[From, A]] = {
//type of the state we will use for fold
case class State(prev:Option[From], it:Iteratee[From, A])
Iteratee.foldM(State(None, inner)) { (prevState, newItem:From) =>
val toInsert = prevState.prev match {
case None => prefix(newItem)
case Some(prevItem) => infix (prevItem, newItem)
}
for(newIt <- toInsert >>> Enumerator(newItem) |>> prevState.it)
yield State(Some(newItem), newIt)
} mapM {
case State(None, it) => //this is possible when our input was empty
Future.successful(it)
case State(Some(lastItem), it) =>
suffix(lastItem) |>> it
}
}
}
// if there are missing integers between from and to, fill that gap with 0
def fillGap(from:Int, to:Int)(implicit ec:ExecutionContext) = Enumerator enumerate List.fill(to-from-1)(0)
def fillFrom(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(x, input)
def fillTo(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(input, x)
val ints = Enumerator(10, 12, 15)
val toStr = Enumeratee.map[Int] (_.toString)
val infill = fill(
fillFrom(5),
fillGap,
fillTo(20)
)
val res = ints &> infill &> toStr // res will have 0,0,0,0,10,0,12,0,0,15,0,0,0,0
You wrote that you are working with WebSockets, so why don't you use dedicated solution for that? What you wrote is better for Server-Sent-Events rather than WS. As I understood you, you want to filter your results before sending them back to client? If its correct then you Enumeratee instead of Enumerator. Enumeratee is transformation from-to. This is very good piece of code how to use Enumeratee. May be is not directly about what you need but I found there inspiration for my project. Maybe when you analyze given code you would find best solution.

Create Future without starting it

This is a follow-up to my previous question
Suppose I want to create a future with my function but don't want to start it immediately (i.e. I do not want to call val f = Future { ... // my function}.
Now I see it can be done as follows:
val p = promise[Unit]
val f = p.future map { _ => // my function here }
Is it the only way to create a future with my function w/o executing it?
You can do something like this
val p = Promise[Unit]()
val f = p.future
//... some code run at a later time
p.success {
// your function
}
LATER EDIT:
I think the pattern you're looking for can be encapsulated like this:
class LatentComputation[T](f: => T) {
private val p = Promise[T]()
def trigger() { p.success(f) }
def future: Future[T] = p.future
}
object LatentComputation {
def apply[T](f: => T) = new LatentComputation(f)
}
You would use it like this:
val comp = LatentComputation {
// your code to be executed later
}
val f = comp.future
// somewhere else in the code
comp.trigger()
You could always defer creation with a closure, you'll not get the future object right ahead, but you get a handle to call later.
type DeferredComputation[T,R] = T => Future[R]
def deferredCall[T,R](futureBody: T => R): DeferredComputation[T,R] =
t => future {futureBody(t)}
def deferredResult[R](futureBody: => R): DeferredComputation[Unit,R] =
_ => future {futureBody}
If you are getting too fancy with execution control, maybe you should be using actors instead?
Or, perhaps, you should be using a Promise instead of a Future: a Promise can be passed on to others, while you keep it to "fulfill" it at a later time.
It's also worth giving a plug to Promise.completeWith.
You already know how to use p.future onComplete mystuff.
You can trigger that from another future using p completeWith f.
You can also define a function that creates and returns the Future, and then call it:
val double = (value: Int) => {
val f = Future { Thread.sleep(1000); value * 2 }
f.onComplete(x => println(s"Future return: $x"))
f
}
println("Before future.")
double(2)
println("After future is called, but as the future takes 1 sec to run, it will be printed before.")
I used this to executes futures in batches of n, something like:
// The functions that returns the future.
val double = (i: Int) => {
val future = Future ({
println(s"Start task $i")
Thread.sleep(1000)
i * 2
})
future.onComplete(_ => {
println(s"Task $i ended")
})
future
}
val numbers = 1 to 20
numbers
.map(i => (i, double))
.grouped(5)
.foreach(batch => {
val result = Await.result( Future.sequence(batch.map{ case (i, callback) => callback(i) }), 5.minutes )
println(result)
})
Or just use regular methods that return futures, and fire them in series using something like a for comprehension (sequential call-site evaluation)
This well known problem with standard libraries Future: they are designed in such a way that they are not referentially transparent, since they evaluate eagerly and memoize their result. In most use cases, this is totally fine and Scala developers rarely need to create non-evaluated future.
Take the following program:
val x = Future(...); f(x, x)
is not the same program as
f(Future(...), Future(...))
because in the first case the future is evaluated once, in the second case it is evaluated twice.
The are libraries which provide the necessary abstractions to work with referentially transparent asynchronous tasks, whose evaluation is deferred and not memoized unless explicitly required by the developer.
Scalaz Task
Monix Task
fs2
If you are looking to use Cats, Cats effects works nicely with both Monix and fs2.
this is a bit of a hack, since it have nothing to do with how future works but just adding lazy would suffice:
lazy val f = Future { ... // my function}
but note that this is sort of a type change as well, because whenever you reference it you will need to declare the reference as lazy too or it will be executed.

Repeating function call until we'll get non-empty Option result in Scala

A very newbie question in Scala - how do I do "repeat function until something is returned meets my criteria" in Scala?
Given that I have a function that I'd like to call until it returns the result, for example, defined like that:
def tryToGetResult: Option[MysteriousResult]
I've come up with this solution, but I really feel that it is ugly:
var res: Option[MysteriousResult] = None
do {
res = tryToGetResult
} while (res.isEmpty)
doSomethingWith(res.get)
or, equivalently ugly:
var res: Option[MysteriousResult] = None
while (res.isEmpty) {
res = tryToGetResult
}
doSomethingWith(res.get)
I really feel like there is a solution without var and without so much hassle around manual checking whether Option is empty or not.
For comparison, Java alternative that I see seems to be much cleaner here:
MysteriousResult tryToGetResult(); // returns null if no result yet
MysteriousResult res;
while ((res = tryToGetResult()) == null);
doSomethingWith(res);
To add insult to injury, if we don't need to doSomethingWith(res) and we just need to return it from this function, Scala vs Java looks like that:
Scala
def getResult: MysteriousResult = {
var res: Option[MysteriousResult] = None
do {
res = tryToGetResult
} while (res.isEmpty)
res.get
}
Java
MysteriousResult getResult() {
while (true) {
MysteriousResult res = tryToGetResult();
if (res != null) return res;
}
}
You can use Stream's continually method to do precisely this:
val res = Stream.continually(tryToGetResult).flatMap(_.toStream).head
Or (possibly more clearly):
val res = Stream.continually(tryToGetResult).dropWhile(!_.isDefined).head
One advantage of this approach over explicit recursion (besides the concision) is that it's much easier to tinker with. Say for example that we decided that we only wanted to try to get the result a thousand times. If a value turns up before then, we want it wrapped in a Some, and if not we want a None. We just add a few characters to our code above:
Stream.continually(tryToGetResult).take(1000).flatMap(_.toStream).headOption
And we have what we want. (Note that the Stream is lazy, so even though the take(1000) is there, if a value turns up after three calls to tryToGetResult, it will only be called three times.)
Performing side effects like this make me die a little inside, but how about this?
scala> import scala.annotation.tailrec
import scala.annotation.tailrec
scala> #tailrec
| def lookupUntilDefined[A](f: => Option[A]): A = f match {
| case Some(a) => a
| case None => lookupUntilDefined(f)
| }
lookupUntilDefined: [A](f: => Option[A])A
Then call it like this
scala> def tryToGetResult(): Option[Int] = Some(10)
tryToGetResult: ()Option[Int]
scala> lookupUntilDefined(tryToGetResult())
res0: Int = 10
You may want to give lookupUntilDefined an additional parameter so it can stop eventually in case f is never defined.

Scala Parallel Collections- How to return early?

I have a list of possible input Values
val inputValues = List(1,2,3,4,5)
I have a really long to compute function that gives me a result
def reallyLongFunction( input: Int ) : Option[String] = { ..... }
Using scala parallel collections, I can easily do
inputValues.par.map( reallyLongFunction( _ ) )
To get what all the results are, in parallel. The problem is, I don't really want all the results, I only want the FIRST result. As soon as one of my input is a success, I want my output, and want to move on with my life. This did a lot of extra work.
So how do I get the best of both worlds? I want to
Get the first result that returns something from my long function
Stop all my other threads from useless work.
Edit -
I solved it like a dumb java programmer by having
#volatile var done = false;
Which is set and checked inside my reallyLongFunction. This works, but does not feel very scala. Would like a better way to do this....
(Updated: no, it doesn't work, doesn't do the map)
Would it work to do something like:
inputValues.par.find({ v => reallyLongFunction(v); true })
The implementation uses this:
protected[this] class Find[U >: T](pred: T => Boolean, protected[this] val pit: IterableSplitter[T]) extends Accessor[Option[U], Find[U]] {
#volatile var result: Option[U] = None
def leaf(prev: Option[Option[U]]) = { if (!pit.isAborted) result = pit.find(pred); if (result != None) pit.abort }
protected[this] def newSubtask(p: IterableSplitter[T]) = new Find(pred, p)
override def merge(that: Find[U]) = if (this.result == None) result = that.result
}
which looks pretty similar in spirit to your #volatile except you don't have to look at it ;-)
I took interpreted your question in the same way as huynhjl, but if you just want to search and discardNones, you could do something like this to avoid the need to repeat the computation when a suitable outcome is found:
class Computation[A,B](value: A, function: A => B) {
lazy val result = function(value)
}
def f(x: Int) = { // your function here
Thread.sleep(100 - x)
if (x > 5) Some(x * 10)
else None
}
val list = List.range(1, 20) map (i => new Computation(i, f))
val found = list.par find (_.result.isDefined)
//found is Option[Computation[Int,Option[Int]]]
val result = found map (_.result.get)
//result is Option[Int]
However find for parallel collections seems to do a lot of unnecessary work (see this question), so this might not work well, with current versions of Scala at least.
Volatile flags are used in the parallel collections (take a look at the source for find, exists, and forall), so I think your idea is a good one. It's actually better if you can include the flag in the function itself. It kills referential transparency on your function (i.e. for certain inputs your function now sometimes returns None rather than Some), but since you're discarding the stopped computations, this shouldn't matter.
If you're willing to use a non-core library, I think Futures would be a good match for this task. For instance:
Akka's Futures include Futures.firstCompletedOf
Twitter's Futures include Future.select
...both of which appear to enable the functionality you're looking for.