Scala parallelization method - scala

Having a look at the forall method implementation, how could this method be parallel?
def forall(p: A => Boolean): Boolean = {
var result = true
breakable {
for (x <- this)
if (!p(x)) { result = false; break }
}
result
}
As I understand for better parallelization, avoid using var's and prefer val's. How is this now supposed to work if I use forall?

You are looking at a non-parallel version of forall.
A parallel version looks like this (provided by a parallel collection, in this case ParIterableLike):
def forall(pred: T => Boolean): Boolean = {
tasksupport.executeAndWaitResult
(new Forall(pred, splitter assign new DefaultSignalling with VolatileAbort))
}
To get a parallel collection, insert a .par, for example:
List.range(1, 10).par.forall(n => n % 2 == 0)
If you use an IDE like IntelliJ, just "zoom" into that forall, and you'll see the code above. Otherwise here is the source of ParIterableLike (thanks #Kigyo for sharing the URL).

Related

Does scala have a lazy evaluating wrapper?

I want to return a wrapper/holder for a result that I want to compute only once and only if the result is actually used. Something like:
def getAnswer(question: Question): Lazy[Answer] = ???
println(getAnswer(q).value)
This should be pretty easy to implement using lazy val:
class Lazy[T](f: () => T) {
private lazy val _result = Try(f())
def value: T = _result.get
}
But I'm wondering if there's already something like this baked into the standard API.
A quick search pointed at Streams and DelayedLazyVal but neither is quite what I'm looking for.
Streams do memoize the stream elements, but it seems like the first element is computed at construction:
def compute(): Int = { println("computing"); 1 }
val s1 = compute() #:: Stream.empty
// computing is printed here, before doing s1.take(1)
In a similar vein, DelayedLazyVal starts computing upon construction, even requires an execution context:
val dlv = new DelayedLazyVal(() => 1, { println("started") })
// immediately prints out "started"
There's scalaz.Need which I think you'd be able to use for this.

Extended scope in while loops scala

In scala we can use for loops as follows:
for { a <- someCollection
b = a.someFunc} //inbody we can use b
I need similar functionality with a while loop for example:
while({ val a = someFunction
a.isDefined }) { //do something with a in body }
How can I do this in scala?
EDIT
I know we can do this using a var on top and modifying it within the loop but I was looking for something more elegant.
What I wish to accomplish is as follows. The function someFunction iterates over a collection of Options and check for the item that is not a None. If it does not find such an option it returns a None. I can probably do this using
for( a <- myCollection
if a.isDefined) {}
but in this case I dont make use of my function.
You could write your own extended while function, like:
def extendedWhile[T](condFunc: => Option[T])(block: T => Unit): Unit = {
val a = condFunc
if (a.isDefined) {
block(a.get)
extendedWhile(condFunc)(block)
}
}
Which can be used as:
def rand =
if ((new scala.util.Random).nextFloat < 0.4) None
else Some("x")
extendedWhile(rand) {
x => println(x)
}
// prints out x a random amount of times
This extendedWhile function is tail recursive, so it should normally be as performant as the regular while loop.
I am not sure I like this, but one way to do this is to define the variable as 'var' outside the loop.
var a: Boolean = _;
def someFunction: Boolean = true
while({ a = someFunction; a }) {
println("hi " + a)
}
For is actually syntactic sugar for foreach, map, and flatMap.
So your code above desugars to:
someCollection.map { a=> f(a)}.foreach {b => ... //something using b}
Now, while is not a desugaring, but an actual imperative syntactic construct.
So the best you can do is
var a = true
while (a) {
a = someFunction (a)
}
In practice I never find myself using while, and instead use higher-order functions: like
input.takeWhile(_ != '').foreach { //do something }

Repeating function call until we'll get non-empty Option result in Scala

A very newbie question in Scala - how do I do "repeat function until something is returned meets my criteria" in Scala?
Given that I have a function that I'd like to call until it returns the result, for example, defined like that:
def tryToGetResult: Option[MysteriousResult]
I've come up with this solution, but I really feel that it is ugly:
var res: Option[MysteriousResult] = None
do {
res = tryToGetResult
} while (res.isEmpty)
doSomethingWith(res.get)
or, equivalently ugly:
var res: Option[MysteriousResult] = None
while (res.isEmpty) {
res = tryToGetResult
}
doSomethingWith(res.get)
I really feel like there is a solution without var and without so much hassle around manual checking whether Option is empty or not.
For comparison, Java alternative that I see seems to be much cleaner here:
MysteriousResult tryToGetResult(); // returns null if no result yet
MysteriousResult res;
while ((res = tryToGetResult()) == null);
doSomethingWith(res);
To add insult to injury, if we don't need to doSomethingWith(res) and we just need to return it from this function, Scala vs Java looks like that:
Scala
def getResult: MysteriousResult = {
var res: Option[MysteriousResult] = None
do {
res = tryToGetResult
} while (res.isEmpty)
res.get
}
Java
MysteriousResult getResult() {
while (true) {
MysteriousResult res = tryToGetResult();
if (res != null) return res;
}
}
You can use Stream's continually method to do precisely this:
val res = Stream.continually(tryToGetResult).flatMap(_.toStream).head
Or (possibly more clearly):
val res = Stream.continually(tryToGetResult).dropWhile(!_.isDefined).head
One advantage of this approach over explicit recursion (besides the concision) is that it's much easier to tinker with. Say for example that we decided that we only wanted to try to get the result a thousand times. If a value turns up before then, we want it wrapped in a Some, and if not we want a None. We just add a few characters to our code above:
Stream.continually(tryToGetResult).take(1000).flatMap(_.toStream).headOption
And we have what we want. (Note that the Stream is lazy, so even though the take(1000) is there, if a value turns up after three calls to tryToGetResult, it will only be called three times.)
Performing side effects like this make me die a little inside, but how about this?
scala> import scala.annotation.tailrec
import scala.annotation.tailrec
scala> #tailrec
| def lookupUntilDefined[A](f: => Option[A]): A = f match {
| case Some(a) => a
| case None => lookupUntilDefined(f)
| }
lookupUntilDefined: [A](f: => Option[A])A
Then call it like this
scala> def tryToGetResult(): Option[Int] = Some(10)
tryToGetResult: ()Option[Int]
scala> lookupUntilDefined(tryToGetResult())
res0: Int = 10
You may want to give lookupUntilDefined an additional parameter so it can stop eventually in case f is never defined.

How to yield a single element from for loop in scala?

Much like this question:
Functional code for looping with early exit
Say the code is
def findFirst[T](objects: List[T]):T = {
for (obj <- objects) {
if (expensiveFunc(obj) != null) return /*???*/ Some(obj)
}
None
}
How to yield a single element from a for loop like this in scala?
I do not want to use find, as proposed in the original question, i am curious about if and how it could be implemented using the for loop.
* UPDATE *
First, thanks for all the comments, but i guess i was not clear in the question. I am shooting for something like this:
val seven = for {
x <- 1 to 10
if x == 7
} return x
And that does not compile. The two errors are:
- return outside method definition
- method main has return statement; needs result type
I know find() would be better in this case, i am just learning and exploring the language. And in a more complex case with several iterators, i think finding with for can actually be usefull.
Thanks commenters, i'll start a bounty to make up for the bad posing of the question :)
If you want to use a for loop, which uses a nicer syntax than chained invocations of .find, .filter, etc., there is a neat trick. Instead of iterating over strict collections like list, iterate over lazy ones like iterators or streams. If you're starting with a strict collection, make it lazy with, e.g. .toIterator.
Let's see an example.
First let's define a "noisy" int, that will show us when it is invoked
def noisyInt(i : Int) = () => { println("Getting %d!".format(i)); i }
Now let's fill a list with some of these:
val l = List(1, 2, 3, 4).map(noisyInt)
We want to look for the first element which is even.
val r1 = for(e <- l; val v = e() ; if v % 2 == 0) yield v
The above line results in:
Getting 1!
Getting 2!
Getting 3!
Getting 4!
r1: List[Int] = List(2, 4)
...meaning that all elements were accessed. That makes sense, given that the resulting list contains all even numbers. Let's iterate over an iterator this time:
val r2 = (for(e <- l.toIterator; val v = e() ; if v % 2 == 0) yield v)
This results in:
Getting 1!
Getting 2!
r2: Iterator[Int] = non-empty iterator
Notice that the loop was executed only up to the point were it could figure out whether the result was an empty or non-empty iterator.
To get the first result, you can now simply call r2.next.
If you want a result of an Option type, use:
if(r2.hasNext) Some(r2.next) else None
Edit Your second example in this encoding is just:
val seven = (for {
x <- (1 to 10).toIterator
if x == 7
} yield x).next
...of course, you should be sure that there is always at least a solution if you're going to use .next. Alternatively, use headOption, defined for all Traversables, to get an Option[Int].
You can turn your list into a stream, so that any filters that the for-loop contains are only evaluated on-demand. However, yielding from the stream will always return a stream, and what you want is I suppose an option, so, as a final step you can check whether the resulting stream has at least one element, and return its head as a option. The headOption function does exactly that.
def findFirst[T](objects: List[T], expensiveFunc: T => Boolean): Option[T] =
(for (obj <- objects.toStream if expensiveFunc(obj)) yield obj).headOption
Why not do exactly what you sketched above, that is, return from the loop early? If you are interested in what Scala actually does under the hood, run your code with -print. Scala desugares the loop into a foreach and then uses an exception to leave the foreach prematurely.
So what you are trying to do is to break out a loop after your condition is satisfied. Answer here might be what you are looking for. How do I break out of a loop in Scala?.
Overall, for comprehension in Scala is translated into map, flatmap and filter operations. So it will not be possible to break out of these functions unless you throw an exception.
If you are wondering, this is how find is implemented in LineerSeqOptimized.scala; which List inherits
override /*IterableLike*/
def find(p: A => Boolean): Option[A] = {
var these = this
while (!these.isEmpty) {
if (p(these.head)) return Some(these.head)
these = these.tail
}
None
}
This is a horrible hack. But it would get you the result you wished for.
Idiomatically you'd use a Stream or View and just compute the parts you need.
def findFirst[T](objects: List[T]): T = {
def expensiveFunc(o : T) = // unclear what should be returned here
case class MissusedException(val data: T) extends Exception
try {
(for (obj <- objects) {
if (expensiveFunc(obj) != null) throw new MissusedException(obj)
})
objects.head // T must be returned from loop, dummy
} catch {
case MissusedException(obj) => obj
}
}
Why not something like
object Main {
def main(args: Array[String]): Unit = {
val seven = (for (
x <- 1 to 10
if x == 7
) yield x).headOption
}
}
Variable seven will be an Option holding Some(value) if value satisfies condition
I hope to help you.
I think ... no 'return' impl.
object TakeWhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else
seq(seq.takeWhile(_ == null).size)
}
object OptionLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T], index: Int = 0): T = if (seq.isEmpty) null.asInstanceOf[T] else
Option(seq(index)) getOrElse func(seq, index + 1)
}
object WhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else {
var i = 0
def obj = seq(i)
while (obj == null)
i += 1
obj
}
}
objects iterator filter { obj => (expensiveFunc(obj) != null } next
The trick is to get some lazy evaluated view on the colelction, either an iterator or a Stream, or objects.view. The filter will only execute as far as needed.

Scala Parallel Collections- How to return early?

I have a list of possible input Values
val inputValues = List(1,2,3,4,5)
I have a really long to compute function that gives me a result
def reallyLongFunction( input: Int ) : Option[String] = { ..... }
Using scala parallel collections, I can easily do
inputValues.par.map( reallyLongFunction( _ ) )
To get what all the results are, in parallel. The problem is, I don't really want all the results, I only want the FIRST result. As soon as one of my input is a success, I want my output, and want to move on with my life. This did a lot of extra work.
So how do I get the best of both worlds? I want to
Get the first result that returns something from my long function
Stop all my other threads from useless work.
Edit -
I solved it like a dumb java programmer by having
#volatile var done = false;
Which is set and checked inside my reallyLongFunction. This works, but does not feel very scala. Would like a better way to do this....
(Updated: no, it doesn't work, doesn't do the map)
Would it work to do something like:
inputValues.par.find({ v => reallyLongFunction(v); true })
The implementation uses this:
protected[this] class Find[U >: T](pred: T => Boolean, protected[this] val pit: IterableSplitter[T]) extends Accessor[Option[U], Find[U]] {
#volatile var result: Option[U] = None
def leaf(prev: Option[Option[U]]) = { if (!pit.isAborted) result = pit.find(pred); if (result != None) pit.abort }
protected[this] def newSubtask(p: IterableSplitter[T]) = new Find(pred, p)
override def merge(that: Find[U]) = if (this.result == None) result = that.result
}
which looks pretty similar in spirit to your #volatile except you don't have to look at it ;-)
I took interpreted your question in the same way as huynhjl, but if you just want to search and discardNones, you could do something like this to avoid the need to repeat the computation when a suitable outcome is found:
class Computation[A,B](value: A, function: A => B) {
lazy val result = function(value)
}
def f(x: Int) = { // your function here
Thread.sleep(100 - x)
if (x > 5) Some(x * 10)
else None
}
val list = List.range(1, 20) map (i => new Computation(i, f))
val found = list.par find (_.result.isDefined)
//found is Option[Computation[Int,Option[Int]]]
val result = found map (_.result.get)
//result is Option[Int]
However find for parallel collections seems to do a lot of unnecessary work (see this question), so this might not work well, with current versions of Scala at least.
Volatile flags are used in the parallel collections (take a look at the source for find, exists, and forall), so I think your idea is a good one. It's actually better if you can include the flag in the function itself. It kills referential transparency on your function (i.e. for certain inputs your function now sometimes returns None rather than Some), but since you're discarding the stopped computations, this shouldn't matter.
If you're willing to use a non-core library, I think Futures would be a good match for this task. For instance:
Akka's Futures include Futures.firstCompletedOf
Twitter's Futures include Future.select
...both of which appear to enable the functionality you're looking for.