Akka with Frege running slower than Scala counterpart

Akka with Frege running slower than Scala counterpart - scala

As an exercise, I took these Scala and Java examples of Akka to port to Frege. While it works fine, it runs slower(11s) than Scala(540ms) counterpart.
module mmhelloworld.akkatutorialfregecore.Pi where
import mmhelloworld.akkatutorialfregecore.Akka
data PiMessage = Calculate |
Work {start :: Int, nrOfElements :: Int} |
Result {value :: Double} |
PiApproximation {pi :: Double, duration :: Duration}
data Worker = private Worker where
calculatePiFor :: Int -> Int -> Double
calculatePiFor !start !nrOfElements = loop start nrOfElements 0.0 f where
loop !curr !n !acc f = if n == 0 then acc
else loop (curr + 1) (n - 1) (f acc curr) f
f !acc !i = acc + (4.0 * fromInt (1 - (i `mod` 2) * 2) / fromInt (2 * i + 1))
onReceive :: Mutable s UntypedActor -> PiMessage -> ST s ()
onReceive actor Work{start=start, nrOfElements=nrOfElements} = do
sender <- actor.sender
self <- actor.getSelf
sender.tellSender (Result $ calculatePiFor start nrOfElements) self
data Master = private Master {
nrOfWorkers :: Int,
nrOfMessages :: Int,
nrOfElements :: Int,
listener :: MutableIO ActorRef,
pi :: Double,
nrOfResults :: Int,
workerRouter :: MutableIO ActorRef,
start :: Long } where
initMaster :: Int -> Int -> Int -> MutableIO ActorRef -> MutableIO UntypedActor -> IO Master
initMaster nrOfWorkers nrOfMessages nrOfElements listener actor = do
props <- Props.forUntypedActor Worker.onReceive
router <- RoundRobinRouter.new nrOfWorkers
context <- actor.getContext
workerRouter <- props.withRouter router >>= (\p -> context.actorOf p "workerRouter")
now <- currentTimeMillis ()
return $ Master nrOfWorkers nrOfMessages nrOfElements listener 0.0 0 workerRouter now
onReceive :: MutableIO UntypedActor -> Master -> PiMessage -> IO Master
onReceive actor master Calculate = do
self <- actor.getSelf
let tellWorker start = master.workerRouter.tellSender (work start) self
work start = Work (start * master.nrOfElements) master.nrOfElements
forM_ [0 .. master.nrOfMessages - 1] tellWorker
return master
onReceive actor master (Result newPi) = do
let (!newNrOfResults, !pi) = (master.nrOfResults + 1, master.pi + newPi)
when (newNrOfResults == master.nrOfMessages) $ do
self <- actor.getSelf
now <- currentTimeMillis ()
duration <- Duration.create (now - master.start) TimeUnit.milliseconds
master.listener.tellSender (PiApproximation pi duration) self
actor.getContext >>= (\context -> context.stop self)
return master.{pi=pi, nrOfResults=newNrOfResults}
data Listener = private Listener where
onReceive :: MutableIO UntypedActor -> PiMessage -> IO ()
onReceive actor (PiApproximation pi duration) = do
println $ "Pi approximation: " ++ show pi
println $ "Calculation time: " ++ duration.toString
actor.getContext >>= ActorContext.system >>= ActorSystem.shutdown
calculate nrOfWorkers nrOfElements nrOfMessages = do
system <- ActorSystem.create "PiSystem"
listener <- Props.forUntypedActor Listener.onReceive >>= flip system.actorOf "listener"
let constructor = Master.initMaster nrOfWorkers nrOfMessages nrOfElements listener
newMaster = StatefulUntypedActor.new constructor Master.onReceive
factory <- UntypedActorFactory.new newMaster
masterActor <- Props.fromUntypedFactory factory >>= flip system.actorOf "master"
masterActor.tell Calculate
getLine >> return () --Not to exit until done
main _ = calculate 4 10000 10000
Am I doing something wrong with Akka or is it something to do with laziness in Frege for being slow? For example, when I initially had fold(strict fold) in place of loop in Worker.calculatePiFor, it took 27s.
Dependencies:
Akka native definitions for Frege: Akka.fr
Java helper to extend Akka classes since we cannot extend a class in
Frege: Actors.java

I am not exactly familiar with Actors, but assuming that the tightest loop is indeed loop you could avoid passing function f as argument.
For one, applications of passed functions cannot take advantage of the strictness of the actual passed function. Rather, code generation must assume conservatively that the passed function takes its arguments lazily and returns a lazy result.
Second, in our case you use f really just once here, so one can inline it. (This is how it is done in the scala code in the article you linked.)
Look at the code generated for the tail recursion in the following sample code that mimics yours:
test b c = loop 100 0 f
where
loop 0 !acc f = acc
loop n !acc f = loop (n-1) (acc + f (acc-1) (acc+1)) f -- tail recursion
f x y = 2*x + 7*y
We get there:
// arg2$f is the accumulator
arg$2 = arg$2f + (int)frege.runtime.Delayed.<java.lang.Integer>forced(
f_3237.apply(PreludeBase.INum_Int._minusƒ.apply(arg$2f, 1)).apply(
PreludeBase.INum_Int._plusƒ.apply(arg$2f, 1)
).result()
);
You see here that f is called lazily which causes all the argument expressios to also be computed lazily. Note the number of method calls this requires!
In your case the code should still be something like:
(double)Delayed.<Double>forced(f.apply(acc).apply(curr).result())
This means, two closures are build with the boxed values acc and curr and then the result is computed, i.e. the function f gets called with the unboxed arguments, and the result gets again boxed, just to get unboxed again (forced) for the next loop.
Now compare the following, where we just do not pass f but call it directly:
test b c = loop 100 0
where
loop 0 !acc = acc
loop n !acc = loop (n-1) (acc + f (acc-1) (acc+1))
f x y = 2*x + 7*y
We get:
arg$2 = arg$2f + f(arg$2f - 1, arg$2f + 1);
Much better!
Finally, in the case above we can do without a function call at all:
loop n !acc = loop (n-1) (acc + f) where
f = 2*x + 7*y
x = acc-1
y = acc+1
And this gets:
final int y_3236 = arg$2f + 1;
final int x_3235 = arg$2f - 1;
...
arg$2 = arg$2f + ((2 * x_3235) + (7 * y_3236));
Please try this out and let us know what happens. The main boost in performance should come from not passing f, whereas the inlining will probably be done in the JIT anyway.
The additional cost with fold is probably because you also had to create some list before applying it.

Related

How to compose curried functions in Scala

Is it possible compose functions in Scala that are curried? For example:
def a(s1: String)(s2: String): Int = s1.length + s2.length
def b(n: Int): Boolean = n % 2 == 0
def x : String => String => Boolean = a andThen b
x("blabla")("foo")
Edit :
I've found a way of doing it in Haskell :
a :: String -> String -> Int
a s1 s2 = length s1 + length s2
b :: Int -> Bool
b n = mod n 2 == 0
c :: String -> String -> Bool
c = curry (b . (uncurry a))

This should work:
def x = a _ andThen (_ andThen b)
The first _ avoids invoking a and makes it into a function value. This value is of type String=>String=>Int, i.e. a function that takes String and returns String=>Int.
The argument to the andThen method is a function that takes the result of the original function and modifies it. So in this case it requires a function that takes String=>Int and returns a new value, a function String=>Boolean. We can fabricate this new function by using andThen on the original function. This takes the result of a and composes it with the new function b.

Why this function call in Scala is not optimized away?

I'm running this program with Scala 2.10.3:
object Test {
def main(args: Array[String]) {
def factorial(x: BigInt): BigInt =
if (x == 0) 1 else x * factorial(x - 1)
val N = 1000
val t = new Array[Long](N)
var r: BigInt = 0
for (i <- 0 until N) {
val t0 = System.nanoTime()
r = r + factorial(300)
t(i) = System.nanoTime()-t0
}
val ts = t.sortWith((x, y) => x < y)
for (i <- 0 to 10)
print(ts(i) + " ")
println("*** " + ts(N/2) + "\n" + r)
}
}
and call to a pure function factorial with constant argument is evaluated during each loop iteration (conclusion based on timing results). Shouldn't the optimizer reuse function call result after the first call?
I'm using Scala IDE for Eclipse. Are there any optimization flags for the compiler, which may produce more efficient code?

Scala is not a purely functional language, so without an effect system it cannot know that factorial is pure (for example, it doesn't "know" anything about the multiplication of big ints).
You need to add your own memoization approach here. Most simply add a val f300 = factorial(300) outside your loop.
Here is a question about memoization.

Convert normal recursion to tail recursion

I was wondering if there is some general method to convert a "normal" recursion with foo(...) + foo(...) as the last call to a tail-recursion.
For example (scala):
def pascal(c: Int, r: Int): Int = {
if (c == 0 || c == r) 1
else pascal(c - 1, r - 1) + pascal(c, r - 1)
}
A general solution for functional languages to convert recursive function to a tail-call equivalent:
A simple way is to wrap the non tail-recursive function in the Trampoline monad.
def pascalM(c: Int, r: Int): Trampoline[Int] = {
if (c == 0 || c == r) Trampoline.done(1)
else for {
a <- Trampoline.suspend(pascal(c - 1, r - 1))
b <- Trampoline.suspend(pascal(c, r - 1))
} yield a + b
}
val pascal = pascalM(10, 5).run
So the pascal function is not a recursive function anymore. However, the Trampoline monad is a nested structure of the computation that need to be done. Finally, run is a tail-recursive function that walks through the tree-like structure, interpreting it, and finally at the base case returns the value.
A paper from Rúnar Bjanarson on the subject of Trampolines: Stackless Scala With Free Monads

In cases where there is a simple modification to the value of a recursive call, that operation can be moved to the front of the recursive function. The classic example of this is Tail recursion modulo cons, where a simple recursive function in this form:
def recur[A](...):List[A] = {
...
x :: recur(...)
}
which is not tail recursive, is transformed into
def recur[A]{...): List[A] = {
def consRecur(..., consA: A): List[A] = {
consA :: ...
...
consrecur(..., ...)
}
...
consrecur(...,...)
}
Alexlv's example is a variant of this.
This is such a well known situation that some compilers (I know of Prolog and Scheme examples but Scalac does not do this) can detect simple cases and perform this optimisation automatically.
Problems combining multiple calls to recursive functions have no such simple solution. TMRC optimisatin is useless, as you are simply moving the first recursive call to another non-tail position. The only way to reach a tail-recursive solution is remove all but one of the recursive calls; how to do this is entirely context dependent but requires finding an entirely different approach to solving the problem.
As it happens, in some ways your example is similar to the classic Fibonnaci sequence problem; in that case the naive but elegant doubly-recursive solution can be replaced by one which loops forward from the 0th number.
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
def fib (n: Long): Long = {
def loop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
loop(next, current + next, iteration + 1)
}
loop(0, 1, 0)
}
For the Fibonnaci sequence, this is the most efficient approach (a streams based solution is just a different expression of this solution that can cache results for subsequent calls). Now,
you can also solve your problem by looping forward from c0/r0 (well, c0/r2) and calculating each row in sequence - the difference being that you need to cache the entire previous row. So while this has a similarity to fib, it differs dramatically in the specifics and is also significantly less efficient than your original, doubly-recursive solution.
Here's an approach for your pascal triangle example which can calculate pascal(30,60) efficiently:
def pascal(column: Long, row: Long):Long = {
type Point = (Long, Long)
type Points = List[Point]
type Triangle = Map[Point,Long]
def above(p: Point) = (p._1, p._2 - 1)
def aboveLeft(p: Point) = (p._1 - 1, p._2 - 1)
def find(ps: Points, t: Triangle): Long = ps match {
// Found the ultimate goal
case (p :: Nil) if t contains p => t(p)
// Found an intermediate point: pop the stack and carry on
case (p :: rest) if t contains p => find(rest, t)
// Hit a triangle edge, add it to the triangle
case ((c, r) :: _) if (c == 0) || (c == r) => find(ps, t + ((c,r) -> 1))
// Triangle contains (c - 1, r - 1)...
case (p :: _) if t contains aboveLeft(p) => if (t contains above(p))
// And it contains (c, r - 1)! Add to the triangle
find(ps, t + (p -> (t(aboveLeft(p)) + t(above(p)))))
else
// Does not contain(c, r -1). So find that
find(above(p) :: ps, t)
// If we get here, we don't have (c - 1, r - 1). Find that.
case (p :: _) => find(aboveLeft(p) :: ps, t)
}
require(column >= 0 && row >= 0 && column <= row)
(column, row) match {
case (c, r) if (c == 0) || (c == r) => 1
case p => find(List(p), Map())
}
}
It's efficient, but I think it shows how ugly complex recursive solutions can become as you deform them to become tail recursive. At this point, it may be worth moving to a different model entirely. Continuations or monadic gymnastics might be better.
You want a generic way to transform your function. There isn't one. There are helpful approaches, that's all.

I don't know how theoretical this question is, but a recursive implementation won't be efficient even with tail-recursion. Try computing pascal(30, 60), for example. I don't think you'll get a stack overflow, but be prepared to take a long coffee break.
Instead, consider using a Stream or memoization:
val pascal: Stream[Stream[Long]] =
(Stream(1L)
#:: (Stream from 1 map { i =>
// compute row i
(1L
#:: (pascal(i-1) // take the previous row
sliding 2 // and add adjacent values pairwise
collect { case Stream(a,b) => a + b }).toStream
++ Stream(1L))
}))

The accumulator approach
def pascal(c: Int, r: Int): Int = {
def pascalAcc(acc:Int, leftover: List[(Int, Int)]):Int = {
if (leftover.isEmpty) acc
else {
val (c1, r1) = leftover.head
// Edge.
if (c1 == 0 || c1 == r1) pascalAcc(acc + 1, leftover.tail)
// Safe checks.
else if (c1 < 0 || r1 < 0 || c1 > r1) pascalAcc(acc, leftover.tail)
// Add 2 other points to accumulator.
else pascalAcc(acc, (c1 , r1 - 1) :: ((c1 - 1, r1 - 1) :: leftover.tail ))
}
}
pascalAcc(0, List ((c,r) ))
}
It does not overflow the stack but as on big row and column but Aaron mentioned it's not fast.

Yes it's possible. Usually it's done with accumulator pattern through some internally defined function, which has one additional argument with so called accumulator logic, example with counting length of a list.
For example normal recursive version would look like this:
def length[A](xs: List[A]): Int = if (xs.isEmpty) 0 else 1 + length(xs.tail)
that's not a tail recursive version, in order to eliminate last addition operation we have to accumulate values while somehow, for example with accumulator pattern:
def length[A](xs: List[A]) = {
def inner(ys: List[A], acc: Int): Int = {
if (ys.isEmpty) acc else inner(ys.tail, acc + 1)
}
inner(xs, 0)
}
a bit longer to code, but i think the idea i clear. Of cause you can do it without inner function, but in such case you should provide acc initial value manually.

I'm pretty sure it's not possible in the simple way you're looking for the general case, but it would depend on how elaborate you permit the changes to be.
A tail-recursive function must be re-writable as a while-loop, but try implementing for example a Fractal Tree using while-loops. It's possble, but you need to use an array or collection to store the state for each point, which susbstitutes for the data otherwise stored in the call-stack.
It's also possible to use trampolining.

It is indeed possible. The way I'd do this is to
begin with List(1) and keep recursing till you get to the
row you want.
Worth noticing that you can optimize it: if c==0 or c==r the value is one, and to calculate let's say column 3 of the 100th row you still only need to calculate the first three elements of the previous rows.
A working tail recursive solution would be this:
def pascal(c: Int, r: Int): Int = {
#tailrec
def pascalAcc(c: Int, r: Int, acc: List[Int]): List[Int] = {
if (r == 0) acc
else pascalAcc(c, r - 1,
// from let's say 1 3 3 1 builds 0 1 3 3 1 0 , takes only the
// subset that matters (if asking for col c, no cols after c are
// used) and uses sliding to build (0 1) (1 3) (3 3) etc.
(0 +: acc :+ 0).take(c + 2)
.sliding(2, 1).map { x => x.reduce(_ + _) }.toList)
}
if (c == 0 || c == r) 1
else pascalAcc(c, r, List(1))(c)
}
The annotation #tailrec actually makes the compiler check the function
is actually tail recursive.
It could be probably be further optimized since given that the rows are symmetric, if c > r/2, pascal(c,r) == pascal ( r-c,r).. but left to the reader ;)

Scala - can 'for-yield' clause yields nothing for some condition?

In Scala language, I want to write a function that yields odd numbers within a given range. The function prints some log when iterating even numbers. The first version of the function is:
def getOdds(N: Int): Traversable[Int] = {
val list = new mutable.MutableList[Int]
for (n <- 0 until N) {
if (n % 2 == 1) {
list += n
} else {
println("skip even number " + n)
}
}
return list
}
If I omit printing logs, the implementation become very simple:
def getOddsWithoutPrint(N: Int) =
for (n <- 0 until N if (n % 2 == 1)) yield n
However, I don't want to miss the logging part. How do I rewrite the first version more compactly? It would be great if it can be rewritten similar to this:
def IWantToDoSomethingSimilar(N: Int) =
for (n <- 0 until N) if (n % 2 == 1) yield n else println("skip even number " + n)

def IWantToDoSomethingSimilar(N: Int) =
for {
n <- 0 until N
if n % 2 != 0 || { println("skip even number " + n); false }
} yield n
Using filter instead of a for expression would be slightly simpler though.

I you want to keep the sequentiality of your traitement (processing odds and evens in order, not separately), you can use something like that (edited) :
def IWantToDoSomethingSimilar(N: Int) =
(for (n <- (0 until N)) yield {
if (n % 2 == 1) {
Option(n)
} else {
println("skip even number " + n)
None
}
// Flatten transforms the Seq[Option[Int]] into Seq[Int]
}).flatten
EDIT, following the same concept, a shorter solution :
def IWantToDoSomethingSimilar(N: Int) =
(0 until N) map {
case n if n % 2 == 0 => println("skip even number "+ n)
case n => n
} collect {case i:Int => i}

If you will to dig into a functional approach, something like the following is a good point to start.
First some common definitions:
// use scalaz 7
import scalaz._, Scalaz._
// transforms a function returning either E or B into a
// function returning an optional B and optionally writing a log of type E
def logged[A, E, B, F[_]](f: A => E \/ B)(
implicit FM: Monoid[F[E]], FP: Pointed[F]): (A => Writer[F[E], Option[B]]) =
(a: A) => f(a).fold(
e => Writer(FP.point(e), None),
b => Writer(FM.zero, Some(b)))
// helper for fixing the log storage format to List
def listLogged[A, E, B](f: A => E \/ B) = logged[A, E, B, List](f)
// shorthand for a String logger with List storage
type W[+A] = Writer[List[String], A]
Now all you have to do is write your filtering function:
def keepOdd(n: Int): String \/ Int =
if (n % 2 == 1) \/.right(n) else \/.left(n + " was even")
You can try it instantly:
scala> List(5, 6) map(keepOdd)
res0: List[scalaz.\/[String,Int]] = List(\/-(5), -\/(6 was even))
Then you can use the traverse function to apply your function to a list of inputs, and collect both the logs written and the results:
scala> val x = List(5, 6).traverse[W, Option[Int]](listLogged(keepOdd))
x: W[List[Option[Int]]] = scalaz.WriterTFunctions$$anon$26#503d0400
// unwrap the results
scala> x.run
res11: (List[String], List[Option[Int]]) = (List(6 was even),List(Some(5), None))
// we may even drop the None-s from the output
scala> val (logs, results) = x.map(_.flatten).run
logs: List[String] = List(6 was even)
results: List[Int] = List(5)

I don't think this can be done easily with a for comprehension. But you could use partition.
def getOffs(N:Int) = {
val (evens, odds) = 0 until N partition { x => x % 2 == 0 }
evens foreach { x => println("skipping " + x) }
odds
}
EDIT: To avoid printing the log messages after the partitioning is done, you can change the first line of the method like this:
val (evens, odds) = (0 until N).view.partition { x => x % 2 == 0 }

Implementing sequences of sequences in F#

I am trying to expose a 2 dimensional array as a sequence of sequences on an object(to be able to do Seq.fold (fun x -> Seq.fold (fun ->..) [] x) [] mytype stuff specifically)
Below is a toy program that exposes the identical functionality.
From what I understand there is a lot going on here, first of IEnumerable has an ambiguous overload and requires a type annotation to explicitly isolate which IEnumerable you are talking about.
But then there can be issues with unit as well requiring additional help:
type blah =
class
interface int seq seq with
member self.GetEnumerator () : System.Collections.Generic.IEnumerable<System.Collections.Generic.IEnumerable<(int*int)>> =
seq{ for i = 0 to 10 do
yield seq { for j=0 to 10 do
yield (i,j)} }
end
Is there some way of getting the above code to work as intended(return a seq<seq<int>>) or am I missing something fundamental?

Well for one thing, GetEnumerator() is supposed to return IEnumerator<T> not IEnumerable<T>...
This will get your sample code to compile.
type blah =
interface seq<seq<(int * int)>> with
member self.GetEnumerator () =
(seq { for i = 0 to 10 do
yield seq { for j=0 to 10 do
yield (i,j)} }).GetEnumerator()
interface System.Collections.IEnumerable with
member self.GetEnumerator () =
(self :> seq<seq<(int * int)>>).GetEnumerator() :> System.Collections.IEnumerator

How about:
let toSeqOfSeq (array:array<array<_>>) = array |> Seq.map (fun x -> x :> seq<_>)
But this works with an array of arrays, not a two-dimensional array. Which do you want?

What are you really out to do? A seq of seqs is rarely useful. All collections are seqs, so you can just use an array of arrays, a la
let myArrayOfArrays = [|
for i = 0 to 9 do
yield [|
for j = 0 to 9 do
yield (i,j)
|]
|]
let sumAllProds = myArrayOfArrays |> Seq.fold (fun st a ->
st + (a |> Seq.fold (fun st (x,y) -> st + x*y) 0) ) 0
printfn "%d" sumAllProds
if that helps...

module Array2D =
// Converts 2D array 'T[,] into seq<seq<'T>>
let toSeq (arr : 'T [,]) =
let f1,f2 = Array2D.base1 arr , Array2D.base2 arr
let t1,t2 = Array2D.length1 arr - f1 - 1 , Array2D.length2 arr - f2 - 1
seq {
for i in f1 .. t1 do
yield seq {
for j in f2 .. t2 do
yield Array2D.get arr i j }}
let myArray2D : string[,] = array2D [["a1"; "b1"; "c1"]; ["a2"; "b2"; "c2"]]
printf "%A" (Array2D.toSeq myArray2D)