Scala lazy val explanation - scala

I am taking the Functional Programming in Scala course on Coursera and I am having a hard time understanding this code snippet -
def sqrtStream(x: Double): Stream[Double] = {
def improve(guess: Double): Double = (guess+ x/ guess) / 2
lazy val guesses: Stream[Double] = 1 #:: (guesses map improve)
guesses
}
This method would find 10 approximate square root of 4 in increasing order of accuracy when I would do sqrtSteam(4).take(10).toList.
Can someone explain the evaluation strategy of guesses here? My doubt is what value of guesses in substituted when the second value of guesses is picked up?

Let's start from simplified example:
scala> lazy val a: Int = a + 5
a: Int = <lazy>
scala> a
stack overflow here, because of infinite recursion
So a is recalculating til it gets some stable value, like here:
scala> def f(f:() => Any) = 0 //takes function with captured a - returns constant 0
f: (f: () => Any)Int
scala> lazy val a: Int = f(() => a) + 5
a: Int = <lazy>
scala> a
res4: Int = 5 // 0 + 5
You may replace def f(f:() => Any) = 0 with def f(f: => Any) = 0, so a definition will look like it's really passed to the f: lazy val a: Int = f(a) + 5.
Streams use same mechanism - guesses map improve will be passed as parameter called by name (and lambda linked to the lazy a will be saved inside Stream, but not calculated until tail is requested), so it's like lazy val guesses = #::(1, () => guesses map improve). When you call guessess.head - tail will not be evaluated; guesses.tail will lazily return Stream (improve(1), ?), guesses.tail.tail will be Stream(improve(improve(1)), ?) and so on.

The value of guesses is not substituted. A stream is like a list, but its elements are evaluated only when they are needed and then they stored, so next time you access them the evaluation will not be necessary. The reference to the stream itself does not change.
On top of the example Αλεχει wrote, there is a nice explanation in Scala API:
http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.Stream

You can easily find out what's going on by modifying the map function, as described in the scaladoc example:
scala> def sqrtStream(x: Double): Stream[Double] = {
| def improve(guess: Double): Double = (guess + x / guess) / 2
| lazy val guesses: Stream[Double] = 1 #:: (guesses map {n =>
| println(n, improve(n))
| improve(n)
| })
| guesses
| }
sqrtStream: (x: Double)Stream[Double]
The output is:
scala> sqrtStream(4).take(10).toList
(1.0,2.5)
(2.5,2.05)
(2.05,2.000609756097561)
(2.000609756097561,2.0000000929222947)
(2.0000000929222947,2.000000000000002)
(2.000000000000002,2.0)
(2.0,2.0)
(2.0,2.0)
(2.0,2.0)
res0: List[Double] = List(1.0, 2.5, 2.05, 2.000609756097561, 2.0000000929222947, 2.000000000000002, 2.0, 2.0, 2.0, 2.0)

Related

Scala Generic Type slow

I do need to create a method for comparison for either Int or String or Char. Using AnyVal was not make it possible as there were no method's for <, > comparison.
However Typing it into Ordered shows a significant slowness. Are there better ways to achieve this? The plan is to do a generic binary sorting, and found Generic typing decreases the performance.
def sample1[T <% Ordered[T]](x:T) = { x < (x) }
def sample2(x:Ordered[Int]) = { x < 1 }
def sample3(x:Int) = { x < 1 }
val start1 = System.nanoTime
sample1(5)
println(System.nanoTime - start1)
val start2 = System.nanoTime
sample2(5)
println(System.nanoTime - start2)
val start3 = System.nanoTime
sample3(5)
println(System.nanoTime - start3)
val start4 = System.nanoTime
sample3(5)
println(System.nanoTime - start4)
val start5 = System.nanoTime
sample2(5)
println(System.nanoTime - start5)
val start6 = System.nanoTime
sample1(5)
println(System.nanoTime - start6)
The results shows:
Sample1:696122
Sample2:45123
Sample3:13947
Sample3:5332
Sample2:194438
Sample1:497992
Am I doing the incorrect way of handling Generics? Or should I be doing the old Java method of using Comparator in this case, sample as in:
object C extends Comparator[Int] {
override def compare(a:Int, b:Int):Int = {
a - b
}
}
def sample4[T](a:T, b:T, x:Comparator[T]) {x.compare(a,b)}
The Scala equivalent of Java Comparator is Ordering. One of the main differences is that, if you don't provide one manually, a suitable Ordering can be injected implicitly by the compiler. By default, this will be done for Byte, Int, Float and other primitives, for any subclass of Ordered or Comparable, and for some other obvious cases.
Also, Ordering provides method definitions for all the main comparison methods as extension methods, so you can write the following:
import Ordering.Implicits._
def sample5[T : Ordering](a: T, b: T) = a < b
def run() = sample5(1, 2)
As of Scala 2.12, those extension operations (i.e., a < b) invoke wrapping in a temporary object Ordering#Ops, so the code will be slower than with a Comparator. Not much in most real cases, but still significant if you care about micro-optimisations.
But you can use an alternative syntax to define an implicit Ordering[T] parameter and invoke methods on the Ordering object directly.
Actually even the generated bytecode for the following two methods will be identical (except for the type of the third argument, and potentially the implementation of the respective compare methods):
def withOrdering[T](x: T, y: T)(implicit cmp: Ordering[T]) = {
cmp.compare(x, y) // also supports other methods, like `cmp.lt(x, y)`
}
def withComparator[T](x: T, y: T, cmp: Comparator[T]) = {
cmp.compare(x, y)
}
In practice the runtime on my machine is the same, when invoking these methods with Int arguments.
So, if you want to compare types generically in Scala, you should usually use Ordering.
Do not do micro-tests in such way if you want to get results somehow similar you will have in production env.
First of all you need to warm-up jvm. And after that do your test as average of many iterations. Also, you need to prevent possible jvm optimizations because of const data. E.g.
def sample1[T <% Ordered[T]](x:T) = { x < (x) }
def sample2(x:Ordered[Int]) = { x < 1 }
def sample3(x:Int) = { x < 1 }
val r = new Random()
def measure(f: => Unit): Long = {
val start1 = System.nanoTime
f
System.nanoTime - start1
}
val n = 1000000
(1 to n).map(_ => measure {val k = r.nextInt();sample1(k)})
(1 to n).map(_ => measure {val k = r.nextInt();sample2(k)})
(1 to n).map(_ => measure {val k = r.nextInt();sample3(k)})
val avg1 = (1 to n).map(_ => measure {val k = r.nextInt();sample1(k)}).sum / n
println(avg1)
val avg2 = (1 to n).map(_ => measure {val k = r.nextInt();sample2(k)}).sum / n
println(avg2)
val avg3 = (1 to n).map(_ => measure {val k = r.nextInt();sample3(k)}).sum / n
println(avg3)
I got results, which look more fare for me:
134
92
83
This book could give some light on performance tests.

How to take a constant Integer as input in Partial Function? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I am new to scala, I have a use case where I want to define a partial function to add three numbers in which one number is constant and two
numbers can be passed as inputs and define another method which can take the partial
function as input and gives its cube as result.
Well... That depends on where is your constant coming from?
Choice 1 - Your function forms a closure with a constant present in scope.
val yourConstant = 10
val pf: PartialFunction[(Int, Int), Int] = {
case (x, y) => x + y + yourConstant
}
pf((5, 10))
Choice 2 - Your function has a local constant.
val pf: PartialFunction[(Int, Int), Int] = {
case (x, y) => x + y + 10
}
pf((5, 10))
Also, as many others pointed out - this does not look like a use case of partial function. Are you sure that you want a Partial Function and not a partially applied function ?
if you were looking for a partially applied function then,
// first you need a curried function
// Curries function are function which can take parameters in steps to build intermidatary functions.
def normalDef(c: Int)(x: Int, y: Int): Int = c + y + x
// normalDef: normalDef[](val c: Int)(val x: Int,val y: Int) => Int
// now you can "partially apply" this "curried" function to your partially applied function
val addTo10PartiallyApplied = normalDef(10) _
// addTo10PartiallyApplied: (Int, Int) => Int = $Lambda$1240/1924827254#46202553
val total = addTo10PartiallyApplied(1, 2)
// total: Int = 13
The following partial function adds 12345 to each number in the tuple passed to it
scala> val addConstantTo: PartialFunction[(Int, Int), Int] = {
| case (a, b) => a + b + 12345
| }
addConstantTo: PartialFunction[(Int, Int),Int] = <function1>
scala> addConstantTo((12, 34))
res4: Int = 12391
This expands on the concept, by programmatically defining a partial function which adds any number to the elements of a tuple:
scala> def addTo(c: Int): PartialFunction[(Int, Int), Int] = {
| case (a, b) => a + b + c
| }
addTo: (c: Int)PartialFunction[(Int, Int),Int]
scala> val pf = addTo(3)
pf: PartialFunction[(Int, Int),Int] = <function1>
scala> pf((1, 2))
res5: Int = 6
Let that sink in for a bit :)

Monte Carlo calculation of Pi in Scala

Suppose I would like to calculate Pi with Monte Carlo simulation as an exercise.
I am writing a function, which picks a point in a square (0, 1), (1, 0) at random and tests if the point is inside the circle.
import scala.math._
import scala.util.Random
def circleTest() = {
val (x, y) = (Random.nextDouble, Random.nextDouble)
sqrt(x*x + y*y) <= 1
}
Then I am writing a function, which takes as arguments the test function and the number of trials and returns the fraction of the trials in which the test was found to be true.
def monteCarlo(trials: Int, test: () => Boolean) =
(1 to trials).map(_ => if (test()) 1 else 0).sum * 1.0 / trials
... and I can calculate Pi
monteCarlo(100000, circleTest) * 4
Now I wonder if monteCarlo function can be improved. How would you write monteCarlo efficient and readable ?
For example, since the number of trials is large is it worth using a view or iterator instead of Range(1, trials) and reduce instead of map and sum ?
It's worth noting that Random.nextDouble is side-effecting—when you call it it changes the state of the random number generator. This may not be a concern to you, but since there are already five answers here I figure it won't hurt anything to add one that's purely functional.
First you'll need a random number generation monad implementation. Luckily NICTA provides a really nice one that's integrated with Scalaz. You can use it like this:
import com.nicta.rng._, scalaz._, Scalaz._
val pointInUnitSquare = Rng.choosedouble(0.0, 1.0) zip Rng.choosedouble(0.0, 1.0)
val insideCircle = pointInUnitSquare.map { case (x, y) => x * x + y * y <= 1 }
def mcPi(trials: Int): Rng[Double] =
EphemeralStream.range(0, trials).foldLeftM(0) {
case (acc, _) => insideCircle.map(_.fold(1, 0) + acc)
}.map(_ / trials.toDouble * 4)
And then:
scala> val choosePi = mcPi(10000000)
choosePi: com.nicta.rng.Rng[Double] = com.nicta.rng.Rng$$anon$3#16dd554f
Nothing's been computed yet—we've just built up a computation that will generate our value randomly when executed. Let's just execute it on the spot in the IO monad for the sake of convenience:
scala> choosePi.run.unsafePerformIO
res0: Double = 3.1415628
This won't be the most performant solution, but it's good enough that it may not be a problem for many applications, and the referential transparency may be worth it.
Stream based version, for another alternative. I think this is quite clear.
def monteCarlo(trials: Int, test: () => Boolean) =
Stream
.continually(if (test()) 1.0 else 0.0)
.take(trials)
.sum / trials
(the sum isn't specialised for streams but the implementation (in TraversableOnce) just calls foldLeft that is specialised and "allows GC to collect along the way." So the .sum won't force the stream to be evaluated and so won't keep all the trials in memory at once)
I see no problem with the following recursive version:
def monteCarlo(trials: Int, test: () => Boolean) = {
def bool2double(b: Boolean) = if (b) 1.0d else 0.0d
#scala.annotation.tailrec
def recurse(n: Int, sum: Double): Double =
if (n <= 0) sum / trials
else recurse(n - 1, sum + bool2double(test()))
recurse(trials, 0.0d)
}
And a foldLeft version, too:
def monteCarloFold(trials: Int, test: () => Boolean) =
(1 to trials).foldLeft(0.0d)((s,i) => s + (if (test()) 1.0d else 0.0d)) / trials
This is more memory efficient than the map version in the question.
Using tail recursion might be an idea:
def recMonteCarlo(trials: Int, currentSum: Double, test:() => Boolean):Double = trials match {
case 0 => currentSum
case x =>
val nextSum = currentSum + (if (test()) 1.0 else 0.0)
recMonteCarlo(trials-1, nextSum, test)
def monteCarlo(trials: Int, test:() => Boolean) = {
val monteSum = recMonteCarlo(trials, 0, test)
monteSum / trials
}
Using aggregate on a parallel collection, like this,
def monteCarlo(trials: Int, test: () => Boolean) = {
val pr = (1 to trials).par
val s = pr.aggregate(0)( (a,_) => a + (if (test()) 1 else 0), _ + _)
s * 4.0 / trials
}
where partial results are summed up in parallel with other test calculations.

Updating a 2d table of counts

Suppose I want a Scala data structure that implements a 2-dimensional table of counts that can change over time (i.e., individual cells in the table can be incremented or decremented). What should I be using to do this?
I could use a 2-dimensional array:
val x = Array.fill[Int](1, 2) = 0
x(1)(2) += 1
But Arrays are mutable, and I guess I should slightly prefer immutable data structures.
So I thought about using a 2-dimensional Vector:
val x = Vector.fill[Int](1, 2) = 0
// how do I update this? I want to write something like val newX : Vector[Vector[Int]] = x.add((1, 2), 1)
// but I'm not sure how
But I'm not sure how to get a new vector with only a single element changed.
What's the best approach?
Best depends on what your criteria are. The simplest immutable variant is to use a map from (Int,Int) to your count:
var c = (for (i <- 0 to 99; j <- 0 to 99) yield (i,j) -> 0).toMap
Then you access your values with c(i,j) and set them with c += ((i,j) -> n); c += ((i,j) -> (c(i,j)+1)) is a little bit annoying, but it's not too bad.
Faster is to use nested Vectors--by about a factor of 2 to 3, depending on whether you tend to re-set the same element over and over or not--but it has an ugly update method:
var v = Vector.fill(100,100)(0)
v(82)(49) // Easy enough
v = v.updated(82, v(82).updated(49, v(82)(49)+1) // Ouch!
Faster yet (by about 2x) is to have only one vector which you index into:
var u = Vector.fill(100*100)(0)
u(82*100 + 49) // Um, you think I can always remember to do this right?
u = u.updated(82*100 + 49, u(82*100 + 49)+1) // Well, that's actually better
If you don't need immutability and your table size isn't going to change, just use an array as you've shown. It's ~200x faster than the fastest vector solution if all you're doing is incrementing and decrementing an integer.
If you want to do this in a very general and functional (but not necessarily performant) way, you can use lenses. Here's an example of how you could use Scalaz 7's implementation, for example:
import scalaz._
def at[A](i: Int): Lens[Seq[A], A] = Lens.lensg(a => a.updated(i, _), (_(i)))
def at2d[A](i: Int, j: Int) = at[Seq[A]](i) andThen at(j)
And a little bit of setup:
val table = Vector.tabulate(3, 4)(_ + _)
def show[A](t: Seq[Seq[A]]) = t.map(_ mkString " ") mkString "\n"
Which gives us:
scala> show(table)
res0: String =
0 1 2 3
1 2 3 4
2 3 4 5
We can use our lens like this:
scala> show(at2d(1, 2).set(table, 9))
res1: String =
0 1 2 3
1 2 9 4
2 3 4 5
Or we can just get the value at a given cell:
scala> val v: Int = at2d(2, 3).get(table)
v: Int = 5
Or do a lot of more complex things, like apply a function to a particular cell:
scala> show(at2d(2, 2).mod(((_: Int) * 2), table))
res8: String =
0 1 2 3
1 2 3 4
2 3 8 5
And so on.
There isn't a built-in method for this, perhaps because it would require the Vector to know that it contains Vectors, or Vectors or Vectors etc, whereas most methods are generic, and it would require a separate method for each number of dimensions, because you need to specify a co-ordinate arg for each dimension.
However, you can add these yourself; the following will take you up to 4D, although you could just add the bits for 2D if that's all you need:
object UpdatableVector {
implicit def vectorToUpdatableVector2[T](v: Vector[Vector[T]]) = new UpdatableVector2(v)
implicit def vectorToUpdatableVector3[T](v: Vector[Vector[Vector[T]]]) = new UpdatableVector3(v)
implicit def vectorToUpdatableVector4[T](v: Vector[Vector[Vector[Vector[T]]]]) = new UpdatableVector4(v)
class UpdatableVector2[T](v: Vector[Vector[T]]) {
def updated2(c1: Int, c2: Int)(newVal: T) =
v.updated(c1, v(c1).updated(c2, newVal))
}
class UpdatableVector3[T](v: Vector[Vector[Vector[T]]]) {
def updated3(c1: Int, c2: Int, c3: Int)(newVal: T) =
v.updated(c1, v(c1).updated2(c2, c3)(newVal))
}
class UpdatableVector4[T](v: Vector[Vector[Vector[Vector[T]]]]) {
def updated4(c1: Int, c2: Int, c3: Int, c4: Int)(newVal: T) =
v.updated(c1, v(c1).updated3(c2, c3, c4)(newVal))
}
}
In Scala 2.10 you don't need the implicit defs and can just add the implicit keyword to the class definitions.
Test:
import UpdatableVector._
val v2 = Vector.fill(2,2)(0)
val r2 = v2.updated2(1,1)(42)
println(r2) // Vector(Vector(0, 0), Vector(0, 42))
val v3 = Vector.fill(2,2,2)(0)
val r3 = v3.updated3(1,1,1)(42)
println(r3) // etc
Hope that's useful.

call a def within a block

If there is any way to call a def from a block
def factor (n: Int) : Int = if (n == 0 ) 1 else n * factor(n-1)
val i = 1000
i.toString.foreach ( x => sum += factor(x.toInt) )
at the end I want to get the sum of factorial of every digit
But it seems like def doesn't return a value, everytime is 0
How to fix it?
Thanks!
The problem actually has nothing to do with Scala per se; your code and your def are fine. The issue is with toInt:
scala> '3'.toInt
res7: Int = 51
toInt doesn't actually convert it as a decimal digit, but as a unicode (ish?) character value. These are producing very large numbers which go beyond what factor can handle:
scala> factor(6)
res8: Int = 720
scala> factor(20)
res9: Int = -2102132736
scala> factor(100)
res10: Int = 0
So instead use (thanks to Luigi)
x.asDigit