How tail recursion works internally in scala? - scala

I am not able to understand tail recursion.
How accumulator variable store intermediate values of recursive calls?
What is the flow of execution. who executes first and why return value fact(n,1)?
def trec(n: Int): BigInt = {
#tailrec
def fact(x: Int, accumulator: BigInt): BigInt = {
if (x <= 1) accumulator else fact(x - 1, x * accumulator)
}
fact(n,1)
}
println(trec(5))

The recursion is support by how code execution works. Whenever you make a call to a method this call is pushed into a stack and all other methods called inside this one is pushed on top of the first, only when the method execution finishes the method call is popped from the stack and its return value is available for the outer method that called it to use it.
This way the recursion works because the results of the previous calls are stored in the stack and each call of the method uses the previous value that has already calculated.
The case of tail recursion specifically is that because the recursive call is the last thing done inside the method you don't need to push one call above the other in the stack (that may cause stack overflow exception) because the previous call has already finished. And because of that when you use #tailrec in scala it converts it to a loop under the hood that is more efficient.
So now I am going to try to answer your questions:
each call to the method has in its accumulator parameter the accumulated value of the factorial so far. The sequence is something like
fact(5, 1) -> fact(4, 5 * 1) -> fact(3, 4 * 5 * 1) -> fact(2, 3 * 4 * 5 * 1) -> fact(1, 2 * 3 * 4 * 5)
So in the last one call the accum parameter has already the result
The flow is first you call the trec method with the value you want to calculate and then this methods call its internal fact method passing also the value that you want to calculate and the accumulator that in this case is the neutral element in multiplication, 1. And then the flow is the one that I have written in step 1 following the recursion

Related

Expensive flatMap() operation on streams originating from Stream.emits()

I just encountered an issue with degrading fs2 performance using a stream of strings to be written to a file via text.utf8encode. I tried to change my source to use chunked strings to increase performance, but the observation was performance degradation instead.
As far as I can see, it boils down to the following: Invoking flatMap on a stream that originates from Stream.emits() can be very expensive. Time usage seems to be exponential based on the size of the sequence passed to Stream.emits(). The code snippet below shows an example:
/*
Test done with scala 2.11.11 and fs2 version 0.10.0-M7.
*/
val rangeSize = 20000
val integers = (1 to rangeSize).toVector
// Note that the last flatMaps are just added to show extreme load for streamA.
val streamA = Stream.emits(integers).flatMap(Stream.emit(_))
val streamB = Stream.range(1, rangeSize + 1).flatMap(Stream.emit(_))
streamA.toVector // Uses approx. 25 seconds (!)
streamB.toVector // Uses approx. 15 milliseconds
Is this a bug, or should usage of Stream.emits() for large sequences be avoided?
TLDR: Allocations.
Longer answer:
Interesting question. I ran a JFR profile on both methods separately, and looked at the results. First thing which immediately attracted my eye was the amount of allocations.
Stream.emit:
Stream.range:
We can see that Stream.emit allocates a significant amount of Append instances, which are the concrete implementation of Catenable[A], which is the type used in Stream.emit to fold:
private[fs2] final case class Append[A](left: Catenable[A], right: Catenable[A]) extends Catenable[A]
This actually comes from the implementation of how Catenable[A] implemented foldLeft:
foldLeft(empty: Catenable[B])((acc, a) => acc :+ f(a))
Where :+ allocates a new Append object for each element. This means we're at least generating 20000 such Append objects.
There is also a hint in the documentation of Stream.range about how it produces a single chunk instead of dividing the stream further, which may be bad if this was a big range we're generating:
/**
* Lazily produce the range `[start, stopExclusive)`. If you want to produce
* the sequence in one chunk, instead of lazily, use
* `emits(start until stopExclusive)`.
*
* #example {{{
* scala> Stream.range(10, 20, 2).toList
* res0: List[Int] = List(10, 12, 14, 16, 18)
* }}}
*/
def range(start: Int, stopExclusive: Int, by: Int = 1): Stream[Pure,Int] =
unfold(start){i =>
if ((by > 0 && i < stopExclusive && start < stopExclusive) ||
(by < 0 && i > stopExclusive && start > stopExclusive))
Some((i, i + by))
else None
}
You can see that there is no additional wrapping here, only the integers that get emitted as part of the range. On the other hand, Stream.emits creates an Append object for every element in the sequence, where we have a left containing the tail of the stream, and right containing the current value we're at.
Is this a bug? I would say no, but I would definitely open this up as a performance issue to the fs2 library maintainers.

simple recursive function error - java.lang.StackOverFlow error - output exceeds cut off limit

I am practicing scala's simple recursive function. This is from a book
def calculatePower(x:Int, y:Int): Long = {
if (x>=1)
x*calculatePower(x,y-1)
else 1
}
calculatePower(2,2)
You are checking x but you are decrementing y. That means your base-case will never be reached.
Your method stack overflows because it doesn't terminate and the stack frames accumulates until there is no more room.
if (x>=1) x*calculatePower(x,y-1) You test if x is greater or equal to 1 but in the recursive call you only decrement y!

Efficient retrieval of last and second to last element of ArrayStack in Scala?

I am using a mutable ArrayStack in Scala but do not know how to access the last element (and second to last element) efficiently (constant time) without popping the items from the stack. Is it possible to access the elements?
stack(4) // returns 5th element
stack.last // returns last element
Those operations are constant time.
stack(4) returns 5th element in constant time
As for the last element - the answer depends on which version you're using. Scala 2.11.7 was still running stack.last on linear time as it was using TraversableLike implementation:
def last: A = {
var lst = head
for (x <- this)
lst = x
lst
}
This was fixed in version 2.12.0-M4 using the IndexedSeqOptimized trait.
Therefor to my understanding - if you're using an older version of Scala (which was the case when the question was posted) you should use stack(stack.size - 1) which returns last element in constant time.

For loop in scala without sequence?

So, while working my way through "Scala for the Impatient" I found myself wondering: Can you use a Scala for loop without a sequence?
For example, there is an exercise in the book that asks you to build a counter object that cannot be incremented past Integer.MAX_VALUE. In order to test my solution, I wrote the following code:
var c = new Counter
for( i <- 0 to Integer.MAX_VALUE ) c.increment()
This throws an error: sequences cannot contain more than Int.MaxValue elements.
It seems to me that means that Scala is first allocating and populating a sequence object, with the values 0 through Integer.MaxValue, and then doing a foreach loop on that sequence object.
I realize that I could do this instead:
var c = new Counter
while(c.value < Integer.MAX_VALUE ) c.increment()
But is there any way to do a traditional C-style for loop with the for statement?
In fact, 0 to N does not actually populate anything with integers from 0 to N. It instead creates an instance of scala.collection.immutable.Range, which applies its methods to all the integers generated on the fly.
The error you ran into is only because you have to be able to fit the number of elements (whether they actually exist or not) into the positive part of an Int in order to maintain the contract for the length method. 1 to Int.MaxValue works fine, as does 0 until Int.MaxValue. And the latter is what your while loop is doing anyway (to includes the right endpoint, until omits it).
Anyway, since the Scala for is a very different (much more generic) creature than the C for, the short answer is no, you can't do exactly the same thing. But you can probably do what you want with for (though maybe not as fast as you want, since there is some performance penalty).
Wow, some nice technical answers for a simple question (which is good!) But in case anyone is just looking for a simple answer:
//start from 0, stop at 9 inclusive
for (i <- 0 until 10){
println("Hi " + i)
}
//or start from 0, stop at 9 inclusive
for (i <- 0 to 9){
println("Hi " + i)
}
As Rex pointed out, "to" includes the right endpoint, "until" omits it.
Yes and no, it depends what you are asking for. If you're asking whether you can iterate over a sequence of integers without having to build that sequence first, then yes you can, for instance using streams:
def fromTo(from : Int, to : Int) : Stream[Int] =
if(from > to) {
Stream.empty
} else {
// println("one more.") // uncomment to see when it is called
Stream.cons(from, fromTo(from + 1, to))
}
Then:
for(i <- fromTo(0, 5)) println(i)
Writing your own iterator by defining hasNext and next is another option.
If you're asking whether you can use the 'for' syntax to write a "native" loop, i.e. a loop that works by incrementing some native integer rather than iterating over values produced by an instance of an object, then the answer is, as far as I know, no. As you may know, 'for' comprehensions are syntactic sugar for a combination of calls to flatMap, filter, map and/or foreach (all defined in the FilterMonadic trait), depending on the nesting of generators and their types. You can try to compile some loop and print its compiler intermediate representation with
scalac -Xprint:refchecks
to see how they are expanded.
There's a bunch of these out there, but I can't be bothered googling them at the moment. The following is pretty canonical:
#scala.annotation.tailrec
def loop(from: Int, until: Int)(f: Int => Unit): Unit = {
if (from < until) {
f(from)
loop(from + 1, until)(f)
}
}
loop(0, 10) { i =>
println("Hi " + i)
}

Mockito different range expectations

I'm using Mockito as a part of Specs in scala code and I've stumbled upon the following task:
Given an ArrayBuffer that emulates a chess board (8x8 = 64 cells). If we querying ArrayBuffer for cell that doesn't exist (has number more than 63 or less than 0) we should receive None. Otherwise we returning Some(0) (in almost all cases) or Some(1) (just in few specified cells).
Right now I'm thinking about spies and something that starts like:
val spiedArray = spy(new ArrayBuffer[Int])
for (x <- 1 to 8; y <- 1 to 8) {
doReturn(Some(0)).when(spiedArray).apply(x * y-1)
}
And then explicitly respecify cells with Some(1).
But how about out-of-bound cells that should return None?
Is there a simplest and natural way to achieve that mocking?
The main issue here is that the specification is wrong: an ArrayBuffer cannot work as expected in the spec. Thus you must either:
Change the expected behavior
Change ArrayBuffer for an homemade trait