Why Scala library is so imperative? [duplicate] - scala

I'm new to Scala and was just reading Scala By Example. In chapter 2, the author has 2 different versions of Quicksort.
One is imperative style:
def sort(xs: Array[Int]) {
def swap(i: Int, j: Int) {
val t = xs(i); xs(i) = xs(j); xs(j) = t
}
def sort1(l: Int, r: Int) {
val pivot = xs((l + r) / 2)
var i = l; var j = r
while (i <= j) {
while (xs(i) < pivot) i += 1
while (xs(j) > pivot) j -= 1
if (i <= j) {
swap(i, j)
i += 1
j -= 1
}
}
if (l < j) sort1(l, j)
if (j < r) sort1(i, r)
}
sort1(0, xs.length - 1)
}
One is functional style:
def sort(xs: Array[Int]): Array[Int] = {
if (xs.length <= 1) xs
else {
val pivot = xs(xs.length / 2)
Array.concat(
sort(xs filter (pivot >)),
xs filter (pivot ==),
sort(xs filter (pivot <)))
}
}
The obvious advantage the functional style has over imperative style is conciseness. But what about performance? Since it uses recursion, do we pay for the performance penalty just like we do in other imperative languages like C? Or, Scala being a hybrid language, the "Scala way" (functional) is preferred, thus more efficient.
Note: The author did mention the functional style does use more memory.

It depends. If you look in the Scala sources, there is often an imperative style used "under the hood" in order to be performant - but in many cases exactly these tweaks allow you write performant functional code. So usually you can come up with a functional solution that is fast enough, but you must be careful and know what you do (especially concerning your data structures). E.g. the array concat in the second example is not nice, but probably not too bad - but using Lists here and concat them with ::: would be overkill.
But that is not more than educated guessing if you don't actually measure the performance. In complex projects it's really hard to predict the performance, especially as things like object creation and method calls get more and more optimized by the compiler and the JVM.
I'd suggest to start out with the functional style. If it is too slow, profile it. Usually there is a better functional solution. If not, you can use the imperative style (or a mix of both) as a last resort.

Related

Scala: IndexedSeq.newBuilder vs. ArrayBuffer

I had accepted that building an IndexedSeq in a loop should use an ArrayBuffer, followed by a conversion to a Vector via ".toVector()".
In an example profiled showed the CPU hotspot was in this section, and so I tried an alternative: use IndexedSeq.newBuilder() followed by conversion to immutable via ".result()".
This change gave a significance performance improvement. The code looks almost the same. So it seems using IndexedSeq.newBuilder() is best practice. Is this correct? The example method is shown below, with the ArrayBuffer difference commented out.
def interleave[T](a: IndexedSeq[T], b: IndexedSeq[T]): IndexedSeq[T] = {
val al = a.length
val bl = b.length
val buffer = IndexedSeq.newBuilder[T]
//---> val buffer = new ArrayBuffer[T](al + bl)
val commonLength = Math.min(al, bl)
val aExtra = al - commonLength
val bExtra = bl - commonLength
var i = 0
while (i < commonLength) {
buffer += a(i)
buffer += b(i)
i += 1
}
if (aExtra > 0) {
while (i < al) {
buffer += a(i)
i += 1
}
} else if (bExtra > 0) {
while (i < bl) {
buffer += b(i)
i += 1
}
}
buffer.result()
//---> buffer.toVector()
}
As to which is best practice, I guess it depends upon your requirements. Both approaches are acceptable and understandable. All things being equal, in this particular case, I would favor the IndexedSeq.newBuilder over ArrayBuilder (since the latter targets the creation of an Array, while the former's result is a Vector).
Just one point on benchmarking: this is a real art form, due to caching, JIT & HotSpot performance, garbage collection, etc. One piece of software you might consider using to do this is ScalaMeter. You will need to write both versions of the function to populate the final vector, and ScalaMeter will give you accurate statistics on both. ScalaMeter allows the code to warm-up before taking measurements, and can also look at memory requirements as well as CPU time.
In this example informal testing did not deceive, but ScalaMeter does provide a clearer picture of performance. Building the result in an ArrayBuffer (top orange line) is definitely slower than the more direct newBuilder (blue line).
Returning the ArrayBuffer as an IndexedSeq is the fastest (green line), but of course it does not give you the true protection of an immutable collection.
Building the intermediate result in an Array (red line) is intermediate between ArrayBuffer and newBuilder.
The "zipAll" collection method allows the interleave to be done in a more functional style:
def interleaveZipAllBuilderPat[T](a: IndexedSeq[T], b: IndexedSeq[T]): IndexedSeq[T] = {
a.zipAll(b, null, null).foldLeft(Vector.newBuilder[T]) { (z, tp) =>
tp match {
case ((x:T, null)) => z += x
case ((x:T,y:T)) => z += x += y
}
}.result()
}
The slowest are the functional method, with the top two almost the same and these differ only in that one does the pattern match, and the other an if statement, so the pattern is not slow.
Functional is marginally worse than the direct loop method if an ArrayBuffer is used to accumulate the result, but the direct loop using the newBuilder is significantly faster.
If "zipAll" could return a builder, and if the builder were iterable, the functional style could be faster - no need to produce the immutable result if the next step just requires an iteration over elements.
So for me newBuilder is the clear winner.

Scala : How to break out of a nested for comprehension

I'm trying to write some code as below -
def kthSmallest(matrix: Array[Array[Int]], k: Int): Int = {
val pq = new PriorityQueue[Int]() //natural ordering
var count = 0
for (
i <- matrix.indices;
j <- matrix(0).indices
) yield {
pq += matrix(i)(j)
count += 1
} //This would yield Any!
pq.dequeue() //kth smallest.
}
My question is, that I only want to loop till the time count is less than k (something like takeWhile(count != k)), but as I'm also inserting elements into the priority queue in the yield, this won't work in the current state.
My other options are to write a nested loop and return once count reaches k. Is it possible to do with yield? I could not find a idiomatic way of doing it yet. Any pointers would be helpful.
It's not idiomatic for Scala to use vars or break loops. You can go for recursion, lazy evaluation or duct tape a break, giving up on some performance (just like return, it's implemented as an Exception, and won't perform well enough). Here are the options broken down:
Use recursion - recursive algorithms are the analog of loops in functional languages
def kthSmallest(matrix: Array[Array[Int]], k: Int): Int = {
val pq = new PriorityQueue[Int]() //natural ordering
#tailrec
def fillQueue(i: Int, j: Int, count: Int): Unit =
if (count >= k || i >= matrix.length) ()
else {
pq += matrix(i)(j)
fillQueue(
if (j >= matrix(i).length - 1) i + 1 else i,
if (j >= matrix(i).length - 1) 0 else j + 1,
count + 1)
}
fillQueue(0, 0, 0)
pq.dequeue() //kth smallest.
}
Use a lazy structure, as chengpohi suggested - this doesn't sound very much like a pure function though. I'd suggest to use an Iterator instead of a Stream in this case though - as iterators don't memoize the steps they've gone through (might spare some memory for large matrices).
For those desperately willing to use break, Scala supports it in an attachable fashion (note the performance caveat mentioned above):
import scala.util.control.Breaks
breakable {
// loop code
break
}
There is a way using the Stream lazy evaluation to do this. Since for yield is equal to flatMap, you can convert for yield to flatMap with Stream:
matrix.indices.toStream.flatMap(i => {
matrix(0).indices.toStream.map(j => {
pq += matrix(i)(j)
count += 1
})
}).takeWhile(_ => count <= k)
Use toStream to convert the collection to Stream, and Since Stream is lazy evaluation, so we can use takeWhile to predicate count to terminate the less loops without init others.

Solving dynamic programming problems using functional programming

After you get a basic idea, coding dynamic programming (DP) problems in imperative style is pretty straightforward, at least for simpler DP problems. It usually involves some form of table, that we iteratively fill based on some formula. This is pretty much all there is for implementing bottom-up DP.
Let's take Longest Increasing Subsequence (LIS) as a simple and common example of a problem that can be solved with bottom-up DP algorithm.
C++ implementation is straightforward (not tested):
// Where A is input (std::vector<int> of size n)
std::vector<int> DP(n, 0);
for(int i = 0; i < n; i++) {
DP[i] = 1;
for(int j = 0; j < i; j++) {
if (A[i] > A[j]) {
DP[i] = std::max(DP[i], DP[j] + 1);
}
}
}
We've just described "how to fill DP" which is exactly what imperative programming is.
If we decide to describe "what DP is", which is kind of a more FP way to think about it, it gets a bit more complicated.
Here's an example Scala code for this problem:
// Where A is input
val DP = A.zipWithIndex.foldLeft(Seq[Int]()) {
case (_, (_, 0)) => Seq(1)
case (d, (a, _)) =>
d :+ d.zipWithIndex.map { case (dj, j) =>
dj + (if (a > A(j)) 1 else 0)
}.max
}
To be quite honest, this Scala implementation doesn't seem that much idiomatic. It's just a translation of imperative solution, with immutability added to it.
I'm curious, what is a general FP way to deal with things like this?
I don't know much about DP but I have a few observations that I hope will contribute to the conversation.
First, your example Scala code doesn't appear to solve the LIS problem. I plugged in the Van der Corput sequence as found on the Wikipedia page and did not get the designated result.
Working through the problem this is the solution I came up with.
def lis(a: Seq[Int], acc: Seq[Int] = Seq.empty[Int]): Int =
if (a.isEmpty) acc.length
else if (acc.isEmpty) lis(a.tail, Seq(a.head)) max lis(a.tail)
else if (a.head > acc.last) lis(a.tail, acc :+ a.head) max lis(a.tail, acc)
else lis(a.tail, acc)
lis(Seq(0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15)) // res0: Int = 6
This can be adjusted to return the subsequence itself, and I'm sure it can be tweaked for better performance.
As for memoization, it's not hard to roll-your-own on an as-needed basis. Here's the basic outline to memoize any arity-2 function.
def memo[A,B,R](f: (A,B)=>R): ((A,B)=>R) = {
val cache = new collection.mutable.WeakHashMap[(A,B),R]
(a:A,b:B) => cache.getOrElseUpdate((a,b),f(a,b))
}
With this I can create a memoized version of some often called method/function.
val myfunc = memo{(s:String, i:Int) => s.length > i}
myfunc("bsxxlakjs",7) // res0: Boolean = true
Note: It used to be that WeakHashMap was recommended so that the cache could drop lesser used elements in memory-challenged environments. I don't know if that's still the case.

Selection Sort - Functional Style with recursion

Have only recently started learning Scala and am trying to delve into Functional Programming. I have seen many of the posts on Selection Sort Functional style; but am not totally been able to understand all the solutions that have been given. My Scala skills are still Nascent.
I have written a piece of Scala code using tail recursion and would appreciate any feedback on the style. Does it look like Functional Programming? Is there a way to make this better or make it more functional?
import scala.annotation.tailrec
object FuncSelectionSort {
/**
* Selection Sort - Trying Functional Style
*/
def sort(a: Array[Int]) = {
val b: Array[Int] = new Array[Int](a.size)
Array.copy(a, 0, b, 0, a.size)
// Function to swap elements
def exchange(i: Int, j: Int): Unit = {
val k = b(i);
b(i) = b(j);
b(j) = k;
}
#tailrec
def helper(b: Array[Int], n: Int): Array[Int] = {
if (n == b.length-1) return b
else {
val head = b(n);
val minimumInTail = b.slice(n, b.length).min;
if (head > minimumInTail) {
val minimumInTailIndex = b.slice(n, b.length).indexOf(minimumInTail);
exchange(n, minimumInTailIndex + n);
}
helper(b, n + 1)
}
}
helper(b, 0)
}
}
The logic that I have tried to adopt is fairly simple. I start with the first index of the Array and find the minimum from the rest. But instead of passing the Array.tail for the next recursion; I pass in the full array and check a slice, where each slice is one smaller than the previous recursion slice.
For example,
If Array(10, 4, 6, 9, 3, 5)
First pass -> head = 10, slice = 4,6,9,3,5
First pass -> head = 4, slice = 6,9,3,5
I feel it looks the same as passing the tail, but I wanted to try and slice and see if it works the same way.
Appreciate your help.
For detailed feedback on working code, you should better go to codereview; however, I can say one thing: namely, in-place sorting arrays is per se not a good example of functional programming. This is because we purists don't like mutability, as it doesn't fit together well with recursion over data -- especially your mixing of recursion and mutation is not really good style, I'd say (and hard to read).
One clean variant would be to copy the full original array, and use in-place selection sort implemented as normal imperative code (with loops and in-place swap). Encapsulated in a function, this is pure to the outside. This pattern is commonly used in the standard library; cf. List.scala.
The other variant, and probably more instructive for learning immutable programming, is to use an immutable recursive algorithm over linked lists:
def sorted(a: List[Int]): List[Int] = a match {
case Nil => Nil
case xs => xs.min :: sorted(xs.diff(List(xs.min)))
}
From that style of programming, you'll learn much more about functional thinking (leaving aside efficiency though). Exercise: transform that code into tail-recursion.
(And actually, insertion sort works nicer with this pattern, since you don't have to "remove" at every step, but can build up a sorted linked list; you might try to implement that, too).

Scala - folding on values that result from object interaction

In Scala I have a list of objects that represent points and contain x and y values. The list describes a path that goes through all these points sequentially. My question is how to use folding on that list in order to find the total length of the path? Or maybe there is even a better functional or Scala way to do this?
What I have came up with is this:
def distance = (0 /: wps)(Waypoint.distance(_, _))
but ofcourse this is totally wrong because distance returns Float, but accepts two Waypoint objects.
UPDATE:
Thanks for the proposed solutions! They are definitely interesting, but I think that this is too much functional for real-time calculations that may become heavy. So far I have came out with these lines:
val distances = for(i <- 0 until wps.size) yield wps(i).distanceTo(wps(i + 1))
val distance = (0f /: distances)(_ + _)
I feel this to be a fair imperative/functional mix that is both fast and also leaves the distances values between each waypoint for further possible references which is also a benifit in my case.
UPDATE 2: Actually, to determine, what is faster, I will have to do benchmarks of all the proposed solutions on all types of sequences.
This should work.
(wps, wps drop 1).zipped.map(Waypoint.distance).sum
Don't know if fold can be used here, but try this:
wps.sliding(2).map(segment => Waypoint.distance(segment(0), segment(1))).sum
wps.sliding(2) returns a list of all subsequent pairs. Or if you prefer pattern matching:
wps.sliding(2).collect{case start :: end :: Nil => Waypoint.distance(start, end)}.sum
BTW consider defining:
def distanceTo(to: Waypoint)
on Waypoint class directly, not on companion object as it looks more object-oriented and will allow you to write nice DSL-like code:
point1.distanceTo(point2)
or even:
point1 distanceTo point2
wps.sliding(2).collect{
case start :: end :: Nil => start distanceTo end
}.sum
Your comment "too much functional for real-time calculations that may become heavy" makes this interesting. Benchmarking and profiling are critical, since you don't want to write a bunch of hard-to-maintain code for the sake of performance, only to find out that it's not a performance critical part of your application in the first place! Or, even worse, find out that your performance optimizations makes things worse for your specific workload.
The best performing implementation will depend on your specifics (How long are the paths? How many cores are on the system?) But I think blending imperative and functional approaches may give you the worst-of-both worlds. You could lose out on both readability and performance if you're not careful!
I would very slightly modify missingfaktor's answer to allow you to have performance gains from parallel collections. The fact that simply adding .par could give you a tremendous performance boost demonstrates the power of sticking with functional programming!
def distancePar(wps: collection.GenSeq[Waypoint]): Double = {
val parwps = wps.par
parwps.zip(parwps drop 1).map(Function.tupled(distance)).sum
}
My guess is that this would work best if you have several of cores to throw at the problem, and wps tends to be somewhat long. If you have few cores or short paths, then parallelism will probably hurt more than it helps.
The other extreme would be a fully imperative solution. Writing imperative implementations of individual, performance critical, functions is usually acceptable, so long as you avoid shared mutable state. But once you get used to FP, you'll find this sort of function more difficult to write and maintain. And it's also not easy to parallelize.
def distanceImp(wps: collection.GenSeq[Waypoint]): Double = {
if (wps.size <= 1) {
0.0
} else {
var r = 0.0
var here = wps.head
var remaining = wps.tail
while (!remaining.isEmpty) {
r += distance(here, remaining.head)
here = remaining.head
remaining = remaining.tail
}
r
}
}
Finally, if you're looking for a middle ground between FP and imperative, you might try recursion. I haven't profiled it, but my guess is that this will be roughly equivalent to the imperative solution in terms of performance.
def distanceRec(wps: collection.GenSeq[Waypoint]): Double = {
#annotation.tailrec
def helper(acc: Double, here: Waypoint, remaining: collection.GenSeq[Waypoint]): Double =
if (remaining.isEmpty)
acc
else
helper(acc + distance(here, remaining.head), remaining.head, remaining.tail)
if (wps.size <= 1)
0.0
else
helper(0.0, wps.head, wps.tail)
}
If you are doing indexing of any kind you want to be using Vector, not List:
scala> def timed(op: => Unit) = { val start = System.nanoTime; op; (System.nanoTime - start) / 1e9 }
timed: (op: => Unit)Double
scala> val l = List.fill(100000)(1)
scala> val v = Vector.fill(100000)(1)
scala> timed { var t = 0; for (i <- 0 until l.length - 1) yield t += l(i) + l(i + 1) }
res2: Double = 16.252194583
scala> timed { var t = 0; for (i <- 0 until v.length - 1) yield t += v(i) + v(i + 1) }
res3: Double = 0.047047654
ListBuffer offers fast appends, it doesn't offer fast random access.