Evaluating Performance of Functions - scala

The "oneToEach" function adds 1 to each element of a List[Int]. The first function is not tail recursive, whereas the latter is.
If I had a one million length List[Int] that I passed in to these 2 functions, which one would perform better? Better = faster or less resource usage.
// Add one to each element of a list
def oneToEach(l: List[Int]) : List[Int] =
l match {
case Nil => l
case x :: xs => (x + 1) :: oneToEach(xs)
}
...
def oneToEachTR(l: List[Int]) : List[Int] = {
def go(l: List[Int], acc: List[Int]) : List[Int] =
l match {
case Nil => acc
case x :: xs => go(xs, acc :+ (x + 1))
}
go(l, List[Int]())
}
If I understand, the first function has algorithmic complexity of O(n) since it's necessary to recurse through each item of the list and add 1.
For oneToEachTR, it uses the :+ operator, which, I've read, is O(n) complexity. As a result of using this operator per recursion/item in the list, does the worst-case algorithm complexity become O(2*n)?
Lastly, for the million-element List, will the latter function perform better with respect to resources since it's tail-recursive?

Regarding
For oneToEachTR, it uses the :+ operator, which, I've read, is O(n) complexity. As a result of using this operator per recursion/item in the list, does the worst-case algorithm complexity become O(2*n)?
no, it becomes O(n^2)
Tail recursion won't save a O(n^2) algorithm vs O(n) for sufficiently large n; 1 million is certainly sufficient!
Why O(n^2)?
You've got a list of n elements.
The first call to :+ will traverse 0 elements (acc is is initially empty) and append 1: 1 operation.
The second call will traverse 1 element and append 1: 2 operations
The third call.. 2 elements + append 1: 3 operations
...
The sum of all "operations" is 1 + 2 + 3 + ... + n = n(n+1)/2 = (1/2)n^2 + n/2. That's "in the order of" n^2, or O(n^2).

Related

How to pick a random value from a collection in Scala

I need a method to pick uniformly a random value from a collection.
Here is my current impl.
implicit class TraversableOnceOps[A, Repr](val elements: TraversableOnce[A]) extends AnyVal {
def pickRandomly : A = elements.toSeq(Random.nextInt(elements.size))
}
But this code instantiate a new collection, so not ideal in term of memory.
Any way to improve ?
[update] make it work with Iterator
implicit class TraversableOnceOps[A, Repr](val elements: TraversableOnce[A]) extends AnyVal {
def pickRandomly : A = {
val seq = elements.toSeq
seq(Random.nextInt(seq.size))
}
}
It may seem at first glance that you can't do this without counting the elements first, but you can!
Iterate through the sequence f and take each element fi with probability 1/i:
def choose[A](it: Iterator[A], r: util.Random): A =
it.zip(Iterator.iterate(1)(_ + 1)).reduceLeft((x, y) =>
if (r.nextInt(y._2) == 0) y else x
)._1
A quick demonstration of uniformity:
scala> ((1 to 1000000)
| .map(_ => choose("abcdef".iterator, r))
| .groupBy(identity).values.map(_.length))
res45: Iterable[Int] = List(166971, 166126, 166987, 166257, 166698, 166961)
Here's a discussion of the math I wrote a while back, though I'm afraid it's a bit unnecessarily long-winded. It also generalizes to choosing any fixed number of elements instead of just one.
Simplest way is just to think of the problem as zipping the collection with an equal-sized list of random numbers, and then just extract the maximum element. You can do this without actually realizing the zipped sequence. This does require traversing the entire iterator, though
val maxElement = s.maxBy(_=>Random.nextInt)
Or, for the implicit version
implicit class TraversableOnceOps[A, Repr](val elements: TraversableOnce[A]) extends AnyVal {
def pickRandomly : A = elements.maxBy(_=>Random.nextInt)
}
It's possible to select an element uniformly at random from a collection, traversing it once without copying the collection.
The following algorithm will do the trick:
def choose[A](elements: TraversableOnce[A]): A = {
var x: A = null.asInstanceOf[A]
var i = 1
for (e <- elements) {
if (Random.nextDouble <= 1.0 / i) {
x = e
}
i += 1
}
x
}
The algorithm works by at each iteration makes a choice: take the new element with probability 1 / i, or keep the previous one.
To understand why the algorithm choose the element uniformly at random, consider this: Start by considering an element in the collection, for example the first one (in this example the collection only has three elements).
At iteration:
Chosen with probability: 1.
Chosen with probability:
(probability of keeping the element at previous iteration) * (keeping at current iteration)
probability => 1 * 1/2 = 1/2
Chosen with probability: 1/2 * 2/3=1/3 (in other words, uniformly)
If we take another element, for example the second one:
0 (not possible to choose the element at this iteration).
1/2.
1/2*2/3=1/3.
Finally for the third one:
0.
0.
1/3.
This shows that the algorithm selects an element uniformly at random. This can be proved formally using induction.
If the collection is large enough that you care about about instantiations, here is the constant memory solution (I assume, it contains ints' but that only matters for passing initial param to fold):
collection.fold((0, 0)) {
case ((0, _), x) => (1, x)
case ((n, x), _) if (random.nextDouble() > 1.0/n) => (n+1, x)
case ((n, _), x) => (n+1, x)
}._2
I am not sure if this requires a further explanation ... Basically, it does the same thing that #svenslaggare suggested above, but in a functional way, since this is tagged as a scala question.

Convert normal recursion to tail recursion

I was wondering if there is some general method to convert a "normal" recursion with foo(...) + foo(...) as the last call to a tail-recursion.
For example (scala):
def pascal(c: Int, r: Int): Int = {
if (c == 0 || c == r) 1
else pascal(c - 1, r - 1) + pascal(c, r - 1)
}
A general solution for functional languages to convert recursive function to a tail-call equivalent:
A simple way is to wrap the non tail-recursive function in the Trampoline monad.
def pascalM(c: Int, r: Int): Trampoline[Int] = {
if (c == 0 || c == r) Trampoline.done(1)
else for {
a <- Trampoline.suspend(pascal(c - 1, r - 1))
b <- Trampoline.suspend(pascal(c, r - 1))
} yield a + b
}
val pascal = pascalM(10, 5).run
So the pascal function is not a recursive function anymore. However, the Trampoline monad is a nested structure of the computation that need to be done. Finally, run is a tail-recursive function that walks through the tree-like structure, interpreting it, and finally at the base case returns the value.
A paper from Rúnar Bjanarson on the subject of Trampolines: Stackless Scala With Free Monads
In cases where there is a simple modification to the value of a recursive call, that operation can be moved to the front of the recursive function. The classic example of this is Tail recursion modulo cons, where a simple recursive function in this form:
def recur[A](...):List[A] = {
...
x :: recur(...)
}
which is not tail recursive, is transformed into
def recur[A]{...): List[A] = {
def consRecur(..., consA: A): List[A] = {
consA :: ...
...
consrecur(..., ...)
}
...
consrecur(...,...)
}
Alexlv's example is a variant of this.
This is such a well known situation that some compilers (I know of Prolog and Scheme examples but Scalac does not do this) can detect simple cases and perform this optimisation automatically.
Problems combining multiple calls to recursive functions have no such simple solution. TMRC optimisatin is useless, as you are simply moving the first recursive call to another non-tail position. The only way to reach a tail-recursive solution is remove all but one of the recursive calls; how to do this is entirely context dependent but requires finding an entirely different approach to solving the problem.
As it happens, in some ways your example is similar to the classic Fibonnaci sequence problem; in that case the naive but elegant doubly-recursive solution can be replaced by one which loops forward from the 0th number.
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
def fib (n: Long): Long = {
def loop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
loop(next, current + next, iteration + 1)
}
loop(0, 1, 0)
}
For the Fibonnaci sequence, this is the most efficient approach (a streams based solution is just a different expression of this solution that can cache results for subsequent calls). Now,
you can also solve your problem by looping forward from c0/r0 (well, c0/r2) and calculating each row in sequence - the difference being that you need to cache the entire previous row. So while this has a similarity to fib, it differs dramatically in the specifics and is also significantly less efficient than your original, doubly-recursive solution.
Here's an approach for your pascal triangle example which can calculate pascal(30,60) efficiently:
def pascal(column: Long, row: Long):Long = {
type Point = (Long, Long)
type Points = List[Point]
type Triangle = Map[Point,Long]
def above(p: Point) = (p._1, p._2 - 1)
def aboveLeft(p: Point) = (p._1 - 1, p._2 - 1)
def find(ps: Points, t: Triangle): Long = ps match {
// Found the ultimate goal
case (p :: Nil) if t contains p => t(p)
// Found an intermediate point: pop the stack and carry on
case (p :: rest) if t contains p => find(rest, t)
// Hit a triangle edge, add it to the triangle
case ((c, r) :: _) if (c == 0) || (c == r) => find(ps, t + ((c,r) -> 1))
// Triangle contains (c - 1, r - 1)...
case (p :: _) if t contains aboveLeft(p) => if (t contains above(p))
// And it contains (c, r - 1)! Add to the triangle
find(ps, t + (p -> (t(aboveLeft(p)) + t(above(p)))))
else
// Does not contain(c, r -1). So find that
find(above(p) :: ps, t)
// If we get here, we don't have (c - 1, r - 1). Find that.
case (p :: _) => find(aboveLeft(p) :: ps, t)
}
require(column >= 0 && row >= 0 && column <= row)
(column, row) match {
case (c, r) if (c == 0) || (c == r) => 1
case p => find(List(p), Map())
}
}
It's efficient, but I think it shows how ugly complex recursive solutions can become as you deform them to become tail recursive. At this point, it may be worth moving to a different model entirely. Continuations or monadic gymnastics might be better.
You want a generic way to transform your function. There isn't one. There are helpful approaches, that's all.
I don't know how theoretical this question is, but a recursive implementation won't be efficient even with tail-recursion. Try computing pascal(30, 60), for example. I don't think you'll get a stack overflow, but be prepared to take a long coffee break.
Instead, consider using a Stream or memoization:
val pascal: Stream[Stream[Long]] =
(Stream(1L)
#:: (Stream from 1 map { i =>
// compute row i
(1L
#:: (pascal(i-1) // take the previous row
sliding 2 // and add adjacent values pairwise
collect { case Stream(a,b) => a + b }).toStream
++ Stream(1L))
}))
The accumulator approach
def pascal(c: Int, r: Int): Int = {
def pascalAcc(acc:Int, leftover: List[(Int, Int)]):Int = {
if (leftover.isEmpty) acc
else {
val (c1, r1) = leftover.head
// Edge.
if (c1 == 0 || c1 == r1) pascalAcc(acc + 1, leftover.tail)
// Safe checks.
else if (c1 < 0 || r1 < 0 || c1 > r1) pascalAcc(acc, leftover.tail)
// Add 2 other points to accumulator.
else pascalAcc(acc, (c1 , r1 - 1) :: ((c1 - 1, r1 - 1) :: leftover.tail ))
}
}
pascalAcc(0, List ((c,r) ))
}
It does not overflow the stack but as on big row and column but Aaron mentioned it's not fast.
Yes it's possible. Usually it's done with accumulator pattern through some internally defined function, which has one additional argument with so called accumulator logic, example with counting length of a list.
For example normal recursive version would look like this:
def length[A](xs: List[A]): Int = if (xs.isEmpty) 0 else 1 + length(xs.tail)
that's not a tail recursive version, in order to eliminate last addition operation we have to accumulate values while somehow, for example with accumulator pattern:
def length[A](xs: List[A]) = {
def inner(ys: List[A], acc: Int): Int = {
if (ys.isEmpty) acc else inner(ys.tail, acc + 1)
}
inner(xs, 0)
}
a bit longer to code, but i think the idea i clear. Of cause you can do it without inner function, but in such case you should provide acc initial value manually.
I'm pretty sure it's not possible in the simple way you're looking for the general case, but it would depend on how elaborate you permit the changes to be.
A tail-recursive function must be re-writable as a while-loop, but try implementing for example a Fractal Tree using while-loops. It's possble, but you need to use an array or collection to store the state for each point, which susbstitutes for the data otherwise stored in the call-stack.
It's also possible to use trampolining.
It is indeed possible. The way I'd do this is to
begin with List(1) and keep recursing till you get to the
row you want.
Worth noticing that you can optimize it: if c==0 or c==r the value is one, and to calculate let's say column 3 of the 100th row you still only need to calculate the first three elements of the previous rows.
A working tail recursive solution would be this:
def pascal(c: Int, r: Int): Int = {
#tailrec
def pascalAcc(c: Int, r: Int, acc: List[Int]): List[Int] = {
if (r == 0) acc
else pascalAcc(c, r - 1,
// from let's say 1 3 3 1 builds 0 1 3 3 1 0 , takes only the
// subset that matters (if asking for col c, no cols after c are
// used) and uses sliding to build (0 1) (1 3) (3 3) etc.
(0 +: acc :+ 0).take(c + 2)
.sliding(2, 1).map { x => x.reduce(_ + _) }.toList)
}
if (c == 0 || c == r) 1
else pascalAcc(c, r, List(1))(c)
}
The annotation #tailrec actually makes the compiler check the function
is actually tail recursive.
It could be probably be further optimized since given that the rows are symmetric, if c > r/2, pascal(c,r) == pascal ( r-c,r).. but left to the reader ;)

Explain some scala code - beginner

I've encountered this scala code and I'm trying to work out what its doing except the fact it returns an int. I'm unsure of these three lines :
l match {
case h :: t =>
case _ => 0
Function :
def iterate(l: List[Int]): Int =
l match {
case h :: t =>
if (h > n) 0
case _ => 0
}
First, you define a function called iterate and you specified the return type as Int. It has arity 1, parameter l of type List[Int].
The List type is prominent throughout functional programming, and it's main characteristics being that it has efficient prepend and that it is easy to decompose any List into a head and tail. The head would be the first element of the list (if non-empty) and the tail would be the rest of the List(which itself is a List) - this becomes useful for recursive functions that operate on List.
The match is called pattern matching.. it's essentially a switch statement in the C-ish languages, but much more powerful - the switch restricts you to constants (at least in C it does), but there is no such restriction with match.
Now, your first case you have h :: t - the :: is called a "cons", another term from functional programming. When you create a new List from another List via a prepend, you can use the :: operator to do it.
Example:
val oldList = List(1, 2, 3)
val newList = 0 :: oldList // newList == List(0, 1, 2, 3)
In Scala, operators that end with a : are really a method of the right hand side, so 0 :: oldList is the equivalent of oldList.::(0) - the 0 :: oldList is syntactic sugar that makes it easier to read.
We could've defined oldList like
val oldList = 1 :: 2 :: 3 :: Nil
where Nil represents an empty List. Breaking this down into steps:
3 :: Nil is evaluated first, creating the equivalent of a List(3) which has head 3 and empty tail.
2 is prepended to the above list, creative a new list with head 2 and tail List(3).
1 is prepended, creating a new list with head 1 and tail List(2, 3).
The resulting List of List(1, 2, 3) is assigned to the val oldList.
Now when you use :: to pattern match you essentially decompose a List into a head and tail, like the reverse of how we created the List above. Here when you do
l match {
case h :: t => ...
}
you are saying decompose l into a head and tail if possible. If you decompose successfully, you can then use these h and t variables to do whatever you want.. typically you would do something like act on h and call the recursive function on t.
One thing to note here is that your code will not compile.. you do an if (h > n) 0 but there is no explicit else so what happens is your code looks like this to the compiler:
if (h > n) 0
else { }
which has type AnyVal (the common supertype of 0 and "nothing"), a violation of your Int guarentee - you're going to have to add an else branch with some failure value or something.
The second case _ => is like a default in the switch, it catches anything that failed the head/tail decomposition in your first case.
Your code essentially does this:
Take the l List parameter and see if it can be decomposed into a head and tail.
If it can be, compare the head against (what I assume to be) a variable in the outer scope called n. If it is greater than n, the function returns 0. (You need to add what happens if it's not greater)
If it cannot be decomposed, the function returns 0.
This is called pattern matching. It's like a switch statement, but more powerful.
Some useful resources:
http://www.scala-lang.org/node/120
http://www.codecommit.com/blog/scala/scala-for-java-refugees-part-4

Off by one with sliding?

One of the advantages of not handling collections through indices is to avoid off-by-one errors. That's certainly not the only advantage, but it is one of them.
Now, I often use sliding in some algorithms in Scala, but I feel that it usually results in something very similar to the off-by-one errors, because a sliding of m elements in a collection of size n has size n - m + 1 elements. Or, more trivially, list sliding 2 is one element shorter than list.
The feeling I get is that there's a missing abstraction in this pattern, something that would be part sliding, part something more -- like foldLeft is to reduceLeft. I can't think of what that might be, however. Can anyone help me find enlightenment here?
UPDATE
Since people are not clear one what I'm talking, let's consider this case. I want to capitalize a string. Basically, every letter that is not preceded by a letter should be upper case, and all other letters should be lower case. Using sliding, I have to special case either the first or the last letter. For example:
def capitalize(s: String) = s(0).toUpper +: s.toSeq.sliding(2).map {
case Seq(c1, c2) if c2.isLetter => if (c1.isLetter) c2.toLower else c2.toUpper
case Seq(_, x) => x
}.mkString
I’m taking Owen’s answer as an inspiration to this.
When you want to apply a simple diff() to a list, this can be seen as equivalent to the following matrix multiplication.
a = (0 1 4 3).T
M = ( 1 -1 0 0)
( 0 1 -1 0)
( 0 0 1 -1)
diff(a) = M * a = (1 3 1).T
We may now use the same scheme for general list operations, if we replace addition and multiplication (and if we generalise the numbers in our matrix M).
So, with plus being a list append operation (with flatten afterwards – or simply a collect operation), and the multiplicative equivalent being either Some(_) or None, a slide with a window size of two becomes:
M = (Some(_) Some(_) None None)
(None Some(_) Some(_) None)
(None None Some(_) Some(_))
slide(a) = M “*” a = ((0 1) (1 4) (4 3)).T
Not sure, if this is the kind of abstraction you’re looking for, but it would be a generalisation on a class of operations which change the number of items.
diff or slide operations of order m for an input of length n will need to use Matrixes of size n-m+1 × n.
Edit: A solution could be to transform List[A] to List[Some[A]] and then to prepend or append (slideLeft or slideRight) these with None. That way you could handle all the magic inside the map method.
list.slideLeft(2) {
case Seq(Some(c1), Some(c2)) if c2.isLetter => if (c1.isLetter) c2.toLower else c2.toUpper
case Seq(_, Some(x)) => x
}
I run into this problem all the time in python/R/Matlab where you diff() a vector and then can't line it up with the original one! It is very frustrating.
I think what's really missing is that the vector only hold the dependent variables, and assumes that you, the programmer, are keeping track of the independent variables, ie the dimension that the collection ranges over.
I think the way to solve this is to have the language to some degree keep track of independent variables; perhaps statically through types, or dynamically by storing them along with the vector. Then it can check the independent axes, make sure they line up, or, I don't know if this is possible, shuffle things around to make them line up.
That's the best I've thought of so far.
EDIT
Another way of thinking about this is, why does your collection have order? Why is it not just a Set? The order means something, but the collection doesn't keep track of that -- it's basically using sequential position (which is about as informative as numerical indices) to proxy for the real meaning.
EDIT
Another consequence would be that transformations like sliding actually represent two transformations, one for the dependent variables, and one for their axis.
In your example, I think the code is made more complex because, you basically want to do a map but working with sliding which introduces edge conditions in a way that doesn't work nicely. I think a fold left with an accumulator that remembers the relevant state may be easier conceptually:
def capitalize2(s: String) = (("", true) /: s){ case ((res, notLetter), c) =>
(res + (if (notLetter) c.toUpper else c.toLower), !c.isLetter)
}._1
I think this could be generalized so that notLetter could remember n elements where n is the size of the sliding window.
The transformation you're asking for inherently reduces the size of the data. Sorry--there's no other way to look at it. tail also gives you off-by-one errors.
Now, you might say--well, fine, but I want a convenience method to maintain the original size.
In that case, you might want these methods on List:
initializedSliding(init: List[A]) = (init ::: this).sliding(1 + init.length)
finalizedSliding(tail: List[A]) = (this ::: tail).sliding(1 + tail.length)
which will maintain your list length. (You can envision how to generalize to non-lists, I'm sure.)
This is the analog to fold left/right in that you supply the missing data in order to perform a pairwise (or more) operation on every element of the list.
The off by one problem you describe reminds me in the boundary condition issue in digital signal processing. The problem occurs since the data (list) is finite. It doesn't occur for infinite data (stream). In digital signal processing the issues is remedied by extending the finite signal to an infinite one. This can be done in various ways like repeating the data or repeating the data and reversing it on every repetition (like it is done for the discrete cosine transform).
Borrowing from these approached for sliding would lead to an abstraction which does not exhibit the off by one problem:
(1::2::3::Nil).sliding(2)
would yield
(1,2), (2,3), (3,1)
for circular boundary conditions and
(1,2), (2,3), (3,2)
for circular boundary conditions with reversal.
Off-by-one errors suggest that you are trying to put the original list in one-to-one correspondence with the sliding list, but something strange is going on, since the sliding list has fewer elements.
The problem statement for your example can be roughly phrased as: "Uppercase every character if it (a) is the first character, or (b) follows a letter character". As Owen points, the first character is a special case, and any abstraction should respect this. Here's a possibility,
def slidingPairMap[A, B](s: List[A], f1: A => B, f2: (A, A) => B): List[B] = s match {
case Nil => Nil
case x :: _ => f1(x) +: s.sliding(2).toList.map { case List(x, y) => f2(x, y) }
}
(not the best implementation, but you get the idea). This generalizes to sliding triples, with off-by-two errors, and so on. The type of slidingPairMap makes it clear that special casing is being done.
An equivalent signature could be
def slidingPairMap[A, B](s: List[A], f: Either[A, (A, A)] => B): List[B]
Then f could use pattern matching to figure out if it's working with the first element, or with a subsequent one.
Or, as Owen says in the comments, why not make a modified sliding method that gives information about whether the element is first or not,
def slidingPairs[A](s: List[A]): List[Either[A, (A, A)]]
I guess this last idea is isomorphic to what Debilski suggests in the comments: pad the beginning of the list with None, wrap all the existing elements with Some, and then call sliding.
I realize this is an old question but I just had a similar problem and I wanted to solve it without having to append or prepend anything, and where it would handle the last elements of the sequence in a seamless manner. The approach I came up with is a slidingFoldLeft. You have to handle the first element as a special case (like some others mentioned, for capitalize, it is a special case), but for the end of the sequence you can just handle it like other cases. Here is the implementation and some silly examples:
def slidingFoldLeft[A, B] (seq: Seq[A], window: Int)(acc: B)(
f: (B, Seq[A]) => B): B = {
if (window > 0) {
val iter = seq.sliding(window)
iter.foldLeft(acc){
// Operate normally
case (acc, next) if iter.hasNext => f(acc, next)
// It's at the last <window> elements of the seq, handle current case and
// call recursively with smaller window
case (acc, next) =>
slidingFoldLeft(next.tail, window - 1)(f(acc, next))(f)
}
} else acc
}
def capitalizeAndQuestionIncredulously(s: String) =
slidingFoldLeft(s.toSeq, 2)("" + s(0).toUpper) {
// Normal iteration
case (acc, Seq(c1, c2)) if c1.isLetter && c2.isLetter => acc + c2.toLower
case (acc, Seq(_, c2)) if c2.isLetter => acc + c2.toUpper
case (acc, Seq(_, c2)) => acc + c2
// Last element of string
case (acc, Seq(c)) => acc + "?!"
}
def capitalizeAndInterruptAndQuestionIncredulously(s: String) =
slidingFoldLeft(s.toSeq, 3)("" + s(0).toUpper) {
// Normal iteration
case (acc, Seq(c1, c2, _)) if c1.isLetter && c2.isLetter => acc + c2.toLower
case (acc, Seq(_, c2, _)) if c2.isLetter => acc + c2.toUpper
case (acc, Seq(_, c2, _)) => acc + c2
// Last two elements of string
case (acc, Seq(c1, c2)) => acc + " (commercial break) " + c2
// Last element of string
case (acc, Seq(c)) => acc + "?!"
}
println(capitalizeAndQuestionIncredulously("hello my name is mAtthew"))
println(capitalizeAndInterruptAndQuestionIncredulously("hello my name is mAtthew"))
And the output:
Hello My Name Is Matthew?!
Hello My Name Is Matthe (commercial break) w?!
I would prepend None after mapping with Some(_) the elements; note that the obvious way of doing it (matching for two Some in the default case, as done in the edit by Debilski) is wrong, as we must be able to modify even the first letter. This way, the abstraction respects the fact that simply sometimes there is no predecessor. Using getOrElse(false) ensures that a missing predecessor is treated as having failed the test.
((None +: "foo1bar".toSeq.map(Some(_))) sliding 2).map {
case Seq(c1Opt, Some(c2)) if c2.isLetter => if (c1Opt.map(_.isLetter).getOrElse(false)) c2.toLower else c2.toUpper
case Seq(_, Some(x)) => x
}.mkString
res13: String = "Foo1Bar"
Acknowledgments: the idea of mapping the elements with Some(_) did come to me through Debilski's post.
I'm not sure if this solves your concrete problem, but we could easily imagine a pair of methods e.g. slidingFromLeft(z: A, size: Int) and slidingToRight(z: A, size: Int) (where A is collection's element type) which, when called on e.g.
List(1, 2, 3, 4, 5)
with arguments e.g. (0, 3), should produce respectively
List(0, 0, 1), List(0, 1, 2), List(1, 2, 3), List(2, 3, 4), List(3, 4, 5)
and
List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 0), List(5, 0, 0)
This is the sort of problem nicely-suited to an array-oriented functional language like J. Basically, we generate a boolean with a one corresponding to the first letter of each word. To do this, we start with a boolean marking the spaces in a string. For example (lines indented three spaces are inputs; results are flush with left margin; "NB." starts a comment):
str=. 'now is the time' NB. Example w/extra spaces for interest
]whspc=. ' '=str NB. Mark where spaces are=1
0 0 0 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0
Verify that (*.-.) ("and not") returns one only for "1 0":
]tt=. #:i.4 NB. Truth table
0 0
0 1
1 0
1 1
(*.-.)/"1 tt NB. Apply to 1-D sub-arrays (rows)
0 0 1 0 NB. As hoped.
Slide our tacit function across pairs in the boolean:
2(*.-.)/\whspc NB. Apply to 2-ples
0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0
But this doesn't handle the edge condition of the initial letter, so force a one into the first position. This actually helps as the reduction of 2-ples left us one short. Here we compare lengths of the original boolean and the target boolean:
#whspc
20
#1,2(*.-.)/\whspc
20
1,2(*.-.)/\whspc
1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0
We get uppercase by using the index into the lowercase vector to select from the uppercase vector (after defining these two vectors):
'lc uc'=. 'abcdefghijklmnopqrstuvwxyz';'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
(uc,' '){~lc i. str
NOW IS THE TIME
Check that insertion by boolean gives correct result:
(1,2(*.-.)/\whspc) } str,:(uc,' '){~lc i. str
Now Is The Time
Now is the time to combine all this into one statement:
(1,2(*.-.)/\' '=str) } str,:(uc,' '){~lc i. str
Now Is The Time

Selection sort in functional Scala

I'm making my way through "Programming in Scala" and wrote a quick implementation of the selection sort algorithm. However, since I'm still a bit green in functional programming, I'm having trouble translating to a more Scala-ish style. For the Scala programmers out there, how can I do this using Lists and vals rather than falling back into my imperative ways?
http://gist.github.com/225870
As starblue already said, you need a function that calculates the minimum of a list and returns the list with that element removed. Here is my tail recursive implementation of something similar (as I believe foldl is tail recursive in the standard library), and I tried to make it as functional as possible :). It returns a list that contains all the elements of the original list (but kindof reversed - see the explanation below) with the minimum as a head.
def minimum(xs: List[Int]): List[Int] =
(List(xs.head) /: xs.tail) {
(ys, x) =>
if(x < ys.head) (x :: ys)
else (ys.head :: x :: ys.tail)
}
This basically does a fold, starting with a list containing of the first element of xs If the first element of xs is smaller than the head of that list, we pre-append it to the list ys. Otherwise, we add it to the list ys as the second element. And so on recursively, we've folded our list into a new list containing the minimum element as a head and a list containing all the elements of xs (not necessarily in the same order) with the minimum removed, as a tail. Note that this function does not remove duplicates.
After creating this helper function, it's now easy to implement selection sort.
def selectionSort(xs: List[Int]): List[Int] =
if(xs.isEmpty) List()
else {
val ys = minimum(xs)
if(ys.tail.isEmpty)
ys
else
ys.head :: selectionSort(ys.tail)
}
Unfortunately this implementation is not tail recursive, so it will blow up the stack for large lists. Anyway, you shouldn't use a O(n^2) sort for large lists, but still... it would be nice if the implementation was tail recursive. I'll try to think of something... I think it will look like the implementation of a fold.
Tail Recursive!
To make it tail recursive, I use quite a common pattern in functional programming - an accumulator. It works a bit backward, as now I need a function called maximum, which basically does the same as minimum, but with the maximum element - its implementation is exact as minimum, but using > instead of <.
def selectionSort(xs: List[Int]) = {
def selectionSortHelper(xs: List[Int], accumulator: List[Int]): List[Int] =
if(xs.isEmpty) accumulator
else {
val ys = maximum(xs)
selectionSortHelper(ys.tail, ys.head :: accumulator)
}
selectionSortHelper(xs, Nil)
}
EDIT: Changed the answer to have the helper function as a subfunction of the selection sort function.
It basically accumulates the maxima to a list, which it eventually returns as the base case. You can also see that it is tail recursive by replacing accumulator by throw new NullPointerException - and then inspect the stack trace.
Here's a step by step sorting using an accumulator. The left hand side shows the list xs while the right hand side shows the accumulator. The maximum is indicated at each step by a star.
64* 25 12 22 11 ------- Nil
11 22 12 25* ------- 64
22* 12 11 ------- 25 64
11 12* ------- 22 25 64
11* ------- 12 22 25 64
Nil ------- 11 12 22 25 64
The following shows a step by step folding to calculate the maximum:
maximum(25 12 64 22 11)
25 :: Nil /: 12 64 22 11 -- 25 > 12, so it stays as head
25 :: 12 /: 64 22 11 -- same as above
64 :: 25 12 /: 22 11 -- 25 < 64, so the new head is 64
64 :: 22 25 12 /: 11 -- and stays so
64 :: 11 22 25 12 /: Nil -- until the end
64 11 22 25 12
You should have problems doing selection sort in functional style, as it is an in-place sort algorithm. In-place, by definition, isn't functional.
The main problem you'll face is that you can't swap elements. Here's why this is important. Suppose I have a list (a0 ... ax ... an), where ax is the minimum value. You need to get ax away, and then compose a list (a0 ... ax-1 ax+1 an). The problem is that you'll necessarily have to copy the elements a0 to ax-1, if you wish to remain purely functional. Other functional data structures, particularly trees, can have better performance than this, but the basic problem remains.
here is another implementation of selection sort (generic version).
def less[T <: Comparable[T]](i: T, j: T) = i.compareTo(j) < 0
def swap[T](xs: Array[T], i: Int, j: Int) { val tmp = xs(i); xs(i) = xs(j); xs(j) = tmp }
def selectiveSort[T <: Comparable[T]](xs: Array[T]) {
val n = xs.size
for (i <- 0 until n) {
val min = List.range(i + 1, n).foldLeft(i)((a, b) => if (less(xs(a), xs(b))) a else b)
swap(xs, i, min)
}
}
You need a helper function which does the selection. It should return the minimal element and the rest of the list with the element removed.
I think it's reasonably feasible to do a selection sort in a functional style, but as Daniel indicated, it has a good chance of performing horribly.
I just tried my hand at writing a functional bubble sort, as a slightly simpler and degenerate case of selection sort. Here's what I did, and this hints at what you could do:
define bubble(data)
if data is empty or just one element: return data;
otherwise, if the first element < the second,
return first element :: bubble(rest of data);
otherwise, return second element :: bubble(
first element :: (rest of data starting at 3rd element)).
Once that's finished recursing, the largest element is at the end of the list. Now,
define bubblesort [data]
apply bubble to data as often as there are elements in data.
When that's done, your data is indeed sorted. Yes, it's horrible, but my Clojure implementation of this pseudocode works.
Just concerning yourself with the first element or two and then leaving the rest of the work to a recursed activity is a lisp-y, functional-y way to do this kind of thing. But once you've gotten your mind accustomed to that kind of thinking, there are more sensible approaches to the problem.
I would recommend implementing a merge sort:
Break list into two sub-lists,
either by counting off half the elements into one sublist
and the rest in the other,
or by copying every other element from the original list
into either of the new lists.
Sort each of the two smaller lists (recursion here, obviously).
Assemble a new list by selecting the smaller from the front of either sub-list
until you've exhausted both sub-lists.
The recursion is in the middle of that, and I don't see a clever way of making the algorithm tail recursive. Still, I think it's O(log-2) in time and also doesn't place an exorbitant load on the stack.
Have fun, good luck!
Thanks for the hints above, they were very inspiring. Here's another functional approach to the selection sort algorithm. I tried to base it on the following idea: finding a max / min can be done quite easily by min(A)=if A=Nil ->Int.MaxValue else min(A.head, min(A.tail)). The first min is the min of a list, the second the min of two numbers. This is easy to understand, but unfortunately not tail recursive. Using the accumulator method the min definition can be transformed like this, now in correct Scala:
def min(x: Int,y: Int) = if (x<y) x else y
def min(xs: List[Int], accu: Int): Int = xs match {
case Nil => accu
case x :: ys => min(ys, min(accu, x))
}
(This is tail recursive)
Now a min version is needed which returns a list leaving out the min value. The following function returns a list whose head is the min value, the tail contains the rest of the original list:
def minl(xs: List[Int]): List[Int] = minl(xs, List(Int.MaxValue))
def minl(xs: List[Int],accu:List[Int]): List[Int] = xs match {
// accu always contains min as head
case Nil => accu take accu.length-1
case x :: ys => minl(ys,
if (x<accu.head) x::accu else accu.head :: x :: accu.tail )
}
Using this selection sort can be written tail recursively as:
def ssort(xs: List[Int], accu: List[Int]): List[Int] = minl(xs) match {
case Nil => accu
case min :: rest => ssort(rest, min::accu)
}
(reverses the order). In a test with 10000 list elements this algorithm is only about 4 times slower than the usual imperative algorithm.
Even though, when coding Scala, I'm used to prefer functional programming style (via combinators or recursion) over imperative style (via variables and iterations), THIS TIME, for this specific problem, old school imperative nested loops result in simpler and more performant code.
I don't think falling back to imperative style is a mistake for certain classes of problems, such as sorting algorithms which usually transform the input buffer in place rather than resulting to a new collection.
My solution is:
package bitspoke.algo
import scala.math.Ordered
import scala.collection.mutable.Buffer
abstract class Sorter[T <% Ordered[T]] {
// algorithm provided by subclasses
def sort(buffer : Buffer[T]) : Unit
// check if the buffer is sorted
def sorted(buffer : Buffer[T]) = buffer.isEmpty || buffer.view.zip(buffer.tail).forall { t => t._2 > t._1 }
// swap elements in buffer
def swap(buffer : Buffer[T], i:Int, j:Int) {
val temp = buffer(i)
buffer(i) = buffer(j)
buffer(j) = temp
}
}
class SelectionSorter[T <% Ordered[T]] extends Sorter[T] {
def sort(buffer : Buffer[T]) : Unit = {
for (i <- 0 until buffer.length) {
var min = i
for (j <- i until buffer.length) {
if (buffer(j) < buffer(min))
min = j
}
swap(buffer, i, min)
}
}
}
As you can see, to achieve parametric polymorphism, rather than using java.lang.Comparable, I preferred scala.math.Ordered and Scala View Bounds rather than Upper Bounds. That's certainly works thanks to Scala Implicit Conversions of primitive types to Rich Wrappers.
You can write a client program as follows:
import bitspoke.algo._
import scala.collection.mutable._
val sorter = new SelectionSorter[Int]
val buffer = ArrayBuffer(3, 0, 4, 2, 1)
sorter.sort(buffer)
assert(sorter.sorted(buffer))
A simple functional program for selection-sort in Scala
def selectionSort(list:List[Int]):List[Int] = {
#tailrec
def selectSortHelper(list:List[Int], accumList:List[Int] = List[Int]()): List[Int] = {
list match {
case Nil => accumList
case _ => {
val min = list.min
val requiredList = list.filter(_ != min)
selectSortHelper(requiredList, accumList ::: List.fill(list.length - requiredList.length)(min))
}
}
}
selectSortHelper(list)
}
You may want to try replacing your while loops with recursion, so, you have two places where you can create new recursive functions.
That would begin to get rid of some vars.
This was probably the toughest lesson for me, trying to move more toward FP.
I hesitate to show solutions here, as I think it would be better for you to try first.
But, if possible you should be using tail-recursion, to avoid problems with stack overflows (if you are sorting a very, very large list).
Here is my point of view on this problem: SelectionSort.scala
def selectionsort[A <% Ordered[A]](list: List[A]): List[A] = {
def sort(as: List[A], bs: List[A]): List[A] = as match {
case h :: t => select(h, t, Nil, bs)
case Nil => bs
}
def select(m: A, as: List[A], zs: List[A], bs: List[A]): List[A] =
as match {
case h :: t =>
if (m > h) select(m, t, h :: zs, bs)
else select(h, t, m :: zs, bs)
case Nil => sort(zs, m :: bs)
}
sort(list, Nil)
}
There are two inner functions: sort and select, which represents two loops in original algorithm. The first function sort iterates through the elements and call select for each of them. When the source list is empty it return bs list as result, which is initially Nil. The sort function tries to search for maximum (not minimum, since we build result list in reversive order) element in source list. It suppose that maximum is head by the default and then just replace it with a proper value.
This is 100% functional implementation of Selection Sort in Scala.
Here is my solution
def sort(list: List[Int]): List[Int] = {
#tailrec
def pivotCompare(p: Int, l: List[Int], accList: List[Int] = List.empty): List[Int] = {
l match {
case Nil => p +: accList
case x :: xs if p < x => pivotCompare(p, xs, accList :+ x)
case x :: xs => pivotCompare(x, xs, accList :+ p)
}
}
#tailrec
def loop(list: List[Int], accList: List[Int] = List.empty): List[Int] = {
list match {
case x :: xs =>
pivotCompare(x, xs) match {
case Nil => accList
case h :: tail => loop(tail, accList :+ h)
}
case Nil => accList
}
}
loop(list)
}