Matching with custom combinations/operators - scala

I know that you can do matching on lists in a way like
val list = List(1,2,3)
list match {
case head::tail => head
case _ => //whatever
}
so I started to wonder how this works. If I understand correctly, :: is just an operator, so what's to stop me from doing something like
4 match {
case x + 2 => x //I would expect x=2 here
}
If there is a way to create this kind of functionality, how is it done; if not, then why?

Pattern matching takes the input and decomposes it with an unapply function. So in your case, unapply(4) would have to return the two numbers that sum to 4. However, there are many pairs that sum to 4, so the function wouldn't know what to do.
What you need is for the 2 to be accessible to the unapply function somehow. A special case class that stores the 2 would work for this:
case class Sum(addto: Int) {
def unapply(i: Int) = Some(i - addto)
}
val Sum2 = Sum(2)
val Sum2(x) = 5 // x = 3
(It would be nice to be able to do something like val Sum(2)(y) = 5 for compactness, but Scala doesn't allow parameterized extractors; see here.)
[EDIT: This is a little silly, but you could actually do the following too:
val `2 +` = Sum(2)
val `2 +`(y) = 5 // y = 3
]
EDIT: The reason the head::tail thing works is that there is exactly one way to split the head from the tail of a list.
There's nothing inherently special about :: versus +: you could use + if you had a predetermined idea of how you wanted it to break a number. For example, if you wanted + to mean "split in half", then you could do something like:
object + {
def unapply(i: Int) = Some(i-i/2, i/2)
}
and use it like:
scala> val a + b = 4
a: Int = 2
b: Int = 2
scala> val c + d = 5
c: Int = 3
d: Int = 2
EDIT: Finally, this explains that, when pattern matching, A op B means the same thing as op(A,B), which makes the syntax look nice.

Matching with case head :: tail uses an infix operation pattern of the form p1 op p2 which gets translated to op(p1, p2) before doing the actual matching. (See API for ::)
The problem with + is the following:
While it is easy to add an
object + {
def unapply(value: Int): Option[(Int, Int)] = // ...
}
object which would do the matching, you may only supply one result per value. E.g.
object + {
def unapply(value: Int): Option[(Int, Int)] = value match {
case 0 => Some(0, 0)
case 4 => Some(3, 1)
case _ => None
}
Now this works:
0 match { case x + 0 => x } // returns 0
also this
4 match { case x + 1 => x } // returns 3
But this won’t and you cannot change it:
4 match { case x + 2 => x } // does not match
No problem for ::, though, because it is always defined what is head and what is tail of a list.

There are two ::s (pronounced "cons") in Scala. One is the operator on Lists and the other is a class, which represents a non empty list characterized by a head and a tail. So head :: tail is a constructor pattern, which has nothing to do with the operator.

Related

Calculating the mean of elements in a list in Scala

I'm trying to write a method that calculates the mean of the elements in a given List in Scala. Here's my code:
def meanElements(list: List[Float]): Float = {
list match {
case x :: tail => (x + meanElements(tail))/(list.length)
case Nil => 0
}
}
When I call meanElements(List(10,12,14))), the result I get is different than 12. Can someone help?
You can simply do it using inbuilt functions:
scala> def mean(list:List[Int]):Int =
| if(list.isEmpty) 0 else list.sum/list.size
mean: (list: List[Int])Int
scala> mean(List(10,12,14))
res1: Int = 12
scala>
The formula is not correct, it should be:
case x :: tail => (x + meanElements(tail) * tail.length) / list.length
But this implementation is performing a lot of divisions and multiplications.
It would be better to split the computation of the mean to two steps,
calculating the sum first,
and then dividing by list.length.
That is, something more like this:
def meanElements(list: List[Float]): Float = sum(list) / list.length
Where sum is a helper function you have to implement.
If you don't want to expose its implementation,
then you can define it in the body of meanElements.
(Or as #ph88 pointed out,
it could be as simple as list.reduce(_ + _).)

Scala code analyzer targets case variable names that are identical to the outer matched varables - "suspicous shadowing"

In the following code snippet in which the outer match vars (x,y) are case matched by (xx,yy):
scala> val (x,y) = (1,2)
x: Int = 1
y: Int = 2
scala> (x,y) match {
| case (xx:Int, yy:Int) => println(s"x=$x xx=$xx")
| }
x=1 xx=1
We could have also written that code as follows:
scala> (x,y) match {
| case (x:Int, y:Int) => println(s"x=$x y=$y")
| }
x=1 y=2
In this latter case the Scala Code Analyzers will inform us:
Suspicious shadowing by a Variable Pattern
OK. But is there any situation where we could end up actually misusing the inner variable (x or y) in place of the original outer match variables?
It seems this is purely stylistic? No actual possibility for bugs? If so i would be interested to learn what the bugs could be.
This could be confusing:
val x = Some(1)
val y = Some(2)
(x, y) match {
case (Some(x), Some(y)) => println(s"x=$x y=$y")
}
x and y have different types depending on whether you are inside or outside of the match. If this code wasn't using simply Option, and was several lines longer, it could be rather difficult to reason about.
Could any bugs arise from this? None that I can think of that aren't horribly contrived. You could for example, mistake one for another.
val list = List(1,2,3)
list match {
case x :: y :: list => list // List(3) and not List(1,2,3)
case x :: list => list // List with 1 element, should the outer list have size 2
case _ => list // Returns the outer list when empty
}
Not to mention what a horrible mess that is. Within the match, list sometimes refers to an inner symbol, and sometimes the outer list.
It's just code that's unnecessarily complicated to read and understand, there are no special bugs that could happen.

Refactoring a small Scala function

I have this function to compute the distance between two n-dimensional points using Pythagoras' theorem.
def computeDistance(neighbour: Point) = math.sqrt(coordinates.zip(neighbour.coordinates).map {
case (c1: Int, c2: Int) => math.pow(c1 - c2, 2)
}.sum)
The Point class (simplified) looks like:
class Point(val coordinates: List[Int])
I'm struggling to refactor the method so it's a little easier to read, can anybody help please?
Here's another way that makes the following three assumptions:
The length of the list is the number of dimensions for the point
Each List is correctly ordered, i.e. List(x, y) or List(x, y, z). We do not know how to handle List(x, z, y)
All lists are of equal length
def computeDistance(other: Point): Double = sqrt(
coordinates.zip(other.coordinates)
.flatMap(i => List(pow(i._2 - i._1, 2)))
.fold(0.0)(_ + _)
)
The obvious disadvantage here is that we don't have any safety around list length. The quick fix for this is to simply have the function return an Option[Double] like so:
def computeDistance(other: Point): Option[Double] = {
if(other.coordinates.length != coordinates.length) {
return None
}
return Some(sqrt(coordinates.zip(other.coordinates)
.flatMap(i => List(pow(i._2 - i._1, 2)))
.fold(0.0)(_ + _)
))
I'd be curious if there is a type safe way to ensure equal list length.
EDIT
It was politely pointed out to me that flatMap(x => List(foo(x))) is equivalent to map(foo) , which I forgot to refactor when I was originally playing w/ this. Slightly cleaner version w/ Map instead of flatMap :
def computeDistance(other: Point): Double = sqrt(
coordinates.zip(other.coordinates)
.map(i => pow(i._2 - i._1, 2))
.fold(0.0)(_ + _)
)
Most of your problem is that you're trying to do math with really long variable names. It's almost always painful. There's a reason why mathematicians use single letters. And assign temporary variables.
Try this:
class Point(val coordinates: List[Int]) { def c = coordinates }
import math._
def d(p: Point) = {
val delta = for ((a,b) <- (c zip p.c)) yield pow(a-b, dims)
sqrt(delta.sum)
}
Consider type aliases and case classes, like this,
type Coord = List[Int]
case class Point(val c: Coord) {
def distTo(p: Point) = {
val z = (c zip p.c).par
val pw = z.aggregate(0.0) ( (a,v) => a + math.pow( v._1-v._2, 2 ), _ + _ )
math.sqrt(pw)
}
}
so that for any two points, for instance,
val p = Point( (1 to 5).toList )
val q = Point( (2 to 6).toList )
we have that
p distTo q
res: Double = 2.23606797749979
Note method distTo uses aggregate on a parallelised collection of tuples, and combines the partial results by the last argument (summation). For high dimensional points this may prove more efficient than the sequential counterpart.
For simplicity of use, consider also implicit classes, as suggested in a comment above,
implicit class RichPoint(val c: Coord) extends AnyVal {
def distTo(d: Coord) = Point(c) distTo Point(d)
}
Hence
List(1,2,3,4,5) distTo List(2,3,4,5,6)
res: Double = 2.23606797749979

Implementing a recursive function using pattern matching

I want to rewrite a recursive function using pattern matching instead of if-else statements, but I am getting (correct) warning messages that some parts of the code are unreachable. In fact, I am getting wrong logic evaluation.
The function I am trying to re-write is:
def pascal(c: Int, r: Int): Int =
if (c == 0)
1
else if (c == r)
1
else
pascal(c - 1, r - 1) + pascal(c, r - 1)
This function works as expected. I re-wrote it as follows using pattern matching but now the function is not working as expected:
def pascal2 (c : Int, r : Int) : Int = c match {
case 0 => 1
case r => 1
case _ => pascal2(c - 1, r - 1) + pascal2(c, r - 1)
}
Where am I going wrong?
Main:
println("Pascal's Triangle")
for (row <- 0 to 10) {
for (col <- 0 to row)
print(pascal(col, row) + " ")
println()
}
The following statement "shadows" the variable r:
case r =>
That is to say, the "r" in that case statement is not, in fact, the "r" that you have defined above. It is it's own "r" which is equivalently equal to "c" because you are telling Scala to assign any value to some variable named "r."
Hence, what you really want is:
def pascal2(c: Int, r: Int): Int = c match{
case 0 => 1
case _ if c == r => 1
case _ => pascal2(c-1, r-1) + pascal2(c, r-1)
}
This is not, however tail recursive.
I fully agree with #wheaties and advice you to follow his directions. For sake of completeness I want to point out a few alternatives.
Alternative 1
You could write your own unapply:
def pascal(c: Int, r: Int): Int = {
object MatchesBoundary {
def unapply(i: Int) = if (i==0 || i==r) Some(i) else None
}
c match {
case MatchesBoundary(_) => 1
case _ => pascal(c-1, r-1) + pascal(c, r-1)
}
}
I would not claim that it improves readability in this case a lot. Just want to show the possibility to combine semantically similar cases (with identical/similar case bodies), which may be useful in more complex examples.
Alternative 2
There is also a possible solution, which exploits the fact that Scala's syntax for pattern matching only treats lower case variables as variables for a match. The following example shows what I mean by that:
def pascal(c: Int, r: Int): Int = {
val BoundaryL = 0
val BoundaryR = r
c match {
case BoundaryL => 1
case BoundaryR => 1
case _ => pascal(c-1, r-1) + pascal(c, r-1)
}
}
Since BoundaryL and BoundaryR start with upper case letters they are not treated as variables, but are used directly as matching object. Therefore the above works (while changing them to boundaryL and boundaryR would not, which btw also gives compiler warnings). This means that you could get your example to work simply by replacing r by R. Since this is a rather ugly solution I mention it only for educational purposes.

Selection sort in functional Scala

I'm making my way through "Programming in Scala" and wrote a quick implementation of the selection sort algorithm. However, since I'm still a bit green in functional programming, I'm having trouble translating to a more Scala-ish style. For the Scala programmers out there, how can I do this using Lists and vals rather than falling back into my imperative ways?
http://gist.github.com/225870
As starblue already said, you need a function that calculates the minimum of a list and returns the list with that element removed. Here is my tail recursive implementation of something similar (as I believe foldl is tail recursive in the standard library), and I tried to make it as functional as possible :). It returns a list that contains all the elements of the original list (but kindof reversed - see the explanation below) with the minimum as a head.
def minimum(xs: List[Int]): List[Int] =
(List(xs.head) /: xs.tail) {
(ys, x) =>
if(x < ys.head) (x :: ys)
else (ys.head :: x :: ys.tail)
}
This basically does a fold, starting with a list containing of the first element of xs If the first element of xs is smaller than the head of that list, we pre-append it to the list ys. Otherwise, we add it to the list ys as the second element. And so on recursively, we've folded our list into a new list containing the minimum element as a head and a list containing all the elements of xs (not necessarily in the same order) with the minimum removed, as a tail. Note that this function does not remove duplicates.
After creating this helper function, it's now easy to implement selection sort.
def selectionSort(xs: List[Int]): List[Int] =
if(xs.isEmpty) List()
else {
val ys = minimum(xs)
if(ys.tail.isEmpty)
ys
else
ys.head :: selectionSort(ys.tail)
}
Unfortunately this implementation is not tail recursive, so it will blow up the stack for large lists. Anyway, you shouldn't use a O(n^2) sort for large lists, but still... it would be nice if the implementation was tail recursive. I'll try to think of something... I think it will look like the implementation of a fold.
Tail Recursive!
To make it tail recursive, I use quite a common pattern in functional programming - an accumulator. It works a bit backward, as now I need a function called maximum, which basically does the same as minimum, but with the maximum element - its implementation is exact as minimum, but using > instead of <.
def selectionSort(xs: List[Int]) = {
def selectionSortHelper(xs: List[Int], accumulator: List[Int]): List[Int] =
if(xs.isEmpty) accumulator
else {
val ys = maximum(xs)
selectionSortHelper(ys.tail, ys.head :: accumulator)
}
selectionSortHelper(xs, Nil)
}
EDIT: Changed the answer to have the helper function as a subfunction of the selection sort function.
It basically accumulates the maxima to a list, which it eventually returns as the base case. You can also see that it is tail recursive by replacing accumulator by throw new NullPointerException - and then inspect the stack trace.
Here's a step by step sorting using an accumulator. The left hand side shows the list xs while the right hand side shows the accumulator. The maximum is indicated at each step by a star.
64* 25 12 22 11 ------- Nil
11 22 12 25* ------- 64
22* 12 11 ------- 25 64
11 12* ------- 22 25 64
11* ------- 12 22 25 64
Nil ------- 11 12 22 25 64
The following shows a step by step folding to calculate the maximum:
maximum(25 12 64 22 11)
25 :: Nil /: 12 64 22 11 -- 25 > 12, so it stays as head
25 :: 12 /: 64 22 11 -- same as above
64 :: 25 12 /: 22 11 -- 25 < 64, so the new head is 64
64 :: 22 25 12 /: 11 -- and stays so
64 :: 11 22 25 12 /: Nil -- until the end
64 11 22 25 12
You should have problems doing selection sort in functional style, as it is an in-place sort algorithm. In-place, by definition, isn't functional.
The main problem you'll face is that you can't swap elements. Here's why this is important. Suppose I have a list (a0 ... ax ... an), where ax is the minimum value. You need to get ax away, and then compose a list (a0 ... ax-1 ax+1 an). The problem is that you'll necessarily have to copy the elements a0 to ax-1, if you wish to remain purely functional. Other functional data structures, particularly trees, can have better performance than this, but the basic problem remains.
here is another implementation of selection sort (generic version).
def less[T <: Comparable[T]](i: T, j: T) = i.compareTo(j) < 0
def swap[T](xs: Array[T], i: Int, j: Int) { val tmp = xs(i); xs(i) = xs(j); xs(j) = tmp }
def selectiveSort[T <: Comparable[T]](xs: Array[T]) {
val n = xs.size
for (i <- 0 until n) {
val min = List.range(i + 1, n).foldLeft(i)((a, b) => if (less(xs(a), xs(b))) a else b)
swap(xs, i, min)
}
}
You need a helper function which does the selection. It should return the minimal element and the rest of the list with the element removed.
I think it's reasonably feasible to do a selection sort in a functional style, but as Daniel indicated, it has a good chance of performing horribly.
I just tried my hand at writing a functional bubble sort, as a slightly simpler and degenerate case of selection sort. Here's what I did, and this hints at what you could do:
define bubble(data)
if data is empty or just one element: return data;
otherwise, if the first element < the second,
return first element :: bubble(rest of data);
otherwise, return second element :: bubble(
first element :: (rest of data starting at 3rd element)).
Once that's finished recursing, the largest element is at the end of the list. Now,
define bubblesort [data]
apply bubble to data as often as there are elements in data.
When that's done, your data is indeed sorted. Yes, it's horrible, but my Clojure implementation of this pseudocode works.
Just concerning yourself with the first element or two and then leaving the rest of the work to a recursed activity is a lisp-y, functional-y way to do this kind of thing. But once you've gotten your mind accustomed to that kind of thinking, there are more sensible approaches to the problem.
I would recommend implementing a merge sort:
Break list into two sub-lists,
either by counting off half the elements into one sublist
and the rest in the other,
or by copying every other element from the original list
into either of the new lists.
Sort each of the two smaller lists (recursion here, obviously).
Assemble a new list by selecting the smaller from the front of either sub-list
until you've exhausted both sub-lists.
The recursion is in the middle of that, and I don't see a clever way of making the algorithm tail recursive. Still, I think it's O(log-2) in time and also doesn't place an exorbitant load on the stack.
Have fun, good luck!
Thanks for the hints above, they were very inspiring. Here's another functional approach to the selection sort algorithm. I tried to base it on the following idea: finding a max / min can be done quite easily by min(A)=if A=Nil ->Int.MaxValue else min(A.head, min(A.tail)). The first min is the min of a list, the second the min of two numbers. This is easy to understand, but unfortunately not tail recursive. Using the accumulator method the min definition can be transformed like this, now in correct Scala:
def min(x: Int,y: Int) = if (x<y) x else y
def min(xs: List[Int], accu: Int): Int = xs match {
case Nil => accu
case x :: ys => min(ys, min(accu, x))
}
(This is tail recursive)
Now a min version is needed which returns a list leaving out the min value. The following function returns a list whose head is the min value, the tail contains the rest of the original list:
def minl(xs: List[Int]): List[Int] = minl(xs, List(Int.MaxValue))
def minl(xs: List[Int],accu:List[Int]): List[Int] = xs match {
// accu always contains min as head
case Nil => accu take accu.length-1
case x :: ys => minl(ys,
if (x<accu.head) x::accu else accu.head :: x :: accu.tail )
}
Using this selection sort can be written tail recursively as:
def ssort(xs: List[Int], accu: List[Int]): List[Int] = minl(xs) match {
case Nil => accu
case min :: rest => ssort(rest, min::accu)
}
(reverses the order). In a test with 10000 list elements this algorithm is only about 4 times slower than the usual imperative algorithm.
Even though, when coding Scala, I'm used to prefer functional programming style (via combinators or recursion) over imperative style (via variables and iterations), THIS TIME, for this specific problem, old school imperative nested loops result in simpler and more performant code.
I don't think falling back to imperative style is a mistake for certain classes of problems, such as sorting algorithms which usually transform the input buffer in place rather than resulting to a new collection.
My solution is:
package bitspoke.algo
import scala.math.Ordered
import scala.collection.mutable.Buffer
abstract class Sorter[T <% Ordered[T]] {
// algorithm provided by subclasses
def sort(buffer : Buffer[T]) : Unit
// check if the buffer is sorted
def sorted(buffer : Buffer[T]) = buffer.isEmpty || buffer.view.zip(buffer.tail).forall { t => t._2 > t._1 }
// swap elements in buffer
def swap(buffer : Buffer[T], i:Int, j:Int) {
val temp = buffer(i)
buffer(i) = buffer(j)
buffer(j) = temp
}
}
class SelectionSorter[T <% Ordered[T]] extends Sorter[T] {
def sort(buffer : Buffer[T]) : Unit = {
for (i <- 0 until buffer.length) {
var min = i
for (j <- i until buffer.length) {
if (buffer(j) < buffer(min))
min = j
}
swap(buffer, i, min)
}
}
}
As you can see, to achieve parametric polymorphism, rather than using java.lang.Comparable, I preferred scala.math.Ordered and Scala View Bounds rather than Upper Bounds. That's certainly works thanks to Scala Implicit Conversions of primitive types to Rich Wrappers.
You can write a client program as follows:
import bitspoke.algo._
import scala.collection.mutable._
val sorter = new SelectionSorter[Int]
val buffer = ArrayBuffer(3, 0, 4, 2, 1)
sorter.sort(buffer)
assert(sorter.sorted(buffer))
A simple functional program for selection-sort in Scala
def selectionSort(list:List[Int]):List[Int] = {
#tailrec
def selectSortHelper(list:List[Int], accumList:List[Int] = List[Int]()): List[Int] = {
list match {
case Nil => accumList
case _ => {
val min = list.min
val requiredList = list.filter(_ != min)
selectSortHelper(requiredList, accumList ::: List.fill(list.length - requiredList.length)(min))
}
}
}
selectSortHelper(list)
}
You may want to try replacing your while loops with recursion, so, you have two places where you can create new recursive functions.
That would begin to get rid of some vars.
This was probably the toughest lesson for me, trying to move more toward FP.
I hesitate to show solutions here, as I think it would be better for you to try first.
But, if possible you should be using tail-recursion, to avoid problems with stack overflows (if you are sorting a very, very large list).
Here is my point of view on this problem: SelectionSort.scala
def selectionsort[A <% Ordered[A]](list: List[A]): List[A] = {
def sort(as: List[A], bs: List[A]): List[A] = as match {
case h :: t => select(h, t, Nil, bs)
case Nil => bs
}
def select(m: A, as: List[A], zs: List[A], bs: List[A]): List[A] =
as match {
case h :: t =>
if (m > h) select(m, t, h :: zs, bs)
else select(h, t, m :: zs, bs)
case Nil => sort(zs, m :: bs)
}
sort(list, Nil)
}
There are two inner functions: sort and select, which represents two loops in original algorithm. The first function sort iterates through the elements and call select for each of them. When the source list is empty it return bs list as result, which is initially Nil. The sort function tries to search for maximum (not minimum, since we build result list in reversive order) element in source list. It suppose that maximum is head by the default and then just replace it with a proper value.
This is 100% functional implementation of Selection Sort in Scala.
Here is my solution
def sort(list: List[Int]): List[Int] = {
#tailrec
def pivotCompare(p: Int, l: List[Int], accList: List[Int] = List.empty): List[Int] = {
l match {
case Nil => p +: accList
case x :: xs if p < x => pivotCompare(p, xs, accList :+ x)
case x :: xs => pivotCompare(x, xs, accList :+ p)
}
}
#tailrec
def loop(list: List[Int], accList: List[Int] = List.empty): List[Int] = {
list match {
case x :: xs =>
pivotCompare(x, xs) match {
case Nil => accList
case h :: tail => loop(tail, accList :+ h)
}
case Nil => accList
}
}
loop(list)
}