Functional programming, Scala map and fold left [closed] - scala

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are some good tutorials on fold left?
Original question, restored from deletion to provide context for other answers:
I am trying to implement a method for finding the boudning box of rectangle, circle, location and the group which all extends Shape. Group is basically an array of Shapes
abstract class Shape
case class Rectangle(width: Int, height: Int) extends Shape
case class Location(x: Int, y: Int, shape: Shape) extends Shape
case class Circle(radius: Int) extends Shape
case class Group(shape: Shape*) extends Shape
I got the bounding box computed for all three except the Group one. So now for the bounding box method I know I should be using map and fold left for Group, but I just can't find out the exact syntax of creating it.
object BoundingBox {
def boundingBox(s: Shape): Location = s match {
case Circle(c)=>
new Location(-c,-c,s)
case Rectangle(_, _) =>
new Location(0, 0, s)
case Location(x, y, shape) => {
val b = boundingBox(shape)
Location(x + b.x, y + b.y, b.shape)
}
case Group(shapes # _*) => ( /: shapes) { } // i dont know how to proceed here.
}
}
Group bounding box is basically the smallest bounding box with all the shapes enclosed.

Now that you've edited to ask an almost completely different question, I'll give a different answer. Rather than point to a tutorial on maps and folds, I'll just give one.
In Scala, you first need to know how to create an anonymous function. It goes like so, from most general to more specific:
(var1: Type1, var2: Type2, ..., varN: TypeN) => /* output */
(var1, var2, ..., varN) => /* output, if types can be inferred */
var1 => /* output, if type can be inferred and N=1 */
Here are some examples:
(x: Double, y: Double, z: Double) => Math.sqrt(x*x + y*y + z*z)
val f:(Double,Double)=>Double = (x,y) => x*y + Math.exp(-x*y)
val neg:Double=>Double = x => -x
Now, the map method of lists and such will apply a function (anonymous or otherwise) to every element of the map. That is, if you have
List(a1,a2,...,aN)
f:A => B
then
List(a1,a2,...,aN) map (f)
produces
List( f(a1) , f(a2) , ..., f(aN) )
There are all sorts of reasons why this might be useful. Maybe you have a bunch of strings and you want to know how long each is, or you want to make them all upper case, or you want them backwards. If you have a function that does what you want to one element, map will do it to all elements:
scala> List("How","long","are","we?") map (s => s.length)
res0: List[Int] = List(3, 4, 3, 3)
scala> List("How","capitalized","are","we?") map (s => s.toUpperCase)
res1: List[java.lang.String] = List(HOW, CAPITALIZED, ARE, WE?)
scala> List("How","backwards","are","we?") map (s => s.reverse)
res2: List[scala.runtime.RichString] = List(woH, sdrawkcab, era, ?ew)
So, that's map in general, and in Scala.
But what if we want to collect our results? That's where fold comes in (foldLeft being the version that starts on the left and works right).
Suppose we have a function f:(B,A) => B, that is, it takes a B and an A, and combines them to produce a B. Well, we could start with a B, and then feed our list of A's into it one at a time, and at the end of it all, we'd have some B. That's exactly what fold does. foldLeft does it starting from the left end of the list; foldRight starts from the right. That is,
List(a1,a2,...,aN) foldLeft(b0)(f)
produces
f( f( ... f( f(b0,a1) , a2 ) ... ), aN )
where b0 is, of course, your initial value.
So, maybe we have a function that takes an int and a string, and returns the int or the length of the string, whichever is greater--if we folded our list using that, it would tell us the longest string (assuming that we start with 0). Or we could add the length to the int, accumulating values as we go.
Let's give it a try.
scala> List("How","long","is","longest?").foldLeft(0)((i,s) => i max s.length)
res3: Int = 8
scala> List("How","long","is","everyone?").foldLeft(0)((i,s) => i + s.length)
res4: Int = 18
Okay, fine, but what if we want to know who is the longest? One way (perhaps not the best, but it illustrates a useful pattern well) is to carry along both the length (an integer) and the leading contender (a string). Let's give that a go:
scala> List("Who","is","longest?").foldLeft((0,""))((i,s) =>
| if (i._1 < s.length) (s.length,s)
| else i
| )
res5: (Int, java.lang.String) = (8,longest?)
Here, i is now a tuple of type (Int,String), and i._1 is the first part of that tuple (an Int).
But in some cases like this, using a fold isn't really want we want. If we want the longer of two strings, the most natural function would be one like max:(String,String)=>String. How do we apply that one?
Well, in this case, there is a default "shortest" case, so we could fold the string-max function starting with "". But a better way is to use reduce. As with fold, there are two versions, one that works from the left, the other which works from the right. It takes no initial value, and requires a function f:(A,A)=>A. That is, it takes two things and returns one of the same type. Here's an example with a string-max function:
scala> List("Who","is","longest?").reduceLeft((s1,s2) =>
| if (s2.length > s1.length) s2
| else s1
| )
res6: java.lang.String = longest?
Now, there are just two more tricks. First, the following two mean the same thing:
list.foldLeft(b0)(f)
(b0 /: list)(f)
Notice how the second is shorter, and it sort of gives you the impression that you're taking b0 and doing something to the list with it (which you are). (:\ is the same as foldRight, but you use it like so: (list :\ b0) (f)
Second, if you only refer to a variable once, you can use _ instead of the variable name and omit the x => part of the anonymous function declaration. Here are two examples:
scala> List("How","long","are","we?") map (_.length)
res7: List[Int] = List(3, 4, 3, 3)
scala> (0 /: List("How","long","are","we","all?"))(_ + _.length)
res8: Int = 16
At this point, you should be able to create functions and map, fold, and reduce them using Scala. Thus, if you know how your algorithm should work, it should be reasonably straightforward to implement it.

The basic algorithm would go like this:
shapes.tail.foldLeft(boundingBox(shapes.head)) {
case (box, shape) if box contains shape => box
case (box, shape) if shape contains box => shape
case (box, shape) => boxBounding(box, shape)
}
Now you have to write contains and boxBounding, which is a pure algorithms problem more than a language problem.
If the shapes all had the same center, implementing contains would be easier. It would go like this:
abstract class Shape { def contains(s: Shape): Boolean }
case class Rectangle(width: Int, height: Int) extends Shape {
def contains(s: Shape): Boolean = s match {
case Rectangle(w2, h2) => width >= w2 && height >= h2
case Location(x, y, s) => // not the same center
case Circle(radius) => width >= radius && height >= radius
case Group(shapes # _*) => shapes.forall(this.contains(_))
}
}
case class Location(x: Int, y: Int, shape: Shape) extends Shape {
def contains(s: Shape): Boolean = // not the same center
}
case class Circle(radius: Int) extends Shape {
def contains(s: Shape): Boolean = s match {
case Rectangle(width, height) => radius >= width && radius >= height
case Location(x, y) => // not the same center
case Circle(r2) => radius >= r2
case Group(shapes # _*) => shapes.forall(this.contains(_))
}
}
case class Group(shapes: Shape*) extends Shape {
def contains(s: Shape): Boolean = shapes.exists(_ contains s)
}
As for boxBounding, which takes two shapes and combine them, it will usually be a rectangle, but can be a circle under certain circunstances. Anyway, it is pretty straight-forward, once you have the algorithm figured out.

A bounding box is usually a rectangle. I don't think a circle located at (-r,-r) is the bounding box of a circle of radius r....
Anyway, suppose you have a bounding box b1 and another b2 and a function combineBoxes that computes the bounding box of b1 and b2.
Then if you have a non-empty set of shapes in your group, you can use reduceLeft to compute the whole bounding box of a list of bounding boxes by combining them two at a time until only one giant box remains. (The same idea can be used to reduce a list of numbers to a sum of numbers by adding them in pairs. And it's called reduceLeft because it works left to right across the list.)
Suppose that blist is a list of bounding boxes of each shape. (Hint: this is where map comes in.) Then
val bigBox = blist reduceLeft( (box1,box2) => combineBoxes(box1,box2) )
You'll need to catch the empty group case separately, however. (Since it has a no well-defined bounding box, you don't want to use folds; folds are good for when there is a default empty case that makes sense. Or you have to fold with Option, but then your combining function has to understand how to combine None with Some(box), which is probably not worth it in this case--but very well might be if you were writing production code that needs to elegantly handle various sorts of empty list situations.)

Related

Partially applied/curried function vs overloaded function

Whilst I understand what a partially applied/curried function is, I still don't fully understand why I would use such a function vs simply overloading a function. I.e. given:
def add(a: Int, b: Int): Int = a + b
val addV = (a: Int, b: Int) => a + b
What is the practical difference between
def addOne(b: Int): Int = add(1, b)
and
def addOnePA = add(1, _:Int)
// or currying
val addOneC = addV.curried(1)
Please note I am NOT asking about currying vs partially applied functions as this has been asked before and I have read the answers. I am asking about currying/partially applied functions VS overloaded functions
The difference in your example is that overloaded function will have hardcoded value 1 for the first argument to add, i.e. set at compile time, while partially applied or curried functions are meant to capture their arguments dynamically, i.e. at run time. Otherwise, in your particular example, because you are hardcoding 1 in both cases it's pretty much the same thing.
You would use partially applied/curried function when you pass it through different contexts, and it captures/fills-in arguments dynamically until it's completely ready to be evaluated. In FP this is important because many times you don't pass values, but rather pass functions around. It allows for higher composability and code reusability.
There's a couple reasons why you might prefer partially applied functions. The most obvious and perhaps superficial one is that you don't have to write out intermediate functions such as addOnePA.
List(1, 2, 3, 4) map (_ + 3) // List(4, 5, 6, 7)
is nicer than
def add3(x: Int): Int = x + 3
List(1, 2, 3, 4) map add3
Even the anonymous function approach (that the underscore ends up expanding out to by the compiler) feels a tiny bit clunky in comparison.
List(1, 2, 3, 4) map (x => x + 3)
Less superficially, partial application comes in handy when you're truly passing around functions as first-class values.
val fs = List[(Int, Int) => Int](_ + _, _ * _, _ / _)
val on3 = fs map (f => f(_, 3)) // partial application
val allTogether = on3.foldLeft{identity[Int] _}{_ compose _}
allTogether(6) // (6 / 3) * 3 + 3 = 9
Imagine if I hadn't told you what the functions in fs were. The trick of coming up with named function equivalents instead of partial application becomes harder to use.
As for currying, currying functions often lets you naturally express transformations of functions that produce other functions (rather than a higher order function that simply produces a non-function value at the end) which might otherwise be less clear.
For example,
def integrate(f: Double => Double, delta: Double = 0.01)(x: Double): Double = {
val domain = Range.Double(0.0, x, delta)
domain.foldLeft(0.0){case (acc, a) => delta * f(a) + acc
}
can be thought of and used in the way that you actually learned integration in calculus, namely as a transformation of a function that produces another function.
def square(x: Double): Double = x * x
// Ignoring issues of numerical stability for the moment...
// The underscore is really just a wart that Scala requires to bind it to a val
val cubic = integrate(square) _
val quartic = integrate(cubic) _
val quintic = integrate(quartic) _
// Not *utterly* horrible for a two line numerical integration function
cubic(1) // 0.32835000000000014
quartic(1) // 0.0800415
quintic(1) // 0.015449626499999999
Currying also alleviates a few of the problems around fixed function arity.
implicit class LiftedApply[A, B](fOpt: Option[A => B]){
def ap(xOpt: Option[A]): Option[B] = for {
f <- fOpt
x <- xOpt
} yield f(x)
}
def not(x: Boolean): Boolean = !x
def and(x: Boolean)(y: Boolean): Boolean = x && y
def and3(x: Boolean)(y: Boolean)(z: Boolean): Boolean = x && y && z
Some(not _) ap Some(false) // true
Some(and _) ap Some(true) ap Some(true) // true
Some(and3 _) ap Some(true) ap Some(true) ap Some(true) // true
By having curried functions, we've been able to "lift" a function to work on Option for as many arguments as we need. If our logic functions had not been curried, then we would have had to have separate functions to lift A => B to Option[A] => Option[B], (A, B) => C to (Option[A], Option[B]) => Option[C], (A, B, C) => D to (Option[A], Option[B], Option[C]) => Option[D] and so on for all the arities we cared about.
Currying also has some other miscellaneous benefits when it comes to type inference and is required if you have both implicit and non-implicit arguments for a method.
Finally, the answers to this question list out some more times you might want currying.

Scala code analyzer targets case variable names that are identical to the outer matched varables - "suspicous shadowing"

In the following code snippet in which the outer match vars (x,y) are case matched by (xx,yy):
scala> val (x,y) = (1,2)
x: Int = 1
y: Int = 2
scala> (x,y) match {
| case (xx:Int, yy:Int) => println(s"x=$x xx=$xx")
| }
x=1 xx=1
We could have also written that code as follows:
scala> (x,y) match {
| case (x:Int, y:Int) => println(s"x=$x y=$y")
| }
x=1 y=2
In this latter case the Scala Code Analyzers will inform us:
Suspicious shadowing by a Variable Pattern
OK. But is there any situation where we could end up actually misusing the inner variable (x or y) in place of the original outer match variables?
It seems this is purely stylistic? No actual possibility for bugs? If so i would be interested to learn what the bugs could be.
This could be confusing:
val x = Some(1)
val y = Some(2)
(x, y) match {
case (Some(x), Some(y)) => println(s"x=$x y=$y")
}
x and y have different types depending on whether you are inside or outside of the match. If this code wasn't using simply Option, and was several lines longer, it could be rather difficult to reason about.
Could any bugs arise from this? None that I can think of that aren't horribly contrived. You could for example, mistake one for another.
val list = List(1,2,3)
list match {
case x :: y :: list => list // List(3) and not List(1,2,3)
case x :: list => list // List with 1 element, should the outer list have size 2
case _ => list // Returns the outer list when empty
}
Not to mention what a horrible mess that is. Within the match, list sometimes refers to an inner symbol, and sometimes the outer list.
It's just code that's unnecessarily complicated to read and understand, there are no special bugs that could happen.

Refactoring a small Scala function

I have this function to compute the distance between two n-dimensional points using Pythagoras' theorem.
def computeDistance(neighbour: Point) = math.sqrt(coordinates.zip(neighbour.coordinates).map {
case (c1: Int, c2: Int) => math.pow(c1 - c2, 2)
}.sum)
The Point class (simplified) looks like:
class Point(val coordinates: List[Int])
I'm struggling to refactor the method so it's a little easier to read, can anybody help please?
Here's another way that makes the following three assumptions:
The length of the list is the number of dimensions for the point
Each List is correctly ordered, i.e. List(x, y) or List(x, y, z). We do not know how to handle List(x, z, y)
All lists are of equal length
def computeDistance(other: Point): Double = sqrt(
coordinates.zip(other.coordinates)
.flatMap(i => List(pow(i._2 - i._1, 2)))
.fold(0.0)(_ + _)
)
The obvious disadvantage here is that we don't have any safety around list length. The quick fix for this is to simply have the function return an Option[Double] like so:
def computeDistance(other: Point): Option[Double] = {
if(other.coordinates.length != coordinates.length) {
return None
}
return Some(sqrt(coordinates.zip(other.coordinates)
.flatMap(i => List(pow(i._2 - i._1, 2)))
.fold(0.0)(_ + _)
))
I'd be curious if there is a type safe way to ensure equal list length.
EDIT
It was politely pointed out to me that flatMap(x => List(foo(x))) is equivalent to map(foo) , which I forgot to refactor when I was originally playing w/ this. Slightly cleaner version w/ Map instead of flatMap :
def computeDistance(other: Point): Double = sqrt(
coordinates.zip(other.coordinates)
.map(i => pow(i._2 - i._1, 2))
.fold(0.0)(_ + _)
)
Most of your problem is that you're trying to do math with really long variable names. It's almost always painful. There's a reason why mathematicians use single letters. And assign temporary variables.
Try this:
class Point(val coordinates: List[Int]) { def c = coordinates }
import math._
def d(p: Point) = {
val delta = for ((a,b) <- (c zip p.c)) yield pow(a-b, dims)
sqrt(delta.sum)
}
Consider type aliases and case classes, like this,
type Coord = List[Int]
case class Point(val c: Coord) {
def distTo(p: Point) = {
val z = (c zip p.c).par
val pw = z.aggregate(0.0) ( (a,v) => a + math.pow( v._1-v._2, 2 ), _ + _ )
math.sqrt(pw)
}
}
so that for any two points, for instance,
val p = Point( (1 to 5).toList )
val q = Point( (2 to 6).toList )
we have that
p distTo q
res: Double = 2.23606797749979
Note method distTo uses aggregate on a parallelised collection of tuples, and combines the partial results by the last argument (summation). For high dimensional points this may prove more efficient than the sequential counterpart.
For simplicity of use, consider also implicit classes, as suggested in a comment above,
implicit class RichPoint(val c: Coord) extends AnyVal {
def distTo(d: Coord) = Point(c) distTo Point(d)
}
Hence
List(1,2,3,4,5) distTo List(2,3,4,5,6)
res: Double = 2.23606797749979

Scala Code — I fail to understand [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I've got part of code from friend and I'm trying to understand it and write it in some other way. "gotowe" is a sorted list of ("2011-12-22",-600.00) elements
val wartosci = gotowe.foldLeft (List(initial_ballance)){
case ((h::t), x) => (x._2 + h)::h::t
case _ => Nil
}.reverse
That is quite okay but how with this usage of foldLeft? (I've put all extra necessary lines):
val max = wartosci.max
val min = wartosci.min
val wychylenie = if(math.abs(min)>max){math.abs(min)}else{max}
def scale(x: Double) =
(x / wychylenie) * 500
def point(x: Double) =
{val z:Int = (500 - x).toInt
z}
val (points, _) = wartosci.foldLeft(("", 1)){case ((a, i), h) => (a + " " + (i * 4) + "," + point(scale(h)), i + 1)}
when I print points I've got a list of values, and don't know why not something like pairs of values
There are a couple of concepts at work here, which we'll examine in turn to work out what's going on:
foldLeft
Pattern matching
Let's first look at the definition of foldLeft:
def foldLeft [B] (z: B)(f: (B, A) ⇒ B) : B
Applies a binary operator to a start value and all elements of this list, going left to right.
Returns the result of inserting op between consecutive elements of this list, going left to right with the start value z on the left: op(...op(z, x1), x2, ..., xn) where x1,..., xn are the elements of this list.
So, in your example we're taking a list of Tuple2[String, Float] (or something like that) and folding it into the value z, which in this case is a List containing one element, initial_balance.
Now, our f in this case is the code inside the braces. It uses pattern matching to compose a partial function from the pair (b,a) - where in this case b is the 'cumulative result' and a is the next item in the list. This is the crux of what a fold does - it collapses the list into a value, using specific rules governing how to add each element at a time.
What is pattern matching / a partial function? Pattern matching is a very powerful technique for conditioning on and extracting things from input data. We give it something to look for - the case part of the expression - and tell it how to deal with it following the =>. The power of this is that the case expression doesn't just match, say, numbers or specific strings as might the switch statement in java, but can match, for example, Lists of a certain length, or email addresses, or specific tuples. Even more, you can use it to automatically get certain parts of the match - the domain of the email address, the third element of the list etc.
We'll look at the first pattern:
case ((h::t), x) => (x._2 + h)::h::t
The left hand side (before the =>) is used to match the value we're looking for and extract the specific pieces we care about. In this case, we're looking for a tuple where the first element is a list consisting of a head (h) and a tail(t), and the second element is just the next element of the list. The h::t is an extractor pattern - it's matching the object ::(h,t) which constructs a List by prepending h onto an existing List t.
When we've matched this, we follow the instructions to the right of the => to fold x into the cumulative value. To do this, we take the right hand side of the date/value tuple (the ._2), add it to the last value in the list (the head), and then push itself on to the head of the list. You'll notice this is using the same syntax as we used in the pattern match - using :: to prepend elements to a List.
The effect in this case is to create a running total of what's going on.
The second case doesn't really do much - it's a catch all case, but as this is being used in a fold it should never get called - we're always going to return something that looks like ((h::t), x).
Finally, we reverse the whole thing! So what we're left with is a list of balances after each transaction, running from oldest to youngest.
This is quite simple. It's just the matter of the assignment. You have this:
val (points, _) = wartosci.foldLeft(("", 1)){...}
What is inside {...} is not relevant. The first parameter of foldLeft will determine the type of its result. Since it is ("", 1), it will return a (String, Int) tuple.
Now, you assign it to (points, _). An assignment like this is also a pattern match. It is like you had written this:
var tmp: (String, Int) = _
val tmp: (String, Int) = wartosci.foldLeft(("", 1)){...} match {
case (x, y) => tmp = (x, y)
}
val points = tmp._1
So, points only gets assigned the String.

How to perform pattern matching with vararg case classes?

I have a set of case classes like this
abstract class Shape
case class Rectangle(width: Int, height: Int) extends Shape
case class Location(x: Int, y: Int, shape: Shape) extends Shape
case class Circle(radius: Int) extends Shape
case class Group(shape: Shape*) extends Shape
where basically Group is an array of shapes. I need to define a size method for computing sizes
for rectangle, circle and location its straightforward just return one. But i am having difficulty for Group.
object size extends Shape{
def size(s: Any) : Int = s match {
case Rectangle(x,y) => 1
case Group // how to do it? Also having case Group(shape : Shape*) gives an error
case Circle(r) => 1
case Location(x,y,shape) => 1
}
}
I know for Group i need to use map and fold left, but i really cant create a logic for it.
Thanks
Either of these will work, the second is probably preferred if a little weird at first glance. See 8.1.9 Pattern Sequences from the Scala Reference.
case g: Group => g.shape.map(size(_)).sum
case Group(ss # _*) => ss.map(size(_)).sum
This is using Scala 2.8. sum may not work on older versions.
The syntax for vararg pattern matching is somewhat strange.
def size(s: Shape) : Int = s match{
case Rectangle(x,y) => 1
case Circle(r) => 1
case Location(x,y,shape) => 1
case Group(shapes # _*) => (0 /: shapes) { _ + size(_) }
}
Note that in the last line, you sum up the sizes of all sub-shapes starting with zero using the /:-notation for folds.
How folds work: Folds accumulate the elements of a sequence using a given function.
So in order to compute the sum of a list, we would write (Haskell-style)
fold (\total element -> total + element) 0 list
which would combine all elements of the list with the given addition function starting with 0 (and therefore compute the sum).
In Scala, we can write it this way:
(0 /: list) { (total, element) => total + element }
which can be simplified to
(0 /: list) { _ + _ }
The first step is figuring out what you mean. The two most obvious choices are the total area covered by all the shapes, and the minimum rectangle containing them all. If for circles you return the actual area, they you probably have to go with the actual area.
There's no closed-form way to answer this. I might consider throwing a thousand random darts at a minimum enclosing rectangle and estimating the area as the percentage of darts that hit an occupied point. Is an estimate an acceptable response?
Are you guaranteed that all the shapes will be circles and rectangles? You might be able to cobble together a solution that would work for them. If Shapes might be extended further, then that won't work.
For Location size should drill down to get size since shape could be group which causes a higher count
case Location(x,y,shape) => size(shape)
That is if size is the number of shapes in the Shape
case g: Group => g.shape.map(size(_)).sum
case Group(ss # *) => ss.map(size()).sum
both of these gives the error value sum is not a member of Seq[Int]
However this oen works
case Group(shapes # _*) => (0 /: shapes) { _ + size(_)