Why does Scala require pattern variables to be linear? - scala

Scala requires pattern variables to be linear, i.e. pattern
variable may not occur more than once in a pattern. Thus, this example does not compile:
def tupleTest(tuple: (Int, Int)) = tuple match {
case (a, a) => a
case _ => -1
}
But you can use two pattern variables and a guard to check equality instead:
def tupleTest(tuple: (Int, Int)) = tuple match {
case (a, b) if a == b => a
case _ => -1
}
So why does Scala require pattern variables to be linear? Are there any cases that can not be transformed like this?
Edit
It is easy to transform the first example into the second (Scala to Scala). Of all occurrences of a variable v in the pattern take the expressions that is evaluated first and assign it to the variable v. For each other occurrence introduce a new variable with a name that is not used in the current scope. For each of those variables v' add a guard v == v'. It is the same way a programmer would go (=> same efficiency). Is there any problem with this approach? I'd like to see an example that can not be transformed like this.

Because case (a, b) is basically assigning val a to _._1 and val b to _._2 (at least you can view it like that). In case of case (a, a), you cannot assign val a to both _._1 and _._2.
Actually the thing you want to do would have been looked like
case (a, `a`) => ???
as scala uses backtick to match an identifier. But unfortunately that still doesn't work as the visibility of a is given only after => (would have been fun though, I also hate writing case (a, b) if a = b =>). And the reason of this is probably just because it is harder to write a compiler that supports that

Related

Fold method using List as accumulator

To find prime factors of a number I was using this piece of code :
def primeFactors(num: Long): List[Long] = {
val exists = (2L to math.sqrt(num).toLong).find(num % _ == 0)
exists match {
case Some(d) => d :: primeFactors(num/d)
case None => List(num)
}
}
but this I found a cool and more functional approach to solve this using this code:
def factors(n: Long): List[Long] = (2 to math.sqrt(n).toInt)
.find(n % _ == 0).fold(List(n)) ( i => i.toLong :: factors(n / i))
Earlier I was using foldLeft or fold simply to get sum of a list or other simple calculations, but here I can't seem to understand how fold is working and how this is breaking out of the recursive function.Can somebody plz explain how fold functionality is working here.
Option's fold
If you look at the signature of Option's fold function, it takes two parameters:
def fold[B](ifEmpty: => B)(f: A => B): B
What it does is, it applies f on the value of Option if it is not empty. If Option is empty, it simply returns output of ifEmpty (this is termination condition for recursion).
So in your case, i => i.toLong :: factors(n / i) represents f which will be evaluated if Option is not empty. While List(n) is termination condition.
fold used for collection / iterators
The other fold that you are taking about for getting sum of collection, comes from TraversableOnce and it has signature like:
def foldLeft[B](z: B)(op: (B, A) => B): B
Here, z is starting value (suppose incase of sum it's 0) and op is associative binary operator which is applied on z and each value of collection from left to right.
So both folds differ in their implementation.

What's the reasoning behind adding the "case" keyword to Scala?

Apart from:
case class A
... case which is quite useful?
Why do we need to use case in match? Wouldn't:
x match {
y if y > 0 => y * 2
_ => -1
}
... be much prettier and concise?
Or why do we need to use case when a function takes a tuple? Say, we have:
val z = List((1, -1), (2, -2), (3, -3)).zipWithIndex
Now, isn't:
z map { case ((a, b), i) => a + b + i }
... way uglier than just:
z map (((a, b), i) => a + b + i)
...?
First, as we know, it is possible to put several statements for the same case scenario without needing some separation notation, just a line jump, like :
x match {
case y if y > 0 => y * 2
println("test")
println("test2") // these 3 statements belong to the same "case"
}
If case was not needed, compiler would have to find a way to know when a line is concerned by the next case scenario.
For example:
x match {
y if y > 0 => y * 2
_ => -1
}
How compiler would know whether _ => -1 belongs to the first case scenario or represents the next case?
Moreover, how compiler would know that the => sign doesn't represent a literal function but the actual code for the current case?
Compiler would certainly need a kind of code like this allowing cases isolation:
(using curly braces, or anything else)
x match {
{y if y > 0 => y * 2}
{_ => -1} // confusing with literal function notation
}
And surely, solution (provided currently by scala) using case keyword is a lot more readable and understandable than putting some way of separation like curly braces in my example.
Adding to #Mik378's answer:
When you write this: (a, b) => something, you are defining an anonymous Function2 - a function that takes two parameters.
When you write this: case (a, b) => something, you are defining an anonymous PartialFunction that takes one parameter and matches it against a pair.
So you need the case keyword to differentiate between these two.
The second issue, anonymous functions that avoid the case, is a matter of debate:
https://groups.google.com/d/msg/scala-debate/Q0CTZNOekWk/z1eg3dTkCXoJ
Also: http://www.scala-lang.org/old/node/1260
For the first issue, the choice is whether you allow a block or an expression on the RHS of the arrow.
In practice, I find that shorter case bodies are usually preferable, so I can certainly imagine your alternative syntax resulting in crisper code.
Consider one-line methods. You write:
def f(x: Int) = 2 * x
then you need to add a statement. I don't know if the IDE is able to auto-add parens.
def f(x: Int) = { val res = 2*x ; res }
That seems no worse than requiring the same syntax for case bodies.
To review, a case clause is case Pattern Guard => body.
Currently, body is a block, or a sequence of statements and a result expression.
If body were an expression, you'd need braces for multiple statements, like a function.
I don't think => results in ambiguities since function literals don't qualify as patterns, unlike literals like 1 or "foo".
One snag might be: { case foo => ??? } is a "pattern matching anonymous function" (SLS 8.5). Obviously, if the case is optional or eliminated, then { foo => ??? } is ambiguous. You'd have to distinguish case clauses for anon funs (where case is required) and case clauses in a match.
One counter-argument for the current syntax is that, in an intuition deriving from C, you always secretly hope that your match will compile to a switch table. In that metaphor, the cases are labels to jump to, and a label is just the address of a sequence of statements.
The alternative syntax might encourage a more inlined approach:
x match {
C => c(x)
D => d(x)
_ => ???
}
#inline def c(x: X) = ???
//etc
In this form, it looks more like a dispatch table, and the match body recalls the Map syntax, Map(a -> 1, b -> 2), that is, a tidy simplification of the association.
One of the key aspects of code readability is the words that grab your attention. For example,
return grabs your attention when you see it because you know that it is such a decisive action (breaking out of the function and possible sending a value back to the caller).
Another example is break--not that I like break, but it gets your attention.
I would agree with #Mik378 that case in Scala is more readable than the alternatives. Besides the compiler confusion he mentions, it gets your attention.
I am all for concise code, but there is a line between concise and illegible. I will gladly make the trade of 4n characters (where n is the number of cases) for the substantial readability that I get in return.

Binary operator with Option arguments

In scala, how do I define addition over two Option arguments? Just to be specific, let's say they're wrappers for Int types (I'm actually working with maps of doubles but this example is simpler).
I tried the following but it just gives me an error:
def addOpt(a:Option[Int], b:Option[Int]) = {
a match {
case Some(x) => x.get
case None => 0
} + b match {
case Some(y) => y.get
case None => 0
}
}
Edited to add:
In my actual problem, I'm adding two maps which are standins for sparse vectors. So the None case returns Map[Int, Double] and the + is actually a ++ (with the tweak at stackoverflow.com/a/7080321/614684)
Monoids
You might find life becomes a lot easier when you realize that you can stand on the shoulders of giants and take advantage of common abstractions and the libraries built to use them. To this end, this question is basically about dealing with
monoids (see related questions below for more about this) and the library in question is called scalaz.
Using scalaz FP, this is just:
def add(a: Option[Int], b: Option[Int]) = ~(a |+| b)
What is more this works on any monoid M:
def add[M: Monoid](a: Option[M], b: Option[M]) = ~(a |+| b)
Even more usefully, it works on any number of them placed inside a Foldable container:
def add[M: Monoid, F: Foldable](as: F[Option[M]]) = ~as.asMA.sum
Note that some rather useful monoids, aside from the obvious Int, String, Boolean are:
Map[A, B: Monoid]
A => (B: Monoid)
Option[A: Monoid]
In fact, it's barely worth the bother of extracting your own method:
scala> some(some(some(1))) #:: some(some(some(2))) #:: Stream.empty
res0: scala.collection.immutable.Stream[Option[Option[Option[Int]]]] = Stream(Some(Some(Some(1))), ?)
scala> ~res0.asMA.sum
res1: Option[Option[Int]] = Some(Some(3))
Some related questions
Q. What is a monoid?
A monoid is a type M for which there exists an associative binary operation (M, M) => M and an identity I under this operation, such that mplus(m, I) == m == mplus(I, m) for all m of type M
Q. What is |+|?
This is just scalaz shorthand (or ASCII madness, ymmv) for the mplus binary operation
Q. What is ~?
It is a unary operator meaning "or identity" which is retrofitted (using scala's implicit conversions) by the scalaz library onto Option[M] if M is a monoid. Obviously a non-empty option returns its contents; an empty option is replaced by the monoid's identity.
Q. What is asMA.sum?
A Foldable is basically a datastructure which can be folded over (like foldLeft, for example). Recall that foldLeft takes a seed value and an operation to compose successive computations. In the case of summing a monoid, the seed value is the identity I and the operation is mplus. You can hence call asMA.sum on a Foldable[M : Monoid]. You might need to use asMA because of the name clash with the standard library's sum method.
Some References
Slides and Video of a talk I gave which gives practical examples of using monoids in the wild
def addOpts(xs: Option[Int]*) = xs.flatten.sum
This will work for any number of inputs.
If they both default to 0 you don't need pattern matching:
def addOpt(a:Option[Int], b:Option[Int]) = {
a.getOrElse(0) + b.getOrElse(0)
}
(Repeating comment above in an answer as requested)
You don't extract the content of the option the proper way. When you match with case Some(x), x is the value inside the option(type Int) and you don't call get on that. Just do
case Some(x) => x
Anyway, if you want content or default, a.getOrElse(0) is more convenient
def addOpt(ao: Option[Int], bo: Option[Int]) =
for {
a <- ao
b <- bo
} yield a + b

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther

Why doesn't Option have a fold method?

I wonder why scala.Option doesn't have a method fold like this defined:
fold(ifSome: A => B , ifNone: => B)
equivalent to
map(ifSome).getOrElse(ifNone)
Is there no better than using map + getOrElse?
I personally find methods like cata that take two closures as arguments are often overdoing it. Do you really gain in readability over map + getOrElse? Think of a newcomer to your code: What will they make of
opt cata { x => x + 1, 0 }
Do you really think this is clearer than
opt map { x => x + 1 } getOrElse 0
In fact I would argue that neither is preferable over the good old
opt match {
case Some(x) => x + 1
case None => 0
}
As always, there's a limit where additional abstraction does not give you benefits and turns counter-productive.
It was finally added in Scala 2.10, with the signature fold[B](ifEmpty: => B)(f: A => B): B.
Unfortunately, this has a common negative consequence: B is inferred for calls based only on the ifEmpty argument, which is in practice often more narrow. E.g. (a correct version is already in the standard library, this is just for demonstration)
def toList[A](x: Option[A]) = x.fold(Nil)(_ :: Nil)
Scala will infer B to be Nil.type instead of desired List[A] and complain about f not returning Nil.type. Instead, you need one of
x.fold[List[A]](Nil)(_ :: Nil)
x.fold(Nil: List[A])(_ :: Nil)
This makes fold not quite equivalent to corresponding match.
You can do:
opt foldLeft (els) ((x, y) => fun(x))
or
(els /: opt) ((x,y) => fun(x))
(Both solutions will evaluate els by value, which might not be what you want. Thanks to Rex Kerr for pointing at it.)
Edit:
But what you really want is Scalaz’s catamorphism cata (basically a fold which not only handles the Some value but also maps the None part, which is what you described)
opt.cata(fun, els)
defined as (where value is the pimped option value)
def cata[X](some: A => X, none: => X): X = value match {
case None => none
case Some(a) => some(a)
}
which is equivalent to opt.map(some).getOrElse(none).
Although I should remark that you should only use cata when it is the ‘more natural’ way of expressing it. There are many cases where a simple map–getOrElse suffices, especially when it involves potentially chaining lots of maps. (Though you could also chain the funs with function composition, of course – it depends on whether you want to focus on the function composition or the value transformation.)
As mentioned by Debilski, you can use Scalaz's OptionW.cata or fold. As Jason commented, named parameters make this look nice:
opt.fold { ifSome = _ + 1, ifNone = 0 }
Now, if the value you want in the None case is mzero for some Monoid[M] and you have a function f: A => M for the Some case, you can do this:
opt foldMap f
So,
opt map (_ + 1) getOrElse 0
becomes
opt foldMap (_ + 1)
Personally, I think Option should have an apply method which would be the catamorphism. That way you could just do this:
opt { _ + 1, 0 }
or
opt { some = _ + 1, none = 0 }
In fact, this would be nice to have for all algebraic data structures.