How to use Scala reduceLeft on case classes? - scala

I understand how to use reduceLeft on simple lists of integers but attempts to use if on case class objects fail.
Assume I have:
case class LogMsg(time:Int, cat:String, msg:String)
val cList = List(LogMsg(1,"a", "bla"), LogMsg(2,"a", "bla"), LogMsg(4,"b", "bla"))
and I want to find the largest difference in time between LogMsgs.
I want to do something like:
cList.reduceLeft((a,b) => (b.time - a.time)
which of course doesn't work.
The first iteration of reduceLeft compares the first two elements, which are both of type LogMsg. After that it compares the next element (LogMsg) with the result of the first iteration (Int).
Do I just have the syntax wrong or should I be doing this another way?

I'd probably do something like this:
(cList, cList.tail).zipped.map((a, b) => b.time - a.time).max
You'll need to check beforehand that cList has at least 2 elements.
reduceLeft can't be used to return the largest difference, because it always returns the type of the List you're reducing, i.e. LogMsg in this case, and you're asking for an Int.

My try:
cList.sliding(2).map(t => t(1).time - t(0).time).max
Another one that came into my mind: since LogMsg is a case class, we can take advantage of pattern matching:
cList.sliding(2).collect{
case List(LogMsg(a, _, _), LogMsg(b, _, _)) => b - a}.
max

I would recommand you to use foldLeft which is a reduceLeft enabling you to initialize the results.
val head::tail = cList
tail.foldLeft((head.time, 0)) ((a,b) => (b.time, math.max(a._2,b.time-a._1)))._2

Related

Why does Scala require pattern variables to be linear?

Scala requires pattern variables to be linear, i.e. pattern
variable may not occur more than once in a pattern. Thus, this example does not compile:
def tupleTest(tuple: (Int, Int)) = tuple match {
case (a, a) => a
case _ => -1
}
But you can use two pattern variables and a guard to check equality instead:
def tupleTest(tuple: (Int, Int)) = tuple match {
case (a, b) if a == b => a
case _ => -1
}
So why does Scala require pattern variables to be linear? Are there any cases that can not be transformed like this?
Edit
It is easy to transform the first example into the second (Scala to Scala). Of all occurrences of a variable v in the pattern take the expressions that is evaluated first and assign it to the variable v. For each other occurrence introduce a new variable with a name that is not used in the current scope. For each of those variables v' add a guard v == v'. It is the same way a programmer would go (=> same efficiency). Is there any problem with this approach? I'd like to see an example that can not be transformed like this.
Because case (a, b) is basically assigning val a to _._1 and val b to _._2 (at least you can view it like that). In case of case (a, a), you cannot assign val a to both _._1 and _._2.
Actually the thing you want to do would have been looked like
case (a, `a`) => ???
as scala uses backtick to match an identifier. But unfortunately that still doesn't work as the visibility of a is given only after => (would have been fun though, I also hate writing case (a, b) if a = b =>). And the reason of this is probably just because it is harder to write a compiler that supports that

Scala: reduceLeft with String

I have a list of Integers and I want to make a String of it.
var xs = list(1,2,3,4,5)
(xs foldLeft "") (_+_) // String = 12345
with foldLeft it works perfect, but my question is does it also work with reduceLeft? And if yes, how?
It cannot work this way with reduceLeft. Informally you can view reduceLeft as a special case of foldLeft where the accumulated value is of the same type as the collection's elements. Because in your case the element type is Int and the accumulated value is String, there is no way to use reduceLeft in the way you used foldLeft.
However in this specific case you can simply convert all your Int elements to String up front, and then reduce:
scala> xs.map(_.toString) reduceLeft(_+_)
res5: String = 12345
Note that this will throw an exception if the list is empty. This is another difference with foldLeft, which handles the empty case just fine (because it has an explicit starting value).
This is also less efficient because we create a whole new collection (of strings) just to reduce it on the spot.
All in all, foldLeft is a much better choice here.
It takes a little bit of work to make sure the types are understood correctly. Expanding them, though, you could use something like:
(xs reduceLeft ((a: Any, b: Int) => a + b.toString)).toString

What's the reasoning behind adding the "case" keyword to Scala?

Apart from:
case class A
... case which is quite useful?
Why do we need to use case in match? Wouldn't:
x match {
y if y > 0 => y * 2
_ => -1
}
... be much prettier and concise?
Or why do we need to use case when a function takes a tuple? Say, we have:
val z = List((1, -1), (2, -2), (3, -3)).zipWithIndex
Now, isn't:
z map { case ((a, b), i) => a + b + i }
... way uglier than just:
z map (((a, b), i) => a + b + i)
...?
First, as we know, it is possible to put several statements for the same case scenario without needing some separation notation, just a line jump, like :
x match {
case y if y > 0 => y * 2
println("test")
println("test2") // these 3 statements belong to the same "case"
}
If case was not needed, compiler would have to find a way to know when a line is concerned by the next case scenario.
For example:
x match {
y if y > 0 => y * 2
_ => -1
}
How compiler would know whether _ => -1 belongs to the first case scenario or represents the next case?
Moreover, how compiler would know that the => sign doesn't represent a literal function but the actual code for the current case?
Compiler would certainly need a kind of code like this allowing cases isolation:
(using curly braces, or anything else)
x match {
{y if y > 0 => y * 2}
{_ => -1} // confusing with literal function notation
}
And surely, solution (provided currently by scala) using case keyword is a lot more readable and understandable than putting some way of separation like curly braces in my example.
Adding to #Mik378's answer:
When you write this: (a, b) => something, you are defining an anonymous Function2 - a function that takes two parameters.
When you write this: case (a, b) => something, you are defining an anonymous PartialFunction that takes one parameter and matches it against a pair.
So you need the case keyword to differentiate between these two.
The second issue, anonymous functions that avoid the case, is a matter of debate:
https://groups.google.com/d/msg/scala-debate/Q0CTZNOekWk/z1eg3dTkCXoJ
Also: http://www.scala-lang.org/old/node/1260
For the first issue, the choice is whether you allow a block or an expression on the RHS of the arrow.
In practice, I find that shorter case bodies are usually preferable, so I can certainly imagine your alternative syntax resulting in crisper code.
Consider one-line methods. You write:
def f(x: Int) = 2 * x
then you need to add a statement. I don't know if the IDE is able to auto-add parens.
def f(x: Int) = { val res = 2*x ; res }
That seems no worse than requiring the same syntax for case bodies.
To review, a case clause is case Pattern Guard => body.
Currently, body is a block, or a sequence of statements and a result expression.
If body were an expression, you'd need braces for multiple statements, like a function.
I don't think => results in ambiguities since function literals don't qualify as patterns, unlike literals like 1 or "foo".
One snag might be: { case foo => ??? } is a "pattern matching anonymous function" (SLS 8.5). Obviously, if the case is optional or eliminated, then { foo => ??? } is ambiguous. You'd have to distinguish case clauses for anon funs (where case is required) and case clauses in a match.
One counter-argument for the current syntax is that, in an intuition deriving from C, you always secretly hope that your match will compile to a switch table. In that metaphor, the cases are labels to jump to, and a label is just the address of a sequence of statements.
The alternative syntax might encourage a more inlined approach:
x match {
C => c(x)
D => d(x)
_ => ???
}
#inline def c(x: X) = ???
//etc
In this form, it looks more like a dispatch table, and the match body recalls the Map syntax, Map(a -> 1, b -> 2), that is, a tidy simplification of the association.
One of the key aspects of code readability is the words that grab your attention. For example,
return grabs your attention when you see it because you know that it is such a decisive action (breaking out of the function and possible sending a value back to the caller).
Another example is break--not that I like break, but it gets your attention.
I would agree with #Mik378 that case in Scala is more readable than the alternatives. Besides the compiler confusion he mentions, it gets your attention.
I am all for concise code, but there is a line between concise and illegible. I will gladly make the trade of 4n characters (where n is the number of cases) for the substantial readability that I get in return.

Idiomatic form of dealing with un-initialized var

I'm coding up my first Scala script to get a feel for the language, and I'm a bit stuck as to the best way to achieve something.
My situation is the following, I have a method which I need to call N times, this method returns an Int on each run (might be different, there's a random component to the execution), and I want to keep the best run (the smallest value returned on these runs).
Now, coming from a Java/Python background, I would simply initialize the variable with null/None, and compare in the if, something like:
best = None
for...
result = executionOfThings()
if(best is None or result < best):
best = result
And that's that (pardon for the semi-python pseudo-code).
Now, on Scala, I'm struggling a bit. I've read about the usage of Option and pattern matching to achieve the same effect, and I guess I could code up something like (this was the best I could come up with):
best match {
case None => best = Some(res)
case Some(x) if x > res => best = Some(res)
case _ =>
}
I believe this works, but I'm not sure if it's the most idiomatic way of writing it. It's clear enough, but a bit verbose for such a simple "use-case".
Anyone that could shine a functional light on me?
Thanks.
For this particular problem, not in general, I would suggest initializing with Int.MaxValue as long as you're guaranteed that N >= 1. Then you just
if (result < best) best = result
You could also, with best as an option,
best = best.filter(_ >= result).orElse( Some(result) )
if the optionality is important (e.g. it is possible that N == 0, and you don't take a distinct path through the code in that case). This is a more general way to deal with optional values that may get replaced: use filter to keep the non-replaced cases, and orElse to fill in the replacement if needed.
Just use the min function:
(for (... executionOfThings()).min
Example:
((1 to 5).map (x => 4 * x * x - (x * x * x))).min
edit: adjusted to #user-unknown's suggestion
I would suggest you to rethink you whole computation to be more functional. You mutate state which should be avoided. I could think of a recursive version of your code:
def calcBest[A](xs: List[A])(f: A => Int): Int = {
def calcBest(xs: List[A], best: Int = Int.MaxValue): Int = xs match {
// will match an empty list
case Nil => best
// x will hold the head of the list and rest the rest ;-)
case x :: rest => calcBest(rest, math.min(f(x), best))
}
calcBest(xs)
}
callable with calcBest(List(7,5,3,8,2))(_*2) // => res0: Int = 4
With this you have no mutable state at all.
Another way would be to use foldLeft on the list:
list.foldLeft(Int.MaxValue) { case (best,x) => math.min(calculation(x),best) }
foldLeft takes a B and a PartialFunction of Tuple2[B,A] => B and returns B
Both ways are equivalent. The first one is probably faster, the second is more readable. Both traverse a list call a function on each value and return the smallest. Which from your snippet is what you want, right?
I thought I would offer another idiomatic solution. You can use Iterator.continually to create an infinite-length iterator that's lazily evaluated, take(N) to limit the iterator to N elements, and use min to find the winner.
Iterator.continually { executionOfThings() }.take(N).min

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther