count occurrences of elements [duplicate] - scala

This question already has answers here:
Scala how can I count the number of occurrences in a list
(17 answers)
Closed 5 years ago.
Counting all elements in a list is a one-liner in Haskell:
count xs = toList (fromListWith (+) [(x, 1) | x <- xs])
Here is an example usage:
*Main> count "haskell scala"
[(' ',1),('a',3),('c',1),('e',1),('h',1),('k',1),('l',3),('s',2)]
Can this function be expressed so elegantly in Scala as well?

scala> "haskell scala".groupBy(identity).mapValues(_.size).toSeq
res1: Seq[(Char, Int)] = ArrayBuffer((e,1), (s,2), (a,3), ( ,1), (l,3), (c,1), (h,1), (k,1))

Recall group from the Data.List library,
group :: [a] -> [[a]]
giving us:
map (head &&& length) . group . sort
a list-friendly and relatively "naive" implementation.

Another implementation:
def count[A](xs: Seq[A]): Seq[(A, Int)] = xs.distinct.map(x => (x, xs.count(_ == x)))

Going for a literal translation, let's try this:
// Implementing this one in Scala
def fromSeqWith[A, B](s: Seq[(A, B)])(f: (B, B) => B) =
s groupBy (_._1) mapValues (_ map (_._2) reduceLeft f)
def count[A](xs: Seq[A]) = fromSeqWith(xs map (_ -> 1))(_+_).toSeq
Scala's groupBy makes this more complex than it needs to be -- there have been calls for groupWith or groupInto, but they didn't make Odersky's standard for standard library inclusion.

Related

Function takes two List[Int] arguments and produces a List[Int]. SCALA [duplicate]

This question already has answers here:
Scala - Combine two lists in an alternating fashion
(4 answers)
Closed 3 years ago.
The elements of the resulting list should alternate between the elements of the arguments. Assume that the two arguments have the same length.
USE RECURSION
My code as follows
val finalString = new ListBuffer[Int]
val buff2= new ListBuffer[Int]
def alternate(xs:List[Int], ys:List[Int]):List[Int] = {
while (xs.nonEmpty) {
finalString += xs.head + ys.head
alternate(xs.tail,ys.tail)
}
return finalString.toList
}
EXPECTED RESULT:
alternate ( List (1 , 3, 5) , List (2 , 4, 6)) = List (1 , 2, 3, 4, 6)
As far for the output, I don't get any output. The program is still running and cannot be executed.
Are there any Scala experts?
There are a few problems with the recursive solutions suggested so far (including yours, which would actually work, if you replace while with if): appending to end of list is a linear operation, making the whole thing quadratic (taking a .length of a list too, as well ас accessing elements by index), don't do that; also, if the lists are long, a recursion may require a lot of space on the stack, you should be using tail-recursion whenever possible.
Here is a solution that is free of both those problems: it builds the output backwards, by prepending elements to the list (constant time operation) rather than appending them, and reverses the result at the end. It is also tail-recursive: the recursive call is the last operation in the function, which allows the compiler to convert it into a loop, so that it will only use a single stack frame for execution regardless of the size of the lists.
#tailrec
def alternate(
a: List[Int],
b: List[Int],
result: List[Int] = Nil
): List[Int] = (a,b) match {
case (Nil, _) | (_, Nil) => result.reversed
case (ah :: at, bh :: bt) => alternate(at, bt, bh :: ah :: result)
}
(if the lists are of different lengths, the whole thing stops when the shortest one ends, and whatever is left in the longer one is thrown out. You may want to modify the first case (split it into two, perhaps) if you desire a different behavior).
BTW, your own solution is actually better than most suggested here: it is actually tail recursive (or rather can be made one if you add else after your if, which is now while), and appending to ListBuffer isn't actually as bad as to a List. But using mutable state is generally considered "code smell" in scala, and should be avoided (that's one of the main ideas behind using recursion instead of loops in the first place).
Condition xs.nonEmpty is true always so you have infinite while loop.
Maybe you meant if instead of while.
A more Scala-ish approach would be something like:
def alternate(xs: List[Int], ys: List[Int]): List[Int] = {
xs.zip(ys).flatMap{case (x, y) => List(x, y)}
}
alternate(List(1,3,5), List(2,4,6))
// List(1, 2, 3, 4, 5, 6)
A recursive solution using match
def alternate[T](a: List[T], b: List[T]): List[T] =
(a, b) match {
case (h1::t1, h2::t2) =>
h1 +: h2 +: alternate(t1, t2)
case _ =>
a ++ b
}
This could be more efficient at the cost of clarity.
Update
This is the more efficient solution:
def alternate[T](a: List[T], b: List[T]): List[T] = {
#annotation.tailrec
def loop(a: List[T], b: List[T], res: List[T]): List[T] =
(a, b) match {
case (h1 :: t1, h2 :: t2) =>
loop(t1, t2, h2 +: h1 +: res)
case _ =>
a ++ b ++ res.reverse
}
loop(a, b, Nil)
}
This retains the original function signature but uses an inner function that is an efficient, tail-recursive implementation of the algorithm.
You're accessing variables from outside the method, which is bad. I would suggest something like the following:
object Main extends App {
val l1 = List(1, 3, 5)
val l2 = List(2, 4, 6)
def alternate[A](l1: List[A], l2: List[A]): List[A] = {
if (l1.isEmpty || l2.isEmpty) List()
else List(l1.head,l2.head) ++ alternate(l1.tail, l2.tail)
}
println(alternate(l1, l2))
}
Still recursive but without accessing state from outside the method.
Assuming both lists are of the same length, you can use a ListBuffer to build up the alternating list. alternate is a pure function:
import scala.collection.mutable.ListBuffer
object Alternate extends App {
def alternate[T](xs: List[T], ys: List[T]): List[T] = {
val buffer = new ListBuffer[T]
for ((x, y) <- xs.zip(ys)) {
buffer += x
buffer += y
}
buffer.toList
}
alternate(List(1, 3, 5), List(2, 4, 6)).foreach(println)
}

What's the diff between reduceLeft and reduceRight in Scala?

What's the diff between reduceLeft and reduceRight in Scala?
val list = List(1, 0, 0, 1, 1, 1)
val sum1 = list reduceLeft {_ + _}
val sum2 = list reduceRight {_ + _}
println { sum2 == sum2 }
In my snippet sum1 = sum2 = 4, so the order does not matter here.
When do they produce the same result
As Lionel already pointed out, reduceLeft and reduceRight only produce the same result if the function you are using to combine the elements is associative (this isn't always true, see my note at the bottom). For instance when running reduceLeft and reduceRight on Seq(1,2,3) with the function (a: Int, b: Int) => a - b you get a different result.
scala> Seq(1,2,3)
res0: Seq[Int] = List(1, 2, 3)
scala> res0.reduceLeft(_ - _)
res5: Int = -4
scala> res0.reduceRight(_ - _)
res6: Int = 2
Why this happens can be made clear if we look at how each of the functions is applied over the list.
For reduceRight this is what the calls look like if we were to unwrap them.
(1 - (2 - 3))
(1 - (-1))
2
For reduceLeft the sequence is built up starting from the left,
((1 - 2) - 3)
((-1) - 3)
(-4)
Tail Recursion
Further because reduceLeft is implemented using Tail Recursion, it will not stack overflow when operating on very large collections (possibly even infinite). reduceRight is not tail recursive, so given a collection of large enough size, it will produce a stack overflow.
For instance, on my machine if I run the following I get an Out of Memory error,
scala> (0 to 100000000).reduceRight(_ - _)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Integer.valueOf(Integer.java:832)
at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:65)
at scala.collection.immutable.Range.apply(Range.scala:61)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:65)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.TraversableOnce$class.reversed(TraversableOnce.scala:99)
at scala.collection.AbstractIterator.reversed(Iterator.scala:1194)
at scala.collection.TraversableOnce$class.reduceRight(TraversableOnce.scala:197)
at scala.collection.AbstractIterator.reduceRight(Iterator.scala:1194)
at scala.collection.IterableLike$class.reduceRight(IterableLike.scala:85)
at scala.collection.AbstractIterable.reduceRight(Iterable.scala:54)
... 20 elided
But if I compute with reduceLeft I don't get the OOM,
scala> (0 to 100000000).reduceLeft(_ - _)
res16: Int = -987459712
You might get slightly different results on your system, depending your JVM default memory settings.
Prefer left versions
So, because of tail recursion, if you know that reduceLeft and reduceRight will produce the same value, you should prefer the reduceLeft variant. This generally holds true of the other left/right functions, such as foldRight and foldLeft (which are just more general versions of reduceRight and reduceLeft).
When do they really always produce the same result
A small note about reduceLeft and reduceRight and the Associative Property of the function you are using. I said that reduceRight and reduceLeft only produce the same results if the operator is associative. This isn't always true for all collection types. That is somewhat of another topic though, so consult the ScalaDoc, but in short the function you are reducing with needs to be both commutative and associative in order to get the same results for all collection types.
Reduce left doesn't always equal the same result as reduce right. Consider an asymmetric function on your array.
Assuming the same result, performance is one obvious difference
See performance-characteristics
The data structure is build with constant access time to head and tail. Iterating backwards will perform worse for large lists.
The best way to know the difference is read the source code in library/scala/collection/LinearSeqOptimized.scala:
def reduceLeft[B >: A](f: (B, A) => B): B =
......
tail.foldLeft[B](head)(f)
def reduceRight[B >: A](op: (A, B) => B): B =
......
op(head, tail.reduceRight(op))
def foldLeft[B](z: B)(f: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = f(acc, these.head)
these = these.tail
}
acc
}
The above is some key part of the code, and you can see reduceLeft is based on foldLeft , while reduceRight is implemented via recursion.
I guess reduceLeft has a better performance.

for-expression to flatMap Conversion

The following for-expression seems intuitive to me. Take each item in List(1), then map over List("a"), and then return a List[(Int, String)].
scala> val x = for {
| a <- List(1)
| b <- List("a")
| } yield (a,b)
x: List[(Int, String)] = List((1,a))
Now, converting it to a flatMap, it seems less clear to me. If I understand correctly, I need to call flatMap first since I'm taking the initial List(1), and then applying a function to convert from A => List[B].
scala> List(1).flatMap(a => List("a").map(b => (a,b) ))
res0: List[(Int, String)] = List((1,a))
After using the flatMap, it seemed necessary to use a map since I needed to go from A => B.
But, as the number of items increases in the for-expression (say 2 to 3 items), how do I know whether to use a map or flatMap when converting from for-expression to flatMap?
In using the for comprehension you always flatMap until the last value that you extract which you map. So if you have three items:
for {
a <- List("a")
b <- List("b")
c <- List("c")
} yield (a, b, c)
It would be the same as:
List("a").flatMap(a => List("b").flatMap(b => List("c").map(c => (a, b, c))))
If you look at the signature of flatMap it's A => M[B]. So as we add elements to the for comprehension we need to flatMap them in since we continue to add M[B] to the comprehension. When we get to the last element, there's nothing left to add so we use map since we just want to go from A => B. Hope that makes sense, if not take you should watch some of the videos in the Reactive Programming class on Coursera as they go over this quite a bit.

Invoke a method using a tuple as the parameter list [duplicate]

This question already has answers here:
scala tuple unpacking
(5 answers)
Closed 7 years ago.
I wonder what is the best way to do this.
val foo = Some("a")
val bar = Some(2)
def baz(a: String, b: Int) = if((b % 2) == 0) Some(a+","+b) else None
(x zip y) flatMap baz //does not compile of course
(x zip y) flatMap { x => baz(x._1, x._2) } //ugly
I would presume that Odersky et al. have another trick up theirs sleeve to reduce the noise in this example.
So the question is how to fight the clutter here assuming you are not allowed to change the implementation of baz (e.g. def baz(a: (String Int))).
This question has already been answered here: scala tuple unpacking
First, make foo to a function by partial application, then call tupled with your parameter list:
(foo _).tupled(myTuple)
The most common way to cleanly unpack anything is with patterns or for-comprehensions:
for ((a,b) <- (x zip y); c <- baz(a,b)) yield(c)

why does filter have to be defined for pattern matching in a for loop in scala?

To create a new class that can be used in a Scala for comprehension, it seems that all you have to do is define a map function:
scala> class C[T](items: T*) {
| def map[U](f: (T) => U) = this.items.map(f)
| }
defined class C
scala> for (x <- new C(1 -> 2, 3 -> 4)) yield x
res0: Seq[(Int, Int)] = ArrayBuffer((1,2), (3,4))
But that only works for simple for loops where there is no pattern matching on the left hand side of <-. If you try to pattern match there, you get a complaint that the filter method is not defined:
scala> for ((k, v) <- new C(1 -> 2, 3 -> 4)) yield k -> v
<console>:7: error: value filter is not a member of C[(Int, Int)]
for ((k, v) <- new C(1 -> 2, 3 -> 4)) yield k -> v
Why is filter required to implement the pattern matching here? I would have thought Scala would just translate the above loop into the equivalent map call:
scala> new C(1 -> 2, 3 -> 4).map{case (k, v) => k -> v}
res2: Seq[(Int, Int)] = ArrayBuffer((1,2), (3,4))
But that seems to work fine, so the for loop must be translated into something else. What is it translated into that needs the filter method?
The short answer: according to the Scala specs, you shouldn't need to define a 'filter' method for the example you gave, but there is an open bug that means it is currently required.
The long answer: the desugaring algorithm applied to for comprehensions is described in the Scala language specification. Let's start with section 6.19 "For Comprehensions and For Loops" (I'm looking at version 2.9 of the specification):
In a first step, every generator p <- e, where p is not irrefutable (§8.1) for the type of e is replaced by p <- e.withFilter { case p => true; case _ => false }
The important point for your question is whether the pattern in the comprehension is "irrefutable" for the given expression or not. (The pattern is the bit before the '<-'; the expression is the bit afterwards.) If it is "irrefutable" then the withFilter will not be added, otherwise it will be needed.
Fine, but what does "irrefutable" mean? Skip ahead to section 8.1.14 of the spec ("Irrefutable Patterns"). Roughly speaking, if the compiler can prove that the pattern cannot fail when matching the expression then the pattern is irrefutable and the withFilter call will not be added.
Now your example that works as expected is the first type of irrefutable pattern from section 8.1.14, a variable pattern. So the first example is easy for the compiler to determine that withFilter is not required.
Your second example is potentially the third type of irrefutable pattern, a constructor pattern. Trying to match (k,v) which is Tuple2[Any,Any] against a Tuple2[Int,Int] (see section 8.1.6 and 8.1.7 from the specification)) succeeds since Int is irrefutable for Any. Therefore the second pattern is also irrefutable and doesn't (shouldn't) need a withFilter method.
In Daniel's example, Tuple2[Any,Any] isn't irrefutable against Any, so the withFilter calls gets added.
By the way, the error message talks about a filter method but the spec talks about withFilter - it was changed with Scala 2.8, see this question and answer for the gory details.
See the difference:
scala> for ((k, v) <- List(1 -> 2, 3 -> 4, 5)) yield k -> v
res22: List[(Any, Any)] = List((1,2), (3,4))
scala> List(1 -> 2, 3 -> 4, 5).map{case (k, v) => k -> v}
scala.MatchError: 5