Scala - How to group elements by occurrence? - scala

Is there a function in scala that groups all elements of a list by the number of these occurrences?
For example, I have this list:
val x = List("c", "b", "b", "c", "a", "d", "c")
And I want to get a new list like that:
x = List((3, "c"), (2, "b"), (1, "a"), (1, "d"))

You can first count the occurrences of each element and then reverse the resulting tuples:
List("c", "b", "b", "c", "a", "d", "c")
.groupBy(identity).mapValues(_.size) // Map(b -> 2, d -> 1, a -> 1, c -> 3)
.toList // List((b,2), (d,1), (a,1), (c,3))
.map{ case (k, v) => (v, k) } // List((2,b), (1,d), (1,a), (3,c))
You don't specifically mention a notion of order for the output, but if this was a requirement, this solution would need to be adapted.

Try this to get exactly what you want in the order you mentioned. (ie., order preserved in the List while taking counts):
x.distinct.map(v=>(x.filter(_==v).size,v))
In SCALA REPL:
scala> val x = List("c", "b", "b", "c", "a", "d", "c")
x: List[String] = List(c, b, b, c, a, d, c)
scala> x.distinct.map(v=>(x.filter(_==v).size,v))
res225: List[(Int, String)] = List((3,c), (2,b), (1,a), (1,d))
scala>

Related

Splitting a Scala List into parts using a given list of part sizes.[Partitioning]

I got two lists:
val list1:List[Int] = List(5, 2, 6)
val list2:List[Any] = List("a", "b", "c", "d", "e", "f", "g", "h", "i", "j","k")
such that list1.sum >= list2.size
I want a list of lists formed with elements in list2 consecutively
with the sizes mentioned in list1.
For example:
if list1 is List(5,2,4) the result I want is:
List(List("a", "b", "c", "d", "e"),List("f", "g"),List("h", "i", "j","k"))
if list1 is List(5,4,6) the result I want is:
List(List("a", "b", "c", "d", "e"),List("f", "g","h", "i"),List("j","k"))
How can I do that with concise code.
Turn list2 into an Iterator then map over list1.
val itr = list2.iterator
list1.map(itr.take(_).toList)
//res0: List[List[Any]] = List(List(a, b, c, d, e), List(f, g), List(h, i, j, k))
update:
While this appears to give the desired results, it has been pointed out elsewhere that reusing the iterator is actually unsafe and its behavior is not guaranteed.
With some modifications a safer version can be achieved.
val itr = list2.iterator
list1.map(List.fill(_)(if (itr.hasNext) Some(itr.next) else None).flatten)
-- or --
import util.Try
val itr = list2.iterator
list1.map(List.fill(_)(Try(itr.next).toOption).flatten)
you can slice on list2 based on the size you get from list1,
def main(args: Array[String]): Unit = {
val groupingList: List[Int] = List(5, 2, 4)
val data = List("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k")
//0, 5
//5, (5 + 2)
//5 + 2, (5 + 2 + 4)
val grouped = groupingList.zipWithIndex.map { case (_, index) =>
val start = groupingList.slice(0, index).sum
val end = groupingList.slice(0, index + 1).sum
data.slice(start, end)
}
println(grouped)
}
result: List(List(a, b, c, d, e), List(f, g), List(h, i, j, k))
Also read: How slice works
i like show it using ex
list1=List("one","two","three")
List[String] = List(one, two, three)
list2=List("red","green","blue")
List[String] = List(red, green, blue)
list1::list2
List[java.io.Serializable] = List(List(one, two, three), red, green, blue)
var listSum=list1::list2
List[java.io.Serializable] = List(List(one, two, three), red, green, blue)
listSum
List[java.io.Serializable] = List(List(one, two, three), red, green, blue)
we can "::" for insert one list to another list in Scala
Just found by me, even the following code worked, but the code posted by jwvh appears more concise than this, and I understand making list1 a Vector makes more sense regarding performance of element access in the following code:
list1.scan(0)(_+_).sliding(2).toList.map(x=>list2.drop(x(0)).take(x(1)-x(0)))
or
list1.scan(0)(_+_).zip(list1).map(x=>list2.drop(x._1).take(x._2))
list1.scan(0)(_+_).sliding(2).map(x=>list2.slice(x(0),x(1))).toList

Scala number of occurences of element in list of lists

To make it simple, let's imagine I have the following input:
List(List("A", "A"), List("A", "B"), List("B", "C"), List("B", "C"))
How would it be possible to group the elements inside the lists in such fashion so that I would know how many lists are they in. For example, following the output of a mapValues function just to illustrate what I mean, the result of the previous input should be something like:
Map("A" -> 2, "B" -> 3, "C" -> 2)
Just to be sure I made clear what I mean, a way to interpret the result would be to say that "A" is present in 2 of the sub-lists (regardless of how many times it appears inside of a particular sub-list), "B" is present in 3 of the sub-lists and "C" is in 2. I just want a way to map how many different sub-lists each of the individual elements are present in.
Disregarding performance, this would work:
val list = List(List("A", "A"), List("A", "B"), List("B", "C"), List("B", "C"))
val elements = list.flatten.distinct
elements.map(el => el -> list.count(_.contains(el))).toMap
You can also use fold operation to like this
list.flatten.foldLeft(Map.empty[String, Int])((map, word) => map + (word -> (map.getOrElse(word,0) + 1)))
//scala> res2: scala.collection.immutable.Map[String,Int] = Map(A -> 3, B -> 3, C -> 2)

Behaviour of Options inside for comprehension is Scala

Two newbie questions.
It seems that for comprehension knows about Options and can skip automatically None and unwrap Some, e.g.
val x = Map("a" -> List(1,2,3), "b" -> List(4,5,6), "c" -> List(7,8,9))
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
map_value <- x get map_key } yield map_value
outputs:
r: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
Where do the Options go? Can someone please shed some light on how does this work? Can we always rely on this behaviour?
The second things is why this does not compile?
val x = Map("a" -> List(1,2,3), "b" -> List(4,5,6), "c" -> List(7,8,9))
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
map_value <- x get map_key
list_value <- map_value
} yield list_value
It gives
Error:(57, 26) type mismatch;
found : List[Int]
required: Option[?]
list_value <- map_value
^
Looking at the type of the first example, I am not sure why we need to have an Option here?
For comprehensions are converted into calls to sequence of map or flatMap calls. See here
Your for loop is equivalent to
List("WRONG_KEY", "a", "b", "c").flatMap(
map_key => x.get(map_key).flatMap(map_value => map_value)
)
flatMap in Option is defined as
#inline final def flatMap[B](f: A => Option[B]): Option[B]
So it is not allowed to pass List as argument as you are notified by compiler.
I think the difference is due to the way for comprehensions are expanded into map() and flatMap method calls within the Seq trait.
For conciseness, lets define some variables:
var keys = List("WRONG_KEY", a, b, c)
Your first case is equivalent to:
val r = keys.flatMap(x.get(_))
whereas your second case is equivalent to:
val r= keys.flatMap(x.get(_).flatMap{ case y => y })
I think the issue is that Option.flatMap() should return an Option[], which is fine in the first case, but is not consistent in the second case with what the x.get().flatMap is passed, which is a List[Int].
These for-comprehension translation rules are explained in further detail in chapter 7 of "Programming Scala" by Wampler & Payne.
Maybe this small difference, setting parenthesis and calling flatten, makes it clear:
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
| } yield x get map_key
r: List[Option[List[Int]]] = List(None, Some(List(1, 2, 3)), Some(List(4, 5, 6)), Some(List(7, 8, 9)))
val r = (for {map_key <- List("WRONG_KEY", "a", "b", "c")
| } yield x get map_key).flatten
r: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
That's equivalent to:
scala> List("WRONG_KEY", "a", "b", "c").map (x get _)
res81: List[Option[List[Int]]] = List(None, Some(List(1, 2, 3)), Some(List(4, 5, 6)), Some(List(7, 8, 9)))
scala> List("WRONG_KEY", "a", "b", "c").map (x get _).flatten
res82: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
The intermediate value (map_key) vanished as _ in the second block.
You are mixing up two different monads (List and Option) inside the for statement. This sometimes works as expected, but not always. In any case, you can trasform options into lists yourself:
for {
map_key <- List("WRONG_KEY", "a", "b", "c")
list_value <- x get map_key getOrElse Nil
} yield list_value

Scala collections: array groupBy and return array indexes for each group

I have an array, something like that:
val a = Array("a", "c", "c", "z", "c", "b", "a")
and I want to get a map with keys of all different values of this array and values with a collection of relevant indexes for each such group, i.e. for a given array the answer would be:
Map(
"a" -> Array(0, 6),
"b" -> Array(5),
"c" -> Array(1, 2, 4),
"z" -> Array(3)
)
Surprisingly, it proved to be somewhat more complicated that I've anticipated. The best I've came so far with is:
a.zipWithIndex.groupBy {
case(cnt, idx) => cnt
}.map {
case(cnt, arr) => (cnt, arr.map {
case(k, v) => v
}
}
which is not either concise or easy to understand. Any better ideas?
Your code can be rewritten as oneliner, but it looks ugly.
as.zipWithIndex.groupBy(_._1).mapValues(_.map(_._2))
Another way is to use mutable.MultiMap
import collection.mutable.{ HashMap, MultiMap, Set }
val as = Array("a", "c", "c", "z", "c", "b", "a")
val mm = new HashMap[String, Set[Int]] with MultiMap[String, Int]
and then just add every binding
as.zipWithIndex foreach (mm.addBinding _).tupled
//mm = Map(z -> Set(3), b -> Set(5), a -> Set(0, 6), c -> Set(1, 2, 4))
finally you can convert it mm.toMap if you want immutable version.
Here's a version with foldRight. I think it's reasonably clear.
val a = Array("a", "c", "c", "z", "c", "b", "a")
a
.zipWithIndex
.foldRight(Map[String, List[Int]]())
{case ((e,i), m)=> m updated (e, i::m.getOrElse(e, Nil))}
//> res0: scala.collection.immutable.Map[String,List[Int]] = Map(a -> List(0, 6)
//| , b -> List(5), c -> List(1, 2, 4), z -> List(3))
Another version using foldLeft and an immutable Map with default value:
val a = Array("a", "c", "c", "z", "c", "b", "a")
a.zipWithIndex.foldLeft(Map[String, List[Int]]().withDefaultValue(Nil))( (m, p) => m + ((p._1, p._2 +: m(p._1))))
// res6: scala.collection.immutable.Map[String,List[Int]] = Map(a -> List(6, 0), c -> List(4, 2, 1), z -> List(3), b -> List(5))
Starting in Scala 2.13, we can use the new groupMap which (as its name suggests) is a one-pass equivalent of a groupBy and a mapping over grouped items:
// val a = Array("a", "c", "c", "z", "c", "b", "a")
a.zipWithIndex.groupMap(_._1)(_._2)
// Map("z" -> Array(3), "b" -> Array(5), "a" -> Array(0, 6), "c" -> Array(1, 2, 4))
This:
zips each item with its index, giving (item, index) tuples
groups elements based on their first tuple part (_._1) (group part of groupMap)
maps grouped values to their second tuple part (_._2 i.e. their index) (map part of groupMap)

Create Map holding a # of appearance for each List element [duplicate]

This question already has answers here:
Count occurrences of each element in a List[List[T]] in Scala
(3 answers)
Closed 9 years ago.
I'm new to Scala.
If I have the following List:
val ls = List("a", "a", "a", "b", "b", "c")
how can I create a Map that holds an number of appearances for every element in the list?
For example the Map for the list above should be:
Map("a" -> 3, "b" -> 2, "c" -> 1)
list.foldLeft(Map[String, Int]() withDefaultValue 0) { (m, x) => m + (x -> (m(x) + 1)) }
snippet in action:
scala> val list = List("a", "a", "b", "c", "c", "a")
list: List[String] = List(a, a, b, c, c, a)
scala> list.foldLeft(Map[String, Int]() withDefaultValue 0) { (m, x) => m + (x -> (1 + m(x))) }
res1: scala.collection.immutable.Map[String,Int] = Map(a -> 3, b -> 1, c -> 2)
(directly based on Count occurrences of each element in a List[List[T]] in Scala)
Not as efficient as Erik's foldLeft solution:
val ls = List("a", "a", "a", "b", "b", "c")
ls.groupBy(identity).mapValues(_.size)
res0: scala.collection.immutable.Map[String,Int] = Map(a -> 3, c -> 1, b -> 2)
With scalaz,
xs foldMap (x => Map(x -> 1))