combine two lists with same keys - scala

Here's a quite simple request to combine two lists as following:
scala> list1
res17: List[(Int, Double)] = List((1,0.1), (2,0.2), (3,0.3), (4,0.4))
scala> list2
res18: List[(Int, String)] = List((1,aaa), (2,bbb), (3,ccc), (4,ddd))
The desired output is as:
((aaa,0.1),(bbb,0.2),(ccc,0.3),(ddd,0.4))
I tried:
scala> (list1 ++ list2)
res23: List[(Int, Any)] = List((1,0.1), (2,0.2), (3,0.3), (4,0.4),
(1,aaa), (2,bbb), (3,ccc), (4,ddd))
But:
scala> (list1 ++ list2).groupByKey
<console>:10: error: value groupByKey is not a member of List[(Int,
Any)](list1 ++ list2).groupByKey
Any hints? Thanks!

The method you're looking for is groupBy:
(list1 ++ list2).groupBy(_._1)
If you know that for each key you have exactly two values, you can join them:
scala> val pairs = List((1, "a1"), (2, "b1"), (1, "a2"), (2, "b2"))
pairs: List[(Int, String)] = List((1,a1), (2,b1), (1,a2), (2,b2))
scala> pairs.groupBy(_._1).values.map {
| case List((_, v1), (_, v2)) => (v1, v2)
| }
res0: Iterable[(String, String)] = List((b1,b2), (a1,a2))
Another approach using zip is possible if the two lists contain the same keys in the same order:
scala> val l1 = List((1, "a1"), (2, "b1"))
l1: List[(Int, String)] = List((1,a1), (2,b1))
scala> val l2 = List((1, "a2"), (2, "b2"))
l2: List[(Int, String)] = List((1,a2), (2,b2))
scala> l1.zip(l2).map { case ((_, v1), (_, v2)) => (v1, v2) }
res1: List[(String, String)] = List((a1,a2), (b1,b2))

Here's a quick one-liner:
scala> list2.map(_._2) zip list1.map(_._2)
res0: List[(String, Double)] = List((aaa,0.1), (bbb,0.2), (ccc,0.3), (ddd,0.4))
If you are unsure why this works then read on! I'll expand it step by step:
list2.map(<function>)
The map method iterates over each value in list2 and applies your function to it. In this case each of the values in list2 is a Tuple2 (a tuple with two values). What you are wanting to do is access the second tuple value. To access the first tuple value use the ._1 method and to access the second tuple value use the ._2 method. Here is an example:
val myTuple = (1.0, "hello") // A Tuple2
println(myTuple._1) // prints "1.0"
println(myTuple._2) // prints "hello"
So what we want is a function literal that takes one parameter (the current value in the list) and returns the second tuple value (._2). We could have written the function literal like this:
list2.map(item => item._2)
We don't need to specify a type for item because the compiler is smart enough to infer it thanks to target typing. A really helpful shortcut is that we can just leave out item altogether and replace it with a single underscore _. So it gets simplified (or cryptified, depending on how you view it) to:
list2.map(_._2)
The other interesting part about this one liner is the zip method. All zip does is take two lists and combines them into one list just like the zipper on your favorite hoodie does!
val a = List("a", "b", "c")
val b = List(1, 2, 3)
a zip b // returns ((a,1), (b,2), (c,3))
Cheers!

Related

Understanding Scala code: (-_._2)

Help me understand this Scala code:
sortBy(-_._2)
I understand that the first underscore (_) is a placeholder. I understand that _2 means the second member of a Tuple.
But what does a minus (-) stand for in this code?
Reverse order (i.e. descending), you sort by minus the second field of the tuple
The underscore is an anonymous parameter, so -_ is basically the same as x => -x
Some examples in plain scala:
scala> List(1,2,3).sortBy(-_)
res0: List[Int] = List(3, 2, 1)
scala> List("a"->1,"b"->2, "c"->3).sortBy(-_._2)
res1: List[(String, Int)] = List((c,3), (b,2), (a,1))
scala> List(1,2,3).sortBy(x => -x)
res2: List[Int] = List(3, 2, 1)
Sort by sorts by ascending order as default. To inverse the order a - (Minus) can be prepended, as already explained by #TrustNoOne .
So sortBy(-_._2) sorts by the second value of a Tuple2 but in reverse order.
A longer example:
scala> Map("a"->1,"b"->2, "c"->3).toList.sortBy(-_._2)
res1: List[(String, Int)] = List((c,3), (b,2), (a,1))
is the same as
scala> Map("a"->1,"b"->2, "c"->3).toList sortBy { case (key,value) => - value }
res1: List[(String, Int)] = List((c,3), (b,2), (a,1))

Scala Iteratively build lists

As a Scala beginner I am still struggling working with immutable lists. All I am trying to do append elements to my list. Here's an example of what I am trying to do.
val list = Seq()::Nil
val listOfInts = List(1,2,3)
listOfInts.foreach {case x=>
list::List(x)
}
expecting that I would end up with a list of lists: List(List(1),List(2),List(3))
Coming from java I am used to just using list.add(new ArrayList(i)) to get the same result. Am I way off here?
Since the List is immutable you can not modify the List in place.
To construct a List of 1 item Lists from a List, you can map over the List. The difference between forEach and map is that forEach returns nothing, i.e. Unit, while map returns a List from the returns of some function.
scala> def makeSingleList(j:Int):List[Int] = List(j)
makeSingleList: (j: Int)List[Int]
scala> listOfInts.map(makeSingleList)
res1: List[List[Int]] = List(List(1), List(2), List(3))
Below is copy and pasted from the Scala REPL with added print statement to see what is happening:
scala> val list = Seq()::Nil
list: List[Seq[Nothing]] = List(List())
scala> val listOfInts = List(1,2,3)
listOfInts: List[Int] = List(1, 2, 3)
scala> listOfInts.foreach { case x=>
| println(list::List(x))
| }
List(List(List()), 1)
List(List(List()), 2)
List(List(List()), 3)
During the first iteration of the foreach loop, you are actually taking the first element of listOfInts (which is 1), putting that in a new list (which is List(1)), and then adding the new element list (which is List(List()) ) to the beginning of List(1). This is why it prints out List(List(List()), 1).
Since your list and listOfInts are both immutable, you can't change them. All you can do is perform something on them, and then return a new list with the change. In your case list::List(x) inside the loop actually doesnt do anything you can see unless you print it out.
There are tutorials on the documentation page.
There is a blurb for ListBuffer, if you swing that way.
Otherwise,
scala> var xs = List.empty[List[Int]]
xs: List[List[Int]] = List()
scala> (1 to 10) foreach (i => xs = xs :+ List(i))
scala> xs
res9: List[List[Int]] = List(List(1), List(2), List(3), List(4), List(5), List(6), List(7), List(8), List(9), List(10))
You have a choice of using a mutable builder like ListBuffer or a local var and returning the collection you build.
In the functional world, you often build by prepending and then reverse:
scala> var xs = List.empty[List[Int]]
xs: List[List[Int]] = List()
scala> (1 to 10) foreach (i => xs = List(i) :: xs)
scala> xs.reverse
res11: List[List[Int]] = List(List(1), List(2), List(3), List(4), List(5), List(6), List(7), List(8), List(9), List(10))
Given val listOfInts = List(1,2,3), and you want the final result as List(List(1),List(2),List(3)).
Another nice trick I can think of is sliding(Groups elements in fixed size blocks by passing a "sliding window" over them)
scala> val listOfInts = List(1,2,3)
listOfInts: List[Int] = List(1, 2, 3)
scala> listOfInts.sliding(1)
res6: Iterator[List[Int]] = non-empty iterator
scala> listOfInts.sliding(1).toList
res7: List[List[Int]] = List(List(1), List(2), List(3))
// If pass 2 in sliding, it will be like
scala> listOfInts.sliding(2).toList
res8: List[List[Int]] = List(List(1, 2), List(2, 3))
For more about the sliding, you can have a read about sliding in scala.collection.IterableLike.
You can simply map over this list to create a List of Lists.
It maintains Immutability and functional approach.
scala> List(1,2,3).map(List(_))
res0: List[List[Int]] = List(List(1), List(2), List(3))
Or you, can also use Tail Recursion :
#annotation.tailrec
def f(l:List[Int],res:List[List[Int]]=Nil) :List[List[Int]] = {
if(l.isEmpty) res else f(l.tail,res :+ List(l.head))
}
scala> f(List(1,2,3))
res1: List[List[Int]] = List(List(1), List(2), List(3))
In scala you have two (three, as #som-snytt has shown) options -- opt for a mutable collection (like Buffer):
scala> val xs = collection.mutable.Buffer(1)
// xs: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1)
scala> xs += 2
// res10: xs.type = ArrayBuffer(1, 2)
scala> xs += 3
// res11: xs.type = ArrayBuffer(1, 2, 3)
As you can see, it works just like you would work with lists in Java. The other option you have, and in fact it's highly encouraged, is to opt to processing list functionally, that's it, you take some function and apply it to each and every element of collection:
scala> val ys = List(1,2,3,4).map(x => x + 1)
// ys: List[Int] = List(2, 3, 4, 5)
scala> def isEven(x: Int) = x % 2 == 0
// isEven: (x: Int)Boolean
scala> val zs = List(1,2,3,4).map(x => x * 10).filter(isEven)
// zs: List[Int] = List(10, 20, 30, 40)
// input: List(1,2,3)
// expected output: List(List(1), List(2), List(3))
val myList: List[Int] = List(1,2,3)
val currentResult = List()
def buildIteratively(input: List[Int], currentOutput: List[List[Int]]): List[List[Int]] = input match {
case Nil => currentOutput
case x::xs => buildIteratively(xs, List(x) :: currentOutput)
}
val result = buildIteratively(myList, currentResult).reverse
You say in your question that the list is immutable, so you do are aware that you cannot mutate it ! All operations on Scala lists return a new list. By the way, even in Java using a foreach to populate a collection is considered a bad practice. The Scala idiom for your use-case is :
list ::: listOfInts
Shorter, clearer, more functional, more idiomatic and easier to reason about (mutability make things more "complicated" especially when writing lambda expressions because it breaks the semantic of a pure function). There is no good reason to give you a different answer.
If you want mutability, probably for performance purposes, use a mutable collection such as ArrayBuffer.

Functional Creating a list based on values in Scala

I have a task to traverse a sequence of tuples and based on last value in the tuple make 1 or more copies of a case class Item. I can solve this task with foreach and Mutable List. As I'm learning FP and Scala collections could it be done more functional way with immutable collections and high order functions in Scala?
For example, input:
List[("A", 2), ("B", 3), ...]
Output:
List[Item("A"), Item("A"), Item("B"),Item("B"),Item("B"), ...]
For each tuple flatMap using List.fill[A](n: Int)(elem: ⇒ A) which produces a List of elem n times.
scala> val xs = List(("A", 2), ("B", 3), ("C", 4))
xs: List[(String, Int)] = List((A,2), (B,3), (C,4))
scala> case class Item(s: String)
defined class Item
scala> xs.flatMap(x => List.fill(x._2)(Item(x._1)))
res2: List[Item] = List(Item(A), Item(A), Item(B), Item(B), Item(B), Item(C), Item(C), Item(C), Item(C))
Using flatten for case class Item(v: String) as follows
myList.map{ case(s,n) => List.fill(n)(Item(s)) }.flatten
Also with a for comprehension like this,
for ( (s,n) <- myList ; l <- List.fill(n)(Item(s)) ) yield l
which is syntax sugar for a call to flatMap.
In addition to List.fill consider List.tabulate for initialising lists, for instance in this way,
for ( (s,n) <- myList ; l <- List.tabulate(n)(_ => Item(s)) ) yield l

How can I group the a list into tuple grouped items in Scala?

for example how can I convert
val list=(1 to 10).toList
into
List((1,2),(3,4),(5,6),(7,8),(9,10))
You may use grouped method for the List class: http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.List
list.grouped(2).toList.collect { case a :: b :: Nil => (a,b) }
res1: List[(Int, Int)] = List((1,2), (3,4), (5,6), (7,8), (9,10))
collect is used to convert list of lists to list of tuples.

Different ways to define lists in Scala

In Scala, if list1 below is not a list what is it?
scala> val list1 = (1,2,3)
res11: (Int, Int, Int) = (1,2,3)
scala> val list2 = List(1,2,3)
list2: List[Int] = List(1, 2, 3)
list1 is a Tuple or more specifically a Tuple3 since it contains three elements.
Tuples can contain different types where as a List must contain elements of type you specify if it cannot be inferred.