Scala, converting multiple lists to list of tuples [duplicate] - scala

This question already has answers here:
Can I zip more than two lists together in Scala?
(11 answers)
Closed 9 years ago.
I have 3 lists like
val a = List("a", "b", "c")
val b = List(1, 2, 3)
val c = List(4, 5, 6)
I want convert them as follows
List(("a", 1, 4), ("b", 2, 5), ("c", 3, 6))
Please let me know how to get this result

If you have two or three lists that you need zipped together you can use zipped
val a = List("a", "b", "c")
val b = List(1, 2, 3)
val c = List(4, 5, 6)
(a,b,c).zipped.toList
This results in: List((a,1,4), (b,2,5), (c,3,6))

Should be easy to achieve:
(a zip b) zip c map {
case ((x, y), z) => (x, y, z)
};

Use:
(a zip b) zip c map { case ((av,bv),cv) => (av,bv,cv) }
Note: This shortens the result list of the shortest of a,b,c. If you'd rather have the result list padded with default values, use zipAll.

Related

How to combine two lists into tuple list (like Cartesian Products) in Scala? [duplicate]

This question already has answers here:
Cartesian product of two lists
(3 answers)
Closed 8 months ago.
Suppose list1 = [1, 2, 3] and list2 = ["a", "b"]
I want to combine them such that I get a tuple list like so:
list3 = [(1, "a"), (1, "b"), (2, "a"), (2, "b"), (3, "a"), (3, "b")]
Is there any way to do this concisely?
You can do it fairly easily with Scala's for ... yield expression.
for {
x <- list1
y <- list2
} yield {
(x, y)
}
This desugars to flatMap and map calls, so it's equivalent to
list1.flatMap { x => list2.map { y => (x, y) } }
If you're using Scalaz, then the Cartesian product is just an application of the Applicative instance for lists. With appropriate Scalaz imports, this will suffice
^(list1, list2) { (_, _) }
cats has similar functionality with different syntax:
import cats.syntax.all._
(list1, list2).tupled

Find common elements in a map of sequences - scala

I have something like this:
val myMap: Map[Int, Seq[Int]] = Map(1 -> (1, 2, 3), 2 -> (2, 3, 4), 3 -> (3, 4, 5), 4 -> (4, 5, 6))
I am trying to find a way to relate all the keys and their common elements in the sequence they are mapped to.
For example:
1 and 2 share (2, 3)
1 and 3 share (3)
2 and 3 share (3, 4)
2 and 4 share (4)
3 and 4 share (4, 5)
I suspect I need to use intersect but I am not sure how to go about the problem. I am brand new to scala and functional programming and need a little help getting started on this. I know there are probably easier ways to do this with spark, however, I am trying to stick just to scala.
Any help is greatly appreciated!
Here's one way using flatMap and collect to generate the shared values from every combination of the key pairs via intersect:
val myMap: Map[Int, List[Int]] = Map(
1 -> List(1, 2, 3), 2 -> List(2, 3, 4), 3 -> List(3, 4, 5), 4 -> List(4, 5, 6)
)
val keys = myMap.keys.toList
keys.flatMap{ i => keys.collect{
case j if j > i => (i, j, myMap(i) intersect myMap(j))
}
}
// res1: List[(Int, Int, List[Int])] = List(
// (1,2,List(2, 3)),
// (1,3,List(3)),
// (1,4,List()),
// (2,3,List(3, 4)),
// (2,4,List(4)),
// (3,4,List(4, 5))
// )
The above is essentially the same as the following for comprehension:
for {
i <- keys
j <- keys
if j > i
} yield (i, j, myMap(i) intersect myMap(j))
How do you want the results returned? Do you just want to print them to STDOUT?
myMap.keys.toList.combinations(2).foreach{ case List(a,b) =>
println(s"$a,$b --> ${myMap(a) intersect myMap(b)}")
}
Pretty similar to #jwvh solution, but with less lookups in the map, in case it is big:
val myMap: Map[Int, Seq[Int]] = Map(1 -> Seq(1, 2, 3), 2 -> Seq(2, 3, 4), 3 -> Seq(3, 4, 5), 4 -> Seq(4, 5, 6))
myMap.toList.combinations(2).foreach {
case List((i1, s1), (i2, s2)) =>
val ints = s1.intersect(s2)
if (ints.nonEmpty) {
println(s"$i1 and $i2 share (${ints.mkString(", ")})")
}
case _ => ???
}
Code run at Scastie.

Counting number of occurrences of Array element in a RDD

I have a RDD1 with Key-Value pair of type [(String, Array[String])] (i will refer to them as (X, Y)), and a Array Z[String].
I'm trying for every element in Z to count how many X instances there are that have Z in Y. I want my output as ((X, Z(i)), #ofinstances).
RDD1= ((A, (2, 3, 4), (B, (4, 4, 4)), (A, (4, 5)))
Z = (1, 4)
then i want to get:
(((A, 4), 2), ((B, 4), 1))
Hope that made sense.
As you can see over, i only want an element if there is atleast one occurence.
I have tried this so far:
val newRDD = RDD1.map{case(x, y) => for(i <- 0 to (z.size-1)){if(y.contains(z(i))) {((x, z(i)), 1)}}}
My output here is an RDD[Unit]
Im not sure if what i'm asking for is even possible, or if i have to do it an other way.
So it is just another word count
val rdd = sc.parallelize(Seq(
("A", Array("2", "3", "4")),
("B", Array("4", "4", "4")),
("A", Array("4", "5"))))
val z = Array("1", "4")
To make lookups efficient convert z to Set:
val zs = z.toSet
val result = rdd
.flatMapValues(_.filter(zs contains _).distinct)
.map((_, 1))
.reduceByKey(_ + _)
where
_.filter(zs contains _).distinct
filters out values that occur in z and deduplicates.
result.take(2).foreach(println)
// ((B,4),1)
// ((A,4),2)

Scala - final map after groupBy / map not sorted even when initial list sorted [duplicate]

This question already has answers here:
Why does groupBy in Scala change the ordering of a list's items?
(2 answers)
Closed 6 years ago.
This is a very simple code that can be executed inside a Scala Worksheet also. It is a map reduce kind of approach to calculate frequency of the numbers in the list.
I am sorting the list before I am starting the groupBy and map operation. Even then the list.groupBy.map operation generates a map, which is not sorted. Neither number wise nor frequency wise
//put this code in Scala worksheet
//this list is sorted and sorted in list variable
val list = List(1,2,4,2,4,7,3,2,4).sorted
//now you can see list is sorted
list
//now applying groupBy and map operation to create frequency map
val freqMap = list.groupBy(x => x) map{ case(k,v) => k-> v.length }
freqMap
groupBy doesn't guarantee any order
val list = List(1,2,4,2,4,7,3,2,4).sorted
val freqMap = list.groupBy(x => x)
Output:
freqMap: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(1), 2 -> List(2, 2, 2), 7 -> List(7), 3 -> List(3), 4 -> List(4, 4, 4))
groupBy takes the list and groups the elements. It builds a Map in which the
key is a unique value of the list
value is a List of all occurrence in the list.
Here is the official method definition from the Scala docs:
def groupBy [K] (f: (A) ⇒ K): Map[K, Traversable[A]]
If you want to order the grouped result you can do it with ListMap:
scala> val freqMap = list.groupBy(x => x)
freqMap: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(1), 2 -> List(2, 2, 2), 7 -> List(7), 3 -> List(3), 4 -> List(4, 4, 4))
scala> import scala.collection.immutable.ListMap
import scala.collection.immutable.ListMap
scala> ListMap(freqMap.toSeq.sortBy(_._1):_*)
res0: scala.collection.immutable.ListMap[Int,List[Int]] = Map(1 -> List(1), 2 -> List(2, 2, 2), 3 -> List(3), 4 -> List(4, 4, 4), 7 -> List(7))

Zip two Arrays, always 3 elements of the first array then 2 elements of the second

I've manually built a method that takes 2 arrays and combines them to 1 like this:
a0,a1,a2,b0,b1,a3,a4,a5,b2,b3,a6,...
So I always take 3 elements of the first array, then 2 of the second one.
As I said, I built that function manually.
Now I guess I could make this a one-liner instead with the help of zip. The problem is, that zip alone is not enough as zip builds tuples like (a0, b0).
Of course I can flatMap this, but still not what I want:
val zippedArray: List[Float] = data1.zip(data2).toList.flatMap(t => List(t._1, t._2))
That way I'd get a List(a0, b0, a1, b1,...), still not what I want.
(I'd then use toArray for the list... it's more convenient to work with a List right now)
I thought about using take and drop but they return new data-structures instead of modifying the old one, so not really what I want.
As you can imagine, I'm not really into functional programming (yet). I do use it and I see huge benefits, but some things are so different to what I'm used to.
Consider grouping array a by 3, and array b by 2, namely
val a = Array(1,2,3,4,5,6)
val b = Array(11,22,33,44)
val g = (a.grouped(3) zip b.grouped(2)).toArray
Array((Array(1, 2, 3),Array(11, 22)), (Array(4, 5, 6),Array(33, 44)))
Then
g.flatMap { case (x,y) => x ++ y }
Array(1, 2, 3, 11, 22, 4, 5, 6, 33, 44)
Very similar answer to #elm but I wanted to show that you can use more lazy approach (iterator) to avoid creating temp structures:
scala> val a = List(1,2,3,4,5,6)
a: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> val b = List(11,22,33,44)
b: List[Int] = List(11, 22, 33, 44)
scala> val groupped = a.sliding(3, 3) zip b.sliding(2, 2)
groupped: Iterator[(List[Int], List[Int])] = non-empty iterator
scala> val result = groupped.flatMap { case (a, b) => a ::: b }
result: Iterator[Int] = non-empty iterator
scala> result.toList
res0: List[Int] = List(1, 2, 3, 11, 22, 4, 5, 6, 33, 44)
Note that it stays an iterator all the way until we materialize it with toList