spark iterate consecutive elements in list to rows - scala

I have a spark rdd with a column like
List(1, 3, 4, 8)
List(2, 3)
List(1, 5, 6)
I would like to get a new rdd with consecutive elements in each list to rows, like
(1, 3)
(3, 4)
(4, 8)
(2, 3)
(1, 5)
(5, 6)
How can I achieve this with scala?

Consider:
using a complementary (plain Scala) function with signature List[Int] => List[(Int, Int)] to achieve the desired result for the single list
and
passing this function to your RDD's flatMap method.
This complementary function may look like this:
def makeTuples(l: List[Int],
acc: List[(Int, Int)] = List.empty): List[(Int, Int)] =
l match {
case Nil | _ :: Nil => acc.reverse
case a :: b :: rest => makeTuples(b :: rest, (a, b) :: acc)
}

Related

Find combinations in Scala List of Lists

I have a List[List[Row]] (Row is a custom type, it's definition is not important here).
I want to get all the possible ways that I can select a single Row from each inner List so that the result is also a List[List[Row]].
scala> def combs[A](xss: List[List[A]]): List[List[A]] = xss match {
| case Nil => List(Nil)
| case xs::rss => for(x <- xs;cs <- combs(rss)) yield x::cs
| }
combs: [A](xss: List[List[A]])List[List[A]]
scala> combs(List(List(1, 2), List(4, 5), List(6))) foreach println
List(1, 4, 6)
List(1, 5, 6)
List(2, 4, 6)
List(2, 5, 6)

create pairs from sets

If I have unknown number of Set[Int] (or List[Int]) as an input and want to combine
i don't know size of input List[Int] for which I need to produce these tuples as a final result, what's the best way to achieve this? My code looks like below.
Ok. Since combine(xs) yields a List[List[Any]] and you have a :: combine(xs) you just insert a into the the List of all combinations. You want to combine a with each element of the possible combinations. That lead me to this solution.
You can also generalize it to lists:List[List[T]] because when you combine from lists:List[List[Int]] you will get a List[List[Int]].
def combine[T](lists: List[List[T]]): List[List[T]] = lists match {
case Nil => lists
case x :: Nil => for(a <- x) yield List(a) //needed extra case because if comb(xs) is Nil in the for loop, it won't yield anything
case x :: xs => {
val comb = combine(xs) //since all further combinations are constant, you should keep it in a val
for{
a <- x
b <- comb
} yield a :: b
}
}
Tests:
val first = List(7, 3, 1)
val second = List(2, 8)
val third = List("a","b")
combine(List(first, second))
//yields List(List(7, 2), List(7, 8), List(3, 2), List(3, 8), List(1, 2), List(1, 8))
combine(List(first, second, third))
//yields List(List(7, 2, a), List(7, 2, b), List(7, 8, a), List(7, 8, b), List(3, 2, a), List(3, 2, b), List(3, 8, a), List(3, 8, b), List(1, 2, a), List(1, 2, b), List(1, 8, a), List(1, 8, b))
I think you can also generalize this to work with other collections than List, but then you can't use pattern-match this easily and you have to work via iterators.

Add list of tuples of integers in Scala

I wish to add a list of tuples of integers i.e. given an input list of tuples of arity k, produce a tuple of arity k whose fields are sums of corresponding fields of the tuples in the list.
Input
List( (1,2,3), (2,3,-3), (1,1,1))
Output
(4, 6, 1)
I was trying to use foldLeft, but I am not able to get it to compile. Right now, I am using a for loop, but I was looking for a more concise solution.
This can be done type safely and very concisely using shapeless,
scala> import shapeless._, syntax.std.tuple._
import shapeless._
import syntax.std.tuple._
scala> val l = List((1, 2, 3), (2, 3, -1), (1, 1, 1))
l: List[(Int, Int, Int)] = List((1,2,3), (2,3,-1), (1,1,1))
scala> l.map(_.toList).transpose.map(_.sum)
res0: List[Int] = List(4, 6, 3)
Notice that unlike solutions which rely on casts, this approach is type safe, and any type errors are detected at compile time rather than at runtime,
scala> val l = List((1, 2, 3), (2, "foo", -1), (1, 1, 1))
l: List[(Int, Any, Int)] = List((1,2,3), (2,foo,-1), (1,1,1))
scala> l.map(_.toList).transpose.map(_.sum)
<console>:15: error: could not find implicit value for parameter num: Numeric[Any]
l.map(_.toList).transpose.map(_.sum)
^
scala> val tuples = List( (1,2,3), (2,3,-3), (1,1,1))
tuples: List[(Int, Int, Int)] = List((1,2,3), (2,3,-3), (1,1,1))
scala> tuples.map(t => t.productIterator.toList.map(_.asInstanceOf[Int])).transpose.map(_.sum)
res0: List[Int] = List(4, 6, 1)
Type information is lost when calling productIterator on Tuple3 so you have to convert from Any back to an Int.
If the tuples are always going to contain the same type I would suggest using another collection such as List. The Tuple is better suited for disparate types. When you have the same types and don't lose the type information by using productIterator the solution is more elegant.
scala> val tuples = List(List(1,2,3), List(2,3,-3), List(1,1,1))
tuples: List[List[Int]] = List(List(1, 2, 3), List(2, 3, -3), List(1, 1, 1))
scala> tuples.transpose.map(_.sum)
res1: List[Int] = List(4, 6, 1)
scala> val list = List( (1,2,3), (2,3,-3), (1,1,1))
list: List[(Int, Int, Int)] = List((1,2,3), (2,3,-3), (1,1,1))
scala> list.foldRight( (0, 0, 0) ){ case ((a, b, c), (a1, b1, c1)) => (a + a1, b + b1, c + c1) }
res0: (Int, Int, Int) = (4,6,1)

my combinations function returns an empty list

I am working on S-99: Ninety-Nine Scala Problems and already stuck at question 26.
Generate the combinations of K distinct objects chosen from the N elements of a list.
After wasting a couple hours, I decided to peek at a solution written in Haskell:
combinations :: Int -> [a] -> [[a]]
combinations 0 _ = [ [] ]
combinations n xs = [ y:ys | y:xs' <- tails xs
, ys <- combinations (n-1) xs']
It looks pretty straightforward so I decided to translate into Scala. (I know that's cheating.) Here's what I got so far:
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (0, _) => List[List[T]]()
case (n, xs) => {
for {
y :: xss <- allTails(xs).reverse
ys <- combinations((n - 1), xss)
} yield y :: ys
}
}
My helper function:
def allTails[T](ls: List[T]): List[List[T]] = {
ls./:(0, List[List[T]]())((acc, c) => {
(acc._1 + 1, ls.drop(acc._1) :: acc._2)
})._2 }
allTails(List(0, 1, 2, 3)).reverse
//> res1: List[List[Int]] = List(List(0, 1, 2, 3), List(1, 2, 3), List(2, 3), List(3))
However, my combinations returns an empty list. Any idea?
Other solutions with explanation are very welcome as well. Thanks
Edit: The description of the question
Generate the combinations of K distinct objects chosen from the N elements of a list.
In how many ways can a committee of 3 be chosen from a group of 12 people? We all know that there are C(12,3) = 220 possibilities (C(N,K) denotes the well-known binomial coefficient). For pure mathematicians, this result may be great. But we want to really generate all the possibilities.
Example:
scala> combinations(3, List('a, 'b, 'c, 'd, 'e, 'f))
res0: List[List[Symbol]] = List(List('a, 'b, 'c), List('a, 'b, 'd), List('a, 'b, 'e), ...
As Noah pointed out, my problem is for of an empty list doesn't yield. However, the hacky work around that Noah suggested is wrong. It adds an empty list to the result of every recursion step. Anyway, here is my final solution. I changed the base case to "case (1, xs)". (n matches 1)
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (1, xs) => xs.map(List(_))
case (n, xs) => {
val tails = allTails(xs).reverse
for {
y :: xss <- allTails(xs).reverse
ys <- combinations((n - 1), xss)
} yield y :: ys
}
}
//combinations(3, List(1, 2, 3, 4))
//List(List(1, 2, 3), List(1, 2, 4), List(1, 3, 4), List(2, 3, 4))
//combinations(2, List(0, 1, 2, 3))
//List(List(0, 1), List(0, 2), List(0, 3), List(1, 2), List(1, 3), List(2, 3))
def allTails[T](ls: List[T]): List[List[T]] = {
ls./:(0, List[List[T]]())((acc, c) => {
(acc._1 + 1, ls.drop(acc._1) :: acc._2)
})._2
}
//allTails(List(0,1,2,3))
//List(List(3), List(2, 3), List(1, 2, 3), List(0, 1, 2, 3))
You made a mistake when translating the Haskell version here:
case (0, _) => List[List[T]]()
This returns an empty list. Whereas the Haskell version
combinations 0 _ = [ [] ]
returns a list with a single element, and that element is an empty list.
This is essentially saying that there is one way to choose zero items, and that is important because the code builds on this case recursively for the cases where we choose more items. If there were no ways to select zero items, then there would also be no ways to select one item and so on. That's what's happening in your code.
If you fix the Scala version to do the same as the Haskell version:
case (0, _) => List(List[T]())
it works as expected.
Your problem is using the for comprehension with lists. If the for detects an empty list, then it short circuits and returns an empty list instead of 'cons'ing your head element. Here's an example:
scala> for { xs <- List() } yield println("It worked!") // This never prints
res0: List[Unit] = List()
So, a kind of hacky work around for your combinations function would be:
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (0, _) => List[List[T]]()
case (n, xs) => {
val tails = allTails(xs).reverse
println(tails)
for {
y :: xss <- tails
ys <- Nil :: combinations((n - 1), xss) //Now we're sure to keep evaulating even with an empty list
} yield y :: ys
}
}
scala> combinations(2, List(1, 2, 3))
List(List(1, 2, 3), List(2, 3), List(3))
List(List(2, 3), List(3))
List(List(3))
List()
res5: List[List[Int]] = List(List(1), List(1, 2), List(1, 3), List(2), List(2, 3), List(3))
One more way of solving it.
def combinations[T](n: Int, ls: List[T]): List[List[T]] = {
var ms: List[List[T]] = List[List[T]]();
val len = ls.size
if (n > len)
throw new Error();
else if (n == len)
List(ls)
else if (n == 1)
ls map (a => List(a))
else {
for (i <- n to len) {
val take: List[T] = ls take i;
val temp = combinations(n - 1, take.init) map (a => take.last :: a)
ms = ms ::: temp
}
ms
}
}
So combinations(2, List(1, 2, 3)) gives: List[List[Int]] = List(List(2, 1), List(3, 1), List(3, 2))

Scala - convert List of Lists into a single List: List[List[A]] to List[A]

What's the best way to convert a List of Lists in scala (2.9)?
I have a list:
List[List[A]]
which I want to convert into
List[A]
How can that be achieved recursively? Or is there any other better way?
List has the flatten method. Why not use it?
List(List(1,2), List(3,4)).flatten
> List(1,2,3,4)
.flatten is obviously the easiest way, but for completeness you should also know about flatMap
val l = List(List(1, 2), List(3, 4))
println(l.flatMap(identity))
and the for-comprehension equivalent
println(for (list <- l; x <- list) yield x)
flatten is obviously a special case of flatMap, which can do so much more.
Given the above example, I'm not sure you need recursion. Looks like you want List.flatten instead.
e.g.
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)
scala> List(4,5,6)
res1: List[Int] = List(4, 5, 6)
scala> List(res0,res1)
res2: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6))
scala> res2.flatten
res3: List[Int] = List(1, 2, 3, 4, 5, 6)
If your structure can be further nested, like:
List(List(1, 2, 3, 4, List(5, 6, List(7, 8))))
This function should give you the desire result:
def f[U](l: List[U]): List[U] = l match {
case Nil => Nil
case (x: List[U]) :: tail => f(x) ::: f(tail)
case x :: tail => x :: f(tail)
}
You don't need recursion but you can use it if you want:
def flatten[A](list: List[List[A]]):List[A] =
if (list.length==0) List[A]()
else list.head ++ flatten(list.tail)
This works like flatten method build into List. Example:
scala> flatten(List(List(1,2), List(3,4)))
res0: List[Int] = List(1, 2, 3, 4)
If you want to use flatmap, here is the the way
Suppose that you have a List of List[Int] named ll, and you want to flat it to List,
many people already gives you the answers, such as flatten, that's the easy way. I assume that you are asking for using flatmap method. If it is the case, here is the way
ll.flatMap(_.map(o=>o))