How to convert a List[List[Long]] to a List[List[Int]]? - scala

What's the best way to convert a List[List[Long]] to a List[List[Int]] in Scala?
For example, given the following list of type List[List[Long]]
val l: List[List[Long]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
how can it be converted to List[List[Int]]?

You can also use cats lib for that and compose List functors
import cats.Functor
import cats.implicits._
import cats.data._
val l: List[List[Long]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
Functor[List].compose[List].map(l)(_.toInt)
//or
Nested(l).map(_.toInt).value
and one more pure scala approach (not very safe, though)
val res:List[List[Int]] = l.asInstanceOf[List[List[Int]]]

Try l.map(_.map(_.toInt)) like so
val l: List[List[Long]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
l.map(_.map(_.toInt))
which should give
res2: List[List[Int]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))

Only if you are completely sure that you won't overflow the Int.
val l1: List[List[Long]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
val l2: List[List[Int]] = l1.map(list => list.map(long => long.toInt))
(Basically, every time you want to transform a List into another List, use map).

can be achieved with simple transformation on collection using map function.
map works by applying a function to each element in the list. in your case nested lists are there. so you need to apply map function 2 times like below example...
val x : List[List[Long]] = List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
println(x)
val y :List[List[Int]]= x.map(a => a.map(_.toInt))
println(y)
Output :
List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))
List(List(11, 10, 11, 10, 11), List(8, 19, 24, 0, 2))

Related

Spark - Remove intersecting elements between two array type columns

I have dataframe like this
+---------+--------------------+----------------------------+
| Name| rem1| quota |
+---------+--------------------+----------------------------+
|Customer_3|[258, 259, 260, 2...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_4|[18, 19, 20, 27, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_5|[16, 17, 51, 52, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_6|[6, 7, 8, 9, 10, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_7|[0, 30, 31, 32, 3...|[1, 2, 3, 4, 5, 6, 7,..500]|
I would like to remove list value in rem1 from quota and create as one new column. I have tried.
val dfleft = dfpci_remove2.withColumn("left",$"quota".filter($"rem1"))
<console>:123: error: value filter is not a member of org.apache.spark.sql.ColumnName
Please advise.
You can use a filter in a column in such way, you can write an udf as below
val filterList = udf((a: Seq[Int], b: Seq[Int]) => a diff b)
df.withColumn("left", filterList($"rem1", $"quota") )
This should give you the expected result.
Hope this helps!

Scala-Making a map from two lists

I have two lists like
val lst1 = List(1, 1, 1, 1, 1, 2, 2, 3, 3, 3)
val lst2 = List(12, 13, 12, 15, 16, 21, 23, 30, 32, 13)
I would like to make a map like this while the order of values in lst2 does not change in the map:
Map(1 -> (12, 13, 12, 15, 16), 2 -> (21 , 23), 3 -> (30, 32, 13))
How can I do that?
Here's one approach using zip and groupBy:
(lst1 zip lst2).groupBy(_._1).mapValues(_.map(_._2))
// res1: scala.collection.immutable.Map[Int,List[Int]] = Map(
// 2 -> List(21, 23), 1 -> List(12, 13, 12, 15, 16), 3 -> List(30, 32, 13)
// )

Expression of type Seq[unit] does not conform to expected type Seq[DataFrame] in scala

In my function, I am returning a finalDF, a sequence of data frames. In the loop shown below, map returns Seq[DataFrame] and it is being stored in finalDF to be able to return to the caller, but in some cases where there is further processing, I would like to store the filtered dataframe for each iteration and pass it to next loop.
How do I do it? If I try to assign it to some temp val, it throws and error that expression of type Seq[unit] does not conform to expected type Seq[DataFrame].
var finalDF: Seq[DataFrame] =null
for (i <- 0 until stop){
finalDF=strataCount(i).map(x=> {
df.filter(df(cols(i)) === x)
//how to get the above data frame to pass on to the next computation?
}
)
}
Regards
Maybe this is helpful:
val finalDF: Seq[DataFrame] = (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x))).toSeq
flatMap to flatten the Seq(Seq).
(0 to stop) will loop from 0 to stop, flatMap will flatten List, Like:
scala> (0 to 20).flatMap(i => List(i))
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
scala> (0 to 20).map(i => List(i)).flatten
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
for two counters, maybe you can do it like:
(0 to stop).flatMap(j => {
(0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x)))
}).toSeq
or try: for yield, see: Scala for/yield syntax

Concatenating multiple lists in Scala

I have a function called generateList and concat function as follows. It is essentially concatenating lists returned by the generateList with i starting at 24 and ending at 1
def concat(i: Int, l: List[(String, Int)]) : List[(String, Int)] = {
if (i==1) l else l ::: concat(i-1, generateList(signs, i))
}
val all = concat(23, generateList(signs, 24))
I can convert this to tail-recursion. But I am curious if there a scala way of doing this?
There are many ways to do this with Scala's built in methods available to Lists.
Here is one approach that uses foldRight
(1 to 24).foldRight(List[Int]())( (i, l) => l ::: generateList(i))
Starting with the range of ints you use to build separate lists, it concats the result of generateList(i) to the initial empty list.
Here is one way to do this:
val signs = ""
def generateList(s: String, n: Int) = n :: n * 2 :: Nil
scala> (24 to 1 by -1) flatMap (generateList(signs, _))
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(24, 48, 23, 46, 22, 44, 21, 42, 20, 40, 19, 38, 18, 36, 17, 34, 16, 32, 15, 30, 14, 28, 13, 26, 12, 24, 11, 22, 10, 20, 9, 18, 8, 16, 7, 14, 6, 12, 5, 10, 4, 8, 3, 6, 2, 4, 1, 2)
What you want to do is to map the list with x => generateList(signs, x) function and then concatenate the results, i.e. flatten the list. This is just what flatMap does.

How do you group elements from Enumerator[A] to Enumerator[Seq[A]]?

I have elements from an Enumerator[A], and want to group/batch the elements to get an Enumerator[Seq[A]]. Here's code I wrote which groups A to Seq[A], but doesn't produce an Enumerator[Seq[A]].
val batchSize = 1000
dogsEnumerator
.run(
Iteratee.fold1[Dog, Vector[Dog]](Future.successful(Vector[Dog]())){
(r, c) =>
if (r.size > batchSize)
processBatch(r).map(_ => Vector[Dog]())
else
Future.successful(r :+ c)
}.map(_ => ())
)
This can be done pretty straightforwardly with the help of some of the Enumeratee combinators:
import play.api.libs.iteratee._
def batch[A](n: Int): Enumeratee[A, List[A]] = Enumeratee.grouped(
Enumeratee.take(n) &>> Iteratee.getChunks[A]
)
We can then use this enumeratee to transform any enumerator into a new enumerator of lists:
val intsEnumerator = Enumerator(1 to 40: _*)
intsEnumerator.through(batch(7)).run(Iteratee.foreach(println))
This will print the following:
List(1, 2, 3, 4, 5, 6, 7)
List(8, 9, 10, 11, 12, 13, 14)
List(15, 16, 17, 18, 19, 20, 21)
List(22, 23, 24, 25, 26, 27, 28)
List(29, 30, 31, 32, 33, 34, 35)
List(36, 37, 38, 39, 40)
As expected.