Reducing iterator based on value - scala

I have an iterator as such:
Iterator(List(1, 2012), List(2, 2015), List(5, 2017), List(7, 2020))
I'm trying to return an iterator, but with the values slightly changed. The values for all multiples of 5 must be added to the previous row. So the result would be:
Iterator(List(1, 2012), List(2, 4032), List(7, 2020))
I've tried using the following method:
val a = Iterator(List(1, 2012), List(2, 2015), List(5, 2017), List(7, 2020))
val aTransformed = a.reduce((x,y) => if (y(0)%5 == 0) List(x(0),x(1)+y(1)) else x)
but it gives me the final value val aTransformed: List[Int] = List(1, 4029)
What can I do to get an iterator in my desired format? Is there a method to just check the previous/next row without folding it all into one final value?
I know this is possible by converting the iterator to a List, traversing, mutating and converting back to an iterator, but is there a more elegant solution?
Edit for clarification:
Consecutive multiples of 5 will get collated into one sum
Iterator(List(1, 2012), List(2, 2015), List(5, 2017), List(10, 2025))
should become
Iterator(List(1, 2012), List(2, 6057))

Since we cant directly get last element from Iterator, we need a buffer to store the last element, and after calcuate, we check the buffer state and append it the final result.
Here I append a empty Iterator[List[Int]] element to simplify the check step.
def convert(xs: Iterator[List[Int]]): Iterator[List[Int]] = {
val res = (xs ++ Iterator(List[Int]())).foldLeft(Iterator[List[Int]](), List[Int]())((x, y)=> {
if (y.nonEmpty && y(0) % 5 == 0) {
if (x._2.nonEmpty) {
(x._1, List(x._2(0), x._2(1) + y(1)))
} else {
(x._1, y)
} else {
if (x._2.nonEmpty) {
(x._1 ++ Iterator(x._2), y)
} else {
(x._1, y)
scala> val xs1 = Iterator(List(1, 2012), List(2, 2015), List(5, 2017), List(7, 2020))
val xs1: Iterator[List[Int]] = <iterator>
scala> val xs2 = Iterator(List(1, 2012), List(2, 2015), List(5, 2017), List(10, 2025))
val xs2: Iterator[List[Int]] = <iterator>
scala> convert(xs1)
val res44: Iterator[List[Int]] = <iterator>
scala> res44.toList
val res45: List[List[Int]] = List(List(1, 2012), List(2, 4032), List(7, 2020))
scala> convert(xs2)
val res47: Iterator[List[Int]] = <iterator>
scala> res47.toList
val res48: List[List[Int]] = List(List(1, 2012), List(2, 6057))

Following is a possible way to get the expected result. I haven't checked all the possibilities..
val interResult = itr.foldLeft((List.empty[List[Int]], List.empty[Int])) { (acc, curr) =>
if(curr.size != 2)
else if(acc._2.isEmpty)
(acc._1, curr)
if(curr.headOption.exists(_ % 5 == 0))
(acc._1, List(acc._2.head, acc._2.last + curr.last))
(acc._1 :+ acc._2, curr)
interResult._1 :+ interResult._2


scala - if/else in a for loop

If i use the two for loops like this i get a List[List[Int]], but how can i get a List[Int]?
I dont know how i could write a if/else statement in only one for loop, can someone help me ?
def example: (List[(Int, Int)], Int,Int) => List[Int] ={
(list, p, counter) =>
if (counter >=0)
for(x<-list(i._1); if ( x._1 ==p))yield x._2
for(x<-list(i._1); if ( x._1 !=p))yield example((x._1,x._2+i._2):: Nil,p,counter-1)
else { ....}
First off, as written, the code you posted is not even a valid definition. If you have something that works but returns a different type than what is desired, post that working code.
That being said, if you have List[List[Int]] and want a List[Int], the method for that is flatten
scala> val nestedList = List(List(1, 2), List(3, 4), List(5, 6))
nestedList: List[List[Int]] = List(List(1, 2), List(3, 4), List(5, 6))
scala> val flattenedList = nestedList.flatten
flattenedList: List[Int] = List(1, 2, 3, 4, 5, 6)

Remove one element from Scala List

For example, if I have a list of List(1,2,1,3,2), and I want to remove only one 1, so the I get List(2,1,3,2). If the other 1 was removed it would be fine.
My solution is:
scala> val myList = List(1,2,1,3,2)
myList: List[Int] = List(1, 2, 1, 3, 2)
scala> myList.patch(myList.indexOf(1), List(), 1)
res7: List[Int] = List(2, 1, 3, 2)
But I feel like I am missing a simpler solution, if so what am I missing?
surely not simpler:
def rm(xs: List[Int], value: Int): List[Int] = xs match {
case `value` :: tail => tail
case x :: tail => x :: rm(tail, value)
case _ => Nil
scala> val xs = List(1, 2, 1, 3)
xs: List[Int] = List(1, 2, 1, 3)
scala> rm(xs, 1)
res21: List[Int] = List(2, 1, 3)
scala> rm(rm(xs, 1), 1)
res22: List[Int] = List(2, 3)
scala> rm(xs, 2)
res23: List[Int] = List(1, 1, 3)
scala> rm(xs, 3)
res24: List[Int] = List(1, 2, 1)
you can zipWithIndex and filter out the index you want to drop.
scala> val myList = List(1,2,1,3,2)
myList: List[Int] = List(1, 2, 1, 3, 2)
scala> myList.zipWithIndex.filter(_._2 != 0).map(_._1)
res1: List[Int] = List(2, 1, 3, 2)
The filter + map is collect,
scala> myList.zipWithIndex.collect { case (elem, index) if index != 0 => elem }
res2: List[Int] = List(2, 1, 3, 2)
To remove first occurrence of elem, you can split at first occurance, drop the element and merge back.
list.span(_ != 1) match { case (before, atAndAfter) => before ::: atAndAfter.drop(1) }
Following is expanded answer,
val list = List(1, 2, 1, 3, 2)
//split AT first occurance
val elementToRemove = 1
val (beforeFirstOccurance, atAndAfterFirstOccurance) = list.span(_ != elementToRemove)
beforeFirstOccurance ::: atAndAfterFirstOccurance.drop(1) // shouldBe List(2, 1, 3, 2)
How to remove an item from a list in Scala having only its index?
How should I remove the first occurrence of an object from a list in Scala?
List is immutable, so you can’t delete elements from it, but you can filter out the elements you don’t want while you assign the result to a new variable:
scala> val originalList = List(5, 1, 4, 3, 2)
originalList: List[Int] = List(5, 1, 4, 3, 2)
scala> val newList = originalList.filter(_ > 2)
newList: List[Int] = List(5, 4, 3)
Rather than continually assigning the result of operations like this to a new variable, you can declare your variable as a var and reassign the result of the operation back to itself:
scala> var x = List(5, 1, 4, 3, 2)
x: List[Int] = List(5, 1, 4, 3, 2)
scala> x = x.filter(_ > 2)
x: List[Int] = List(5, 4, 3)

Error: type mismatch flatMap

I am new to spark programming and scala and i am not able to understand the difference between map and flatMap.
I tried below code as i was expecting both to work but got error.
scala> val b = List("1","2", "4", "5")
b: List[String] = List(1, 2, 4, 5)
scala> => (x,1))
res2: List[(String, Int)] = List((1,1), (2,1), (4,1), (5,1))
scala> b.flatMap(x => (x,1))
<console>:28: error: type mismatch;
found : (String, Int)
required: scala.collection.GenTraversableOnce[?]
b.flatMap(x => (x,1))
As per my understanding flatmap make Rdd in to collection for String/Int Rdd.
I was thinking that in this case both should work without any error.Please let me know where i am making the mistake.
You need to look at how the signatures defined these methods:
def map[U: ClassTag](f: T => U): RDD[U]
map takes a function from type T to type U and returns an RDD[U].
On the other hand, flatMap:
def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U]
Expects a function taking type T to a TraversableOnce[U], which is a trait Tuple2 doesn't implement, and returns an RDD[U]. Generally, you use flatMap when you want to flatten a collection of collections, i.e. if you had an RDD[List[List[Int]] and you want to produce a RDD[List[Int]] you can flatMap it using identity.
map(func) Return a new distributed dataset formed by passing each element of the source through a function func.
flatMap(func) Similar to map, but each input item can be mapped to 0 or more output items (so func should return a Seq rather than a single item).
The following example might be helpful.
scala> val b = List("1", "2", "4", "5")
b: List[String] = List(1, 2, 4, 5)
res69: List[scala.collection.immutable.Set[Any]] =
List(Set(1, 1), Set(2, 1), Set(4, 1), Set(5, 1))
scala> b.flatMap(x=>Set(x,1))
res70: List[Any] = List(1, 1, 2, 1, 4, 1, 5, 1)
scala> b.flatMap(x=>List(x,1))
res71: List[Any] = List(1, 1, 2, 1, 4, 1, 5, 1)
scala> b.flatMap(x=>List(x+1))
res75: scala.collection.immutable.Set[String] = List(11, 21, 41, 51) // concat
scala> val x = sc.parallelize(List("aa bb cc dd", "ee ff gg hh"), 2)
scala> val y = => x.split(" ")) // split(" ") returns an array of words
scala> y.collect
res0: Array[Array[String]] = Array(Array(aa, bb, cc, dd), Array(ee, ff, gg, hh))
scala> val y = x.flatMap(x => x.split(" "))
scala> y.collect
res1: Array[String] = Array(aa, bb, cc, dd, ee, ff, gg, hh)
Map operation return type is U where as flatMap return type is TraversableOnce[U](means collections)
val b = List("1", "2", "4", "5")
val mapRDD = { input => (input, 1) }
mapRDD.foreach(f => println(f._1 + " " + f._2))
val flatmapRDD = b.flatMap { input => List((input, 1)) }
flatmapRDD.foreach(f => println(f._1 + " " + f._2))
map does a 1-to-1 transformation, while flatMap converts a list of lists to a single list:
scala> val b = List(List(1,2,3), List(4,5,6), List(7,8,90))
b: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 90))
scala> => (x,1))
res1: List[(List[Int], Int)] = List((List(1, 2, 3),1), (List(4, 5, 6),1), (List(7, 8, 90),1))
scala> b.flatMap(x => x)
res2: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 90)
Also, flatMap is useful for filtering out None values if you have a list of Options:
scala> val c = List(Some(1), Some(2), None, Some(3), Some(4), None)
c: List[Option[Int]] = List(Some(1), Some(2), None, Some(3), Some(4), None)
scala> c.flatMap(x => x)
res3: List[Int] = List(1, 2, 3, 4)

Grouping a list

I want to group elements of a list such as :
val lst = List(1,2,3,4,5)
On transformation it should return a new list as:
val newlst = List(List(1), List(1,2), List(1,2,3), List(1,2,3,4), Lis(1,2,3,4,5))
You can do it this way:
(1 to lst.size map lst.take).toList should do it.
Not as pretty or short as others, but gotta have some tail recursion for the soul:
def createFromElements(list: List[Int]): List[List[Int]] = {
def createFromElements(l: List[Int], p: List[List[Int]]): List[List[Int]] =
l match {
case x :: xs =>
createFromElements(xs, (p.headOption.getOrElse(List()) ++ List(x)) :: p)
case Nil => p.reverse
createFromElements(list, Nil)
And now:
scala> createFromElements(List(1,2,3,4,5))
res10: List[List[Int]] = List(List(1), List(1, 2), List(1, 2, 3), List(1, 2, 3, 4), List(1, 2, 3, 4, 5))
Doing a foldLeft seems to be more efficient, though ugly:
(lst.foldLeft((List[List[Int]](), List[Int]()))((x,y) => {
val z = x._2 :+ y;
(x._1 :+ z, z)

Scala: grouped from right?

In Scala, grouped works from left to right.
val list = List(1,2,3,4,5)
=> List[List[Int]] = List(List(1, 2), List(3, 4), List(5))
But what if I want:
=> List[List[Int]] = List(List(1), List(2, 3), List(4, 5))
Well I know this works:
It seems not efficient, however.
Then you could implement it by yourself:
def rgrouped[T](xs: List[T], n: Int) = {
val diff = xs.length % n
if (diff == 0) xs.grouped(n).toList else {
val (head, toGroup) = xs.splitAt(diff)
List(head, toGroup.grouped(n).toList.head)
Quite ugly, but should work.
Here is my attempt:
def rightGrouped[T](ls:List[T], s:Int) = {
val a = ls.length%s match {
case 0 => ls.grouped(s)
case x => List(ls.take(x)) ++ ls.takeRight(ls.length-x).grouped(s)
scala> rightGrouped(List(1,2,3,4,5),3)
res6: List[List[Int]] = List(List(1, 2), List(3, 4, 5))
I initially tried without pattern matching, but it was wrong when the list was "even"
val ls = List(1,2,3,4,5,6)
val s = 3
val x = ls.length % s
List(ls.take(x)) ++ ls.takeRight(ls.length-x).grouped(s)
List(List(), List(1, 2, 3), List(4, 5, 6))
val l =List(list.head)::(list.tail grouped(2) toList)
After #gzm0 pointed out my mistake I have fixed the solution, though it works only for n=2
def group2[T](list: List[T]) ={
(list.size % 2 == 0) match {
case true => list.grouped(2).toList
case false => List(list.head) :: (list.tail grouped(2) toList)
List(List(1), List(2, 3), List(4, 5))
List(List(1, 2), List(3, 4), List(5, 6))
Staying consistent with idiomatic use of the Scala Collections Library such that it also works on things like String, here's an implementation.
def groupedRight[T](seqT: Seq[T], width: Int): Iterator[Seq[T]] =
if (width > 0) {
val remainder = seqT.length % width
if (remainder == 0)
(seqT.take(remainder) :: seqT.drop(remainder).grouped(width).toList).iterator
throw new IllegalArgumentException(s"width [$width] must be greater than 0")
val x = groupedRight(List(1,2,3,4,5,6,7), 3).toList
// => val x: List[Seq[Int]] = List(List(1), List(2, 3, 4), List(5, 6, 7))
val sx = groupedRight("12345", 3).toList
// => val sx: List[Seq[Char]] = List(12, 345)
val sx = groupedRight("12345", 3)
// => val sx: List[String] = List(12, 345)