How to access second element in a Sequence in scala - scala

val k = Seq((0,1),(1,2),(2,3),(3,4))
k: Seq[(Int, Int)] = List((0,1), (1,2), (2,3), (3,4))
If I have above statement and I need to do addition for even places and subtraction for odd places how can I access them? to be clear
(0,1) has to become (0,(1+2))
(1,2) has to become (1,(1-2))
(2,3) has to become (2,(3+4))
(3,4) has to become (3,(3-4)

Do you mean something like this?
val transformed = k.grouped(2).flatMap{
case Seq((i, x), (j, y)) => Seq((i, x + y), (j, x - y))
}
transformed.toList
// List[(Int, Int)] = List((0,3), (1,-1), (2,7), (3,-1))

Related

How can I split a list of tuples scala

I have this list in Scala (which in reality has length 500):
List((1,List(1,2,3)), (2,List(1,2,3)), (3, List(1,2,3)))
What could I do so that I can make a new list which contains the following:
List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3))
Basically I wanna have a new list of tuples which will contain the first element of the old tuple and each element of the list inside the tuple. I am not sure how to start implementing this and this is why I have posted no code to show my attempt. I am really sorry, but I cant grasp this. I appreciate any help you can provide.
Exactly the same as #Andriy but using a for comprehension.
Which in the end is exactly the same but is more readable IMHO.
val result = for {
(x, ys) <- xs
y <- ys
} yield (x, y) // You can also use x -> y
(Again, I would recommend you to follow any tutorial, this is a basic exercise which if you had understand how map & flatMap works you shouldn't have any problem)
scala> val xs = List((1,List(1,2,3)), (2,List(1,2,3)), (3, List(1,2,3)))
xs: List[(Int, List[Int])] = List((1,List(1, 2, 3)), (2,List(1, 2, 3)), (3,List(1, 2, 3)))
scala> xs.flatMap { case (x, ys) => ys.map(y => (x, y)) }
res0: List[(Int, Int)] = List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3))
It's probably worth mentioning that the solution by Andriy Plokhotnyuk can also be re-written as a for-comprehension:
val list = List((1,List(1,2,3)), (2,List(1,2,3)), (3, List(1,2,3)))
val pairs = for {
(n, nestedList) <- list
m <- nestedList
} yield (n, m)
assert(pairs == List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)))
The compiler will effectively re-write the for-comprehension to a flatMap/map chain as described in another answer.

How to create a two dimensional list using for loop in Scala?

If I write the following code in scala I get one dimensional list, as such:
scala> for (a <- (1 to 2).toList; b <- (1 to 3).toList) yield (a, b)
res1 = List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3))
But I'm expecting :-
List(List((1,1), (1,2), (1,3)), List((2,1), (2,2), (2,3)))
Is it possible to do this using a for loop in scala or is some kind of other construct needed?
You could do it in 2 for comprehensions:
for (n <- (1 to 4).toList) yield (for (m <- ('a' to 'c').toList) yield (n, m))
You could also use map
(1 to 4).toList.map(n => ('a' to 'c').toList.map(m => (n, m)))
{for (i <- 1 to 2; j <- 1 to 3) yield (i,j)}.grouped(3).toList
You get a List of Vector, but that's fine, Vector is usually preferred in most circumstances. Fast random access, append, prepend, and updates. You can forgo converting toList until the end. If you are OK with Vector Tom's answer is really good and you can just refactor down to:
(1 to 4).map(n => ('a' to 'c').map(m => (n, m)))

N to N matching in scala

I have two lists, namely
val a = List(1,2,3)
val b = List(4,5)
I want to perform N to N bipartite mapping and want to get output
List((1,4),(1,5),(2,4),(2,5),(3,4),(3,5))
How can I do this?
Assuming that B = List(4,5), then you can use for comprehensions to achieve your goal:
val A = List(1,2,3)
val B = List(4,5)
val result = for(a <- A; b <- B) yield {
(a, b)
}
The output is
result:List[(Int, Int)] = List((1,4), (1,5), (2,4), (2,5), (3,4), (3,5))
Consider also
a.flatMap(x => b.map(y => (x,y)))
though not so concise as a for comprehension.

For multiple generators to handle Seq

I'm a new to scala, and I want to unique a Seq[(Int,Int)] by the first component, my code as follow:
val seq = Seq((1,1), (0,1), (2,1), (0, 1), (3,1), (2,1))
val prev = -1
val uniqueSeq = for(tuple <- seq.sortBy(_._1) if !tuple._1.equals(prev); prev = tuple._1) yield tuple
but why the result is
uniqueSeq: Seq[(Int, Int)] = List((0,1), (0,1), (1,1), (2,1), (2,1), (3,1))
I would take a different approach:
It is a good idea to group them first. Then you can get the head of each of the groups:
seq.groupBy{
case (x, _) => x
}.map {
case (_, head :: _) => head
}.toList
prev in prev = tuple._1 is a completely different variable from val prev = -1! Note that it compiles even though the first prev is val, i.e. immutable (it can't be changed).
If you want to use this approach, you can:
val seq = Seq((1,1), (0,1), (2,1), (0, 1), (3,1), (2,1))
var prev = -1
val uniqueSeq = for(tuple <- seq.sortBy(_._1) if !tuple._1.equals(prev)) yield { prev = tuple._1; tuple }
but it isn't the idiomatic one in Scala. I'll leave that to someone else, since I don't have enough time right now.
Alexey already explained the mistake you're making with the prev variable.
A more idiomatic implementation of what you're trying to do (if I got it right) is
val seq = Seq((1,1), (0,1), (2,1), (0, 1), (3,1), (2,1))
seq.sortBy(_._1).reverse.toMap.toList // List((0,1), (1,1), (2,1), (3,1))
The caveat is that going through a Map the duplicate keys will disappear.
The reverse is necessary, since the last occurrence of a "key" will be preserved in the Map.

Spark: produce RDD[(X, X)] of all possible combinations from RDD[X]

Is it possible in Spark to implement '.combinations' function from scala collections?
/** Iterates over combinations.
*
* #return An Iterator which traverses the possible n-element combinations of this $coll.
* #example `"abbbc".combinations(2) = Iterator(ab, ac, bb, bc)`
*/
For example how can I get from RDD[X] to RDD[List[X]] or RDD[(X,X)] for combinations of size = 2. And lets assume that all values in RDD are unique.
Cartesian product and combinations are two different things, the cartesian product will create an RDD of size rdd.size() ^ 2 and combinations will create an RDD of size rdd.size() choose 2
val rdd = sc.parallelize(1 to 5)
val combinations = rdd.cartesian(rdd).filter{ case (a,b) => a < b }`.
combinations.collect()
Note this will only work if an ordering is defined on the elements of the list, since we use <. This one only works for choosing two but can easily be extended by making sure the relationship a < b for all a and b in the sequence
This is supported natively by a Spark RDD with the cartesian transformation.
e.g.:
val rdd = sc.parallelize(1 to 5)
val cartesian = rdd.cartesian(rdd)
cartesian.collect
Array[(Int, Int)] = Array((1,1), (1,2), (1,3), (1,4), (1,5),
(2,1), (2,2), (2,3), (2,4), (2,5),
(3,1), (3,2), (3,3), (3,4), (3,5),
(4,1), (4,2), (4,3), (4,4), (4,5),
(5,1), (5,2), (5,3), (5,4), (5,5))
As discussed, cartesian will give you n^2 elements of the cartesian product of the RDD with itself.
This algorithm computes the combinations (n,2) of an RDD without having to compute the n^2 elements first: (used String as type, generalizing to a type T takes some plumbing with classtags that would obscure the purpose here)
This is probably less time efficient that cartesian + filtering due to the iterative count and take actions that forces the computation of the RDD, but more space efficient as it calculates only the C(n,2) = n!/(2*(n-2))! = (n*(n-1)/2) elements instead of the n^2 of the cartesian product.
import org.apache.spark.rdd._
def combs(rdd:RDD[String]):RDD[(String,String)] = {
val count = rdd.count
if (rdd.count < 2) {
sc.makeRDD[(String,String)](Seq.empty)
} else if (rdd.count == 2) {
val values = rdd.collect
sc.makeRDD[(String,String)](Seq((values(0), values(1))))
} else {
val elem = rdd.take(1)
val elemRdd = sc.makeRDD(elem)
val subtracted = rdd.subtract(elemRdd)
val comb = subtracted.map(e => (elem(0),e))
comb.union(combs(subtracted))
}
}
This creates all combinations (n, 2) and works for any RDD without requiring any ordering on the elements of RDD.
val rddWithIndex = rdd.zipWithIndex
rddWithIndex.cartesian(rddWithIndex).filter{case(a, b) => a._2 < b._2}.map{case(a, b) => (a._1, b._1)}
a._2 and b._2 are the indices, while a._1 and b._1 are the elements of the original RDD.
Example:
Note that, no ordering is defined on the maps here.
val m1 = Map('a' -> 1, 'b' -> 2)
val m2 = Map('c' -> 3, 'a' -> 4)
val m3 = Map('e' -> 5, 'c' -> 6, 'b' -> 7)
val rdd = sc.makeRDD(Array(m1, m2, m3))
val rddWithIndex = rdd.zipWithIndex
rddWithIndex.cartesian(rddWithIndex).filter{case(a, b) => a._2 < b._2}.map{case(a, b) => (a._1, b._1)}.collect
Output:
Array((Map(a -> 1, b -> 2),Map(c -> 3, a -> 4)), (Map(a -> 1, b -> 2),Map(e -> 5, c -> 6, b -> 7)), (Map(c -> 3, a -> 4),Map(e -> 5, c -> 6, b -> 7)))