How to destructure a tuple to multiple tuple in scala - scala

I have a file containing a line
4 3 2
5 6 7
9 8 2
I am splitting the line by tab and then want to break the content into 2 pieces
What is the way to convert each line of input to 2 seperate tuple as following-
(4 3 2) = (4 1 2) & (3 1 2)

I am assuming that:
each tab-separated line consists of three elements
elements within each line are separated with exactly one space character
each line needs to be converted to a tuple of tuples
In case I got any of these wrong (e.g. there can be more than 3 elements in each row or you need different structures than tuples) it can be easily adapted.
val file = "4 3 2\t5 6 7\t9 8 2"
val lines = file.split("\t").map(line => line.split(" ").toList)
val newLines = lines.map({
case a :: b :: c :: Nil => ((a, "1", c), (b, "1", c))
})
newLines.map(println)
// ((4, 1, 2), (3, 1, 2))
// ((5, 1, 7), (6, 1, 7))
// ((9, 1, 2), (8, 1, 2))
EDIT:
This answer was based on the logic that you wrote initially in your question and which said that you want this kind of map: ((a b c) => (a 1 c) (b 1 c)). I can see now that you removed that part so I'm not sure if the logic in my solution is right, but now that you have the basic skeleton you can modify as you need.

Related

How to remove duplicates from list without using in inbuilt libraries such as distinct, groupBy(identity), toSet.. Etc

I wanted to write a Scala program that takes command-line args as list input and provide the output list without duplicates.
I want to know the custom implementation of this without using any libraries.
Input : 4 3 7 2 8 4 2 7 3
Output :4 3 7 2 8
val x= List(4, 3, 7, 2, 8, 4, 2, 7, 3)
x.foldLeft(List[Int]())((l,v)=> if (l.contains(v)) l else v :: l)
if you can't use contains you can do another fold
x.foldLeft(List[Int]())((l,v)=> if (l.foldLeft(false)((contains,c)=>if (c==v ) contains | true else contains | false)) l else v :: l)
Here's a way you could do this using recursion. I've tried to lay it out in a way that's easiest to explain:
import scala.annotation.tailrec
#tailrec
def getIndividuals(in: List[Int], out: List[Int] = List.empty): List[Int] = {
if(in.isEmpty) out
else if(!out.contains(in.head)) getIndividuals(in.tail, out :+ in.head)
else getIndividuals(in.tail, out)
}
val list = List(1, 2, 3, 4, 5, 4, 3, 5, 6, 0, 7)
val list2 = List(1)
val list3 = List()
val list4 = List(3, 3, 3, 3)
getIndividuals(list) // List(1, 2, 3, 4, 5, 6, 0, 7)
getIndividuals(list2) // List(1)
getIndividuals(list3) // List()
getIndividuals(list4) // List(3)
This function takes two parameters, in and out, and iterates through every element in the in List until it's empty (by calling itself with the tail of in). Once in is empty, the function outputs the out List.
If the out List doesn't contain the value of in you are currently looking at, the function calls itself with the tail of in and with that value of in added on to the end of the out List.
If out does contain the value of in you are currently looking at, it just calls itself with the tail of in and the current out List.
Note: This is an alternative to the fold method that Arnon proposed. I personally would write a function like mine and then maybe refactor it into a fold function if necessary. I don't naturally think in a functional, fold-y way so laying it out like this helps me picture what's going on as I'm trying to work out the logic.

How to convert list values into Key, value format?

Input
List(List(1, 2, 3, 4),List(5, 6, 6, 8),List(2,4,5,0))
expected output
(1: 2)
(5: 6)
(2: 4)
I tried with the below code
val res = input.map(x => println(x(0)+ " "+x(1)+" "+x(2)+" " +x(3)))
its getting like this
1 2 3 4
5 6 6 8
2 4 5 0
You can follow this approach
val input = List(List(1, 2, 3, 4),List(5, 6, 6, 8),List(2,4,5,0))
val res0 = input.map(x => x match
{case y :: ys => (y -> ys) }
).toMap
val res1 = res0.foreach{x => println(x._1 + ": " + x._2.mkString(","))}
res1 will print an output like
1: 2,3,4
5: 6,6,8
2: 4,5,0
res1: Unit = ()
Please let me know if this answers your question.
You can combine pattern marching and string interpolation:
val result = input.map {
case key :: values => s"$key: ${values.mkString(",")}"
}.mkString("\n")
println(result)
case key :: values matches the first element of the list (your key) and the rest of elements in the list (your values).
mkString(separator) joins the elements of the list into a string using the given separator.

For loop to create tuples of adjacent elements

I have a array
[1,2,2,3,4,6,2,4,6,8,2,3,5]
I want to iterate over this array using a for loop to get a collection of tuples of adjacent elements. How should I code in Scala?
Expected output :
1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5
If you want the output like 1-2|2-2|2-3|3-4|........ as you mentioned in your comment you can try following,
val arr = Array(1,2,2,3,4,6,2,4,6,8,2,3,5)
//here first separate array elements by - then whole array by |
val str = arr.sliding(2).map(_.mkString("-")).mkString("|")
print(str)
//output
//1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5
In scala you have sliding function for that.
scala> val arr = Array(1,2,2,3,4,6,2,4,6,8,2,3,5)
arr: Array[Int] = Array(1, 2, 2, 3, 4, 6, 2, 4, 6, 8, 2, 3, 5)
scala> arr.sliding(2).foreach(tuple => println(tuple.mkString(" ")))
1 2
2 2
2 3
3 4
4 6
6 2
2 4
4 6
6 8
8 2
2 3
3 5
scala> arr.sliding(2).map(tuple => tuple.mkString("-")).mkString("|")
res10: String = 1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5

Spark - Combinations without repetition

I am trying to do all lines combinations without repetition of a text file.
Example:
1
2
2
1
1
Result:
Line 1 with line 2 = (1,2)
Line 1 with line 3 = (1,2)
Line 1 with line 4 = (1,1)
Line 1 with line 5 = (1,1)
Line 2 with line 3 = (2,2)
Line 2 with line 4 = (2,1)
Line 2 with line 5 = (2,1)
Line 3 with line 4 = (2,1)
Line 3 with line 5 = (2,1)
Line 4 with line 5 = (1,1)
or
Considering (x,y), if (x != y) 0 else 1:
0
0
1
1
1
0
0
0
0
1
I have the following code:
def processCombinations(rdd: RDD[String]) = {
rdd.mapPartitions({ partition => {
var previous: String = null;
if (partition.hasNext)
previous = partition.next
for (element <- partition) yield {
if (previous == element)
"1"
else
"0"
}
}
})
}
The piece of code above is doing the combinations of the first element of my RDD, in other words: (1,2) (1,2) (1,1) (1,1).
The problem is: This code ONLY works with ONE PARTITION. I'd like to make this work with many partitions, how could I do that?
It's not very clear exactly what you want as output, but this reproduces your first example, and translates directly to Spark. It generates combinations, but only where the index of the first element in the original list is less than the index of the second, which is I think what you're asking for.
val r = List(1,2,2,1,1)
val z = r zipWithIndex
z.flatMap(x=>z.map(y=>(x,y))).collect{case(x,y) if x._2 < y._2 => (x._1, y._1)}
//List((1,2), (1,2), (1,1), (1,1), (2,2), (2,1), (2,1), (2,1), (2,1), (1,1))
or, as a for-comprehension
for (x<-z; y<-z; if x._2 < y._2) yield (x._1, y._1)
This code calculate the combinations without repetitions by using recursion. It gets 2 arguments: number of elements for the combination and the list of elements.
It works in the following way: for the given list: 1, 2, 3, 4, 5 => It takes the 4 first elements for the first combination. Then It generates other combination with 5, the last element of the list. When there are not more elements left in the list, It moves one position back (third position) and takes the next element to generates more combinations from there: 1, 2, "4", 5. This operation is done recursively with all of elements of the list.
def combinator[A](n: Int, list: List[A], acc: List[A]): List[List[A]] = {
if (n == 0)
List(acc.reverse)
else if (list == Nil)
List()
else
combinator(n - 1, list.tail, list.head :: acc) ::: combinator(n, list.tail, acc)
}
combinator(4, List(1, 2, 3, 4, 5), List()).foreach(println)
// List(1, 2, 3, 4)
// List(1, 2, 3, 5)
// List(1, 2, 4, 5)
// List(1, 3, 4, 5)
// List(2, 3, 4, 5)

Make a tuple of three integers in Scala

I have a problem where I need to make a tuplet of three elements. Let's suppose that I have a list, and I managed to write tuplet of two elements:
val list = (1 to 10).toList
val map1 = list.foldLeft(Map.empty[Int,String])( (map, value) => map + (value -> value.toString) )
Map(5 -> 5, 10 -> 10, 1 -> 1, 6 -> 6, 9 -> 9, 2 -> 2, 7 -> 7, 3 -> 3, 8 -> 8, 4 -> 4)
I want to make a tuplet of three elements. How can I do that?
I tried this code:
val map1 = list.foldLeft(Map.empty[Int,String])( (map, value, s) => map + (value -> value.toString -> value.toString) )
Map(5 -> 5 -> 5, 10 -> 10-> 10, 1 -> 1-> 1, 6 -> 6-> 6, 9 -> 9-> 9, 2 -> 2-> 2, 7 -> 7-> 7, 3 -> 3-> 3, 8 -> 8-> 8, 4 -> 4-> 4)
-> is just a sugar notation for a pair (a tuple of two items). The universal notation for tuples of any arity is a comma-delimited list in braces. E.g. (1,2,3) is a tuple of three integers, while as in your example the expression 1 -> 2 -> 3 would desugar to ((1,2),3), which is a tuple of a tuple of two ints and an int.
What you're trying to achieve with your code simply doesn't make any sense. A Map can be constructed from a list of pairs, treating the first element of the tuple as a key and the second as a value. Tuples of any other arities are not supported and wouldn't make sense in that case. You can however construct collections of other types (e.g., a List) containing tuples of any arities.
In general to convert a range into a Tuple3 you could do something like this:
(0 to 10) map (x=>(x,x*2,x+10))
res0: scala.collection.immutable.IndexedSeq[(Int, Int, Int)] = Vector((0,0,10), (1,2,11), (2,4,12), (3,6,13), (4,8,14), (5,10,15), (6,12,16), (7,14,17), (8,16,18), (9,18,19), (10,20,20))
To join 2 Seqs as a Tuple2 you zip them:
(1 to 5) zip (10 to 15)
res3: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((1,10), (2,11), (3,12), (4,13), (5,14))
scala has built in support for zipping up to arity 3:
((0 to 3),(4 to 6),(7 to 9)).zipped.toList
res6: List[(Int, Int, Int)] = List((0,4,7), (1,5,8), (2,6,9))
If you need to do something similar to higher arities there's product-collections:
(0 to 3) flatZip (4 to 6) flatZip (7 to 9) flatZip (10 to 12)
res7: org.catch22.collections.immutable.CollSeq4[Int,Int,Int,Int] =
CollSeq((0,4,7,10),
(1,5,8,11),
(2,6,9,12))
And finally there's shapeless which does lots of cool things but has a moderate learning curve.