scala unpack tuple into case class arguments and additional zip two sequences - scala

I want to transform Seq[String, Seq[Char]] into Seq[UnpackedObject] but don't know how to unpack tuple of two Chars (A, B) to separate case class arguments.
I want to basically create s3 out of s1 and s2 such that:
Seq(("aaa", "A", B"), ("bbb", "B", C"), ("ccc", "C", "D"), ("ddd", "D", "D"))
hence I am trying to use case class but:
problem 1: unpacking tuple into two arguments;
problem 2: last element with "D", "D" <-- I don't know how to solve it.
val s1 = Seq("aaa", "bbb", "ccc", "ddd")
val s2 = ('A' to 'D').sliding(2).toSeq
val pairs = (s1, s2).zipped.map { case (a, b) => UnpackedObject(a, b) }
case class UnpackedObject(a: String, b: Char, c: Char)
this above is my code so far.

zipped function expects Seq with the same length but you passed s2 of length 3 and s1 length is 4. You need to add one element into s2 to get s3:
val s1 = Seq("aaa", "bbb", "ccc", "ddd")
val s2 = ('A' to 'D').sliding(2).toSeq :+ Seq('D', 'D')
// ('A' to 'D').sliding(2) will return just
// Seq(Seq('A', 'B'), Seq('B', 'C'), Seq('C', 'D'))
val pairs = (s1, s2).zipped.map { case (a, b) => (a, b.head, b.last) }
// will return Seq((aaa,A,B), (bbb,B,C), (ccc,C,D), (ddd,D,D))
if you need to create UnpackedObject, you can do it just call tupled apply function of case class:
val objects = (s1, s2).zipped.map { case (a, b) => (a, b.head, b.last) }
.map((UnpackedObject.apply _).tupled)
// will return
// Seq(
// UnpackedObject(aaa,A,B), UnpackedObject(bbb,B,C),
// UnpackedObject(ccc,C,D), UnpackedObject(ddd,D,D))

b is not tuple but rather a indexed sequence, so:
val pairs = (s1, s2).zipped.map { case (a, b) => UnpackedObject(a, b(0), b(1)) }
or
val pairs = (s1, s2).zipped.map { case (a, b) => UnpackedObject(a, b.head, b.last) }
As for 2nd point and using tuples you can do:
val s1 = Seq("aaa", "bbb", "ccc", "ddd")
val s2 = ('A' to 'D').zip(('B' to 'D')) :+ ('D', 'D')
val pairs = (s1, s2).zipped.map { case (a, b) => UnpackedObject(a, b._1, b._2) }

Related

Reverse a Map which has A Set as its value using HOF

I am trying to reverse a map that has a String as the key and a set of numbers as its value
My goal is to create a list that contains a tuple of a number and a list of strings that had the same number in the value set
I have this so far:
def flipMap(toFlip: Map[String, Set[Int]]): List[(Int, List[String])] = {
toFlip.flatMap(_._2).map(x => (x, toFlip.keys.toList)).toList
}
but it is only assigning every String to every Int
val map = Map(
"A" -> Set(1,2),
"B" -> Set(2,3)
)
should produce:
List((1, List(A)), (2, List(A, B)), (3, List(B)))
but is producing:
List((1, List(A, B)), (2, List(A, B)), (3, List(A, B)))
This works to, but it's not exactly what you might need and you may need some conversions to get the exact data type you need:
toFlip.foldLeft(Map.empty[Int, Set[String]]) {
case (acc, (key, numbersSet)) =>
numbersSet.foldLeft(acc) {
(updatingMap, newNumber) =>
updatingMap.updatedWith(newNumber) {
case Some(existingSet) => Some(existingSet + key)
case None => Some(Set(key))
}
}
}
I used Set to avoid duplicate key insertions in the the inner List, and used Map for better look up instead of the outer List.
You can do something like this:
def flipMap(toFlip: Map[String, Set[Int]]): List[(Int, List[String])] =
toFlip
.toList
.flatMap {
case (key, values) =>
values.map(value => value -> key)
}.groupMap(_._1)(_._2)
.view
.mapValues(_.distinct)
.toList
Note, I personally would return a Map instead of a List
Or if you have cats in scope.
def flipMap(toFlip: Map[String, Set[Int]]): Map[Int, Set[String]] =
toFlip.view.flatMap {
case (key, values) =>
values.map(value => Map(value -> Set(key)))
}.toList.combineAll
// both scala2 & scala3
scala> map.flatten{ case(k, s) => s.map(v => (k, v)) }.groupMapReduce{ case(k, v) => v }{case(k, v) => List(k)}{ _ ++ _ }
val res0: Map[Int, List[String]] = Map(1 -> List(A), 2 -> List(A, B), 3 -> List(B))
// scala3 only
scala> map.flatten((k, s) => s.map(v => (k, v))).groupMapReduce((k, v) => v)((k, v) => List(k))( _ ++ _ )
val res1: Map[Int, List[String]] = Map(1 -> List(A), 2 -> List(A, B), 3 -> List(B))

Scala - Merge two lists of tuples by common elements

How to merge two lists of tuples that simulates Chasles' Relation?
(a, b), (b, c) => (a, c)
Here is an example:
val l1 = List(("Dan", "b"), ("Dan","a"), ("Bart", "c"))
val l2 = List(("a", "1"), ("c", "1"), ("b", "3"), ("a", "2"))
Expected result would be:
val result = List(("Dan", "3"), ("Dan", "1"), ("Dan", "2"), ("Bart", "1"))
You basically want to consider all pairs of one element from the first list and one from the second and keep those where the "b" elements match.
In other words, we want to map over l1 and, inside that map, map over l2, meaning we consider all the pairs of an element from each list, so something like:
l1.map(x => l2.map(y => (x,y))
That's not quite right, though, since we now have a List[List[((String, String),(String,String))]]--we needed to flatmap:
l1.flatMap(x => l2.map(y => (x,y)))
Now we have to filter to keep just the pairs we want and tidy up:
l1.flatMap(x => l2.map(y => (x,y)))
.filter{ case ((_,y),(b,_)) => y == b }
.map {case ((x, _),(_,c)) => (x,c) }
which gives us
List((Dan,3), (Dan,1), (Dan,2), (Bart,1))
That was kind of an ugly mess, so and we can tidy it up a bit--let's filter l2 in our original flatmap and build the result there, so we don't have to juggle the tuple of tuples:
l1.map{ case (x,y) =>
l2.filter{ case (b, _) => y == b}
.map{ case (_, c) => (x, c)} }
This is one of those cases where it's easier to read a for comprehension:
for {
(x, y) <- l1
(b, c) <- l2
if y == b
} yield (x,c)
For each tuple in l1 you can filter l2 to select the tuples with the matching first element:
def join[A, B, C](l1: List[(A, B)], l2: List[(B, C)]): List[(A, C)] = {
for {
(key, subkey) <- l1
value <- l2.collect { case (`subkey`, value) => value }
} yield key -> value
}
You could also convert l2 into a Map beforehand for better selection performance:
def join[A, B, C](l1: List[(A, B)], l2: List[(B, C)]): List[(A, C)] = {
val valuesMap = l2.groupBy(_._1)
for {
(key, subkey) <- l1
(_, value) <- valuesMap.getOrElse(subkey, Nil)
} yield key -> value
}

Transforming one record into multiple records

If the format of the input is
(x1,(a,b,c,List(key1, key2))
(x2,(a,b,c,List(key3))
and I would like to achieve this output
(key1,(a,b,c,x1))
(key2,(a,b,c,x1))
(key3,(a,b,c,x2))
Here is the code:
var hashtags = joined_d.map(x => (x._1, (x._2._1._1, x._2._2, x._2._1._4, getHashTags(x._2._1._4))))
var hashtags_keys = hashtags.map(x => if(x._2._4.size == 0) (x._1, (x._2._1, x._2._2, x._2._3, 0)) else
x._2._4.map(y => (y, (x._2._1, x._2._2, x._2._3, 1))))
The function getHashTags() returns a list. If the list is not empty, we want to use each elements in the list as the new key. How should i work around this issue?
With rdd created as:
val rdd = sc.parallelize(
Seq(
("x1",("a","b","c",List("key1", "key2"))),
("x2", ("a", "b", "c", List("key3")))
)
)
You can use flatMap like this:
rdd.flatMap{ case (x, (a, b, c, list)) => list.map(k => (k, (a, b, c, x))) }.collect
// res12: Array[(String, (String, String, String, String))] =
// Array((key1,(a,b,c,x1)),
// (key2,(a,b,c,x1)),
// (key3,(a,b,c,x2)))
Here's one way to do it:
val rdd = sc.parallelize(Seq(
("x1", ("a", "b", "c", List("key1", "key2"))),
("x2", ("a", "b", "c", List("key3")))
))
val rdd2 = rdd.flatMap{
case (x, (a, b, c, l)) => l.map( (_, (a, b, c, x) ) )
}
rdd2.collect
// res1: Array[(String, (String, String, String, String))] = Array((key1,(a,b,c,x1)), (key2,(a,b,c,x1)), (key3,(a,b,c,x2)))

Zip two lists of different lengths with default element to fill

Assume we have the following lists of different size:
val list1 = ("a", "b", "c")
val list2 = ("x", "y")
Now I want to merge these 2 lists and create a new list with the string elements being concatenated:
val desiredResult = ("ax", "by", "c")
I tried
val wrongResult = (list1, list2).zipped map (_ + _)
as proposed here, but this doesn't work as intended, because zip discards those elements of the longer list that can't be matched.
How can I solve this problem? Is there a way to zip the lists and give a "default element" (like the empty string in this case) if one list is longer?
The method you are looking for is .zipAll:
scala> val list1 = List("a", "b", "c")
list1: List[String] = List(a, b, c)
scala> val list2 = List("x", "y")
list2: List[String] = List(x, y)
scala> list1.zipAll(list2, "", "")
res0: List[(String, String)] = List((a,x), (b,y), (c,""))
.zipAll takes 3 arguments:
the iterable to zip with
the default value if this (the collection .zipAll is called on) is shorter
the default value if the other collection is shorter
The API-based zipAll is the way to go, yet you can implement it (as an exercise) for instance as follows,
implicit class OpsSeq[A,B](val xs: Seq[A]) extends AnyVal {
def zipAll2(ys: Seq[B], xDefault: A, yDefault: B) = {
val xs2 = xs ++ Seq.fill(ys.size-xs.size)(xDefault)
val ys2 = ys ++ Seq.fill(xs.size-ys.size)(yDefault)
xs2.zip(ys2)
}
}
Hence for instance
Seq(1,2).zipAll2(Seq(3,4,5),10,20)
List((1,3), (2,4), (10,5))
and
list1.zipAll2(list2, "", "")
List((a,x), (b,y), (c,""))
A recursive version,
def zipAll3[A,B](xs: Seq[A], ys: Seq[B], xd: A, yd: B): Seq[(A,B)] = {
(xs,ys) match {
case (Seq(), Seq()) => Seq()
case (x +: xss, Seq()) => (x,yd) +: zipAll3(xss, Seq(), xd, yd)
case (Seq(), y +: yss) => (xd,y) +: zipAll3(Seq(), yss, xd, yd)
case (x +: xss, y +: yss) => (x,y) +: zipAll3(xss, yss, xd, yd)
}
}
with default xd and default yd values.

What's the idiomatic way to map producing 0 or 1 results per entry?

What's the idiomatic way to call map over a collection producing 0 or 1 result per entry?
Suppose I have:
val data = Array("A", "x:y", "d:e")
What I'd like as a result is:
val target = Array(("x", "y"), ("d", "e"))
(drop anything without a colon, split on colon and return tuples)
So in theory I think I want to do something like:
val attempt1 = data.map( arg => {
arg.split(":", 2) match {
case Array(l,r) => (l, r)
case _ => (None, None)
}
}).filter( _._1 != None )
What I'd like to do is avoid the need for the any-case and get rid of the filter.
I could do this by pre-filtering (but then I have to test the regex twice):
val attempt2 = data.filter( arg.contains(":") ).map( arg => {
val Array(l,r) = arg.split(":", 2)
(l,r)
})
Last, I could use Some/None and flatMap...which does get rid of the need to filter, but is it what most scala programmers would expect?
val attempt3 = data.flatMap( arg => {
arg.split(":", 2) match {
case Array(l,r) => Some((l,r))
case _ => None
}
})
It seems to me like there'd be an idiomatic way to do this in Scala, is there?
With a Regex extractor and collect :-)
scala> val R = "(.+):(.+)".r
R: scala.util.matching.Regex = (.+):(.+)
scala> Array("A", "x:y", "d:e") collect {
| case R(a, b) => (a, b)
| }
res0: Array[(String, String)] = Array((x,y), (d,e))
Edit:
If you want a map, you can do:
scala> val x: Map[String, String] = Array("A", "x:y", "d:e").collect { case R(a, b) => (a, b) }.toMap
x: Map[String,String] = Map(x -> y, d -> e)
If performance is a concern, you can use collection.breakOut as shown below to avoid creation of an intermediate array:
scala> val x: Map[String, String] = Array("A", "x:y", "d:e").collect { case R(a, b) => (a, b) } (collection.breakOut)
x: Map[String,String] = Map(x -> y, d -> e)