I have a nested tuple structure like (String,(String,Double)) and I want to transform it to (String,String,Double). I have various kinds of nested tuple, and I don't want to transform each manually. Is there any convenient way to do that?
If you use shapeless, this is exactly what you need, I think.
There is no flatten on a Tupple. But if you know the structure, you can do something like this:
implicit def flatten1[A, B, C](t: ((A, B), C)): (A, B, C) = (t._1._1, t._1._2, t._2)
implicit def flatten2[A, B, C](t: (A, (B, C))): (A, B, C) = (t._1, t._2._1, t._2._2)
This will flatten Tupple with any types. You can also add the implicit keyword to the definition. This works only for three elements. You can flatten Tupple like:
(1, ("hello", 42.0)) => (1, "hello", 42.0)
(("test", 3.7f), "hi") => ("test", 3.7f, "hi")
Multiple nested Tupple cannot be flatten to the ground, because there are only three elements in the return type:
((1, (2, 3)),4) => (1, (2, 3), 4)
Not sure about the effiency of this, but you can convert Tuple to List with tuple.productIterator.toList, then flatten the nested lists:
scala> val tuple = ("top", ("nested", 42.0))
tuple: (String, (String, Double)) = (top,(nested,42.0))
scala> tuple.productIterator.map({
| case (item: Product) => item.productIterator.toList
| case (item: Any) => List(item)
| }).toList.flatten
res0: List[Any] = List(top, nested, 42.0)
Complement of answer above
Paste this utility code:
import shapeless._
import ops.tuple.FlatMapper
import syntax.std.tuple._
trait LowPriorityFlatten extends Poly1 {
implicit def default[T] = at[T](Tuple1(_))
}
object flatten extends LowPriorityFlatten {
implicit def caseTuple[P <: Product](implicit lfm: Lazy[FlatMapper[P, flatten.type]]) =
at[P](lfm.value(_))
}
then you are able to flatten any nested tuple:
scala> val a = flatten(((1,2),((3,4),(5,(6,(7,8))))))
a: (Int, Int, Int, Int, Int, Int, Int, Int) = (1,2,3,4,5,6,7,8)
Note that this solution does not work for self-defined case class type, which would be converted to String in the output.
scala> val b = flatten(((Cat("c"), Dog("d")), Cat("c")))
b: (String, String, String) = (c,d,c)
In my opinion simple pattern matching would work
scala> val motto = (("dog", "food"), "tastes good")
val motto: ((String, String), String) = ((dog,food),tastes good)
scala> motto match {
| case ((it, really), does) => (it, really, does)
| }
val res0: (String, String, String) = (dog,food,tastes good)
Or if you have a collection of such tuples:
scala> val motto = List(
| (("dog", "food"), "tastes good")) :+ (("cat", "food"), "tastes bad")
val motto: List[((String, String), String)] = List(((dog,food),tastes good), ((cat,food),tastes bad))
scala> motto.map {
| case ((one, two), three) => (one, two, three)
| }
val res2: List[(String, String, String)] = List((dog,food,tastes good), (cat,food,tastes bad))
I think it would be convenient even if you have several cases.
Related
I have data like this in an RDD:
RDD[((Int, Int, Int), ((Int, Int), Int))]
as:
(((9,679,16),((2,274),1)), ((250,976,13),((2,218),1)))
I want output as :
((9,679,16,2,274,1),(250,976,13,2,218,1))
After Joining 2 rdds with:
val joinSale = salesTwo.join(saleFinal)
I got that result set. I tried the following code.
joinSale.flatMap(x => x).take(100).foreach(println)
I have tried map/flatMap but couldn't do it. Any ideas how to implement a scenario like this ? Thanks in advance ..
You can do this with pattern matching in scala. Simply wrap your tuple modification logic within a map similar to the below:
val mappedJoinSale = joinSale.map { case ((a, b, c), ((d, e), f)) => (a, b, c, d, e, f) }
Using your example, we have:
scala> val example = sc.parallelize(Array(((9,679,16),((2,274),1)), ((250,976,13),((2,218),1))))
example: org.apache.spark.rdd.RDD[((Int, Int, Int), ((Int, Int), Int))] = ParallelCollectionRDD[0] at parallelize at <console>:12
scala> val mapped = example.map { case ((a, b, c), ((d, e), f)) => (a, b, c, d, e, f) }
mapped: org.apache.spark.rdd.RDD[(Int, Int, Int, Int, Int, Int)] = MappedRDD[1] at map at <console>:14
scala> mapped.take(2).foreach(println)
...
(9,679,16,2,274,1)
(250,976,13,2,218,1)
You could also create generic tuple flattener using marvelous shapeless library as follows:
import shapeless._
import shapeless.ops.tuple
trait LowLevelFlatten extends Poly1 {
implicit def anyFlat[T] = at[T](x => Tuple1(x))
}
object concat extends Poly2 {
implicit def atTuples[T1, T2](implicit prepend: tuple.Prepend[T1, T2]): Case.Aux[T1, T2, prepend.Out] =
at[T1,T2]((t1,t2) => prepend(t1,t2))
}
object flatten extends LowLevelFlatten {
implicit def tupleFlat[T, M](implicit
mapper: tuple.Mapper.Aux[T, flatten.type, M],
reducer: tuple.LeftReducer[M, concat.type]
): Case.Aux[T, reducer.Out] =
at[T](t => reducer(mapper(t)))
}
Now in any code where import shapeless._ exists you could use it as
joinSale.map(flatten)
val list = List((1,2), (3,4))
list.map(tuple => {
val (a, b) = tuple
do_something(a,b)
})
// the previous can be shortened as follows
list.map{ case(a, b) =>
do_something(a,b)
}
// similarly, how can I shorten this (and avoid declaring the 'tuple' variable)?
def f(tuple: (Int, Int)) {
val (a, b) = tuple
do_something(a,b)
}
// here there two ways, but still not very short,
// and I could avoid declaring the 'tuple' variable
def f(tuple: (Int, Int)) {
tuple match {
case (a, b) => do_something(a,b)
}
}
def f(tuple: (Int, Int)): Unit = tuple match {
case (a, b) => do_something(a,b)
}
Use tupled
scala> def doSomething = (a: Int, b: Int) => a + b
doSomething: (Int, Int) => Int
scala> doSomething.tupled((1, 2))
res0: Int = 3
scala> def f(tuple: (Int, Int)) = doSomething.tupled(tuple)
f: (tuple: (Int, Int))Int
scala> f((1,2))
res1: Int = 3
scala> f(1,2) // this is due to scala auto-tupling
res2: Int = 3
tupled is defined for every FunctionN with N >= 2, and returns a function expecting the parameters wrapped in a tuple.
While this might look like a trivial suggestion, the f function, can be further simplified by just using _1 and _2 on a tuple.
def f(tuple: (Int, Int)): Unit =
do_something(tuple._1, tuple._2)
Obviously by doing this you're affecting readability (some meta-information about the meaning of the 1st and 2nd parameter of the tuple is removed) and should you wish to use elements of the tuple somewhere else in the f method you will need to extract them again.
Though for many uses this might be still the easiest, shortest and most intuitive alternative.
If I understand correctly you are trying to pass a tuple to a method with 2 args?
def f(tuple: (Int,Int)) = do_something(tuple._1, tuple._2)
by more readable, I mean giving variable names instead of using the _1 an _2 on the tuple
In this case, it's a good idea to use a case class instead of a tuple, especially since it only takes one line:
case class IntPair(a: Int, b: Int)
def f(pair: IntPair) = do_something(pair.a, pair.b)
If you get (Int, Int) from external code which can't be changed (or you don't want to change), you could add a method converting from a tuple to IntPair.
Another option: {(a: Int, b: Int) => a + b}.tupled.apply(tuple). Unfortunately, {case (a: Int, b: Int) => a + b}.apply(tuple) doesn't work.
I've recently come across a problem. I'm trying to flatten "tail-nested" tuples in a compiler-friendly way, and I've come up with the code below:
implicit def FS[T](x: T): List[T] = List(x)
implicit def flatten[T,V](x: (T,V))(implicit ft: T=>List[T], fv: V=>List[T]) =
ft(x._1) ++ fv(x._2)
This above code works well for flattening tuples I am calling "tail-nested" like the ones below.
flatten((1,2)) -> List(1,2)
flatten((1,(2,3))) -> List(1,2,3)
flatten((1,(2,(3,4)))) -> List(1,2,3,4)
However, I seek to make my solution more robust. Consider a case where I have a list of these higher-kinded "tail-nested" tuples.
val l = List( (1,2), (1,(2,3)), (1,(2,(3,4))) )
The inferred type signature of this would be List[(Int, Any)] and this poses a problem for an operation such as map, which would fail with:
error: No implicit view available from Any => List[Int]
This error makes sense to me because of the nature of my recursive implicit chain in the flatten function. However, I was wondering: is there any way I can make my method of flattening the tuples more robust so that higher order functions such as map mesh well with it?
EDIT:
As Bask.ws pointed out, the Product trait offers potential for a nice solution. The below code illustrates this:
def flatten(p: Product): List[_] = p.productIterator.toList.flatMap {x => x match {
case pr: Product => flatten(pr)
case _ => List(x)
}}
The result type of this new flatten call is always List[Any]. My problem would be solved if there was a way to have the compiler tighten this bound a bit. In parallel to my original question, does anyone know if it is possible to accomplish this?
UPD Compile-time fail solution added
I have one solution that may suit you. Types of your first 3 examples are resolved in compile time: Int, Tuple2[Int, Int], Tuple2[Int, Tuple2[Int, Int]]. For you example with the list you have heterogeneous list with actual type List[(Int, Any)] and you have to resolve the second type in runtime or it maybe can be done by macro. So you may want to actually write implicit def flatten[T](x: (T,Any)) as your error advises you
Here is the fast solution. It gives a couple of warnings, but it works nicely:
implicit def FS[T](x: T): List[T] = List(x)
implicit def FP[T](x: Product): List[T] = {
val res = (0 until x.productArity).map(i => x.productElement(i) match {
case p: Product => FP[T](p)
case e: T => FS(e)
case _ => sys.error("incorrect element")
})
res.toList.flatten
}
implicit def flatten[T](x: (T,Any))(implicit ft: T=>List[T], fp: Product =>List[T]) =
ft(x._1) ++ (x._2 match {
case p: Product => fp(p)
case t: T => ft(t)
})
val l = List( (1,2), (1,(2,3)), (1,(2,(3,4))) )
scala> l.map(_.flatten)
res0: List[List[Int]] = List(List(1, 2), List(1, 2, 3), List(1, 2, 3, 4))
UPD
I have researched problem a little bit more, and I have found simple solution to make homogeneus list, which can fail at compile time. It is fully typed without Any and match and looks like compiler now correctly resolves nested implicits
case class InfiniteTuple[T](head: T, tail: Option[InfiniteTuple[T]] = None) {
def flatten: List[T] = head +: tail.map(_.flatten).getOrElse(Nil)
}
implicit def toInfiniteTuple[T](x: T): InfiniteTuple[T] = InfiniteTuple(x)
implicit def toInfiniteTuple2[T, V](x: (T, V))(implicit ft: V => InfiniteTuple[T]): InfiniteTuple[T] =
InfiniteTuple(x._1, Some(ft(x._2)))
def l: List[InfiniteTuple[Int]] = List( (1,2), (1,(2,3)), (1,(2,(3,4)))) //OK
def c: List[InfiniteTuple[Int]] = List( (1,2), (1,(2,3)), (1,(2,(3,"44"))))
//Compile-time error
//<console>:11: error: No implicit view available from (Int, (Int, java.lang.String)) => InfiniteTuple[Int]
Then you can implement any flatten you want. For example, one above:
scala> l.map(_.flatten)
res0: List[List[Int]] = List(List(1, 2), List(1, 2, 3), List(1, 2, 3, 4))
I have some financial data gathered at a List[(Int, Double)], like this:
val snp = List((2001, -13.0), (2002, -23.4))
With this, I wrote a formula that would transform the list, through map, into another list (to demonstrate investment grade life insurance), where losses below 0 are converted to 0, and gains above 15 are converted to 15, like this:
case class EiulLimits(lower:Double, upper:Double)
def eiul(xs: Seq[(Int, Double)], limits:EiulLimits): Seq[(Int, Double)] = {
xs.map(item => (item._1,
if (item._2 < limits.lower) limits.lower
else if (item._2 > limits.upper) limits.upper
else item._2
}
Is there anyway to extract the tuple's values inside this, so I don't have to use the clunky _1 and _2 notation?
List((1,2),(3,4)).map { case (a,b) => ... }
The case keyword invokes the pattern matching/unapply logic.
Note the use of curly braces instead of parens after map
And a slower but shorter quick rewrite of your code:
case class EiulLimits(lower: Double, upper: Double) {
def apply(x: Double) = List(x, lower, upper).sorted.apply(1)
}
def eiul(xs: Seq[(Int, Double)], limits: EiulLimits) = {
xs.map { case (a,b) => (a, limits(b)) }
}
Usage:
scala> eiul(List((1, 1.), (3, 3.), (4, 4.), (9, 9.)), EiulLimits(3., 7.))
res7: Seq[(Int, Double)] = List((1,3.0), (3,3.0), (4,4.0), (7,7.0), (9,7.0))
scala> val snp = List((2001, -13.0), (2002, -23.4))
snp: List[(Int, Double)] = List((2001,-13.0), (2002,-23.4))
scala> snp.map {case (_, x) => x}
res2: List[Double] = List(-13.0, -23.4)
scala> snp.map {case (x, _) => x}
res3: List[Int] = List(2001, 2002)
'map' preserves the number of elements, so using it on a Tuple seems sensible.
My attempts so far:
scala> (3,4).map(_*2)
error: value map is not a member of (Int, Int)
(3,4).map(_*2)
^
scala> (3,4).productIterator.map(_*2)
error: value * is not a member of Any
(3,4).productIterator.map(_*2)
^
scala> (3,4).productIterator.map(_.asInstanceOf[Int]*2)
res4: Iterator[Int] = non-empty iterator
scala> (3,4).productIterator.map(_.asInstanceOf[Int]*2).toList
res5: List[Int] = List(6, 8)
It looks quite painful... And I haven't even begun to try to convert it back to a tuple.
Am I doing it wrong? Could the library be improved?
In general, the element types of a tuple aren't the same, so map doesn't make sense. You can define a function to handle the special case, though:
scala> def map[A, B](as: (A, A))(f: A => B) =
as match { case (a1, a2) => (f(a1), f(a2)) }
map: [A,B](as: (A, A))(f: (A) => B)(B, B)
scala> val p = (1, 2)
p: (Int, Int) = (1,2)
scala> map(p){ _ * 2 }
res1: (Int, Int) = (2,4)
You could use the Pimp My Library pattern to call this as p.map(_ * 2).
UPDATE
Even when the types of the elements are not the same, Tuple2[A, B] is a Bifunctor, which can be mapped with the bimap operation.
scala> import scalaz._
import scalaz._
scala> import Scalaz._
import Scalaz._
scala> val f = (_: Int) * 2
f: (Int) => Int = <function1>
scala> val g = (_: String) * 2
g: (String) => String = <function1>
scala> f <-: (1, "1") :-> g
res12: (Int, String) = (2,11)
UPDATE 2
http://gist.github.com/454818
shapeless Supports mapping and folding over tuples via an intermediary HList representation,
Sample REPL session,
scala> import shapeless._ ; import Tuples._
import shapeless._
import Tuples._
scala> object double extends (Int -> Int) (_*2)
defined module double
scala> (3, 4).hlisted.map(double).tupled
res0: (Int, Int) = (6,8)
Where the elements of the tuple are of different types you can map with a polymorphic function with type-specific cases,
scala> object frob extends Poly1 {
| implicit def caseInt = at[Int](_*2)
| implicit def caseString = at[String]("!"+_+"!")
| implicit def caseBoolean = at[Boolean](!_)
| }
defined module frob
scala> (23, "foo", false, "bar", 13).hlisted.map(frob).tupled
res1: (Int, String, Boolean, String, Int) = (46,!foo!,true,!bar!,26)
Update
As of shapeless 2.0.0-M1 mapping over tuples is supported directly. The above examples now look like this,
scala> import shapeless._, poly._, syntax.std.tuple._
import shapeless._
import poly._
import syntax.std.tuple._
scala> object double extends (Int -> Int) (_*2)
defined module double
scala> (3, 4) map double
res0: (Int, Int) = (6,8)
scala> object frob extends Poly1 {
| implicit def caseInt = at[Int](_*2)
| implicit def caseString = at[String]("!"+_+"!")
| implicit def caseBoolean = at[Boolean](!_)
| }
defined module frob
scala> (23, "foo", false, "bar", 13) map frob
res1: (Int, String, Boolean, String, Int) = (46,!foo!,true,!bar!,26)
map function gets an A => B and returns F[B].
def map[A, B](f: A => B) : F[B]
As retronym wrote Tuple2[A, B] is a Bifunctor, so you can look for the bimap function in scalaz or cats.
bimap is a function that maps both sides of the tuple:
def bimap[A, B, C, D](fa: A => C, fb: B => D): Tuple2[C, D]
Because Tuple[A, B] holds 2 values and only one value can be mapped (by convention the right value), you can just return the same value for the left side and use the right
function to map over the right value of the tuple.
(3, 4).bimap(identity, _ * 2)