Working with tuples in Scala - scala

I want to do something like this (simplified quite heavily):
((1, 2, 3, 4, 5, 6), (6, 5, 4, 3, 2, 1)).zipped map (_ + _)
Ignore the actual values of the integers (although it's important that these are 6-tuples, actually :)). Essentially, I want to use this fairly regularly in a function which maintains a Map[String, (Int, Int, Int, Int, Int, Int)] when an existing element is updated.
As it is, Scala spits this out at me:
<console>:6: error: could not find implicit value for parameter w1: ((Int, Int, Int, Int, Int, Int)) => scala.collection.TraversableLike[El1,Repr1]
((1, 2, 3, 4, 5, 6), (6, 5, 4, 3, 2, 1)).zipped
If I use Seqs instead of tuples, everything works fine, but I want to enforce an arity of 6 in the type system (I'll probably type Record = (Int, Int, Int, Int, Int, Int) as a quick refactor shortly).
Can anyone offer some advice on what I'm doing wrong/why Scala won't deal with the code above? I thought it might work if I used a 2- or 3-arity tuple, seeing as Scala defines Tuple2 and Tuple3s (I understand that scaling tuple functions across an arbitrary n-arity is difficult), but I get the same error.
Thanks in advance for any help offered :).

You only want to map over tuples which have identical types--otherwise the map wouldn't make sense--but Tuple doesn't contain that in its type signature. But if you're willing to do a little work, you can set it up so that tuples work the way you requested:
Groundwork:
class TupTup6[A,B](a: (A,A,A,A,A,A), b: (B,B,B,B,B,B)) {
def op[C](f:(A,B)=>C) = ( f(a._1,b._1), f(a._2,b._2), f(a._3,b._3),
f(a._4,b._4), f(a._5,b._5), f(a._6,b._6) )
}
implicit def enable_tuptup6[A,B](ab: ((A,A,A,A,A,A),(B,B,B,B,B,B))) = {
new TupTup6(ab._1,ab._2)
}
Usage:
scala> ((1,2,3,4,5,6) , (6,5,4,3,2,1)) op { _ + _ }
res0: (Int, Int, Int, Int, Int, Int) = (7,7,7,7,7,7)

I received this little inspiration.
class TupleZipper[T <: Product](t1: T) {
private def listify(p: Product) = p.productIterator.toList
def zipWith(t2: T) = (listify(t1), listify(t2)).zipped
}
implicit def mkZipper[T <: Product](t1: T) = new TupleZipper(t1)
// ha ha, it's arity magic
scala> ((1, 2, 3, 4, 5, 6)) zipWith ((6, 5, 4, 3, 2))
<console>:8: error: type mismatch;
found : (Int, Int, Int, Int, Int)
required: (Int, Int, Int, Int, Int, Int)
((1, 2, 3, 4, 5, 6)) zipWith ((6, 5, 4, 3, 2))
^
scala> ((1, 2, 3, 4, 5, 6)) zipWith ((6, 5, 4, 3, 2, 1))
res1: (List[Any], List[Any])#Zipped[List[Any],Any,List[Any],Any] = scala.Tuple2$Zipped#42e934e
scala> res1 map ((x, y) => x.asInstanceOf[Int] + y.asInstanceOf[Int])
res2: List[Int] = List(7, 7, 7, 7, 7, 7)
Yes, a bunch of Anys comes out the other end. Not real thrilling but not a lot you can do when you try to force yourself on Tuples this way.
Edit: oh, and of course the type system gives you the full monty here.
scala> ((1, 2, 3, 4, 5, 6)) zipWith ((6, 5, 4, 3, 2, "abc"))
<console>:8: error: type mismatch;
found : java.lang.String("abc")
required: Int
((1, 2, 3, 4, 5, 6)) zipWith ((6, 5, 4, 3, 2, "abc"))
^

import scala.collection._
type Record = (Int, Int, Int, Int, Int, Int)
implicit def toIterable(r: Record) = new Iterable[Int]{
def iterator = r.productIterator.asInstanceOf[Iterator[Int]]
}
implicit def cbf[From <: Iterable[Int]] = new generic.CanBuildFrom[From, Int, Record] {
def apply(from: From) = apply
def apply = new mutable.Builder[Int, Record] {
var array = Array.ofDim[Int](6)
var i = 0
def +=(elem: Int) = {
array(i) += elem
i += 1
this
}
def clear() = i = 0
def result() = (array(0), array(1), array(2), array(3), array(4), array(5))
}
}
usage:
scala> ((1, 2, 3, 4, 5, 6), (6, 5, 4, 3, 2, 1)).zipped.map{_ + _}
res1: (Int, Int, Int, Int, Int, Int) = (7,7,7,7,7,7)

Tuple2#zipped won't help you out here, it works when the contained elements are TraversableLike/IterableLike - which Tuples aren't.
You'll probably want to define your own sumRecords function that takes two Records and returns their sum:
def sumRecord(a:Record, b:Record) = new Record(
a._1 + b._1,
a._2 + b._2,
a._3 + b._3,
a._4 + b._4,
a._5 + b._5,
a._6 + b._6
)
Then to use it with a Pair[Record, Record]:
val p : Pair[Record, Record] = ...
val summed = sumRecord(p._1, p._2)
Sure, there are abstractions available; but as Record is going to be fixed throughout your design, then they have little value.

short solution:
type Record = (Int, Int, Int, Int, Int, Int)
implicit def toList(r: Record) = r.productIterator.asInstanceOf[Iterator[Int]].toList
implicit def toTuple(l: List[Int]): Record = (l(0), l(1), l(2), l(3), l(4), l(5))
usage:
scala> ((1,2,3,4,5,6), (6,5,4,3,2,1)).zipped map {_ + _}: Record
res0: (Int, Int, Int, Int, Int, Int) = (7,7,7,7,7,7)

You can now easily achieve this with shapeless, this way:
import shapeless._
import shapeless.syntax.std.tuple._
val a = (1, 2, 3, 4, 5, 6)
val b = (6, 5, 4, 3, 2, 1)
object sum extends Poly1 {
implicit def f = use((t: (Int, Int)) => t._1 + t._2)
}
val r = a.zip(b) map sum // r is a (Int, Int, Int, Int, Int, Int)
The drawback is the weird syntax you have to use to express the sum function, but everything is type-safe and type-checked.

As update to Rex Kerr answer, starting from Scala 2.10 you can use implicit classes: syntactic sugar that makes that solution even shorter.
implicit class TupTup6[A,B](x: ((A,A,A,A,A,A),(B,B,B,B,B,B))) {
def op[C](f:(A,B)=>C) = (
f(x._1._1,x._2._1),
f(x._1._2,x._2._2),
f(x._1._3,x._2._3),
f(x._1._4,x._2._4),
f(x._1._5,x._2._5),
f(x._1._6,x._2._6) )
}

You get the error because you treat the tuple as a collection.
Is it possible for you to use lists instead of tuples? Then the calculation is simple:
scala> List(1,2,3,4,5,6).zip(List(1,2,3,4,5,6)).map(x => x._1 + x._2 )
res6: List[Int] = List(2, 4, 6, 8, 10, 12)

Related

scala case class from sequence

case class Foo(a: Int, b: Int, c: Int)
val s = Seq(1, 2, 3)
val t = (1, 2, 3)
I know I can create case class from tuple:
Foo.tupled(t)
but how can I create case class from Sequence? I have ~10 integer elements in the sequence.
One option is to add corresponding apply factory method to companion object something like so
object Foo {
def apply(xs: Seq[Int]): Option[Foo] = {
xs match {
case Seq(a, b, c) => Some(Foo(a, b, c))
case _ => None
}
}
}
Foo(s) // : Option[Foo] = Some(value = Foo(a = 1, b = 2, c = 3))
How aboud this?
case class Foo(xs: Int*)
val a = Foo(1,2,3)
val b = Foo(1,2,3,4,5)
val c = Foo((1 to 10).toList: _*)
println(a.xs) // Seq[Int](1,2,3)
println(c.xs) // Seq[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

How to pass custom function to reduceByKey of RDD in scala

My requirement is to find the maximum of each group in RDD.
I tried the below;
scala> val x = sc.parallelize(Array(Array("A",3), Array("B",5), Array("A",6)))
x: org.apache.spark.rdd.RDD[Array[Any]] = ParallelCollectionRDD[0] at parallelize at <console>:27
scala> x.collect
res0: Array[Array[Any]] = Array(Array(A, 3), Array(B, 5), Array(A, 6))
scala> x.filter(math.max(_,_))
<console>:30: error: wrong number of parameters; expected = 1
x.filter(math.max(_,_))
^
I also tried the below;
Option 1:
scala> x.filter((x: Int, y: Int) => { math.max(x,y)} )
<console>:30: error: type mismatch;
found : (Int, Int) => Int
required: Array[Any] => Boolean
x.filter((x: Int, y: Int) => { math.max(x,y)} )
Option 2:
scala> val myMaxFunc = (x: Int, y: Int) => { math.max(x,y)}
myMaxFunc: (Int, Int) => Int = <function2>
scala> myMaxFunc(56,12)
res10: Int = 56
scala> x.filter(myMaxFunc(_,_) )
<console>:32: error: wrong number of parameters; expected = 1
x.filter(myMaxFunc(_,_) )
How to get this right ?
I can only guess, but probably you want to do:
val rdd = sc.parallelize(Array(("A", 3), ("B", 5), ("A", 6)))
val max = rdd.reduceByKey(math.max)
println(max.collect().toList) // List((B,5), (A,6))
Instead of "How to get this right ?" you should have explained what your expected result is. I think you made a few mistakes:
using filter instead of reduceByKey (why??)
reduceByKey only works on PairRDDs, so you need tuples instead of Array[Any] (which is a bad type anyways)
you do not need to write your own wrapper function for math.max, you can just use it as-is

Update multiple values in a sequence

To get a sequence with one value updated, one can use
seq.updated(index, value)
I want to set a new value for a range of elements. Is there a library function for that? I currently use the following function:
def updatedSlice[A](seq: List[A], ind: Iterable[Int], value: A): List[A] =
if (ind.isEmpty) seq
else updatedSlice(seq.updated(ind.head, value), ind.tail, value)
Besides the need of writing function, this seems to be inefficient, and also works only for lists, rather than arbitrary subclasses of Seq and Strings. So,
is there a method that performs it?
how can I parametrize the function to take (and return) some subclass of Seq[A]?
To my knowledge there's no combinator that directly provides this functionality.
For the Seq part, well, it works only for List because you're taking a List as a parameter. Take a Seq, return a Seq and you already have one less problem.
Moreover, your implementation throws an IndexOutOfBounds exception if ind contains an index greater or equal to the seq length.
Here's an alternative implementation (which uses Set for a O(1) contains)
def updatedAtIndexes[A](seq: Seq[A], ind: Set[Int], value: A): Seq[A] = seq.zipWithIndex.map {
case (el, i) if ind.contains(i) => value
case (el, _) => el
}
Example
updatedAtIndexes(List(1, 2, 3, 4, 5), Set(0, 2), 42) // List(42, 2, 42, 4)
You can even make it prettier with a simple implicit class:
implicit class MyPimpedSeq[A](seq: Seq[A]) {
def updatedAtIndexes(ind: Set[Int], value: A): Seq[A] = seq.zipWithIndex.map {
case (el, i) if ind.contains(i) => value
case (el, _) => el
}
}
Examples
List(1, 2, 3, 4).updatedAtIndexes(Set(0, 2), 42) // List(42, 2, 42, 4)
Vector(1, 2, 3).updatedAtIndexes(Set(1, 2, 3), 42) // Vector(1, 42, 42)
No one at a computer has said:
scala> (1 to 10).toSeq patch (3, (1 to 5), 3)
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 1, 2, 3, 4, 5, 7, 8, 9, 10)
Save your green checks for #Marth.
Note they're still working on it.
https://issues.scala-lang.org/browse/SI-8474
Which says something about less-frequently-used API.
Update: I glanced at the question a second time and saw that I misread it, oh well:
scala> implicit class x[A](as: Seq[A]) {
| def updatedAt(is: collection.Traversable[Int], a: A) = {
| (as /: is) { case (xx, i) => xx updated (i, a) } } }
defined class x
scala> (1 to 10) updatedAt (Seq(3,6,9), 0)
res9: Seq[Int] = Vector(1, 2, 3, 0, 5, 6, 0, 8, 9, 0)
Just a relaxing round of golf.
Update: s/relaxing/annoying
Looks like it needs more type params, but I don't have a time slice for it.
scala> implicit class slicer[A, B[_] <: Seq[_]](as: B[A]) {
| def updatedAt[That<:B[_]](is: Traversable[Int], a: A)(implicit cbf: CanBuildFrom[B[A], A, That]) =
| (as /: is) { case (x,i) => x updated[A,That] (i,a) }}
<console>:15: error: type arguments [A,That] conform to the bounds of none of the overloaded alternatives of
value updated: [B >: _$1, That](index: Int, elem: B)(implicit bf: scala.collection.generic.CanBuildFrom[Seq[_$1],B,That])That <and> [B >: A, That](index: Int, elem: B)(implicit bf: scala.collection.generic.CanBuildFrom[Repr,B,That])That
(as /: is) { case (x,i) => x updated[A,That] (i,a) }}
^
Who even knew updated was overloaded?
My new favorite Odersky quote:
I played with it until it got too tedious.

why does scala not match implicitly on tuples?

you can do the following in ruby:
l = [[1, 2], [3, 4], [5, 6]]
m = l.map {|(a, b)| a+b}
but you can not do the following in scala:
val a = List((1, 2), (3, 4), (5, 6))
a.map((f, s) => f + s)
<console>:9: error: wrong number of parameters; expected = 1
a.map((f, s) => f + s)
instead you have to do this:
a.map { case (f, s) => f + s }
I find this rather wordy, since scala defines a "tuple" type, I was expecting it to also provide the syntactic sugar on top of it to match implicitly like above. Is there some deep reason why this kind of matching is not supported? Is there a more elegant way of doing this?
the reason
The reason is that the syntax you are trying to use already has a meaning. It is used when the higher order function expects a two-argument function. For example, with reduce or fold:
List(1,2,3).reduce((a,b) => a+b)
a solution
The cleaner way can be achieved by defining your own implicit method:
import scala.collection.generic.CanBuildFrom
import scala.collection.GenTraversableLike
implicit class EnrichedWithMapt2[A, B, Repr](val
self: GenTraversableLike[(A, B), Repr]) extends AnyVal {
def mapt[R, That](f: (A, B) => R)(implicit bf: CanBuildFrom[Repr, R, That]) = {
self.map(x => f(x._1, x._2))
}
}
Then you can do:
val a = List((1, 2), (3, 4), (5, 6))
a.mapt((f, s) => f + s) // List(3, 7, 11)
other alternatives
There are some other tricks you can do, like using tupled, but they doesn't really help you with the situation you described:
val g = (f: Int, s: Int) => f + s
a.map(g.tupled)
Or just
a.map(((f: Int, s: Int) => f + s).tupled)
More alternatives
If you have more time, you can even do the following with implicits:
val _1 = { t: { def _1: Int } => t._1 }
val _2 = { t: { def _2: Int } => t._2 }
implicit class HighLevelPlus[A](t: A => Int) {
def +(other: A => Int) = { a: A => t(a) + other(a) }
def *(other: A => Int) = { a: A => t(a) * other(a) }
}
val a = List((1, 2), (3, 4), (5, 6))
a map _1 + _2
Another possibility with monads and the keyword for and yield:
val a = List((1, 2), (3, 4), (5, 6))
for((i, j) <- a) yield i + j
but this one might not be the solution you prefer.
Is there some deep reason why this kind of matching is not supported?
Yes, because in Scala, the syntax
(f, s) =>
means an anonymous function that takes 2 arguments.
Is there a more elegant way of doing this?
(Somewhat tongue-in-cheek answer.) Use Haskell, where \(f, s) -> actually means a function that takes a tuple as an argument.
you can use _1, _2 properties of Tuple for desired effect
scala>val a = List((1, 2), (3, 4), (5, 6))
a: List[(Int, Int)] = List((1,2), (3,4), (5,6))
scala>a.map(x => x._1 + x._2)
res2: List[Int] = List(3, 7, 11)

Use 4 (or N) collections to yield only one value at a time (1xN) (i.e. zipped for tuple4+)

scala> val a = List(1,2)
a: List[Int] = List(1, 2)
scala> val b = List(3,4)
b: List[Int] = List(3, 4)
scala> val c = List(5,6)
c: List[Int] = List(5, 6)
scala> val d = List(7,8)
d: List[Int] = List(7, 8)
scala> (a,b,c).zipped.toList
res6: List[(Int, Int, Int)] = List((1,3,5), (2,4,6))
Now:
scala> (a,b,c,d).zipped.toList
<console>:12: error: value zipped is not a member of (List[Int], List[Int], List[Int], List[Int])
(a,b,c,d).zipped.toList
^
I've searched for this elsewhere, including this one and this one, but no conclusive answer.
I want to do the following or similar:
for((itemA,itemB,itemC,itemD) <- (something)) yield itemA + itemB + itemC + itemD
Any suggestions?
Short answer:
for (List(w,x,y,z) <- List(a,b,c,d).transpose) yield (w,x,y,z)
// List[(Int, Int, Int, Int)] = List((1,3,5,7), (2,4,6,8))
Why you want them as tuples, I'm not sure, but a slightly more interesting case would be when your lists are of different types, and for example, you want to combine them into a list of objects:
case class Person(name: String, age: Int, height: Double, weight: Double)
val names = List("Alf", "Betty")
val ages = List(22, 33)
val heights = List(111.1, 122.2)
val weights = List(70.1, 80.2)
val persons: List[Person] = ???
Solution 1: using transpose, as above:
for { List(name: String, age: Int, height: Double, weight: Double) <-
List(names, ages, heights, weights).transpose
} yield Person(name, age, height, weight)
Here, we need the type annotations in the List extractor, because transpose gives a List[List[Any]].
Solution 2: using iterators:
val namesIt = names.iterator
val agesIt = ages.iterator
val heightsIt = heights.iterator
val weightsIt = weights.iterator
for { name <- names }
yield Person(namesIt.next, agesIt.next, heightsIt.next, weightsIt.next)
Some people would avoid iterators because they involve mutable state and so are not "functional". But they're easy to understand if you come from the Java world and might be suitable if what you actually have are already iterators (input streams etc).
Shameless plug-- product-collections does something similar:
a flatZip b flatZip c flatZip d
res0: org.catch22.collections.immutable.CollSeq4[Int,Int,Int,Int] =
CollSeq((1,3,5,7),
(2,4,6,8))
scala> res0(0) //first row
res1: Product4[Int,Int,Int,Int] = (1,3,5,7)
scala> res0._1 //first column
res2: Seq[Int] = List(1, 2)
val g = List(a,b,c,d)
val result = ( g.map(x=>x(0)), g.map(x=>x(1) ) )
result : (List(1, 3, 5, 7),List(2, 4, 6, 8))
basic, zipped assit tuple2 , tuple3
http://www.scala-lang.org/api/current/index.html#scala.runtime.Tuple3Zipped
so, You want 'tuple4zippped' you make it
gool luck
found a possible solution, although it's very imperative to my taste:
val a = List(1,2)
val b = List(3,4)
val c = List(5,6)
val d = List(7,8)
val g : List[Tuple4[Int,Int,Int,Int]] = {
a.zipWithIndex.map { case (value,index) => (value, b(index), c(index), d(index))}
}
zipWithIndex would allow me to go through all the other collections. However, i'm sure there's a better way to do this. Any suggestions?
Previous attempts included:
Ryan LeCompte's zipMany or transpose.
however, it a List, not a tuple4. this is not as convenient to work with since i can't name the variables.
Tranpose it's already built in in the standard library and doesn't require higher kinds imports so it's preferrable, but not ideal.
I also, incorrectly, tried the following example with Shapeless
scala> import Traversables._
import Tuples._
import Traversables._
import Tuples._
import scala.language.postfixOps
scala> val a = List(1,2)
a: List[Int] = List(1, 2)
scala> val b = List(3,4)
b: List[Int] = List(3, 4)
scala> val c = List(5,6)
c: List[Int] = List(5, 6)
scala> val d = List(7,8)
d: List[Int] = List(7, 8)
scala> val x = List(a,b,c,d).toHList[Int :: Int :: Int :: Int :: HNil] map tupled
x: Option[(Int, Int, Int, Int)] = None