I have a function:
def mapIt(id: Int): F[String]
What is the best way to map a collection using it, if the result is used in constructor of a case class? I currently do something like this:
List(1, 2, 3, 4).map( id => mapIt(id).map(SomeCaseClass(id, _))).toList.sequence
You can map some Monad[Int] into Monad[SomeCaseClass] just using map function (if F is Monad then it can be converted to List using ListInstances.catsStdInstancesForList from cats.instances.list). After that all you need is just flatMap your list of Int into list of SomeCaseClass:
import cats.Monad
import cats.instances.list._
import scala.language.higherKinds
case class SomeCaseClass(id: Int, name: String)
def mapIt[F[_]: Monad](id: Int): F[String] = Monad[F].pure((id + 10).toString)
List(1, 2, 3, 4).flatMap(id => mapIt(id).map(x => SomeCaseClass(id, x)))
//List(
// SomeCaseClass(1,11),
// SomeCaseClass(2,12),
// SomeCaseClass(3,13),
// SomeCaseClass(4,14)
//)
More detailed explanation
in function mapIt:
def mapIt[F[_]: Monad](id: Int): F[String]
compiler expect some F[_] which should have some Monad implicit.
When you call mapIt inside flatMap function, compiler looking for some Monad over the Int:
List(1, 2, 3, 4).flatMap{
id => mapIt(id) // looking for Monad[Int]
}
and found it in the imported instances:
import cats.instances.list._
From ListInstances:
implicit val catsStdInstancesForList: Traverse[List] with Alternative[List] with Monad[List] with CoflatMap[List] =
new Traverse[List] with Alternative[List] with Monad[List] with CoflatMap[List] { ... }
after it's just working with common Monad and map Int into SomeCaseClass. In the end it's just flatten List of SomeCaseClasses into.
Related
If A has the Ordered[A] trait, I'd like to be able to have code that works like this
val collection: List[List[A]] = ... // construct a list of lists of As
val sorted = collection sort { _ < _ }
and get something where the lists have been sorted in lexicographic order. Of course, just because A has the trait Ordered[A] doesn't mean that List[A] has the trait Ordered[List[A]]. Presumably, however, the 'scala way' to do this is with an implicit def.
How do I implicitly convert a List[A] to a Ordered[List[A]], assuming A has the trait Ordered[A] (so that the code above just works)?
I have in mind using the lexicographic ordering on List[A] objects, but I'd like code that can be adapted to others orderings.
Inspired by Ben Lings' answer, I managed to work out what seems like the simplest way to sort lists lexicographically: add the line
import scala.math.Ordering.Implicits._
before doing your List[Int] comparison, to ensure that the implicit function infixOrderingOps is in scope.
(11 minutes ago I actually didn't know how to do this, I hope it's considered okay to answer my own question.)
implicit def List2OrderedList[A <% Ordered[A]](list1: List[A]): Ordered[List[A]] = {
new Ordered[List[A]] {
def compare(list2: List[A]): Int = {
for((x,y) <- list1 zip list2) {
val c = x compare y
if(c != 0) return c
}
return list1.size - list2.size
}
}
}
An important thing to note here is the 'view bound' A <% Ordered[A], which ensures that A needn't itself by an Ordered[A], just that there's a way to do this conversion. Happily, the Scala library's object Predef has an implicit conversion from Ints to RichInts, which in particular are Ordered[Int]s.
The rest of the code is just implementing lexicographic ordering.
Inspired by Ben Lings' answer, I wrote my own version of sort:
def sort[A : Ordering](coll: Seq[Iterable[A]]) = coll.sorted
which is equivalent to:
def sort[A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A]) = coll.sorted
Note that ordering is implicitly converted to Ordering[Iterable[A]].
Examples:
scala> def sort[A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A]) = coll.sorted
sort: [A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A])Seq[Iterable[A]]
scala> val coll = List(List(1, 3), List(1, 2), List(0), Nil, List(2))
coll: List[List[Int]] = List(List(1, 3), List(1, 2), List(0), List(), List(2))
scala> sort(coll)
res1: Seq[Iterable[Int]] = List(List(), List(0), List(1, 2), List(1, 3), List(2))
It was asked how to supply your own comparison function (say, _ > _ instead of _ < _). It suffices to use Ordering.fromLessThan:
scala> sort(coll)(Ordering.fromLessThan(_ > _))
res4: Seq[Iterable[Int]] = List(List(), List(2), List(1, 3), List(1, 2), List(0))
Ordering.by allows you to map your value into another type for which there is already an Ordering instance. Given that also tuples are ordered, this can be useful for lexicographical comparison of case classes.
To make an example, let's define a wrapper of an Int, apply Ordering.by(_.v), where _.v extracts the underlying value, and show that we obtain the same result:
scala> case class Wrap(v: Int)
defined class Wrap
scala> val coll2 = coll.map(_.map(Wrap(_)))
coll2: List[List[Wrap]] = List(List(Wrap(1), Wrap(3)), List(Wrap(1), Wrap(2)), List(Wrap(0)), List(), List(Wrap(2)))
scala> sort(coll2)(Ordering.by(_.v))
res6: Seq[Iterable[Wrap]] = List(List(), List(Wrap(0)), List(Wrap(1), Wrap(2)), List(Wrap(1), Wrap(3)), List(Wrap(2)))
Finally, let's do the same thing on a case class with more members, reusing the comparators for Tuples:
scala> case class MyPair(a: Int, b: Int)
defined class MyPair
scala> val coll3 = coll.map(_.map(MyPair(_, 0)))
coll3: List[List[MyPair]] = List(List(MyPair(1,0), MyPair(3,0)), List(MyPair(1,0), MyPair(2,0)), List(MyPair(0,0)), List(), List(MyPair(2,0)))
scala> sort(coll3)(Ordering.by(x => (x.a, x.b)))
res7: Seq[Iterable[MyPair]] = List(List(), List(MyPair(0,0)), List(MyPair(1,0), MyPair(2,0)), List(MyPair(1,0), MyPair(3,0)), List(MyPair(2,0)))
EDIT:
My definition of sort above is deprecated in 2.13:
warning: method Iterable in object Ordering is deprecated (since 2.13.0):
Iterables are not guaranteed to have a consistent order; if using a type
with a consistent order (e.g. Seq), use its Ordering (found in the
Ordering.Implicits object)
Use instead:
def sort[A](coll: Seq[Seq[A]])(implicit ordering: Ordering[A]) = {
import Ordering.Implicits._
coll.sorted
}
In 2.8, you should be able to just do collection.sorted. sorted takes an implicit Ordering parameter. Any type that implements Ordered has a corresponding Ordering (thanks to the implicit conversion Ordering.ordered). There is also the implicit Ordering.Iterable that makes an Iterable[T] have an Ordering if T has an Ordering.
However, if you try this it doesn't work:
scala> def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted
<console>:5: error: could not find implicit value for parameter ord: Ordering[List[A]]
def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted
^
You need to explicitly specify that you want the Ordering[Iterable[A]]:
def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted[Iterable[A]]
I'm not sure why the compiler can't find the Ordering[Iterable[A]] if the element type of the collection is List[A].
Inspired by Daniel's comment, here is a recursive version:
implicit def toOrdered[A <% Ordered[A]](list1: List[A]): Ordered[List[A]] = {
#scala.annotation.tailrec
def c(list1:List[A], list2:List[A]): Int = {
(list1, list2) match {
case (Nil, Nil) => 0
case (x::xs, Nil) => 1
case (Nil, y::ys) => -1
case (x::xs, y::ys) => (x compare y) match {
case 0 => c(xs, ys)
case i => i
}
}
}
new Ordered[List[A]] {
def compare(list2: List[A]): Int = c(list1, list2)
}
}
With respect to the comment:
I used to think it's more a matter of taste. Sometimes it's easier to verify correctness on a recursive function, and certainly your version is short enough that there is no compelling reason to prefer recursive.
I was intrigued by the performance implications though. So I tried to benchmark it: see http://gist.github.com/468435. I was surprised to see that the recursive version is faster (assuming I did the benchmark correctly). The results still hold true for list of about length 10.
Just because I already implemented this another way, here is a non-recursive version that does not use return:
new Ordering[Seq[String]]() {
override def compare(x: Seq[String], y: Seq[String]): Int = {
x.zip(y).foldLeft(None: Option[Int]){ case (r, (v, w)) =>
if(r.isDefined){
r
} else {
val comp = v.compareTo(w)
if(comp == 0) None
else Some(comp)
}
}.getOrElse(x.size.compareTo(y.size))
}
}
Say for example, I have a simple case class
case class Foo(k:String, v1:String, v2:String)
Can I get spark to recognise this as a tuple for the purposes of something like this, without converting to a tuple in, say a map or keyBy step.
val rdd = sc.parallelize(List(Foo("k", "v1", "v2")))
// Swap values
rdd.mapValues(v => (v._2, v._1))
I don't even care if it looses the original case class after such an operation. I've tried the following with no luck. I'm fairly new to Scala, am I missing something?
case class Foo(k:String, v1:String, v2:String)
extends Tuple2[String, (String, String)](k, (v1, v2))
edit: In the above snippet the case class extends Tuple2, this does not produce the desired effect that the RDD class and functions do not treat it like a tuple and allow PairRDDFunctions, such as mapValues, values, reduceByKey, etc.
Extending TupleN isn't a good idea for a number of reasons, with one of the best being the fact that it's deprecated, and on 2.11 it's not even possible to extend TupleN with a case class. Even if you make your Foo a non-case class, defining it on 2.11 with -deprecation will show you this: "warning: inheritance from class Tuple2 in package scala is deprecated: Tuples will be made final in a future version.".
If what you care about is convenience of use and you don't mind the (almost certainly negligible) overhead of the conversion to a tuple, you can enrich a RDD[Foo] with the syntax provided by PairRDDFunctions with a conversion like this:
import org.apache.spark.rdd.{ PairRDDFunctions, RDD }
case class Foo(k: String, v1: String, v2: String)
implicit def fooToPairRDDFunctions[K, V]
(rdd: RDD[Foo]): PairRDDFunctions[String, (String, String)] =
new PairRDDFunctions(
rdd.map {
case Foo(k, v1, v2) => k -> (v1, v2)
}
)
And then:
scala> val rdd = sc.parallelize(List(Foo("a", "b", "c"), Foo("d", "e", "f")))
rdd: org.apache.spark.rdd.RDD[Foo] = ParallelCollectionRDD[6] at parallelize at <console>:34
scala> rdd.mapValues(_._1).first
res0: (String, String) = (a,b)
The reason your version with Foo extending Tuple2[String, (String, String)] doesn't work is that RDD.rddToPairRDDFunctions targets an RDD[Tuple2[K, V]] and RDD isn't covariant in its type parameter, so an RDD[Foo] isn't a RDD[Tuple2[K, V]]. A simpler example might make this clearer:
case class Box[A](a: A)
class Foo(k: String, v: String) extends Tuple2[String, String](k, v)
class PairBoxFunctions(box: Box[(String, String)]) {
def pairValue: String = box.a._2
}
implicit def toPairBoxFunctions(box: Box[(String, String)]): PairBoxFunctions =
new PairBoxFunctions(box)
And then:
scala> Box(("a", "b")).pairValue
res0: String = b
scala> Box(new Foo("a", "b")).pairValue
<console>:16: error: value pairValue is not a member of Box[Foo]
Box(new Foo("a", "b")).pairValue
^
But if you make Box covariant…
case class Box[+A](a: A)
class Foo(k: String, v: String) extends Tuple2[String, String](k, v)
class PairBoxFunctions(box: Box[(String, String)]) {
def pairValue: String = box.a._2
}
implicit def toPairBoxFunctions(box: Box[(String, String)]): PairBoxFunctions =
new PairBoxFunctions(box)
…everything's fine:
scala> Box(("a", "b")).pairValue
res0: String = b
scala> Box(new Foo("a", "b")).pairValue
res1: String = b
You can't make RDD covariant, though, so defining your own implicit conversion to add the syntax is your best bet. Personally I'd probably choose to do the conversion explicitly, but this is a relatively un-horrible use of implicit conversions.
Not sure if I get your question right, but let say you have a case class
import org.apache.spark.rdd.RDD
case class DataFormat(id: Int, name: String, value: Double)
val data: Seq[(Int, String, Double)] = Seq(
(1, "Joe", 0.1),
(2, "Mike", 0.3)
)
val rdd: RDD[DataFormat] = (
sc.parallelize(data).map(x=>DataFormat(x._1, x._2, x._3))
)
// Print all data
rdd.foreach(println)
// Print only names
rdd.map(x=>x.name).foreach(println)
I'm working on a Finatra app where I have a Future[Seq[A]] (from Finagle call). However, I need to convert that to a new object that contains Seq[A], for example
case class Container[A](s: Seq[A])
That would result in Future[Container].
While I can perform map on Future[Seq[A]], it's unclear how to arrive at Future[Container[Seq[A]].
.map over the Future with Container.apply:
scala> case class Container[A](s: Seq[A])
defined class Container
scala> val f = Future.successful(List(1, 2, 3))
f: scala.concurrent.Future[List[Int]] = scala.concurrent.impl.Promise$KeptPromise#3f363cf5
scala> f.map(Container.apply)
res2: scala.concurrent.Future[Container[Int]] = scala.concurrent.impl.Promise$DefaultPromise#3d829787
scala> res2.value
res3: Option[scala.util.Try[Container[Int]]] = Some(Success(Container(List(1, 2, 3))))
f.map(Container.apply) is short for f.map(Container.apply(_)), which in turn is short for f.map(v => Container.apply(v)).
Note that you could also write it f.map(Container(_)), whichever you prefer.
I am a newbie to Scala and I am trying to understand collectives. I have a sample Scala code in which a method is defined as follows:
override def write(records: Iterator[Product2[K, V]]): Unit = {...}
From what I understand, this function is passed an argument record which is an Iterator of type Product2[K,V]. Now what I don't understand is this Product2 a user defined class or is it a built in data structure. Moreover how do explore the key-value pair contents of Product2 and how do I iterate over them.
Chances are Product2 is a built-in class and you can easily check it if you're in modern IDE (just hover over it with ctrl pressed), or, by inspecting file header -- if there is no related imports, like some.custom.package.Product2, it's built-in.
What is Product2 and where it's defined? You can easily found out such things by utilizing Scala's ScalaDoc:
In case of build-in class you can treat it like tuple of 2 elements (in fact Tuple2 extends Product2, as you may see below), which has ._1 and ._2 accessor methods.
scala> val x: Product2[String, Int] = ("foo", 1)
// x: Product2[String,Int] = (foo,1)
scala> x._1
// res0: String = foo
scala> x._2
// res1: Int = 1
See How should I think about Scala's Product classes? for more.
Iteration is also hassle free, for example here is the map operation:
scala> val xs: Iterator[Product2[String, Int]] = List("foo" -> 1, "bar" -> 2, "baz" -> 3).iterator
xs: Iterator[Product2[String,Int]] = non-empty iterator
scala> val keys = xs.map(kv => kv._1)
keys: Iterator[String] = non-empty iterator
scala> val keys = xs.map(kv => kv._1).toList
keys: List[String] = List(foo, bar, baz)
scala> xs
res2: Iterator[Product2[String,Int]] = empty iterator
Keep in mind though, that once iterator was consumed, it transitions to empty state and can't be re-used again.
Product2 is just two values of type K and V.
use it like this:
write(List((1, "one"), (2, "two")))
the prototype can also be written like: override def write(records: Iterator[(K, V)]): Unit = {...}
To access values k of type K and v of type V.
override def write(records: Iterator[(K, V)]): Unit = {
records.map{case (k, v) => w(k, v)}
}
Is there a way in Scala to convert a List[Int] to java.util.List[java.lang.Integer]?
I'm interfacing with Java (Thrift).
JavaConversions supports List --> java.util.List, and implicits exist between Int --> java.lang.Integer, but from what I can tell I would still need an extra pass to manually do the conversion:
val y = List(1)
val z: java.util.List[Integer] = asList(y) map { (x: Int) => x : java.lang.Integer }
Apparently you need both conversions. However, you can group them in a single implicit conversion:
implicit def toIntegerList( lst: List[Int] ) =
seqAsJavaList( lst.map( i => i:java.lang.Integer ) )
Example:
scala> def sizeOf( lst: java.util.List[java.lang.Integer] ) = lst.size
scala> sizeOf( List(1,2,3) )
res5: Int = 3
Because the underlying representation of Int is Integer you can cast directly to java.util.List[java.lang.Integer]. It will save you an O(n) operation and some implicit stuff.
import collection.JavaConversions._
class A {
def l() = asList(List(1,2)).asInstanceOf[java.util.List[java.lang.Integer]]
}
Then you can use from Java like this:
A a = new A();
java.util.List<Integer> l = a.l();
Note that on 2.9.0 ,I get a deprecation warning on asList (use seqAsJavaList instead)
Did you try:
val javalist = collection.JavaConversions.asJavaList (y)
I'm not sure, whether you need a conversion Int=>Integer or Int=>int here. Can you try it out?
Update:
The times, they are a changing. Today you'll get a deprecated warning for that code. Use instead:
import scala.collection.JavaConverters._
val y = List (1)
> y: List[Int] = List(1)
val javalist = (y).asJava
> javalist: java.util.List[Int] = [1]
This doesn't have an implicit at the outmost layer, but I like this generic approach and have implemented it for a couple of types of collections (List, Map).
import java.util.{List => JList}
import scala.collection.JavaConverters._
def scalaList2JavaList[A, B](scalaList: List[A])
(implicit a2bConversion: A => B): JList[B] =
(scalaList map a2bConversion).asJava
Since an implicit conversion from Int to Integer is part of standard lib, usage in this case would just look like this:
scalaList2JavaList[Int, Integer](someScalaList)
In the other direction!
(since I have these available anyway as they were my original implementations...)
import java.util.{List => JList}
import scala.collection.JavaConversions._
def javaList2ScalaList[A, B](javaList: JList[A])
(implicit a2bConversion: A => B): List[B] =
javaList.toList map a2bConversion
Usage:
javaList2ScalaList[Integer, Int](someJavaList)
This can then be re-used for all lists so long as an implicit conversion of the contained type is in scope.
(And in case you're curious, here is my implementation for map...)
def javaMap2ScalaMap[A, B, C, D](javaMap: util.Map[A, B])(implicit a2cConversion: A => C, b2dConversion: B => D): Map[C, D] =
javaMap.toMap map { case (a, b) => (a2cConversion(a), b2dConversion(b)) }
Starting Scala 2.13, the standard library includes scala.jdk.CollectionConverters which provides Scala to Java implicit collection conversions.
Which we can combine with java.lang.Integer::valueOf to convert Scala's Int to Java's Integer:
import scala.jdk.CollectionConverters._
List(1, 2, 3).map(Integer.valueOf).asJava
// java.util.List[Integer] = [1, 2, 3]
I was trying to pass a Map[String, Double] to a Java method. But the problem was JavaConversions converts the Map to a java Map, but leaves the scala Double as is, instead of converting it to java.lang.Double. After a few hours of seaching I found [Alvaro Carrasco's answer])https://stackoverflow.com/a/40683561/1612432), it is as simple as doing:
val scalaMap = // Some Map[String, Double]
val javaMap = scalaMap.mapValues(Double.box)
After this, javaMap is a Map[String, java.lang.Double]. Then you can pass this to a java function that expects a Map<String, Double> and thanks to implicit conversions the Scala Map will be converted to java.util.Map
In your case would be the same, but with Int.box:
val y = List(1)
val javay = y.map(Int.box)