I'm having trouble getting this implicit conversion to work properly.
I keep getting these errors:
[error]
found: (scala.collection.immutable.Map[O,scala.collection.immutable.Seq[D]], O) => scala.collection.immutable.Map[O,scala.collection.immutable.Seq[D]]
required:
(Object, Object) => Object
at (operToDocsMap: Map[O, Seq[D]], operator: O) =>
[error] type mismatch;
found : Object
required: scala.collection.immutable.Map[O,scala.collection.immutable.Seq[D]]
at .fold(operatorToDocsMap){
My code:
object ModelTypes {
trait Document
trait DocumentOperator {
def operatesOn[D <: Document](document: D): Boolean
}
class Documents[D <: Document](docs: Seq[D]) {
def groupByOperator[O <: DocumentOperator](operators: Seq[O])
: Map[O, Seq[D]] = {
docs.foldLeft(Map[O, Seq[D]]()) {
(operatorToDocsMap: Map[O, Seq[D]], document: D) =>
operators
.filter(_.operatesOn(document))
.fold(operatorToDocsMap){
(operToDocsMap: Map[O, Seq[D]], operator: O) =>
{operToDocsMap + (operator -> (document +: operToDocsMap.getOrElse(operator, Seq())))}
}
}
}
}
implicit def documentsConverter[D <: Document](docs: Seq[D]) =
new Documents(docs)
}
Is it something wrong with the type bounds?
Any help is greatly appreciated.
Below is more idiomatic way to achieve your requirement. This logic should give you the grouping between operators and documents without the use of convoluted nested foldJoins.
class Documents[D <: Document](docs: Seq[D]) {
def groupByOperator[O <: DocumentOperator](operators: Seq[O]): Map[O, Seq[D]] = {
val operatorDoc =
for {
doc <- docs
operator <- operators if operator.operatesOn(doc)
} yield (operator -> doc)
operatorDoc
.groupBy({ case (x, _) => x })
.mapValues(_.map({ case (_, x) => x }))
}
}
Related
i am trying to make some scala functions that would help making flink map and filter operations that redirect their error to a dead letter queue.
However, i'm struggling with scala's type erasure which prevents me from making them generic. The implementation of mapWithDeadLetterQueue below does not compile.
sealed trait ProcessingResult[T]
case class ProcessingSuccess[T,U](result: U) extends ProcessingResult[T]
case class ProcessingError[T: TypeInformation](errorMessage: String, exceptionClass: String, stackTrace: String, sourceMessage: T) extends ProcessingResult[T]
object FlinkUtils {
// https://stackoverflow.com/questions/1803036/how-to-write-asinstanceofoption-in-scala
implicit class Castable(val obj: AnyRef) extends AnyVal {
def asInstanceOfOpt[T <: AnyRef : ClassTag] = {
obj match {
case t: T => Some(t)
case _ => None
}
}
}
def mapWithDeadLetterQueue[T: TypeInformation,U: TypeInformation](source: DataStream[T], func: (T => U)): (DataStream[U], DataStream[ProcessingError[T]]) = {
val mapped = source.map(x => {
val result = Try(func(x))
result match {
case Success(value) => ProcessingSuccess(value)
case Failure(exception) => ProcessingError(exception.getMessage, exception.getClass.getName, exception.getStackTrace.mkString("\n"), x)
}
} )
val mappedSuccess = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingSuccess[T,U]]).map(x => x.result)
val mappedFailure = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingError[T]])
(mappedSuccess, mappedFailure)
}
}
I get:
[error] FlinkUtils.scala:35:36: overloaded method value flatMap with alternatives:
[error] [R](x$1: org.apache.flink.api.common.functions.FlatMapFunction[Product with Serializable with ProcessingResult[_ <: T],R], x$2: org.apache.flink.api.common.typeinfo.TypeInformation[R])org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator[R] <and>
[error] [R](x$1: org.apache.flink.api.common.functions.FlatMapFunction[Product with Serializable with ProcessingResult[_ <: T],R])org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator[R]
[error] cannot be applied to (ProcessingResult[T] => Option[ProcessingSuccess[T,U]])
[error] val mappedSuccess = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingSuccess[T,U]]).map(x => x.result)
Is there a way to make this work ?
Ok, i'm going to answer my own question. I made a couple of mistakes:
First of all, i accidentically included the java DataStream class instead of the scala DataStream class (this happens all the time). The java variant obviously doesn't accept a scala lambda for map/filter/flatmap
Second, sealed traits are not supported by flinks serialisation. There is a project that should solve it but I didn't try it yet.
Solution: first i didn't use sealed trait but a simple case class with two Options (bit less expressive, but still works):
case class ProcessingError[T](errorMessage: String, exceptionClass: String, stackTrace: String, sourceMessage: T)
case class ProcessingResult[T: TypeInformation, U: TypeInformation](result: Option[U], error: Option[ProcessingError[T]])
Then, i could have everything working like so:
object FlinkUtils {
def mapWithDeadLetterQueue[T: TypeInformation: ClassTag,U: TypeInformation: ClassTag]
(source: DataStream[T], func: (T => U)):
(DataStream[U], DataStream[ProcessingError[T]]) = {
implicit val typeInfo = TypeInformation.of(classOf[ProcessingResult[T,U]])
val mapped = source.map((x: T) => {
val result = Try(func(x))
result match {
case Success(value) => ProcessingResult[T, U](Some(value), None)
case Failure(exception) => ProcessingResult[T, U](None, Some(
ProcessingError(exception.getMessage, exception.getClass.getName,
exception.getStackTrace.mkString("\n"), x)))
}
} )
val mappedSuccess = mapped.flatMap((x: ProcessingResult[T,U]) => x.result)
val mappedFailure = mapped.flatMap((x: ProcessingResult[T,U]) => x.error)
(mappedSuccess, mappedFailure)
}
}
the flatMap and filter functions look very similar, but they use a ProcessingResult[T,List[T]] and a ProcessingResult[T,T] respectively.
I use the functions like this:
val (result, errors) = FlinkUtils.filterWithDeadLetterQueue(input, (x: MyMessage) => {
x.`type` match {
case "something" => throw new Exception("how how how")
case "something else" => false
case _ => true
}
})
I have a json serializer for Tuple. It first reflects on the Tuple and builds a function that, given a Tuple of some arity, it will return a list of (TypeAdapter, value) pairs. (A TypeAdapter is a type-specific thing that renders the value.) Looks like this:
def extractTuple(p: Product): (Product)=>List[(TypeAdapter[_], Any)] = {
val reflected = reflectOnTuple(tupleClass) // extract a bunch of reflected metadata
val tupleFieldInfo = reflected.tupleFieldInfo
tupleFieldInfos match {
case 1 =>
(p: Product) =>
List( (getTypeAdapterFor(tupleFieldInfo(0)), p.asInstanceOf[Tuple1[_]]._1) )
case 2 =>
(p: Product) =>
List( (getTypeAdapterFor(tupleFieldInfo(0)), p.asInstanceOf[Tuple1[_]]._1),
(getTypeAdapterFor(tupleFieldInfo(1)), p.asInstanceOf[Tuple1[_]]._2) )
//... and so on to Tuple23
}
}
In the JSON serializer I have a writeTuple() function that is shown below. In theory it should work as-is, but... I'm getting compile errors on fieldValue saying it is of type Any when what is expected is ?1.T.
A TypeAdapter looks like:
trait TypeAdapter[T] {
def write[WIRE](
t: T,
writer: Writer[WIRE],
out: mutable.Builder[WIRE, WIRE]): Unit
}
class JsonWriter() {
def writeTuple[T](t: T, writeFn: (Product) => List[(TypeAdapter[_], Any)], out: mutable.Builder[JSON, JSON]): Unit = {
out += "[".asInstanceOf[JSON]
var first = true
writeFn(t.asInstanceOf[Product]).foreach { case (fieldTA, fieldValue) =>
if (first)
first = false
else
out += ",".asInstanceOf[JSON]
fieldTA.write(fieldValue, this, out) // <<-- this blows up (compile) on fieldValue because it's type Any, not some specific field Type
}
out += "]".asInstanceOf[JSON]
}
}
How can I convince my TypeAdapter that the field is the correct type?
Try to use type pattern with type variable tp (lower-case)
class JsonWriter() extends Writer[JSON] {
def writeTuple[T](t: T, writeFn: (Product) => List[(TypeAdapter[_], Any)], out: mutable.Builder[JSON, JSON]): Unit = {
...
writeFn(t.asInstanceOf[Product]).foreach { case (fieldTA: TypeAdapter[tp], fieldValue) =>
...
fieldTA.write(fieldValue.asInstanceOf[tp], this, out)
}
...
}
}
or T instead of it
class JsonWriter() extends Writer[JSON] {
def writeTuple[T](t: T, writeFn: (Product) => List[(TypeAdapter[_], Any)], out: mutable.Builder[JSON, JSON]): Unit = {
...
writeFn(t.asInstanceOf[Product]).foreach { case (fieldTA: TypeAdapter[T], fieldValue) =>
...
fieldTA.write(fieldValue.asInstanceOf[T], this, out)
}
...
}
}
It seems you don't actually use T in writeTuple[T]. Maybe you can try monomorphic def writeTuple(t: Product... Also if you want to specify that (TypeAdapter[_], Any) is actually (TypeAdapter[U], U) with the same U in both elements of the tuple you can try existential type (TypeAdapter[U], U) forSome {type U}. So try
class JsonWriter() extends Writer[JSON] {
def writeTuple(t: Product, writeFn: (Product) => List[(TypeAdapter[U], U) forSome {type U}], out: mutable.Builder[JSON, JSON]): Unit = {
out += "[".asInstanceOf[JSON]
var first = true
writeFn(t).foreach { case (fieldTA, fieldValue) =>
if (first)
first = false
else
out += ",".asInstanceOf[JSON]
fieldTA.write(fieldValue, this, out)
}
out += "]".asInstanceOf[JSON]
}
}
Dmytro, your answer is very close! Unfortunately neither the 1st or 2nd options compiled. I think the 3rd would have worked just fine, except... I'm actually using Dotty not Scala, and Dotty eliminated existential types.
So I tried the following, and it worked. It involved modifying TypeAdapter, as it knows its type:
trait TypeAdapter[T] {
type tpe = T
def write[WIRE](
t: T,
writer: Writer[WIRE],
out: mutable.Builder[WIRE, WIRE]): Unit
inline def castAndWrite[WIRE](
v: Any,
writer: Writer[WIRE],
out: mutable.Builder[WIRE, WIRE]): Unit =
write(v.asInstanceOf[tpe], writer, out)
}
Calling castAndWrite() from JsonWriter enabled the correctly-typed write() mechanism to be called.
I'm trying to define a HKT in Scala (a generic stream) and I'm not sure why I'm getting a type mismatch error while trying to implement the exists method:
Here's my code so far
sealed trait SmartStream[+A]
case object Nil extends SmartStream[Nothing]
case class Cons[+A](h : () => A, t : () => SmartStream[A]) extends SmartStream[A]
object SmartStream {
def nil[A] : SmartStream[A] = Nil
def cons[A](h : => A, t : => SmartStream[A]) : SmartStream[A] = {
lazy val g = h
lazy val u = t
Cons(() => g, () => u)
}
def apply[A](as: A*) : SmartStream[A] = {
if (as.isEmpty) nil
else cons( as.head, apply(as.tail: _*))
}
def exists[A](p : A => Boolean) : Boolean = {
this match {
case Nil => false
case Cons(h, t) => p(h()) || t().exists(p)
}
}
}
The error I'm getting is:
ScalaFiddle.scala:21: error: pattern type is incompatible with expected type;
found : ScalaFiddle.this.Nil.type
required: ScalaFiddle.this.SmartStream.type
case Nil => false
^
ScalaFiddle.scala:22: error: constructor cannot be instantiated to expected type;
found : ScalaFiddle.this.Cons[A]
required: ScalaFiddle.this.SmartStream.type
case Cons(h, t) => p(h()) || t().exists(p)
^
Thanks in advance!
You're putting exists() in the SmartStream object (i.e. singleton). That means this is type SmartStream.type and can never be anything else.
If you move exists() to the trait, and remove the type parameter, things will compile.
sealed trait SmartStream[+A] {
def exists(p : A => Boolean) : Boolean = {
this match {
case Nil => false
case Cons(h, t) => p(h()) || t().exists(p)
}
}
}
There may be other deficiencies in the design, but at least this compiles.
I had tried to implement a foldLeft on a LinkedList with this code, one curried foldLeft2, and one not, foldLeft:
sealed trait LinkedList[+E] {
#tailrec
final def foldLeft[A](accumulator: A, f: (A, E) => A): A = {
this match {
case Node(h, t) => {
val current = f(accumulator, h)
t.foldLeft(current, f)
}
case Empty => accumulator
}
}
#tailrec
final def foldLeft2[A](accumulator: A)(f: (A, E) => A): A = {
this match {
case Node(h, t) => {
val current = f(accumulator, h)
t.foldLeft2(current)(f)
}
case Empty => accumulator
}
}
}
But when I use foldLeft, it seems I need to declare the type for accumulator and item, but for foldLeft2, I don't. Can someone explain why that is?
class LinkedListSpecification extends Specification {
"linked list" should {
"foldLeft correctly" in {
val original = LinkedList(1,2,3,4)
original.foldLeft(0, (acc: Int, item: Int) => acc + item) === 10
}
}
"linked list" should {
"foldLeft2 correctly" in {
val original = LinkedList(1,2,3,4)
original.foldLeft2(0)((acc, item) => acc + item) === 10
}
}
}
This is because the type inference in Scala works left-to-right across parameter lists.
Thus in the second version foldLeft2 it is able to infer the type A as Int before it continues to the next parameter list where it now expects a function (Int,E)=>Int.
While in the first version foldLeft it is trying to infer A at the same time by both parameters (accumulator and f). It complains about the anonymous function you are passing to it because it hasn't inferred type A yet.
I am trying to implement an implicit function mapMetered that wraps map and functions exactly like it in terms of returning the correct type. I tried this:
implicit class MeteredGenTraversablePimp[T, C[T] <: GenTraversable[T]](trav: C[T]) {
def foreachMetered(m: Meter)(f: T => Unit) =
m.foreach(trav)(f)
def mapMetered[B, That](m: Meter)(f: (T) => B)(
implicit bf: CanBuildFrom[C[T], B, That]
): That = {
m.start()
try {
trav.map { x =>
val z = f(x)
m.item_processed()
z
} (bf)
} finally { m.finish() }
}
}
But when I try this I get an error:
[info] Compiling 1 Scala source to /Users/benwing/devel/textgrounder/target/classes...
[error] /Users/benwing/devel/textgrounder/src/main/scala/opennlp/textgrounder/util/metering.scala:223: type mismatch;
[error] found : scala.collection.generic.CanBuildFrom[C[T],B,That]
[error] required: scala.collection.generic.CanBuildFrom[scala.collection.GenTraversable[T],B,That]
[error] } (bf)
[error] ^
[error] one error found
There are similar Stack Overflow questions, including one from Daniel Sobral who suggests writing (trav: C[T] with GenTraversableLike[T]) but this doesn't fix the problem.
The Repr parameter in CanBuildFrom and in the *Like collection types is already there to represent the most precise type of the collection.
The solution to your problem is to wrap a GenTraversableLike[A,Repr] instead of a C[T].
The compiler will infer the exact type as Repr, and everything will work flawlessly:
implicit class MeteredGenTraversablePimp[A, Repr](trav: GenTraversableLike[A,Repr]) {
def foreachMetered(m: Meter)(f: A => Unit) = {
m.start()
try {
trav.foreach{ e => f( e ); m.item_processed() }
} finally {
m.finish()
}
}
def mapMetered[B, That](m: Meter)(f: A => B)(
implicit bf: CanBuildFrom[Repr, B, That]
): That = {
m.start()
trav.map { x: A =>
try {
val z: B = f(x)
m.item_processed()
z
} finally { m.finish() }
}(bf)
}
}