flink scala map with dead letter queue - scala

i am trying to make some scala functions that would help making flink map and filter operations that redirect their error to a dead letter queue.
However, i'm struggling with scala's type erasure which prevents me from making them generic. The implementation of mapWithDeadLetterQueue below does not compile.
sealed trait ProcessingResult[T]
case class ProcessingSuccess[T,U](result: U) extends ProcessingResult[T]
case class ProcessingError[T: TypeInformation](errorMessage: String, exceptionClass: String, stackTrace: String, sourceMessage: T) extends ProcessingResult[T]
object FlinkUtils {
// https://stackoverflow.com/questions/1803036/how-to-write-asinstanceofoption-in-scala
implicit class Castable(val obj: AnyRef) extends AnyVal {
def asInstanceOfOpt[T <: AnyRef : ClassTag] = {
obj match {
case t: T => Some(t)
case _ => None
}
}
}
def mapWithDeadLetterQueue[T: TypeInformation,U: TypeInformation](source: DataStream[T], func: (T => U)): (DataStream[U], DataStream[ProcessingError[T]]) = {
val mapped = source.map(x => {
val result = Try(func(x))
result match {
case Success(value) => ProcessingSuccess(value)
case Failure(exception) => ProcessingError(exception.getMessage, exception.getClass.getName, exception.getStackTrace.mkString("\n"), x)
}
} )
val mappedSuccess = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingSuccess[T,U]]).map(x => x.result)
val mappedFailure = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingError[T]])
(mappedSuccess, mappedFailure)
}
}
I get:
[error] FlinkUtils.scala:35:36: overloaded method value flatMap with alternatives:
[error] [R](x$1: org.apache.flink.api.common.functions.FlatMapFunction[Product with Serializable with ProcessingResult[_ <: T],R], x$2: org.apache.flink.api.common.typeinfo.TypeInformation[R])org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator[R] <and>
[error] [R](x$1: org.apache.flink.api.common.functions.FlatMapFunction[Product with Serializable with ProcessingResult[_ <: T],R])org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator[R]
[error] cannot be applied to (ProcessingResult[T] => Option[ProcessingSuccess[T,U]])
[error] val mappedSuccess = mapped.flatMap((x: ProcessingResult[T]) => x.asInstanceOfOpt[ProcessingSuccess[T,U]]).map(x => x.result)
Is there a way to make this work ?

Ok, i'm going to answer my own question. I made a couple of mistakes:
First of all, i accidentically included the java DataStream class instead of the scala DataStream class (this happens all the time). The java variant obviously doesn't accept a scala lambda for map/filter/flatmap
Second, sealed traits are not supported by flinks serialisation. There is a project that should solve it but I didn't try it yet.
Solution: first i didn't use sealed trait but a simple case class with two Options (bit less expressive, but still works):
case class ProcessingError[T](errorMessage: String, exceptionClass: String, stackTrace: String, sourceMessage: T)
case class ProcessingResult[T: TypeInformation, U: TypeInformation](result: Option[U], error: Option[ProcessingError[T]])
Then, i could have everything working like so:
object FlinkUtils {
def mapWithDeadLetterQueue[T: TypeInformation: ClassTag,U: TypeInformation: ClassTag]
(source: DataStream[T], func: (T => U)):
(DataStream[U], DataStream[ProcessingError[T]]) = {
implicit val typeInfo = TypeInformation.of(classOf[ProcessingResult[T,U]])
val mapped = source.map((x: T) => {
val result = Try(func(x))
result match {
case Success(value) => ProcessingResult[T, U](Some(value), None)
case Failure(exception) => ProcessingResult[T, U](None, Some(
ProcessingError(exception.getMessage, exception.getClass.getName,
exception.getStackTrace.mkString("\n"), x)))
}
} )
val mappedSuccess = mapped.flatMap((x: ProcessingResult[T,U]) => x.result)
val mappedFailure = mapped.flatMap((x: ProcessingResult[T,U]) => x.error)
(mappedSuccess, mappedFailure)
}
}
the flatMap and filter functions look very similar, but they use a ProcessingResult[T,List[T]] and a ProcessingResult[T,T] respectively.
I use the functions like this:
val (result, errors) = FlinkUtils.filterWithDeadLetterQueue(input, (x: MyMessage) => {
x.`type` match {
case "something" => throw new Exception("how how how")
case "something else" => false
case _ => true
}
})

Related

How to find class parameter datatype at runtime in scala

import scala.reflect.runtime.universe
import scala.reflect.runtime.universe._
def getType[T: TypeTag](obj: T) = typeOf[T]
case class Thing(
val id: Int,
var name: String
)
val thing = Thing(1, "Apple")
val dataType = getType(thing).decl(TermName("id")).asTerm.typeSignature
dataType match {
case t if t =:= typeOf[Int] => println("I am Int")
case t if t =:= typeOf[String] => println("String, Do some stuff")
case _ => println("Absurd")
}
Not able to digest why result is Absurd instead of I am int?
My aim is to know data-type of class parameter at runtime and match it to predefined types.
Both current dataType and typeOf[Int] are printed as Int but if you do showRaw you'll see why they don't match
showRaw(dataType) // NullaryMethodType(TypeRef(ThisType(scala), scala.Int, List()))
showRaw(typeOf[Int]) // TypeRef(ThisType(scala), scala.Int, List())
The thing is that just the type Int and the type of nullary method returning Int are different types.
Try to add .resultType
val dataType = getType(thing).decl(TermName("id")).asTerm.typeSignature.resultType
dataType match {
case t if t =:= typeOf[Int] => println("I am Int")
case t if t =:= typeOf[String] => println("String, Do some stuff")
case _ => println("Absurd")
} // I am Int
It's also worth mentioning that .decl(TermName("id")) returns getter symbol, it's .decl(TermName("id ")) (with a blank space) that returns field symbol. So alternatively you can do with a blank space in the symbol name and without .resultType
val dataType = getType(thing).decl(TermName("id ")).asTerm.typeSignature
I'll add to #TomerShetah's answer that if the goal is "pattern matching" all fields of a case class then this can be done also at compile time (mostly) with Shapeless:
import shapeless.Poly1
import shapeless.syntax.std.product._
object printTypes extends Poly1 {
implicit val int: Case.Aux[Int, Unit] = at(t => println(s"I am Int: $t"))
implicit val string: Case.Aux[String, Unit] = at(t => println(s"String, Do some stuff: $t"))
implicit def default[V]: Case.Aux[V, Unit] = at(t => println(s"Absurd: $t"))
}
thing.toHList.map(printTypes)
// I am Int: 1
// String, Do some stuff: Apple
https://scastie.scala-lang.org/DmytroMitin/N4Idk4KcRumQJZE2CHC0yQ
#Dmytrio answer is a great explanation why the reflection didn't work as you expected.
I can understand from your question, that what you are trying to do, is actually pattern match all variables you have in a case class. Please consider doing it in the following way:
case class Thing(id: Int, name: String)
val thing = Thing(1, "Apple")
thing.productIterator.foreach {
case t: Int => println(s"I am Int: $t")
case t: String => println(s"String, Do some stuff: $t")
case t => println(s"Absurd: $t")
}
Code run at Scastie.

scala lower bound type "with" another type?

Here is the code
class Result[A] {
def map[B](f: (Result[A] => Result[B]), xResult: Result[A]) = {
xResult match {
case Success(x) => Success(f(xResult))
case Failure(errs) => Failure(errs)
}
}
def apply[B](fResult: Result[A], xResult: Result[B]) = {
(fResult, xResult) match {
case (Success(f), Success(x)) => Success((f, x))
case (Failure(errs), Success(a)) => Failure(errs)
case (Success(a), Failure(errs)) => Failure(errs)
case (Failure(errs), Failure(errs2)) => Failure(List(errs, errs2))
}
}
}
case class Success[A](a: A) extends Result[A] {}
case class Failure[A](a: A) extends Result[A] {}
def createCustomerId(id: Int) = {
if (id > 0)
Success(id)
else
Failure("CustomerId must be positive")
}
Here are the questions
1) result from type inference from method createCustomer is like this
Product with Serializable with Result[_ >: Int with String]
I dont have trouble with the "Product with Serializable" stuff, the thing im wondering is how to do something meaningfull with the result, and just out of curiosity how to initialize a type like that.
suppose i have another method as shown below
def createCustomer(customerId: Result[_ >: Int with String], email: Result[String]) = {
how can i read/do something with "customerId" argument
}
how to initialize a type like that
val x: Result[Int with String] = ???
notes:
i know my Result class doesnt have any generic "getter"
I know that perhaps things could be easier in method "createCustomerId" if when id > 0 i returned Success(id.ToString), but i really want to know whats going on with this lower type and how i could take advantage (if possible) in the future
The problem is that your type definition states that both Success and Failure take the same argument type. However, you actually want to put some other type of argument into Failure while still keeping covariance. Here is the simplest fix:
class Result[A] {
def map[B](f: (Result[A] => Result[B]), xResult: Result[A]) = {
xResult match {
case Success(x) => Success(f(xResult))
case Failure(errs) => Failure(errs)
}
}
def apply[B](fResult: Result[A], xResult: Result[B]) = {
(fResult, xResult) match {
case (Success(f), Success(x)) => Success((f, x))
case (Failure(errs), Success(a)) => Failure(errs)
case (Success(a), Failure(errs)) => Failure(errs)
case (Failure(errs), Failure(errs2)) => Failure(List(errs, errs2))
}
}
}
case class Success[A](a: A) extends Result[A] {}
case class Failure[A, B](a: B) extends Result[A] {}
object TestResult extends App {
def createCustomerId(id: Int): Result[Int] = {
if (id > 0)
Success(id)
else
Failure("CustomerId must be positive")
}
println(createCustomerId(0))
println(createCustomerId(1))
}
prints:
Failure(CustomerId must be positive)
Success(1)
If you don't annotate return type for createCustomerId with Result[Int] compiler will infer Result[_ <: Int] with Product with Serializable which is probably not ideal but not too bad.

Extract class from Option[T] when in None clause

Assuming you have the following code
trait T {
}
case class First(int:Int) extends T
case class Second(int:Int) extends T
val a:Option[T] = Option(First(3))
val b:Option[Second] = None
def test[A](a:Option[A])(implicit manifest:Manifest[Option[A]]) = {
a match {
case Some(z) => println("we have a value here")
case None => a match {
case t:Option[First] => println("instance of first class")
case s:Option[Second] => println("instance of second class")
}
}
}
test(b)
How would you extract what type the enclosing A is if the option happens to be a None. I have attempted to do various combinations of manifests, but none of them seem to work, each time it complains about the types being eliminated with erasure, i.e.
non-variable type argument First in type pattern Option[First] is unchecked since it is eliminated by erasure
You can use a type tag, a modern replacement for Manifest:
import scala.reflect.runtime.universe._
trait T
case class First(int:Int) extends T
case class Second(int:Int) extends T
def test[A: TypeTag](a: Option[A]) = {
a match {
case Some(_) => println("we have a value here")
case None => typeOf[A] match {
case t if t =:= typeOf[First] => println("instance of First")
case t if t =:= typeOf[Second] => println("instance of Second")
case t => println(s"unexpected type $t")
}
}
}
Example
val a = Option(First(3)) // we have a value here
val b: Option[First] = None // instance of First
val c: Option[Second] = None // instance of Second
val d: Option[Int] = None // unexpected type Int
By the way, you are interested in the type of A (which is being eliminated by erasure), so even with manifests you need one on A and not on Option[A].

Scala - Resolving Templated Type

I have a templated class, and depending on the type of T, I want to print something different.
class Clazz[T](fn: T => String) {}
Ideally, I would like to do something like pattern match T (which I can't do):
T match {
case x:Int => println("function from int to string")
case _ => //... and so forth
}
I tried:
class Clazz[T](fn: T => String) {
def resolve():String = fn match {
case f:(Int => String) => println("function from int to string")
case _ => //...etc.
}
}
and the error message I got was that:
non-variable type argument Int in type pattern Int => String is unchecked since it is eliminated by erasure
Is there a way to work around this?
I would use TypeTag so I can compare T to other types without losing any type info:
import scala.reflect.runtime.universe._
class Clazz[T: TypeTag](fn: T => String) {
val resolve = typeOf[T] match {
case t if t =:= typeOf[Int] => "function from Int to String"
case t if t =:= typeOf[Byte] => ..
}
}
But if I just wanted to print out something different from toString, I would go with a type class:
trait Print[T] {
def toText(t: T): String
def apply(t: T) = print(toText(t))
}
def print[T: Print](t: T) = implicitly[Print[T]].apply(t)
implicit object PrintInt extends Print[Int] {
def toText(t: Int) = s"Int: $t"
}
implicit object PrintByte extends Print[Byte] {
def toText(t: Byte) = s"Byte: $t"
}
PrintInt(3) //=> Int: 3
print(3) //=> Int: 3
3.print is even possible if you add this:
implicit class Printable[T:Print](t: T) {
def print = implicitly[Print[T]].apply(t)
}
3.print //=> Int: 3

ClassTag based pattern matching fails for primitives

I thought the following would be the most concise and correct form to collect elements of a collection which satisfy a given type:
def typeOnly[A](seq: Seq[Any])(implicit tag: reflect.ClassTag[A]): Seq[A] =
seq.collect {
case tag(t) => t
}
But this only works for AnyRef types, not primitives:
typeOnly[String](List(1, 2.3, "foo")) // ok. List(foo)
typeOnly[Double](List(1, 2.3, "foo")) // fail. List()
Obviously the direct form works:
List(1, 2.3, "foo") collect { case d: Double => d } // ok. List(2.3)
So there must be a (simple!) way to fix the above method.
It's boxed in the example, right?
scala> typeOnly[java.lang.Double](vs)
res1: Seq[Double] = List(2.3)
Update: The oracle was suitably cryptic: "boxing is supposed to be invisible, plus or minus". I don't know if this case is plus or minus.
My sense is that it's a bug, because otherwise it's all an empty charade.
More Delphic demurring: "I don't know what the given example is expected to do." Notice that it is not specified, expected by whom.
This is a useful exercise in asking who knows about boxedness, and what are the boxes? It's as though the compiler were a magician working hard to conceal a wire which keeps a playing card suspended in midair, even though everyone watching already knows there has to be a wire.
scala> def f[A](s: Seq[Any])(implicit t: ClassTag[A]) = s collect {
| case v if t.runtimeClass.isPrimitive &&
| ScalaRunTime.isAnyVal(v) &&
| v.getClass.getField("TYPE").get(null) == t.runtimeClass =>
| v.asInstanceOf[A]
| case t(x) => x
| }
f: [A](s: Seq[Any])(implicit t: scala.reflect.ClassTag[A])Seq[A]
scala> f[Double](List(1,'a',(),"hi",2.3,4,3.14,(),'b'))
res45: Seq[Double] = List(2.3, 3.14)
ScalaRunTime is not supported API -- the call to isAnyVal is just a match on the types; one could also just check that the "TYPE" field exists or
Try(v.getClass.getField("TYPE").get(null)).map(_ == t.runtimeClass).getOrElse(false)
But to get back to a nice one-liner, you can roll your own ClassTag to handle the specially-cased extractions.
Version for 2.11. This may not be bleeding edge, but it's the recently cauterized edge.
object Test extends App {
implicit class Printable(val s: Any) extends AnyVal {
def print = Console println s.toString
}
import scala.reflect.{ ClassTag, classTag }
import scala.runtime.ScalaRunTime
case class Foo(s: String)
val vs = List(1,'a',(),"hi",2.3,4,Foo("big"),3.14,Foo("small"),(),null,'b',null)
class MyTag[A](val t: ClassTag[A]) extends ClassTag[A] {
override def runtimeClass = t.runtimeClass
/*
override def unapply(x: Any): Option[A] = (
if (t.runtimeClass.isPrimitive && (ScalaRunTime isAnyVal x) &&
x.getClass.getField("TYPE").get(null) == t.runtimeClass)
Some(x.asInstanceOf[A])
else super.unapply(x)
)
*/
override def unapply(x: Any): Option[A] = (
if (t.runtimeClass.isPrimitive) {
val ok = x match {
case _: java.lang.Integer => runtimeClass == java.lang.Integer.TYPE
//case _: java.lang.Double => runtimeClass == java.lang.Double.TYPE
case _: java.lang.Double => t == ClassTag.Double // equivalent
case _: java.lang.Long => runtimeClass == java.lang.Long.TYPE
case _: java.lang.Character => runtimeClass == java.lang.Character.TYPE
case _: java.lang.Float => runtimeClass == java.lang.Float.TYPE
case _: java.lang.Byte => runtimeClass == java.lang.Byte.TYPE
case _: java.lang.Short => runtimeClass == java.lang.Short.TYPE
case _: java.lang.Boolean => runtimeClass == java.lang.Boolean.TYPE
case _: Unit => runtimeClass == java.lang.Void.TYPE
case _ => false // super.unapply(x).isDefined
}
if (ok) Some(x.asInstanceOf[A]) else None
} else if (x == null) { // let them collect nulls, for example
if (t == ClassTag.Null) Some(null.asInstanceOf[A]) else None
} else super.unapply(x)
)
}
implicit def mytag[A](implicit t: ClassTag[A]): MyTag[A] = new MyTag(t)
// the one-liner
def g[A](s: Seq[Any])(implicit t: ClassTag[A]) = s collect { case t(x) => x }
// this version loses the "null extraction", if that's a legitimate concept
//def g[A](s: Seq[Any])(implicit t: ClassTag[A]) = s collect { case x: A => x }
g[Double](vs).print
g[Int](vs).print
g[Unit](vs).print
g[String](vs).print
g[Foo](vs).print
g[Null](vs).print
}
For 2.10.x, an extra line of boilerplate because implicit resolution is -- well, we won't say it's broken, we'll just say it doesn't work.
// simplified version for 2.10.x
object Test extends App {
implicit class Printable(val s: Any) extends AnyVal {
def print = Console println s.toString
}
case class Foo(s: String)
val vs = List(1,'a',(),"hi",2.3,4,Foo("big"),3.14,Foo("small"),(),null,'b',null)
import scala.reflect.{ ClassTag, classTag }
import scala.runtime.ScalaRunTime
// is a ClassTag for implicit use in case x: A
class MyTag[A](val t: ClassTag[A]) extends ClassTag[A] {
override def runtimeClass = t.runtimeClass
override def unapply(x: Any): Option[A] = (
if (t.runtimeClass.isPrimitive && (ScalaRunTime isAnyVal x) &&
(x.getClass getField "TYPE" get null) == t.runtimeClass)
Some(x.asInstanceOf[A])
else t unapply x
)
}
// point of the exercise in implicits is the type pattern.
// there is no need to neutralize the incoming implicit by shadowing.
def g[A](s: Seq[Any])(implicit t: ClassTag[A]) = {
implicit val u = new MyTag(t) // preferred as more specific
s collect { case x: A => x }
}
s"Doubles? ${g[Double](vs)}".print
s"Ints? ${g[Int](vs)}".print
s"Units? ${g[Unit](vs)}".print
s"Strings? ${g[String](vs)}".print
s"Foos? ${g[Foo](vs)}".print
}
Promoting a comment:
#WilfredSpringer Someone heard you. SI-6967