DSE 5.0 Custom Codecs in Scala - Intellij does not compile - scala

I am trying to write a custom codec to convert Cassandra columns of type timestamp to org.joda.time.DateTime.
I am building my project with sbt versions 0.13.13.
I wrote a test that serializes and deserializes a DateTime object. When I run the test via the command line with sbt "test:testOnly *DateTimeCodecTest", the project builds and the test passes.
However, if I try to build the project inside Intellij, I receive the following error:
Error:(17, 22) overloaded method constructor TypeCodec with alternatives:
(x$1: com.datastax.driver.core.DataType,x$2: shade.com.datastax.spark.connector.google.common.reflect.TypeToken[org.joda.time.DateTime])com.datastax.driver.core.TypeCodec[org.joda.time.DateTime] <and>
(x$1: com.datastax.driver.core.DataType,x$2: Class[org.joda.time.DateTime])com.datastax.driver.core.TypeCodec[org.joda.time.DateTime]
cannot be applied to (com.datastax.driver.core.DataType, com.google.common.reflect.TypeToken[org.joda.time.DateTime])
object DateTimeCodec extends TypeCodec[DateTime](DataType.timestamp(), TypeToken.of(classOf[DateTime]).wrap()) {
Here is the codec:
import java.nio.ByteBuffer
import com.datastax.driver.core.exceptions.InvalidTypeException
import com.datastax.driver.core.{ DataType, ProtocolVersion, TypeCodec }
import com.google.common.reflect.TypeToken
import org.joda.time.{ DateTime, DateTimeZone }
/**
* Provides serialization between Cassandra types and org.joda.time.DateTime
*
* Reference for writing custom codecs in Scala:
* https://www.datastax.com/dev/blog/writing-scala-codecs-for-the-java-driver
*/
object DateTimeCodec extends TypeCodec[DateTime](DataType.timestamp(), TypeToken.of(classOf[DateTime]).wrap()) {
override def serialize(value: DateTime, protocolVersion: ProtocolVersion): ByteBuffer = {
if (value == null) return null
val millis: Long = value.getMillis
TypeCodec.bigint().serializeNoBoxing(millis, protocolVersion)
}
override def deserialize(bytes: ByteBuffer, protocolVersion: ProtocolVersion): DateTime = {
val millis: Long = TypeCodec.bigint().deserializeNoBoxing(bytes, protocolVersion)
new DateTime(millis).withZone(DateTimeZone.UTC)
}
// Do we need a formatter?
override def format(value: DateTime): String = value.getMillis.toString
// Do we need a formatter?
override def parse(value: String): DateTime = {
try {
if (value == null ||
value.isEmpty ||
value.equalsIgnoreCase("NULL")) throw new Exception("Cannot produce a DateTime object from empty value")
// Do we need a formatter?
else new DateTime(value)
} catch {
// TODO: Determine the more specific exception that would be thrown in this case
case e: Exception =>
throw new InvalidTypeException(s"""Cannot parse DateTime from "$value"""", e)
}
}
}
and here is the test:
import com.datastax.driver.core.ProtocolVersion
import org.joda.time.{ DateTime, DateTimeZone }
import org.scalatest.FunSpec
class DateTimeCodecTest extends FunSpec {
describe("Serialization") {
it("should serialize between Cassandra types and org.joda.time.DateTime") {
val now = new DateTime().withZone(DateTimeZone.UTC)
val result = DateTimeCodec.deserialize(
// TODO: Verify correct ProtocolVersion for DSE 5.0
DateTimeCodec.serialize(now, ProtocolVersion.V4), ProtocolVersion.V4
)
assertResult(now)(result)
}
}
}
I make extensive use of the debugger within Intellij as well as the ability to quickly run a single test using some hotkeys. Losing the ability to compile within the IDE is almost as bad as losing the ability to compile at all. Any help would be appreciated, and I am more than happy to provide any additional information about my project // environment if anyone needs it.
Edit, update:
The project compiles within IntelliJ if I provide an instance of com.google.common.reflect.TypeToken as opposed to shade.com.datastax.spark.connector.google.common.reflect.TypeToken.
However, this breaks the build within sbt.

You must create a default constructor for DateTimeCodec.

I resolved the issue.
The issue stemmed from conflicting versions of spark-cassandra-connector on the classpath. Both shaded and unshaded versions of the dependency were on the classpath, and removing the shaded dependency fixed the issue.

Related

How to use Avro serialization for scala case classes with Flink 1.7?

We've got a Flink job written in Scala using case classes (generated from avsc files by avrohugger) to represent our state. We would like to use Avro for serialising our state so state migration will work when we update our models. We understood since Flink 1.7 Avro serialization is supported OOTB. We added the flink-avro module to the classpath, but when restoring from a saved snapshot we notice that it's still trying to use Kryo serialization. Relevant Code snippet
case class Foo(id: String, timestamp: java.time.Instant)
val env = StreamExecutionEnvironment.getExecutionEnvironment
val conf = env.getConfig
conf.disableForceKryo()
conf.enableForceAvro()
val rawDataStream: DataStream[String] = env.addSource(MyFlinkKafkaConsumer)
val parsedDataSteam: DataStream[Foo] = rawDataStream.flatMap(new JsonParser[Foo])
// do something useful with it
env.execute("my-job")
When performing a state migration on Foo (e.g. by adding a field and deploying the job) I see that it tries to deserialize using Kryo, which obviously fails. How can I make sure Avro serialization is being used?
UPDATE
Found out about https://issues.apache.org/jira/browse/FLINK-10897, so POJO state serialization with Avro is only supported from 1.8 afaik. I tried it using the latest RC of 1.8 with a simple WordCount POJO that extends from SpecificRecord:
/** MACHINE-GENERATED FROM AVRO SCHEMA. DO NOT EDIT DIRECTLY */
import scala.annotation.switch
case class WordWithCount(var word: String, var count: Long) extends
org.apache.avro.specific.SpecificRecordBase {
def this() = this("", 0L)
def get(field$: Int): AnyRef = {
(field$: #switch) match {
case 0 => {
word
}.asInstanceOf[AnyRef]
case 1 => {
count
}.asInstanceOf[AnyRef]
case _ => new org.apache.avro.AvroRuntimeException("Bad index")
}
}
def put(field$: Int, value: Any): Unit = {
(field$: #switch) match {
case 0 => this.word = {
value.toString
}.asInstanceOf[String]
case 1 => this.count = {
value
}.asInstanceOf[Long]
case _ => new org.apache.avro.AvroRuntimeException("Bad index")
}
()
}
def getSchema: org.apache.avro.Schema = WordWithCount.SCHEMA$
}
object WordWithCount {
val SCHEMA$ = new org.apache.avro.Schema.Parser().parse(" .
{\"type\":\"record\",\"name\":\"WordWithCount\",\"fields\":
[{\"name\":\"word\",\"type\":\"string\"},
{\"name\":\"count\",\"type\":\"long\"}]}")
}
This, however, also didn’t work out of the box. We then tried to define our own type information using flink-avro’s AvroTypeInfo but this fails because Avro looks for a SCHEMA$ property (SpecificData:285) in the class and is unable to use Java reflection to identify the SCHEMA$ in the Scala companion object.
I could never get reflection to work due to Scala's fields being private under the hood. AFAIK the only solution is to update Flink to use avro's non-reflection-based constructors in AvroInputFormat (compare).
In a pinch, other than Java, one could fall back to avro's GenericRecord, maybe use avro4s to generate them from avrohugger's Standard format (note that Avro4s will generate it's own schema from the generated Scala types)

Why Scala Enumeration does not work in Apache Zeppelin but it works in maven

Enumeration works as expected when I use it in a maven project(with the same Scala version).
object t {
object DashStyle extends Enumeration {
val Solid,ShortDash = Value
}
def f(style: DashStyle.Value) = println(style)
def main(args: Array[String]) = f(DashStyle.Solid)
}
But when it runs in Apache Zeppelin(Zeppelin 0.6, Spark 1.6, Scala 2.10, Java 1.8)
object DashStyle extends Enumeration {
val Solid,ShortDash = Value
}
def f(style: DashStyle.Value) = println(style)
f(DashStyle.Solid)
It reports the following error even it says found and required type is exactly the same
<console>:130: error: type mismatch;
found : DashStyle.Value
required: DashStyle.Value
f(DashStyle.Solid)
Why and how should I use it?
I figured out the trick to solve this issue.
In Apache Zeppelin (or Scala REPL). In order to use Enumeration or sealed&object, it should be wrapped in object but not directly define on the root scope.
The reason why it works in maven is that I already put it into an object.
Define enumeration in an object in a Zeppelin paragraph
object t {
object DashStyle extends Enumeration {
val Solid,ShortDash = Value
}
def f(style: DashStyle.Value) = println(style)
}
Then use it in a Zeppelin paragraph
import t._
f(DashStyle.Solid)

Having a custom QueryStringBindable in the scope with Play 2.4.3

I would like to have java.sql.Date and Option[java.sql.Date] in my Play-scala project as a query-paramater, which don't come as a default with the Play framework. Play-version I'm using is 2.4.3. I have following (rough) class.
object CustomBinders extends {
val dateFormat = ISODateTimeFormat.date()
implicit def dateBinder: QueryStringBindable[Date] = new QueryStringBindable[Date] {
def bind(key: String, params: Map[String, Seq[String]]): Option[Either[String, Date]] = {
val dateString: Option[Seq[String]] = params.get(key)
try {
Some(Right(new Date(dateFormat.parseDateTime(dateString.get.head).getMillis)))
} catch {
case e: IllegalArgumentException => Option(Left(dateString.get.head))
}
}
def unbind(key: String, value: Date): String = {
dateFormat.print(value.getTime)
}
}
}
Then in Build.scala I have
import play.sbt.routes.RoutesKeys
object Build extends Build {
RoutesKeys.routesImport += "binders.CustomBinders.dateBinder"
RoutesKeys.routesImport += "binders.CustomBinders.optionDateBinder"
However if I define a query-parameter with Option[Date] for an example, I'm getting an error
No QueryString binder found for type Option[java.sql.Date]. Try to implement an implicit QueryStringBindable for this type.
So it obviously isn't the scope. How should I define the Binders so that they exist in the scope? I can't find the 2.4-documentation for this, but 2.5-documentation doesn't say anything about needing to add them to Build.scala
So appereantly the Build.scala wasn't the right place... Even though some documentations tell to put it there. When in build.sbt
routesImport += "binders.CustomBinders._"
The project compiles just fine. Fixed some faults in the original post for the Binder as well.

AspectJ can't work on Scala function literal?

I have the following scala class and annotated aspectj class:
package playasjectj
import org.aspectj.lang.annotation.Pointcut
import org.aspectj.lang.annotation.Aspect
import org.aspectj.lang.annotation.Before
class Entity {
def foo(p:String):String ={
return p
}
def bar(handler:(String,String,Long)=>String):Unit={
handler("first", "second", 100L)
}
}
object Entity {
def main(args:Array[String]){
val inst = new Entity
inst.foo("we are the champion")
val handler = (first:String, second:String, value:Long) => {
first + second + ":" + value
}
inst.bar(handler)
}
}
#Aspect
class EntityAspect{
#Pointcut("execution(* foo(String) ) && target(playasjectj.Entity) && args(p)")
def pointcut_foo(p:String):Unit={}
#Pointcut("execution(* bar(scala.Function3<String,String,Long,String>)) && target(playasjectj.Entity) && args(handler)")
def pointcut_bar(handler: (String,String,Long)=>String):Unit={}
#Before("pointcut_foo(p)")
def beforeAdvice_foo(p:String):Unit={
println("before advice foo: " + p)
}
#Before("pointcut_bar(handler)")
def beforeAdvice_bar(handler:(String,String,Long)=>String):Unit={
println("before advice bar:")
}
}
function bar works well, but function foo doesn't. There is no any errors, seems the execution of function "foo" is not caught.
[AppClassLoader#14dad5dc] info AspectJ Weaver Version 1.8.5 built on Thursday Jan 29, 2015 at 01:03:58 GMT
[AppClassLoader#14dad5dc] info register classloader sun.misc.Launcher$AppClassLoader#14dad5dc
[AppClassLoader#14dad5dc] info using configuration /Users/grant/programming/java/workspace/playasjectj/bin/META-INF/aop.xml
[AppClassLoader#14dad5dc] info register aspect playasjectj.EntityAspect
before advice foo: we are the champion
Anyone knows how to solve the problem? I guess it is related to how scala transform the tuple into java class
solved by myself. the problem is not function literal, but type "Long". If I use "java.lang.Long" in that situation, that works. for generic type, AspectJ expects "Type",not primitive. in Scala, numeric type values like Int, Long, even Boolean are equivalent to java primitive type

Scala SBT: possible bug?

When I "sbt run" the following code,
package com.example
import java.io.ObjectInputStream
import java.io.ObjectOutputStream
import java.io.FileInputStream
import java.io.FileOutputStream
object SimpleFailure extends App {
case class MyClass(a: String, b: Int, c: Double)
def WriteObjectToFile[A](obj: A, filename: String) {
val output = new ObjectOutputStream(new FileOutputStream(filename, false))
output.writeObject(obj)
}
def ReadObjectFromFile[A](filename: String)(implicit m: Manifest[A]): A = {
val obj = new ObjectInputStream(new FileInputStream(filename)) readObject
obj match {
case a if m.erasure.isInstance(a) => a.asInstanceOf[A]
case _ => { sys.error("Type not what was expected when reading from file") }
}
}
val orig = MyClass("asdf", 42, 2.71)
val filename = "%s/delete_me.spckl".format(System.getProperty("user.home"))
WriteObjectToFile(List(orig), filename)
val loaded = try {
ReadObjectFromFile[List[MyClass]](filename)
} catch { case e => e.printStackTrace; throw e }
println(loaded(0))
}
I get the following exception:
java.lang.ClassNotFoundException: com.example.SimpleFailure$MyClass
However, I can run the code fine in Eclipse with the Scala plugin. Is this an SBT bug? Interestingly, the problem only comes up when wrapping MyClass in a List (see how "orig" is wrapped in a List in the WriteObjectToFile call). If I don't wrap in a List, everything works fine.
Put this in your build.sbt or project file:
fork in run := true
The problem seems to be with the classloader that gets used when sbt loads your code. ObjectInputStream describes it's default classloader resolution, which walks the stack. Normally, this ends up finding the loader associated with the program in mind, but in this case, it ends up using the wrong one.
I was able to work around this by including the following class in my code, and using it instead of ObjectInputStream directly.
package engine;
import java.io.InputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectStreamClass;
class LocalInputStream extends ObjectInputStream {
LocalInputStream(InputStream in) throws IOException {
super(in);
}
#Override
protected Class<?> resolveClass(ObjectStreamClass desc)
throws ClassNotFoundException
{
return Class.forName(desc.getName(), false,
this.getClass().getClassLoader());
}
}
This overrides the resolveClass method, and always uses one associated with this particular class. As long as this class is the one that is part of your app, this should work.
BTW, this is both faster than requiring fork in run, but it also works with the Play framework, which currently doesn't support forking in dev mode.
I was able to reproduce this too using sbt 0.10.1 and scalaVersion := "2.9.0-1". You should probably just report it on github or bring it up on the mailing list.