Scala Data Modeling and Generics

Scala Data Modeling and Generics - scala

I'm using the Play Framework and Squeryl to make a fairly basic front end for a database, but I know I'm rewriting too much code. I have different models to represent data in my db, and they all do the same six functions
object ModelType{
def add(model:ModelType):Option[ModelType] = Option(AppDB.tablename.insert(model))
def remove(id: Long) = AppDB.tablename.delete(id)
def getAll():List[ModelType] = from(AppDB.tablename)(model => select(model) orderBy(model.aDifferentFieldForEachModel)) toList
def toJson(model:ModelType):JsValue ={
Json.toJson(
Map("field" -> Json.toJson(model.field))
)
}
def allToJson() = {
val json:List[JsValue] = getAll.map{toJson(_)}
Json.toJson(json.toSeq)
}
def validate(different values for each model) = // is fairly different for each one. Validates the submitted fields from a user
}
So I'm using case classes for each of the models, and using an accompanying object for these commands. How can I use generics or traits in scala to make my life easier and not type all of these methods out every time?
EDIT: Mostly solved with gzm0's answer, but the problem is now how would I implement getAll in the trait? I want to be able to save some variable for each model that resembles the model.aDifferentFieldForEachModel as above.

You could try the following:
trait ModelOps[T] {
def table: AppDB.Table // not sure about type
def order: AppDB.OrderByPredicate // not sure about type
def toJson(model: T): JsValue
def add(model: T): Option[T] = Option(AppDB.categories.insert(model))
def remove(id: Long) = AppDB.categories.delete(id)
def getAll(): List[T] = from(table)(model => select(model) orderBy(order)) toList
def allToJson() = {
val json:List[JsValue] = getAll.map{toJson(_)}
Json.toJson(json.toSeq)
}
}
Then you can for each model type:
object ModelType extends ModelOps[ModelType] {
def table = AppDB.examples
def order = yourPredicate
def toJson(model:ModelType):JsValue = {
Json.toJson(Map("field" -> Json.toJson(model.field)))
}
def validate(different values for each model) = // is fairly different for each one. Validates the submitted fields from a user
}
UPDATE About the true type of AppDB.OrderByPredicate:
Calling select on PrimitiveTypeMode returns a SelectState. On this SelectState, you will call orderBy which takes a List[BaseQueryYield#O] (or multiple of those in the same argument list). Hence you should define:
def order(model: T): List[BaseQueryYield#O]
and
def getAll() = from(table)(model => select(model) orderBy(order(model))) toList
By the way, BaseQueryYield#O resolves to ExpressionNode.

Related

Handling loads of different message-types at runtime in an elegant way

In order to be able to handle large amounts of different request types I created a .proto file like this:
message Message
{
string typeId = 1;
bytes message = 2;
}
I added the typeId so that one knows what actual protobuf bytes represents. (Self-describing)
Now my problem is handling that different "concrete types" in an elegant way. (Note: All works fine if I simple use a switch-case-like approach!)
I thought about a solution like this:
1) Have a trait the different handlers have to implement, e.g.:
trait Handler[T]
{
def handle(req: T): Any
}
object TestHandler extends Handler[Test]
{
override def handle(req: Test): String =
{
s"A success, $req has been handled by TestHandler
}
}
object OtherHandler extends Handler[Other]
{
override def handle(req: Other): String =
{
s"A success, $req has been handled by OtherHandler
}
}
2) provide some kind of registry to query the right handler for a given message:
val handlers = Map(
Test -> TestHandler,
Other -> OtherHandler
)
3) If a request comes in it identifies itself, so we need another Mapper:
val reqMapper = Map(
"Test" -> Test
"Other" -> Other
)
4) If a request comes in, handle it:
val request ...
// Determine the requestType
val requestType = reqMapper(request.type)
// Find the correct handler for the requestType
val handler = handlers(requestType)
// Parse the actual request
val actualRequest = requestType.parse(...) // type of actualRequest can only be Test or Other in our little example
Now, until here everything looks fine and dandy, but then this line breaks my whole world:
handler.handle(actualRequest)
It leads to:
type mismatch; found : com.trueaccord.scalapb.GeneratedMessage with Product with com.trueaccord.scalapb.Message[_ >: tld.test.proto.Message.Test with tld.test.proto.Message.Other <: com.trueaccord.scalapb.GeneratedMessage with Product] with com.trueaccord.lenses.Updatable[_ >: tld.test.proto.Message.Other with tld.test.proto.Message.Test <: com.trueaccord.scalapb.GeneratedMessage with Product]{def companion: Serializable} required: _1
As far as I understand - PLEASE CORRECT ME HERE IF AM WRONG - the compiler cannot be sure here, that actualRequest is "handable" by a handler. That means it lacks the knowledge that the actualRequest is definitely somewhere in that mapper AND ALSO that there is a handler for it.
It's basically implicit knowledge a human would get, but the compiler cannot infer.
So, that being said, how can I overcome that situation elegantly?

your types are lost when you use a normal Map. for eg
object Test{}
object Other{}
val reqMapper = Map("Test" -> Test,"Other" -> Other)
reqMapper("Test")
res0: Object = Test$#5bf0fe62 // the type is lost here and is set to java.lang.Object
the most idomatic way to approach this is to use pattern matching
request match {
case x: Test => TestHandler(x)
case x: Other => OtherHandler(x)
case _ => throw new IllegalArgumentException("not supported")
}
if you still want to use Maps to store your type to handler relation consider HMap provided by Shapeless here
Heterogenous maps
Shapeless provides a heterogenous map which supports an arbitrary
relation between the key type and the corresponding value type,

I settled for this solution for now (basically thesamet's, a bit adapted for my particular use-case)
trait Handler[T <: GeneratedMessage with Message[T], R]
{
implicit val cmp: GeneratedMessageCompanion[T]
def handle(bytes: ByteString): R = {
val msg: T = cmp.parseFrom(bytes.newInput())
handler(msg)
}
def apply(t: T): R
}
object Test extends Handler[Test, String]
{
override def apply(t: Test): String = s"$t received and handled"
override implicit val cmp: GeneratedMessageCompanion[Test] = Test.messageCompanion
}

One trick you can use is to capture the companion object as an implicit, and combine the parsing and handling in a single function where the type is available to the compiler:
case class Handler[T <: GeneratedMessage with Message[T]](handler: T => Unit)(implicit cmp: GeneratedMessageCompanion[T]) {
def handle(bytes: ByteString): Unit = {
val msg: T = cmp.parseFrom(bytes.newInputStream)
handler(t)
}
}
val handlers: Map[String, Handler[_]] = Map(
"X" -> Handler((x: X) => Unit),
"Y" -> Handler((x: Y) => Unit)
)
// To handle the request:
handlers(request.typeId).handle(request.message)
Also, take a look at any.proto which defines a structure very similar to your Message. It wouldn't solve your problem, but you can take advantage of it's pack and unpack methods.

Access Spark broadcast variable in different classes

I am broadcasting a value in Spark Streaming application . But I am not sure how to access that variable in a different class than the class where it was broadcasted.
My code looks as follows:
object AppMain{
def main(args: Array[String]){
//...
val broadcastA = sc.broadcast(a)
//..
lines.foreachRDD(rdd => {
val obj = AppObject1
rdd.filter(p => obj.apply(p))
rdd.count
}
}
object AppObject1: Boolean{
def apply(str: String){
AnotherObject.process(str)
}
}
object AnotherObject{
// I want to use broadcast variable in this object
val B = broadcastA.Value // compilation error here
def process(): Boolean{
//need to use B inside this method
}
}
Can anyone suggest how to access broadcast variable in this case?

There is nothing particularly Spark specific here ignoring possible serialization issues. If you want to use some object it has to be available in the current scope and you can achieve this the same way as usual:
you can define your helpers in a scope where broadcast is already defined:
{
...
val x = sc.broadcast(1)
object Foo {
def foo = x.value
}
...
}
you can use it as a constructor argument:
case class Foo(x: org.apache.spark.broadcast.Broadcast[Int]) {
def foo = x.value
}
...
Foo(sc.broadcast(1)).foo
method argument
case class Foo() {
def foo(x: org.apache.spark.broadcast.Broadcast[Int]) = x.value
}
...
Foo().foo(sc.broadcast(1))
or even mixed-in your helpers like this:
trait Foo {
val x: org.apache.spark.broadcast.Broadcast[Int]
def foo = x.value
}
object Main extends Foo {
val sc = new SparkContext("local", "test", new SparkConf())
val x = sc.broadcast(1)
def main(args: Array[String]) {
sc.parallelize(Seq(None)).map(_ => foo).first
sc.stop
}
}

Just a short take on performance considerations that were introduced earlier.
Options proposed by zero233 are indeed very elegant way of doing this kind of things in Scala. At the same time it is important to understand implications of using certain patters in distributed system.
It is not the best idea to use mixin approach / any logic that uses enclosing class state. Whenever you use a state of enclosing class within lambdas Spark will have to serialize outer object. This is not always true but you'd better off writing safer code than one day accidentally blow up the whole cluster.
Being aware of this, I would personally go for explicit argument passing to the methods as this would not result in outer class serialization (method argument approach).

you can use classes and pass the broadcast variable to classes
your psudo code should look like :
object AppMain{
def main(args: Array[String]){
//...
val broadcastA = sc.broadcast(a)
//..
lines.foreach(rdd => {
val obj = new AppObject1(broadcastA)
rdd.filter(p => obj.apply(p))
rdd.count
})
}
}
class AppObject1(bc : Broadcast[String]){
val anotherObject = new AnotherObject(bc)
def apply(str: String): Boolean ={
anotherObject.process(str)
}
}
class AnotherObject(bc : Broadcast[String]){
// I want to use broadcast variable in this object
def process(str : String): Boolean = {
val a = bc.value
true
//need to use B inside this method
}
}

Scala: Return multiple data types from function

This is somewhat of a theoretical question but something I might want to do. Is it possible to return multiple data data types from a Scala function but limit the types that are allowed? I know I can return one type by specifying it, or I can essentially allow any data type by not specifying the return type, but I would like to return 1 of 3 particular data types to preserve a little bit of type safety. Is there a way to write an 'or' in the return type like:
def myFunc(input:String): [Int || String] = { ...}
The main context for this is trying to write universal data loading script. Some of my users use Spark, some Scalding, and who knows what will be next. I want my users to be able to use a generic loading script that might return a RichPipe, RDD, or some other data format depending on the framework they are using, but I don't want to throw type safety completely out the window.

You can use the Either type provided by the Scala Library.
def myFunc(input:String): Either[Int, String] = {
if (...)
Left(42) // return an Int
else
Right("Hello, world") // return a String
}
You can use more than two types by nesting, for instance Either[A,Either[B,C]].

As already noted in comments you'd better use Either for this task, but if you really want it, you can use implicits
object IntOrString {
implicit def fromInt(i: Int): IntOrString = new IntOrString(None, Some(i))
implicit def fromString(s: String): IntOrString = new IntOrString(Some(s), None)
}
case class IntOrString(str: Option[String], int: Option[Int])
implicit def IntOrStringToInt(v: IntOrString): Int = v.int.get
implicit def IntOrStringToStr(v: IntOrString): String = v.str.get
def myFunc(input:String): IntOrString = {
if(input.isEmpty) {
1
} else {
"test"
}
}
val i: Int = myFunc("")
val s: String = myFunc("123")
//exception
val ex: Int = myFunc("123")

I'd make the typing by the user less implicit and more explicit. Here are three examples:
def loadInt(input: String): Int = { ... }
def loadString(input: String): String = { ... }
That's nice and simple. Alternatively, we can have a function that returns the appropriate curried function using an implicit context:
def loader[T]()(implicit context: String): String => T = {
context match {
case "RDD" => loadInt _ // or loadString _
}
}
Then the user would:
implicit val context: String = "RDD" // simple example
val loader: String => Int = loader()
loader(input)
Alternatively, can turn it into an explicit parameter:
val loader: String => Int = loader("RDD")

Scala Reflection to update a case class val

I'm using scala and slick here, and I have a baserepository which is responsible for doing the basic crud of my classes.
For a design decision, we do have updatedTime and createdTime columns all handled by the application, and not by triggers in database. Both of this fields are joda DataTime instances.
Those fields are defined in two traits called HasUpdatedAt, and HasCreatedAt, for the tables
trait HasCreatedAt {
val createdAt: Option[DateTime]
}
case class User(name:String,createdAt:Option[DateTime] = None) extends HasCreatedAt
I would like to know how can I use reflection to call the user copy method, to update the createdAt value during the database insertion method.
Edit after #vptron and #kevin-wright comments
I have a repo like this
trait BaseRepo[ID, R] {
def insert(r: R)(implicit session: Session): ID
}
I want to implement the insert just once, and there I want to createdAt to be updated, that's why I'm not using the copy method, otherwise I need to implement it everywhere I use the createdAt column.

This question was answered here to help other with this kind of problem.
I end up using this code to execute the copy method of my case classes using scala reflection.
import reflect._
import scala.reflect.runtime.universe._
import scala.reflect.runtime._
class Empty
val mirror = universe.runtimeMirror(getClass.getClassLoader)
// paramName is the parameter that I want to replacte the value
// paramValue is the new parameter value
def updateParam[R : ClassTag](r: R, paramName: String, paramValue: Any): R = {
val instanceMirror = mirror.reflect(r)
val decl = instanceMirror.symbol.asType.toType
val members = decl.members.map(method => transformMethod(method, paramName, paramValue, instanceMirror)).filter {
case _: Empty => false
case _ => true
}.toArray.reverse
val copyMethod = decl.declaration(newTermName("copy")).asMethod
val copyMethodInstance = instanceMirror.reflectMethod(copyMethod)
copyMethodInstance(members: _*).asInstanceOf[R]
}
def transformMethod(method: Symbol, paramName: String, paramValue: Any, instanceMirror: InstanceMirror) = {
val term = method.asTerm
if (term.isAccessor) {
if (term.name.toString == paramName) {
paramValue
} else instanceMirror.reflectField(term).get
} else new Empty
}
With this I can execute the copy method of my case classes, replacing a determined field value.

As comments have said, don't change a val using reflection. Would you that with a java final variable? It makes your code do really unexpected things. If you need to change the value of a val, don't use a val, use a var.
trait HasCreatedAt {
var createdAt: Option[DateTime] = None
}
case class User(name:String) extends HasCreatedAt
Although having a var in a case class may bring some unexpected behavior e.g. copy would not work as expected. This may lead to preferring not using a case class for this.
Another approach would be to make the insert method return an updated copy of the case class, e.g.:
trait HasCreatedAt {
val createdAt: Option[DateTime]
def withCreatedAt(dt:DateTime):this.type
}
case class User(name:String,createdAt:Option[DateTime] = None) extends HasCreatedAt {
def withCreatedAt(dt:DateTime) = this.copy(createdAt = Some(dt))
}
trait BaseRepo[ID, R <: HasCreatedAt] {
def insert(r: R)(implicit session: Session): (ID, R) = {
val id = ???//insert into db
(id, r.withCreatedAt(??? /*now*/))
}
}
EDIT:
Since I didn't answer your original question and you may know what you are doing I am adding a way to do this.
import scala.reflect.runtime.universe._
val user = User("aaa", None)
val m = runtimeMirror(getClass.getClassLoader)
val im = m.reflect(user)
val decl = im.symbol.asType.toType.declaration("createdAt":TermName).asTerm
val fm = im.reflectField(decl)
fm.set(??? /*now*/)
But again, please don't do this. Read this stackoveflow answer to get some insight into what it can cause (vals map to final fields).

Scala Pickling: Writing a custom pickler / unpickler for nested structures

I'm trying to write a custom SPickler / Unpickler pair to work around some the current limitations of scala-pickling.
The data type I'm trying to pickle is a case class, where some of the fields already have their own SPickler and Unpickler instances.
I'd like to use these instances in my custom pickler, but I don't know how.
Here's an example of what I mean:
// Here's a class for which I want a custom SPickler / Unpickler.
// One of its fields can already be pickled, so I'd like to reuse that logic.
case class MyClass[A: SPickler: Unpickler: FastTypeTag](myString: String, a: A)
// Here's my custom pickler.
class MyClassPickler[A: SPickler: Unpickler: FastTypeTag](
implicit val format: PickleFormat) extends SPickler[MyClass[A]] with Unpickler[MyClass[A]] {
override def pickle(
picklee: MyClass[A],
builder: PBuilder) {
builder.beginEntry(picklee)
// Here we save `myString` in some custom way.
builder.putField(
"mySpecialPickler",
b => b.hintTag(FastTypeTag.ScalaString).beginEntry(
picklee.myString).endEntry())
// Now we need to save `a`, which has an implicit SPickler.
// But how do we use it?
builder.endEntry()
}
override def unpickle(
tag: => FastTypeTag[_],
reader: PReader): MyClass[A] = {
reader.beginEntry()
// First we read the string.
val myString = reader.readField("mySpecialPickler").unpickle[String]
// Now we need to read `a`, which has an implicit Unpickler.
// But how do we use it?
val a: A = ???
reader.endEntry()
MyClass(myString, a)
}
}
I would really appreciate a working example.
Thanks!

Here is a working example:
case class MyClass[A](myString: String, a: A)
Note that the type parameter of MyClass does not need context bounds. Only the custom pickler class needs the corresponding implicits:
class MyClassPickler[A](implicit val format: PickleFormat, aTypeTag: FastTypeTag[A],
aPickler: SPickler[A], aUnpickler: Unpickler[A])
extends SPickler[MyClass[A]] with Unpickler[MyClass[A]] {
private val stringUnpickler = implicitly[Unpickler[String]]
override def pickle(picklee: MyClass[A], builder: PBuilder) = {
builder.beginEntry(picklee)
builder.putField("myString",
b => b.hintTag(FastTypeTag.ScalaString).beginEntry(picklee.myString).endEntry()
)
builder.putField("a",
b => {
b.hintTag(aTypeTag)
aPickler.pickle(picklee.a, b)
}
)
builder.endEntry()
}
override def unpickle(tag: => FastTypeTag[_], reader: PReader): MyClass[A] = {
reader.hintTag(FastTypeTag.ScalaString)
val tag = reader.beginEntry()
val myStringUnpickled = stringUnpickler.unpickle(tag, reader).asInstanceOf[String]
reader.endEntry()
reader.hintTag(aTypeTag)
val aTag = reader.beginEntry()
val aUnpickled = aUnpickler.unpickle(aTag, reader).asInstanceOf[A]
reader.endEntry()
MyClass(myStringUnpickled, aUnpickled)
}
}
In addition to the custom pickler class, we also need an implicit def which returns a pickler instance specialized for concrete type arguments:
implicit def myClassPickler[A: SPickler: Unpickler: FastTypeTag](implicit pf: PickleFormat) =
new MyClassPickler

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala Data Modeling and Generics - scala

Related

Handling loads of different message-types at runtime in an elegant way

Access Spark broadcast variable in different classes

Scala: Return multiple data types from function

Scala Reflection to update a case class val

Scala Pickling: Writing a custom pickler / unpickler for nested structures

Categories

Resources