Squeryl: Typed Primary keys - scala

I would like to define my Primary Keys as specific types - not just Long or String
For example
case class Project(
var id: ProjectId = 0,
One advantage of this is if I accidently compare different keys - then the compiler will pick it up.
Obviously this gives the compile error
overriding method id in trait KeyedEntity of type => Long;
variable id has incompatible type
Are there any example's where this type of approach is successfully implemented?
Appendix - a draft of what ProjectId could be
trait SelfType[T] {
val self : T
}
class Content_typeId( val self: Int) extends SelfType[Int]
class ProjectId( val self: Long) extends SelfType[Long]
object ProjectId {
implicit def baseToType(self: Long) = new ProjectId(self)
implicit def typeToBase(higherSelf: ProjectId) : Long = higherSelf.self
}
Thanks
Brent

Yup, it can be done, but you are going to want to upgrade to Squeryl 0.9.6. The latest available is RC3 at the moment. There are 2 changes that you'll want to take advantage of:
You no longer need to extend KeyedEntity. Instead, you can define an implicit KeyedEntityDef that Squeryl will use to determine which field(s) of your object constitute the primary key.
Squeryl 0.9.6 allows you to extend what types are supported using type classes.
RC3 is very stable, I'm using it myself in several production projects, but there is no official documentation for these features yet. You can find examples on the list, where I see you've also posted this question.
I'd also suggest looking at how both PrimitiveTypeMode (the process of exposing a TEF for a type) and PrimitiveTypeSupport (which is where the TypedExpressionFactory instances are defined). KeyedEntity itself is supported with a KeyedEntityDef By Squeryl and looking at that code may be helpful as well.

Related

Creating Spark Dataframes from regular classes

I have always seen that, when we are using a map function, we can create a dataframe from rdd using case class like below:-
case class filematches(
row_num:Long,
matches:Long,
non_matches:Long,
non_match_column_desc:Array[String]
)
newrdd1.map(x=> filematches(x._1,x._2,x._3,x._4)).toDF()
This works great as we all know!!
I was wondering , why we specifically need case classes here?
We should be able to achieve same effect using normal classes with parameterized constructors (as they will be vals and not private):-
class filematches1(
val row_num:Long,
val matches:Long,
val non_matches:Long,
val non_match_column_desc:Array[String]
)
newrdd1.map(x=> new filematches1(x._1,x._2,x._3,x._4)).toDF
Here , I am using new keyword to instantiate the class.
Running above has given me the error:-
error: value toDF is not a member of org.apache.spark.rdd.RDD[filematches1]
I am sure I am missing some key concept on case classes vs regular classes here but not able to find it yet.
To resolve error of
value toDF is not a member of org.apache.spark.rdd.RDD[...]
You should move your case class definition out of function where you are using it. You can refer http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Spark-Scala-Error-value-toDF-is-not-a-member-of-org-apache/td-p/29878 for mode detail.
On your Other query - case classes are syntactic sugar and they provide following additional things
Case classes are different from general classes. They are specially used when creating immutable objects.
They have default apply function which is used as constructor to create object. (so Lesser code)
All the variables in case class are by default val type. Hence immutable. which is a good thing in spark world as all red are immutable
example for case class is
case class Book( name : string)
val book1 = Book("test")
you cannot change value of book1.name as it is immutable. and you do not need to say new Book() to create object here.
The class variables are public by default. so you don't need setter and getters.
Moreover while comparing two objects of case classes, their structure is compared instead of references.
Edit : Spark Uses Following class to Infer Schema
Code Link :
https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
If you check. in schemaFor function (Line 719 to 791). It converts Scala types to catalyst types. I this the case to handle non case classes for schema inference is not added yet. so the every time you try to use non case class with infer schema. It goes to other option and hence gives error of Schema for type $other is not supported.
Hope this helps

Pureconfig Typesafe Config with sealed abstract case class

I am trying to incorporate Pureconfig in my use case for typesafe configurations. Been successful in mapping HOCON .conf to case class types. However, if I have to constrain my types with no side-effects on the object definition side (i.e., supress default apply() and copy()), I am using following definition approach for case class:
sealed abstract case class someConfig(name:String)
object someConfig{
def apply(name:String):Option[someConfig]={
if(Option(name).isDefined && name.nonEmpty){
Some(new someConfig(name){})
} else {
None
}
}
}
To support Option[_] types, I am considering having an implicit ConfigReader. This approach seems to work, with a bit more for me to address config-keys to object mapping and instantiation.
Examples that I have searched upon so far doesn't seem to resonate this need. However, do see use of Option[_] on the object members. Tried to walk through the code samples in Pureconfig git repo.
Could someone suggest an approach where Option[T] could be supported, where T is a composite custom type? And I don't have to deal with member variable name to config key mapping, etc. i.e., avoid necessary boilerplate!
Because you've hidden the constructor for your class in order to channel validation through the apply, you'll have to manually construct a ConfigReader. I believe that's about as simple as putting this in your companion object:
implicit val configReader =
pureconfig.ConfigReader.fromNonEmptyStringOpt[someConfig](apply)
Alternatively, you could name the class implementing your abstract type in which case PureConfig's automatic derivation for sealed families would magically create the ConfigReader for you.

Scala immutability in persistent storage with Squeryl

So as I am reading the Play for Scala book, I came across something odd that was explained in the book. This is the relevant snippet:
There's something strange going on, though. If you're using immutable
classes—which vanilla case classes are—you might be worried when you
discover that Squeryl updates your object's supposedly immutable id
field when you insert the object. That means that if you execute the
following code,
val myImmutableObject = Product(0, 5010255079763,
"plastic coated blue", "standard paperclip, coated with blue plastic")
Database.productsTable.insert(myImmutableObject)
println(myImmutableObject)
the output will unexpectedly be something like: Product(13,
5010255079763, "plastic coated blue", "standard paperclip, coated with
blue plastic"). This can lead to bad situations if the rest of your
code expects an instance of one of your model classes to never change.
In order to protect yourself from this sort of stuff, we recommend you
change the insert methods we showed you earlier into this:
def insert(product: Product): Product = inTransaction {
val defensiveCopy = product.copy
productsTable.insert(defensiveCopy)
}
My question is, given that the product class is defined like this:
import org.squeryl.KeyedEntity
case class Product(
id: Long,
ean: Long,
name: String,
description: String) extends KeyedEntity[Long]
Database object is defined like this:
import org.squeryl.Schema
import org.squeryl.PrimitiveTypeMode._
object Database extends Schema {
val productsTable = table[Product]("products")
...
on(productsTable) { p => declare {
p.id is(autoIncremented)
}}
}
How then is it possible that a case class declared as val can have one of its fields changed? Is Squeryl using reflection of some sort to change the field or is the book mistaken somehow?
I am not able to run the examples to verify what the case might be, but someone who has used Squeryl can perhaps give an answer?
You can check the definition of table method for yourself:
https://github.com/squeryl/squeryl/blob/master/src/main/scala/org/squeryl/Schema.scala#L345
It's a generic function which does use reflection to instantiate the Table object bound to the given case class. Functions are first-class citizens in Scala, so they can be assigned to a val just like anything else.
The last fragment is also an asynchronous function, which maps a given argument to some modification defined for it.

Can a Scala method from a base-class be renamed?

I'm rather new to Scala, and I am trying to use lift-squeryl-record in Lift. Scala is 2.8.1 and Lift is 2.3. My problem is that I wanted to use (Mega)ProtoUser from Record, but it conflicts with lift-squeryl-record.
I followed the instruction of:
lift-squeryl-record example
which did not use ProtoUser, and tried to define my user like this:
trait AbstractUser[MyType <: AbstractUser[MyType]] extends
ProtoUser[MyType] with Record[MyType] with KeyedRecord[Long] {
NB: KeyedRecord is from package net.liftweb.squerylrecord, not net.liftweb.record
Then I get the following error:
overriding lazy value id in trait ProtoUser of type net.liftweb.record.field.LongField[MyType]; method id in trait KeyedRecord of type => Long needsoverride' modifier`
Because both KeyedRecord and ProtoUser define a differing id method. Since I do not control the code of neither classes/traits, is there any "Scala" way around it, like renaming one of the methods? I really don't want to have to choose between the two. :(
No you cannot rename methods in a subclass. If there are two conflicting method signatures from parent types, you will need to resort to another pattern, such as indirection through delegation ( http://en.wikipedia.org/wiki/Delegation_pattern )
trait AbstractUser[MyType <: AbstractUser[MyType]] extends ProtoUser[MyType] {
def record: Record[MyType] with KeyedRecord[Long]
}

What are type classes in Scala useful for?

As I understand from this blog post "type classes" in Scala is just a "pattern" implemented with traits and implicit adapters.
As the blog says if I have trait A and an adapter B -> A then I can invoke a function, which requires argument of type A, with an argument of type B without invoking this adapter explicitly.
I found it nice but not particularly useful. Could you give a use case/example, which shows what this feature is useful for ?
One use case, as requested...
Imagine you have a list of things, could be integers, floating point numbers, matrices, strings, waveforms, etc. Given this list, you want to add the contents.
One way to do this would be to have some Addable trait that must be inherited by every single type that can be added together, or an implicit conversion to an Addable if dealing with objects from a third party library that you can't retrofit interfaces to.
This approach becomes quickly overwhelming when you also want to begin adding other such operations that can be done to a list of objects. It also doesn't work well if you need alternatives (for example; does adding two waveforms concatenate them, or overlay them?) The solution is ad-hoc polymorphism, where you can pick and chose behaviour to be retrofitted to existing types.
For the original problem then, you could implement an Addable type class:
trait Addable[T] {
def zero: T
def append(a: T, b: T): T
}
//yup, it's our friend the monoid, with a different name!
You can then create implicit subclassed instances of this, corresponding to each type that you wish to make addable:
implicit object IntIsAddable extends Addable[Int] {
def zero = 0
def append(a: Int, b: Int) = a + b
}
implicit object StringIsAddable extends Addable[String] {
def zero = ""
def append(a: String, b: String) = a + b
}
//etc...
The method to sum a list then becomes trivial to write...
def sum[T](xs: List[T])(implicit addable: Addable[T]) =
xs.FoldLeft(addable.zero)(addable.append)
//or the same thing, using context bounds:
def sum[T : Addable](xs: List[T]) = {
val addable = implicitly[Addable[T]]
xs.FoldLeft(addable.zero)(addable.append)
}
The beauty of this approach is that you can supply an alternative definition of some typeclass, either controlling the implicit you want in scope via imports, or by explicitly providing the otherwise implicit argument. So it becomes possible to provide different ways of adding waveforms, or to specify modulo arithmetic for integer addition. It's also fairly painless to add a type from some 3rd-party library to your typeclass.
Incidentally, this is exactly the approach taken by the 2.8 collections API. Though the sum method is defined on TraversableLike instead of on List, and the type class is Numeric (it also contains a few more operations than just zero and append)
Reread the first comment there:
A crucial distinction between type classes and interfaces is that for class A to be a "member" of an interface it must declare so at the site of its own definition. By contrast, any type can be added to a type class at any time, provided you can provide the required definitions, and so the members of a type class at any given time are dependent on the current scope. Therefore we don't care if the creator of A anticipated the type class we want it to belong to; if not we can simply create our own definition showing that it does indeed belong, and then use it accordingly. So this not only provides a better solution than adapters, in some sense it obviates the whole problem adapters were meant to address.
I think this is the most important advantage of type classes.
Also, they handle properly the cases where the operations don't have the argument of the type we are dispatching on, or have more than one. E.g. consider this type class:
case class Default[T](val default: T)
object Default {
implicit def IntDefault: Default[Int] = Default(0)
implicit def OptionDefault[T]: Default[Option[T]] = Default(None)
...
}
I think of type classes as the ability to add type safe metadata to a class.
So you first define a class to model the problem domain and then think of metadata to add to it. Things like Equals, Hashable, Viewable, etc. This creates a separation of the problem domain and the mechanics to use the class and opens up subclassing because the class is leaner.
Except for that, you can add type classes anywhere in the scope, not just where the class is defined and you can change implementations. For example, if I calculate a hash code for a Point class by using Point#hashCode, then I'm limited to that specific implementation which may not create a good distribution of values for the specific set of Points I have. But if I use Hashable[Point], then I may provide my own implementation.
[Updated with example]
As an example, here's a use case I had last week. In our product there are several cases of Maps containing containers as values. E.g., Map[Int, List[String]] or Map[String, Set[Int]]. Adding to these collections can be verbose:
map += key -> (value :: map.getOrElse(key, List()))
So I wanted to have a function that wraps this so I could write
map +++= key -> value
The main issue is that the collections don't all have the same methods for adding elements. Some have '+' while others ':+'. I also wanted to retain the efficiency of adding elements to a list, so I didn't want to use fold/map which create new collections.
The solution is to use type classes:
trait Addable[C, CC] {
def add(c: C, cc: CC) : CC
def empty: CC
}
object Addable {
implicit def listAddable[A] = new Addable[A, List[A]] {
def empty = Nil
def add(c: A, cc: List[A]) = c :: cc
}
implicit def addableAddable[A, Add](implicit cbf: CanBuildFrom[Add, A, Add]) = new Addable[A, Add] {
def empty = cbf().result
def add(c: A, cc: Add) = (cbf(cc) += c).result
}
}
Here I defined a type class Addable that can add an element C to a collection CC. I have 2 default implementations: For Lists using :: and for other collections, using the builder framework.
Then using this type class is:
class RichCollectionMap[A, C, B[_], M[X, Y] <: collection.Map[X, Y]](map: M[A, B[C]])(implicit adder: Addable[C, B[C]]) {
def updateSeq[That](a: A, c: C)(implicit cbf: CanBuildFrom[M[A, B[C]], (A, B[C]), That]): That = {
val pair = (a -> adder.add(c, map.getOrElse(a, adder.empty) ))
(map + pair).asInstanceOf[That]
}
def +++[That](t: (A, C))(implicit cbf: CanBuildFrom[M[A, B[C]], (A, B[C]), That]): That = updateSeq(t._1, t._2)(cbf)
}
implicit def toRichCollectionMap[A, C, B[_], M[X, Y] <: col
The special bit is using adder.add to add the elements and adder.empty to create new collections for new keys.
To compare, without type classes I would have had 3 options:
1. to write a method per collection type. E.g., addElementToSubList and addElementToSet etc. This creates a lot of boilerplate in the implementation and pollutes the namespace
2. to use reflection to determine if the sub collection is a List / Set. This is tricky as the map is empty to begin with (of course scala helps here also with Manifests)
3. to have poor-man's type class by requiring the user to supply the adder. So something like addToMap(map, key, value, adder), which is plain ugly
Yet another way I find this blog post helpful is where it describes typeclasses: Monads Are Not Metaphors
Search the article for typeclass. It should be the first match. In this article, the author provides an example of a Monad typeclass.
The forum thread "What makes type classes better than traits?" makes some interesting points:
Typeclasses can very easily represent notions that are quite difficult to represent in the presence of subtyping, such as equality and ordering.
Exercise: create a small class/trait hierarchy and try to implement .equals on each class/trait in such a way that the operation over arbitrary instances from the hierarchy is properly reflexive, symmetric, and transitive.
Typeclasses allow you to provide evidence that a type outside of your "control" conforms with some behavior.
Someone else's type can be a member of your typeclass.
You cannot express "this method takes/returns a value of the same type as the method receiver" in terms of subtyping, but this (very useful) constraint is straightforward using typeclasses. This is the f-bounded types problem (where an F-bounded type is parameterized over its own subtypes).
All operations defined on a trait require an instance; there is always a this argument. So you cannot define for example a fromString(s:String): Foo method on trait Foo in such a way that you can call it without an instance of Foo.
In Scala this manifests as people desperately trying to abstract over companion objects.
But it is straightforward with a typeclass, as illustrated by the zero element in this monoid example.
Typeclasses can be defined inductively; for example, if you have a JsonCodec[Woozle] you can get a JsonCodec[List[Woozle]] for free.
The example above illustrates this for "things you can add together".
One way to look at type classes is that they enable retroactive extension or retroactive polymorphism. There are a couple of great posts by Casual Miracles and Daniel Westheide that show examples of using Type Classes in Scala to achieve this.
Here's a post on my blog
that explores various methods in scala of retroactive supertyping, a kind of retroactive extension, including a typeclass example.
I don't know of any other use case than Ad-hoc polymorhism which is explained here the best way possible.
Both implicits and typeclasses are used for Type-conversion. The major use-case for both of them is to provide ad-hoc polymorphism(i.e) on classes that you can't modify but expect inheritance kind of polymorphism. In case of implicits you could use both an implicit def or an implicit class (which is your wrapper class but hidden from the client). Typeclasses are more powerful as they can add functionality to an already existing inheritance chain(eg: Ordering[T] in scala's sort function).
For more detail you can see https://lakshmirajagopalan.github.io/diving-into-scala-typeclasses/
In scala type classes
Enables ad-hoc polymorphism
Statically typed (i.e. type-safe)
Borrowed from Haskell
Solves the expression problem
Behavior can be extended
- at compile-time
- after the fact
- without changing/recompiling existing code
Scala Implicits
The last parameter list of a method can be marked implicit
Implicit parameters are filled in by the compiler
In effect, you require evidence of the compiler
… such as the existence of a type class in scope
You can also specify parameters explicitly, if needed
Below Example extension on String class with type class implementation extends the class with a new methods even though string is final :)
/**
* Created by nihat.hosgur on 2/19/17.
*/
case class PrintTwiceString(val original: String) {
def printTwice = original + original
}
object TypeClassString extends App {
implicit def stringToString(s: String) = PrintTwiceString(s)
val name: String = "Nihat"
name.printTwice
}
This is an important difference (needed for functional programming):
consider inc:Num a=> a -> a:
a received is the same that is returned, this cannot be done with subtyping
I like to use type classes as a lightweight Scala idiomatic form of Dependency Injection that still works with circular dependencies yet doesn't add a lot of code complexity. I recently rewrote a Scala project from using the Cake Pattern to type classes for DI and achieved a 59% reduction in code size.