Allocation of Function Literals in Scala - scala

I have a class that represents sales orders:
class SalesOrder(val f01:String, val f02:Int, ..., f50:Date)
The fXX fields are of various types. I am faced with the problem of creating an audit trail of my orders. Given two instances of the class, I have to determine which fields have changed. I have come up with the following:
class SalesOrder(val f01:String, val f02:Int, ..., val f50:Date){
def auditDifferences(that:SalesOrder): List[String] = {
def diff[A](fieldName:String, getField: SalesOrder => A) =
if(getField(this) != getField(that)) Some(fieldName) else None
val diffList = diff("f01", _.f01) :: diff("f02", _.f02) :: ...
:: diff("f50", _.f50) :: Nil
diffList.flatten
}
}
I was wondering what the compiler does with all the _.fXX functions: are they instanced just once (statically), and can be shared by all instances of my class, or will they be instanced every time I create an instance of my class?
My worry is that, since I will use a lot of SalesOrder instances, it may create a lot of garbage. Should I use a different approach?

One clean way of solving this problem would be to use the standard library's Ordering type class. For example:
class SalesOrder(val f01: String, val f02: Int, val f03: Char) {
def diff(that: SalesOrder) = SalesOrder.fieldOrderings.collect {
case (name, ord) if !ord.equiv(this, that) => name
}
}
object SalesOrder {
val fieldOrderings: List[(String, Ordering[SalesOrder])] = List(
"f01" -> Ordering.by(_.f01),
"f02" -> Ordering.by(_.f02),
"f03" -> Ordering.by(_.f03)
)
}
And then:
scala> val orderA = new SalesOrder("a", 1, 'a')
orderA: SalesOrder = SalesOrder#5827384f
scala> val orderB = new SalesOrder("b", 1, 'b')
orderB: SalesOrder = SalesOrder#3bf2e1c7
scala> orderA diff orderB
res0: List[String] = List(f01, f03)
You almost certainly don't need to worry about the perfomance of your original formulation, but this version is (arguably) nicer for unrelated reasons.

Yes, that creates 50 short lived functions. I don't think you should be worried unless you have manifest evidence that that causes a performance problem in your case.
But I would define a method that transforms SalesOrder into a Map[String, Any], then you would just have
trait SalesOrder {
def fields: Map[String, Any]
}
def diff(a: SalesOrder, b: SalesOrder): Iterable[String] = {
val af = a.fields
val bf = b.fields
af.collect { case (key, value) if bf(key) != value => key }
}
If the field names are indeed just incremental numbers, you could simplify
trait SalesOrder {
def fields: Iterable[Any]
}
def diff(a: SalesOrder, b: SalesOrder): Iterable[String] =
(a.fields zip b.fields).zipWithIndex.collect {
case ((av, bv), idx) if av != bv => f"f${idx + 1}%02d"
}

Related

Nested Scala case classes to/from CSV

There are many nice libraries for writing/reading Scala case classes to/from CSV files. I'm looking for something that goes beyond that, which can handle nested cases classes. For example, here a Match has two Players:
case class Player(name: String, ranking: Int)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane",7), Player("Fred",23)),
Match("Rome", Player("Marco",19), Player("Giulia",3)),
Match("Paris", Player("Isabelle",2), Player("Julien",5))
)
I'd like to effortlessly (no boilerplate!) write/read matches to/from this CSV:
place,winner.name,winner.ranking,loser.name,loser.ranking
London,Jane,7,Fred,23
Rome,Marco,19,Giulia,3
Paris,Isabelle,2,Julien,5
Note the automated header line using the dot "." to form the column name for a nested field, e.g. winner.ranking. I'd be delighted if someone could demonstrate a simple way to do this (say, using reflection or Shapeless).
[Motivation. During data analysis it's convenient to have a flat CSV to play around with, for sorting, filtering, etc., even when case classes are nested. And it would be nice if you could load nested case classes back from such files.]
Since a case-class is a Product, getting the values of the various fields is relatively easy. Getting the names of the fields/columns does require using Java reflection.
The following function takes a list of case-class instances and returns a list of rows, each is a list of strings. It is using a recursion to get the values and headers of child case-class instances.
def toCsv(p: List[Product]): List[List[String]] = {
def header(c: Class[_], prefix: String = ""): List[String] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
if (classOf[Product].isAssignableFrom(field.getType)) header(field.getType, name + ".")
else List(name)
}
}
def flatten(p: Product): List[String] =
p.productIterator.flatMap {
case p: Product => flatten(p)
case v: Any => List(v.toString)
}.toList
header(classOf[Match]) :: p.map(flatten)
}
However, constructing case-classes from CSV is far more involved, requiring to use reflection for getting the types of the various fields, for creating the values from the CSV strings and for constructing the case-class instances.
For simplicity (not saying the code is simple, just so it won't be further complicated), I assume that the order of columns in the CSV is the same as if the file was produced by the toCsv(...) function above.
The following function starts by creating a list of "instructions how to process a single CSV row" (the instructions are also used to verify that the column headers in the CSV matches the the case-class properties). The instructions are then used to recursively produce one CSV row at a time.
def fromCsv[T <: Product](csv: List[List[String]])(implicit tag: ClassTag[T]): List[T] = {
trait Instruction {
val name: String
val header = true
}
case class BeginCaseClassField(name: String, clazz: Class[_]) extends Instruction {
override val header = false
}
case class EndCaseClassField(name: String) extends Instruction {
override val header = false
}
case class IntField(name: String) extends Instruction
case class StringField(name: String) extends Instruction
case class DoubleField(name: String) extends Instruction
def scan(c: Class[_], prefix: String = ""): List[Instruction] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
val fType = field.getType
if (fType == classOf[Int]) List(IntField(name))
else if (fType == classOf[Double]) List(DoubleField(name))
else if (fType == classOf[String]) List(StringField(name))
else if (classOf[Product].isAssignableFrom(fType)) BeginCaseClassField(name, fType) :: scan(fType, name + ".")
else throw new IllegalArgumentException(s"Unsupported field type: $fType")
} :+ EndCaseClassField(prefix)
}
def produce(instructions: List[Instruction], row: List[String], argAccumulator: List[Any]): (List[Instruction], List[String], List[Any]) = instructions match {
case IntField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toInt)
case StringField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString)
case DoubleField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toDouble)
case BeginCaseClassField(_, clazz) :: tail =>
val (instructionRemaining, rowRemaining, constructorArgs) = produce(tail, row, List.empty)
val newCaseClass = clazz.getConstructors.head.newInstance(constructorArgs.map(_.asInstanceOf[AnyRef]): _*)
produce(instructionRemaining, rowRemaining, argAccumulator :+ newCaseClass)
case EndCaseClassField(_) :: tail => (tail, row, argAccumulator)
case Nil if row.isEmpty => (Nil, Nil, argAccumulator)
case Nil => throw new IllegalArgumentException("Not all values from CSV row were used")
}
val instructions = BeginCaseClassField(".", tag.runtimeClass) :: scan(tag.runtimeClass)
assert(csv.head == instructions.filter(_.header).map(_.name), "CSV header doesn't match target case-class fields")
csv.drop(1).map(row => produce(instructions, row, List.empty)._3.head.asInstanceOf[T])
}
I've tested this using:
case class Player(name: String, ranking: Int, price: Double)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane", 7, 12.5), Player("Fred", 23, 11.1)),
Match("Rome", Player("Marco", 19, 13.54), Player("Giulia", 3, 41.8)),
Match("Paris", Player("Isabelle", 2, 31.7), Player("Julien", 5, 16.8))
)
val csv = toCsv(matches)
val matchesFromCsv = fromCsv[Match](csv)
assert(matches == matchesFromCsv)
Obviously this should be optimized and hardened if you ever want to use this for production...

Immutable paired instances of Case Classes in Scala?

I'm trying to model a relationship which can be reversed. For example, the reverse of North might be South. The reverse of Left might be Right. I'd like to use a case class to represent my relationships. I found a similar solution that uses case Objects here, but it's not quite what I want, here.
Here's my non-functional code:
case class Relationship(name: String, opposite:Relationship)
def relationshipFactory(nameA:String, nameB:String): Relationship = {
lazy val x:Relationship = Relationship(nameA, Relationship(nameB, x))
x
}
val ns = relationshipFactory("North", "South")
ns // North
ns.opposite // South
ns.opposite.opposite // North
ns.opposite.opposite.opposite // South
Can this code be changed so that:
It dosen't crash
I can create these things on demand as pairs.
If you really want to build graphs of immutable objects with circular dependencies, you have to declare opposite as def, and (preferably) throw one more lazy val into the mix:
abstract class Relationship(val name: String) {
def opposite: Relationship
}
object Relationship {
/** Factory method */
def apply(nameA: String, nameB: String): Relationship = {
lazy val x: Relationship = new Relationship(nameA) {
lazy val opposite = new Relationship(nameB) {
def opposite = x
}
}
x
}
/** Extractor */
def unapply(r: Relationship): Option[(String, Relationship)] =
Some((r.name, r.opposite))
}
val ns = Relationship("North", "South")
println(ns.name)
println(ns.opposite.name)
println(ns.opposite.opposite.name)
println(ns.opposite.opposite.opposite.name)
You can quickly convince yourself that nothing bad happens if you run a few million rounds on this circle of circular dependencies:
// just to demonstrate that it doesn't blow up in any way if you
// call it hundred million times:
// Should be "North"
println((1 to 100000000).foldLeft(ns)((r, _) => r.opposite).name)
It indeed prints "North". It doesn work with case classes, but you can always add your own extractors, so this works:
val Relationship(x, op) = ns
val Relationship(y, original) = op
println(s"Extracted x = $x y = $y")
It prints "North" and "South" for x and y.
However, the more obvious thing to do would be to just save both components of a relation, and add opposite as a method that constructs the opposite pair.
case class Rel(a: String, b: String) {
def opposite: Rel = Rel(b, a)
}
Actually, this is already implemented in the standard library:
scala> val rel = ("North", "South")
rel: (String, String) = (North,South)
scala> rel.swap
res0: (String, String) = (South,North)
you have cyclic dependencies, this won't work. One option is to do:
case class Relationship(name: String)
and have a setter to specify the opposite. The factory would then do:
def relationshipFactory(nameA:String, nameB:String): Relationship = {
val x:Relationship = Relationship(nameA)
val opposite = Relationship(nameB)
x.setOpposite(opposite)
opposite.setOpposite(x)
x
}
another option:
case class Relationship(name: String) {
lazy val opposite = Utils.computeOpposite(this)
}
and have the opposite logic on the Utils object
yet another option: probably you don't want several South instances, so you should use case objects or enums (more on that at http://pedrorijo.com/blog/scala-enums/)
Using enums you can use pattern matching to do that logic without no overhead

Filter list elements based on another list elements

I have 2 Lists: lista and listb. For each element in lista, I want to check if a_type of each element is in b_type of listb. If true, get the b_name for corresponding b_type and construct an object objc. And, then I should return the list of of constructed objc.
Is there a way to do this in Scala and preferably without any mutable collections?
case class obja = (a_id: String, a_type: String)
case class objb = (b_id: String, b_type: String, b_name: String)
case class objc = (c_id: String, c_type: String, c_name: String)
val lista: List[obja] = List(...)
val listb: List[objb] = List(...)
def getNames(alist: List[obja], blist: List[objb]): List[objc] = ???
Lookup in lists requires traversal in O(n) time, this is inefficient. Therefore, the first thing you do is to create a map from b_type to b_name:
val bTypeToBname = listb.map(b => (b.b_type, b_name)).toMap
Then you iterate through lista, look up in the map whether there is a corresponding b_name for a given a.a_type, and construct the objc:
val cs = for {
a <- lista
b_name <- bTypeToBname.get(a.a_type)
} yield objc(a.a_id, a.a_type, b_name)
Notice how Scala for-comprehensions automatically filter those cases for which bTypeToBname(a.a_type) isn't defined: then the corresponding a is simply skipped. This because we use bTypeToBname.get(a.a_type) (which returns an Option), as opposed to calling bTypeToBname(a.a_type) directly (this would lead to a NoSuchElementException). As far as I understand, this filtering is exactly the behavior you wanted.
case class A(aId: String, aType: String)
case class B(bId: String, bType: String, bName: String)
case class C(cId: String, cType: String, cName: String)
def getNames(aList: List[A], bList: List[B]): List[C] = {
val bMap: Map[String, B] = bList.map(b => b.bType -> b)(collection.breakOut)
aList.flatMap(a => bMap.get(a.aType).map(b => C(a.aId, a.aType, b.bName)))
}
Same as Andrey's answer but without comprehension so you can see what's happening inside.
// make listb into a map from type to name for efficiency
val bs = listb.map(b => b.b_type -> b_name).toMap
val listc: Seq[objc] = lista
.flatMap(a => // flatmap to exclude types not in listb
bs.get(a.a_type) // get an option from blist
.map(bName => objc(a.a_id, a.a_type, bName)) // if there is a b name for that type, make an objc
)

Scala Macros: Checking for a certain annotation

Thanks to the answers to my previous question, I was able to create a function macro such that it returns a Map that maps each field name to its value of a class, e.g.
...
trait Model
case class User (name: String, age: Int, posts: List[String]) extends Model {
val numPosts: Int = posts.length
...
def foo = "bar"
...
}
So this command
val myUser = User("Foo", 25, List("Lorem", "Ipsum"))
myUser.asMap
returns
Map("name" -> "Foo", "age" -> 25, "posts" -> List("Lorem", "Ipsum"), "numPosts" -> 2)
This is where Tuples for the Map are generated (see Travis Brown's answer):
...
val pairs = weakTypeOf[T].declarations.collect {
case m: MethodSymbol if m.isAccessor =>
val name = c.literal(m.name.decoded)
val value = c.Expr(Select(model, m.name))
reify(name.splice -> value.splice).tree
}
...
Now I want to ignore fields that have #transient annotation. How would I check if a method has a #transient annotation?
I'm thinking of modifying the snippet above as
val pairs = weakTypeOf[T].declarations.collect {
case m: MethodSymbol if m.isAccessor && !m.annotations.exists(???) =>
val name = c.literal(m.name.decoded)
val value = c.Expr(Select(model, m.name))
reify(name.splice -> value.splice).tree
}
but I can't find what I need to write in exists part. How would I get #transient as an Annotation so I could pass it there?
Thanks in advance!
The annotation will be on the val itself, not on the accessor. The easiest way to access the val is through the accessed method on MethodSymbol:
def isTransient(m: MethodSymbol) = m.accessed.annotations.exists(
_.tpe =:= typeOf[scala.transient]
)
Now you can just write the following in your collect:
case m: MethodSymbol if m.isAccessor && !isTransient(m) =>
Note that the version of isTransient I've given here has to be defined in your macro, since it needs the imports from c.universe, but you could factor it out by adding a Universe argument if you're doing this kind of thing in several macros.

Selection Sort Generic type implementation

I worked my way implementing a recursive version of selection and quick sort,i am trying to modify the code in a way that it can sort a list of any generic type , i want to assume that the generic type supplied can be converted to Comparable at runtime.
Does anyone have a link ,code or tutorial on how to do this please
I am trying to modify this particular code
'def main (args:Array[String]){
val l = List(2,4,5,6,8)
print(quickSort(l))
}
def quickSort(x:List[Int]):List[Int]={
x match{
case xh::xt =>
{
val (first,pivot,second) = partition(x)
quickSort (first):::(pivot :: quickSort(second))
}
case Nil => {x}
}
}
def partition (x:List[Int])=
{
val pivot =x.head
var first:List[Int]=List ()
var second : List[Int]=List ()
val fun=(i:Int)=> {
if (i<pivot)
first=i::first
else
second=i::second
}
x.tail.foreach(fun)
(first,pivot,second)
}
enter code here
def main (args:Array[String]){
val l = List(2,4,5,6,8)
print(quickSort(l))
}
def quickSort(x:List[Int]):List[Int]={
x match{
case xh::xt =>
{
val (first,pivot,second) = partition(x)
quickSort (first):::(pivot :: quickSort(second))
}
case Nil => {x}
}
}
def partition (x:List[Int])=
{
val pivot =x.head
var first:List[Int]=List ()
var second : List[Int]=List ()
val fun=(i:Int)=> {
if (i<pivot)
first=i::first
else
second=i::second
}
x.tail.foreach(fun)
(first,pivot,second)
} '
Language: SCALA
In Scala, Java Comparator is replaced by Ordering (quite similar but comes with more useful methods). They are implemented for several types (primitives, strings, bigDecimals, etc.) and you can provide your own implementations.
You can then use scala implicit to ask the compiler to pick the correct one for you:
def sort[A]( lst: List[A] )( implicit ord: Ordering[A] ) = {
...
}
If you are using a predefined ordering, just call:
sort( myLst )
and the compiler will infer the second argument. If you want to declare your own ordering, use the keyword implicit in the declaration. For instance:
implicit val fooOrdering = new Ordering[Foo] {
def compare( f1: Foo, f2: Foo ) = {...}
}
and it will be implicitly use if you try to sort a List of Foo.
If you have several implementations for the same type, you can also explicitly pass the correct ordering object:
sort( myFooLst )( fooOrdering )
More info in this post.
For Quicksort, I'll modify an example from the "Scala By Example" book to make it more generic.
class Quicksort[A <% Ordered[A]] {
def sort(a:ArraySeq[A]): ArraySeq[A] =
if (a.length < 2) a
else {
val pivot = a(a.length / 2)
sort (a filter (pivot >)) ++ (a filter (pivot == )) ++
sort (a filter(pivot <))
}
}
Test with Int
scala> val quicksort = new Quicksort[Int]
quicksort: Quicksort[Int] = Quicksort#38ceb62f
scala> val a = ArraySeq(5, 3, 2, 2, 1, 1, 9, 39 ,219)
a: scala.collection.mutable.ArraySeq[Int] = ArraySeq(5, 3, 2, 2, 1, 1, 9, 39, 21
9)
scala> quicksort.sort(a).foreach(n=> (print(n), print (" " )))
1 1 2 2 3 5 9 39 219
Test with a custom class implementing Ordered
scala> case class Meh(x: Int, y:Int) extends Ordered[Meh] {
| def compare(that: Meh) = (x + y).compare(that.x + that.y)
| }
defined class Meh
scala> val q2 = new Quicksort[Meh]
q2: Quicksort[Meh] = Quicksort#7677ce29
scala> val a3 = ArraySeq(Meh(1,1), Meh(12,1), Meh(0,1), Meh(2,2))
a3: scala.collection.mutable.ArraySeq[Meh] = ArraySeq(Meh(1,1), Meh(12,1), Meh(0
,1), Meh(2,2))
scala> q2.sort(a3)
res7: scala.collection.mutable.ArraySeq[Meh] = ArraySeq(Meh(0,1), Meh(1,1), Meh(
2,2), Meh(12,1))
Even though, when coding Scala, I'm used to prefer functional programming style (via combinators or recursion) over imperative style (via variables and iterations), THIS TIME, for this specific problem, old school imperative nested loops result in simpler code for the reader. I don't think falling back to imperative style is a mistake for certain classes of problems (such as sorting algorithms which usually transform the input buffer (like a procedure) rather than resulting to a new sorted one
Here it is my solution:
package bitspoke.algo
import scala.math.Ordered
import scala.collection.mutable.Buffer
abstract class Sorter[T <% Ordered[T]] {
// algorithm provided by subclasses
def sort(buffer : Buffer[T]) : Unit
// check if the buffer is sorted
def sorted(buffer : Buffer[T]) = buffer.isEmpty || buffer.view.zip(buffer.tail).forall { t => t._2 > t._1 }
// swap elements in buffer
def swap(buffer : Buffer[T], i:Int, j:Int) {
val temp = buffer(i)
buffer(i) = buffer(j)
buffer(j) = temp
}
}
class SelectionSorter[T <% Ordered[T]] extends Sorter[T] {
def sort(buffer : Buffer[T]) : Unit = {
for (i <- 0 until buffer.length) {
var min = i
for (j <- i until buffer.length) {
if (buffer(j) < buffer(min))
min = j
}
swap(buffer, i, min)
}
}
}
As you can see, rather than using java.lang.Comparable, I preferred scala.math.Ordered and Scala View Bounds rather than Upper Bounds. That's certainly works thanks to many Scala Implicit Conversions of primitive types to Rich Wrappers.
You can write a client program as follows:
import bitspoke.algo._
import scala.collection.mutable._
val sorter = new SelectionSorter[Int]
val buffer = ArrayBuffer(3, 0, 4, 2, 1)
sorter.sort(buffer)
assert(sorter.sorted(buffer))