Scala Assertion unknown parameter on case class - scala

Let's say I have the below case class that translates directly a db table and where the id will be generated randomly on the creation of a new row.
case class Test(id: UUID, name: String)
Looking at the tests right now, I need to retrieve a row from Test and compare it with
val test1 = (...., "testName")
however I don't have the first parameter since it's created randomly and I would like to ignore it somehow...
I tried doing
test1 = (_, "testName")
but it's not valid.
Is there any way where I can ignore in Scala a case class parameter ?
Thanks!

Assuming we have
case class Test(id: UUID, name: String)
Here's a function that tests two instances of Test for equality, ignoring the id field.
def myEquality(a: Test, b: Test): Boolean =
a == b.copy(id=a.id)
We can't explicitly tell Scala to ignore a field, but we can most certainly mock that field to be the correct value. And since these are case classes (i.e. immutable), we can't mess up any other unrelated data structures by doing this simple trick.

To answer the question posed, the answer is no. Case class instances are defined by the values of their fields. They do not have the property of identity like normal classes. So instantiating a case class with a missing parameter is not possible.

Related

Creating Spark Dataframes from regular classes

I have always seen that, when we are using a map function, we can create a dataframe from rdd using case class like below:-
case class filematches(
row_num:Long,
matches:Long,
non_matches:Long,
non_match_column_desc:Array[String]
)
newrdd1.map(x=> filematches(x._1,x._2,x._3,x._4)).toDF()
This works great as we all know!!
I was wondering , why we specifically need case classes here?
We should be able to achieve same effect using normal classes with parameterized constructors (as they will be vals and not private):-
class filematches1(
val row_num:Long,
val matches:Long,
val non_matches:Long,
val non_match_column_desc:Array[String]
)
newrdd1.map(x=> new filematches1(x._1,x._2,x._3,x._4)).toDF
Here , I am using new keyword to instantiate the class.
Running above has given me the error:-
error: value toDF is not a member of org.apache.spark.rdd.RDD[filematches1]
I am sure I am missing some key concept on case classes vs regular classes here but not able to find it yet.
To resolve error of
value toDF is not a member of org.apache.spark.rdd.RDD[...]
You should move your case class definition out of function where you are using it. You can refer http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Spark-Scala-Error-value-toDF-is-not-a-member-of-org-apache/td-p/29878 for mode detail.
On your Other query - case classes are syntactic sugar and they provide following additional things
Case classes are different from general classes. They are specially used when creating immutable objects.
They have default apply function which is used as constructor to create object. (so Lesser code)
All the variables in case class are by default val type. Hence immutable. which is a good thing in spark world as all red are immutable
example for case class is
case class Book( name : string)
val book1 = Book("test")
you cannot change value of book1.name as it is immutable. and you do not need to say new Book() to create object here.
The class variables are public by default. so you don't need setter and getters.
Moreover while comparing two objects of case classes, their structure is compared instead of references.
Edit : Spark Uses Following class to Infer Schema
Code Link :
https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
If you check. in schemaFor function (Line 719 to 791). It converts Scala types to catalyst types. I this the case to handle non case classes for schema inference is not added yet. so the every time you try to use non case class with infer schema. It goes to other option and hence gives error of Schema for type $other is not supported.
Hope this helps

Scala case class with argument type change at runtime

Scala Issue:
JSON data is extracted and stored into case class, the time string data needs to be converted to sql timestamp for
Spark dataframe and to Java/Joda Date for Salat DAO/Mongo DB store.
And both don't support each other format.
Currently we are using two case class for same:
case class A(a:int, b:string, time:java.sql.timestamp)
case class B(a:int, b:string, time:java.util.Date)
So an Json Extractor method populates either of the above two case class based on the store type Spark/Mongo
Is there a better way to handle this ? (composite class is one way but again it gets too nested)
Do Note, the case class can even be Nested, (A containing C and D, which in turn can have time arguments within them)
I would think about the application domain first. Timestamp or Date is an implementation detail depending on your data store.
My suggested solution would be
case class MyDomainObject(a: Int, b: String, time: java.util.Instant)
object MyDomainObject {
def fromMongoObject(o: MyDomainMongoObject): MyDomainObject = ???
def fromSparkObject(o: MyDomainSparkObject): MyDomainObject = ???
}
(NOTE, I picked java.util.Instant as an example, you can choose whatever time representation you prefer)
And the classes/functions that deal with Mongo/Spark will extract object respectively in MyDomainMongoObject and MyDomainSparkObject, that will be then converted using the methods in the companion object. This way you keep your domain clean by thinking only about one particular type of time representation, but every datastore adapter can use its own type.

generically populate a same field in all case classes which have a particular timestamp property

I have case classes which have Option[java.sql.Timestamp] field.
case class BooRow(id:Int, createdTs: Option[java.sql.Timestamp])
case class FooRow(id: Int, name:String, deptId:Int,createdTs:Option[java.sql.Timestamp])
case class BarRow(id:Int,name:String)
looking for a generic solution which is also compile time safe to observe incoming case class instance and if it finds createdTs:Option[java.sql.Timestamp] in it then populate it.
Originally thought started with looking at shapless and had posted this question. Gabriele 's answer is right in the context of that previous question but may be I wasn't clear enough in the question so not sure how to "catch" "implicit not found error for that shapeless solution" (for the case class which doesn't have createdTs field) somehow. Generic function should populate createdTs if found in a case class instance it was passed, if it doesn't find createdTs in an instance of a case class then do nothing.
I have tagged slick for this question because these are slick generated case classes (not the reason for tagging) but I assume this can be a common need (generically populating such timestamp columns) when dealing with slick/databases in general. To clarify, I am not looking for db based solutions (like triggers etc) but something at an application level also without using structural types.
Edit: Providing default value to createdTs in case class declaration is not a desired solution. That will stamp wrong timestamp when case class instance is created for an "update" query.
Depending on how you are expecting to arrive at instances of you case classes, one option might be to ensure the createdTs parameter is given a default value on creation:
case class BooRow(id:Int, createdTs: Option[java.sql.Timestamp] = Some(new java.sql.Timestamp((new java.util.Date).getTime)))
However, it isn't clear from your question if this can cover your particular use case.

Copying almost identical objects in Scala (aka. How to package models for REST)?

I have several objects that closely (but not perfectly) mirror other objects in Scala. For example, I have a PackagedPerson that has all of the same fields as the PersonModel object, plus some. (The PackagedPerson adds in several fields from other entities, things that are not on the PersonModel object).
Generally, the PackagedPerson is used for transmitting a "package" of person-related things over REST, or receiving changes back (again over REST).
When preparing these transactions, I have a pack method, such as:
def pack(p: PersonModel): PackagedPerson
After all the preamble is out of the way (for instance, loading optional, extra objects that will be included in the package), I create a PackagedPerson from the PersonModel and "everything else:"
new PackagedPerson(p.id, p.name, // these (and more) from the model object
x.profilePicture, y.etc // these from elsewhere
)
In many cases, the model object has quite a few fields. My question is, how can I minimize repetitive code.
In a way it's like unapply and apply except that there are "extra" parameters, so what I really want is something like this:
new PackagePerson(p.unapply(), x.profilePicture, y.etc)
But obviously that won't work. Any ideas? What other approaches have you taken for this? I very much want to keep my REST-compatible "transport objects" separate from the model objects. Sometimes this "packaging" is not necessary, but sometimes there is too much delta between what goes over the wire, and what gets stored in the database. Trying to use a single object for both gets messy fast.
You could use LabelledGeneric from shapeless.
You can convert between a case class and its a generic representation.
case class Person(id: Int, name: String)
case class PackagedPerson(id: Int, name: String, age: Int)
def packagePerson(person: Person, age: Int) : PackagedPerson = {
val personGen = LabelledGeneric[Person]
val packPersonGen = LabelledGeneric[PackagedPerson]
// turn a Person into a generic representation
val rec = personGen.to(person)
// add the age field to the record
// and turn the updated record into a PackagedPerson
packPersonGen.from(rec + ('age ->> age))
}
Probably the order of the fields of your two case classes won't correspond as nice as my simple example. If this is the case shapeless can reorder your fields using Align. Look at this brilliant answer on another question.
You can try Java/Scala reflection. Create a method that accepts a person model, all other models and model-free parameters:
def pack(p: PersonModel, others: Seq[Model], freeParams: (String, Any)*): PackedPerson
In the method, you reflectively obtain PackedPerson's constructor, see what arguments go there. Then you (reflectively) iterate over the fields of PersonModel, other models and free args: if there's a field the name and type of which are same as one of the cunstructor params, you save it. Then you invoke the PackedPerson constructor reflectively using saved args.
Keep in mind though, that a case class can contain only up to 22 constructor params.

Getting field name and types from Case Class (with Option)

Assuming we have a model of something, represented as a case class, as so
case class User(firstName:String,lastName:String,age:Int,planet:Option[Planet])
sealed abstract class Planet
case object Earth extends Planet
case object Mars extends Planet
case object Venus extends Planet
Essentially, either by use of reflection, or Macros, to be able to get the field names of the User case class, as well as the types represented by the fields. This also includes Option, i.e. in the example provided, need to be able to differentiate between an Option[Planet] and just a Planet
In scala'ish pseudocode, something like this
val someMap = createTypedMap[User] // Assume createTypedMap is some function which returns map of Strings to Types
someMap.foreach{case(fieldName,someType) {
val statement = someType match {
case String => s"$fieldName happened to be a string"
case Int => s"$fieldName happened to be an integer"
case Planet => s"$fieldName happened to be a planet"
case Option[Planet] => s"$fieldName happened to be an optional planet"
case _ => s"unknown type for $fieldName"
}
println(statement)
}
I am currently aware that you can't do stuff like case Option[Planet], since it gets erased by Scala's erasure, however even when using TypeTags, I am unable to wrote code that does what I am trying to do, and possibly deal with other types (like Either[SomeError,String]).
Currently we are using the latest version of Scala (2.11.2) so any solution that uses TypeTags or ClassTags or macros would be more than enough.
Option is a type-parametrized type (Option[T]). At runtime, unless you have structured your code to use type tags, you have no mean to distinguish between an Option[String] and an Option[Int], due to type erasure (this is true for all type-parametrized types).
Nonetheless, you can discriminate between an Option[*] and a Planet. Just keep in mind the first issue.
Through reflection, getting all the "things" inside a class is easy. For example, say you only want the getters (you can put other types of filters, there are A LOT of them, and not all behave as expected when inheritance is part of the process, so you'll need to experiment a little):
import reflect.runtime.{universe=>ru}
val fieldSymbols = ru.typeOf[User].members.collect{
case m: ru.MethodSymbol if m.isGetter => m
}
Another option you'd have, if you are calling the code on instances rather than on classes, is to go through every method, call the method and assign the result to a variable, and then test the type of the variable. This assumes that you are only calling methods that don't alter the state of the instance.
You have a lot of options, time for you to find the best one for your needs.