Scala - Domain Objects with lots of fields - scala

This might be a stupid question but I am relatively new to Scala so please bear with me. I am trying to model a domain object for a Spark job in Scala, which reflects the data structure of the source record and contains more than 100 fields. I am trying to figure out the best way to model this as I don't feel comfortable simply adding all the fields to a single case class. I thought about grouping closely associated fields into nested case classes but then I read in a few places that nesting case classes is not recommended. I would appreciate some input on what would be the best approach.
Edit: In response to Alvaro's comments:
So in essence we are saying that this is not recommended:
case class Product(name: String,
desc: String,
productGroup: String) {
case class ProductPack(packType: String,
packQuantity: Int,
packQuantityUnit: String,
packUnitPrice: Float)
}
While this would be fine:
case class Product(name: String,
desc: String,
productGroup: String,
productPack: ProductPack) {
}
case class ProductPack(packType: String,
packQuantity: Int,
packQuantityUnit: String,
packUnitPrice: Float) {
}

Your update is correct.
Another alternative: If a case class mostly makes sense in the context of another concept, sometimes I define the case class inside a companion to the concept:
case class Product(
name: String,
desc: String,
productGroup: String
productPack: Product.Pack
)
object Product {
case class Pack(
packType: String,
packQuantity: Int,
packQuantityUnit: String,
packUnitPrice: Float
)
}
That should also be fine. The class is contained in an object, but it is not "nested" in the Product class.

Related

scala refactor 2 almost identical case classes

I have the following case classes in scala:
case class Brand(brand: String, country: String)
object Brand {
def func(data: (String, String)): Brand = {}
}
case class Manufacturer(manufacturer: String, country: String)
object Manufacturer {
def func(data: (String, String)): Manufacturer = {}
}
As you can see, the problem here is that Brand and Manufacturer actually have the same structure, but they share different field names. So the same functions will need to be implemented twice, just with different output type. The logic is the same.
Is there a good way to refactor this code?
I prefer not to combine them into one case class such as
case class Attribute(attribute: String, country: String)
Thanks.
You can use Brand.tupled instead of Brand.func, this removes the need for this function.
tupled is a method generated for case classes which takes a tuple as input and create an instance of the case class.
For instance, Brand.tupled(("a","b")) will give Brand("a","b").

What's the advantage of using case classes? [duplicate]

This question already has answers here:
What is the difference between Scala's case class and class?
(17 answers)
When to use case class or regular class
(7 answers)
Closed 4 years ago.
Sometimes I see people using standalone case classes for general purposes, instead of pattern matching, for example,
case class Employee(id: Int, name: String, age: Int, city: String)
What's the advantage using case classes like this over normal classes?
class Employee(id: Int, name: String, age: Int, city: String)
class Employee(id: Int, name: String, age: Int, city: String)
when you declare a class like above, every field is part of the constructor, not class members. To make them class fields, you need to add val before every field. But in the case of case class by default, they are members of the class.
Besides that, be default case class has toString, hashcode and equals method.
For more benefits blog1 blog2 stackoverflow

Same case class different validation

What I'm trying to do in Scala 2.11 and akka is have one case class but two different validations based on which route is being hit.
For example, let's consider the case class below
case class User(_id: String, name: String, age: Int, address: String)
Now while the /create route is hit, I don't need _id but I need all the other fields.
But while /update route is hit, I need the _id and the fields that are to be updated (which could be one or all three)
Only declaring Option doesn't serve the purpose because then my /create route goes for a toss.
Even extending case classes doesn't really work seamlessly (there's too much code duplicity).
I would love if something like this was possible
case class User(_id: String, name: String, age: Int, address: String)
case class SaveUser() extends User {
require(name.nonEmpty)
require(age.nonEmpty)
require(address.nonEmpty)
}
case class UpdateUser() extends User {
require(_id.nonEmpty)
}
Is there an elegant solution to this? Or do I have to create two identical case classes?
My suggestion would be to encode different case classes for different requirements, but if you insist you must share code between these two cases a possible solution would be to parameterize the case class
case class User[Id[_], Param[_]](_id: Id[String], name: Param[String], age: Param[Int], address: Param[String])
Then you define an alias for the Identity type constructor and your two uses of the case class
type Identity[T] = T
type SaveUser = User[Option, Identity]
type UpdateUser = User[Identity, Option]

Scala idiom for partial models?

I am writing a HTTP REST API and I want strongly typed model classes in Scala e.g. if I have a car model Car, I want to create the following RESTful /car API:
1) For POSTs (create a new car):
case class Car(manufacturer: String,
name: String,
year: Int)
2) For PUTs (edit existing car) and GETs, I want tag along an id too:
case class Car(id: Long,
manufacturer: String,
name: String,
year: Int)
3) For PATCHes (partial edit existing car), I want this partial object:
case class Car(id: Long,
manufacturer: Option[String],
name: Option[String],
year: Option[Int])
But keeping 3 models for essentially the same thing is redundant and error prone (e.g. if I edit one model, I have to remember to edit the other models).
Is there a typesafe way to maintain all 3 models? I am okay with answers that use macros too.
I did manage to combine the first two ones as following
trait Id {
val id: Long
}
type PersistedCar = Car with Id
I would go with something like that
trait Update[T] {
def patch(obj: T): T
}
case class Car(manufacturer: String, name: String, year: Int)
case class CarUpdate(manufacturer: Option[String],
name: Option[String],
year: Option[Int]) extends Update[Car] {
override def patch(car: Car): Car = Car(
manufacturer.getOrElse(car.manufacturer),
name.getOrElse(car.name),
year.getOrElse(car.year)
)
}
sealed trait Request
case class Post[T](obj: T) extends Request
case class Put[T](id: Long, obj: T) extends Request
case class Patch[T, U <: Update[T]](patch: U) extends Request
With Post & Put everything is straightforward. With Patch a bit more complicated. I'm pretty sure CarUpdate class can be replaced with auto generated with macros.
If you'll update you Car model, you'll definitely will not forget about patch, because it will fail at compile time. However this two models looks too "copy-paste-like".
You could represent your models as Shapeless records, then the id is simply one more field on the front, and the mapping to/from options can be done generically using ordinary Shapeless type-level programming techniques. It should also be possible to generically serialize/deserialize such things to JSON (I have done this in the past, but the relevant code belongs to a previous employer). But you would definitely be pushing the boundaries and doing complex type-level programming; I don't think mature library solutions with this approach exist yet.
Actually I managed to solve this using a little library that I wrote:
https://github.com/pathikrit/metarest
Using above library, this simply becomes:
import com.github.pathikrit.MetaRest._
#MetaRest case class Car(
#get #put id: Long,
#get #post #put #patch manufacturer: String,
#get #post #put #patch name: String,
#get #post #put #patch year: Int)
)
While I do agree with Paul's comment (yes, you'd have a lot of duplicated fields, but that's because you are decoupling the external representation of the fields to the internal representation of the fields, which is a good thing in case you want to change your internal representation without changing the API), a possible way to achieve what you want could be (which, if I understood correctly, is to have a single representation):
case class CarAllRepresentationsInOne(
id: Option[Long] = None,
manufacturer: Option[String] = None,
name: Option[String] = None,
year: Option[Int] = None)
Since you have default values for everything set to None, you can instantiate this CClass from all the routes with the only disadvantages being of having to use named parameters during instantiation and checking for None in all the usages of the fields.
But I would strongly recommend having different types for your internal representation and for each possible external request resource: it may seem like duplication of code at the beginning, but the way you model cars inside your world should be separated by the resources used by the external world, in order to decouple them and allow you to change internal representation without changing the api contract with the outside when new needs arise.

Scala case class inheritance

I have an application based on Squeryl. I define my models as case classes, mostly since I find convenient to have copy methods.
I have two models that are strictly related. The fields are the same, many operations are in common, and they are to be stored in the same DB table. But there is some behaviour that only makes sense in one of the two cases, or that makes sense in both cases but is different.
Until now I only have used a single case class, with a flag that distinguishes the type of the model, and all methods that differ based on the type of the model start with an if. This is annoying and not quite type safe.
What I would like to do is factor the common behaviour and fields in an ancestor case class and have the two actual models inherit from it. But, as far as I understand, inheriting from case classes is frowned upon in Scala, and is even prohibited if the subclass is itself a case class (not my case).
What are the problems and pitfalls I should be aware in inheriting from a case class? Does it make sense in my case to do so?
My preferred way of avoiding case class inheritance without code duplication is somewhat obvious: create a common (abstract) base class:
abstract class Person {
def name: String
def age: Int
// address and other properties
// methods (ideally only accessors since it is a case class)
}
case class Employer(val name: String, val age: Int, val taxno: Int)
extends Person
case class Employee(val name: String, val age: Int, val salary: Int)
extends Person
If you want to be more fine-grained, group the properties into individual traits:
trait Identifiable { def name: String }
trait Locatable { def address: String }
// trait Ages { def age: Int }
case class Employer(val name: String, val address: String, val taxno: Int)
extends Identifiable
with Locatable
case class Employee(val name: String, val address: String, val salary: Int)
extends Identifiable
with Locatable
Since this is an interesting topic to many, let me shed some light here.
You could go with the following approach:
// You can mark it as 'sealed'. Explained later.
sealed trait Person {
def name: String
}
case class Employee(
override val name: String,
salary: Int
) extends Person
case class Tourist(
override val name: String,
bored: Boolean
) extends Person
Yes, you have to duplicate the fields. If you don't, it simply would not be possible to implement correct equality among other problems.
However, you don't need to duplicate methods/functions.
If the duplication of a few properties is that much of an importance to you, then use regular classes, but remember that they don't fit FP well.
Alternatively, you could use composition instead of inheritance:
case class Employee(
person: Person,
salary: Int
)
// In code:
val employee = ...
println(employee.person.name)
Composition is a valid and a sound strategy that you should consider as well.
And in case you wonder what a sealed trait means — it is something that can be extended only in the same file. That is, the two case classes above have to be in the same file. This allows for exhaustive compiler checks:
val x = Employee(name = "Jack", salary = 50000)
x match {
case Employee(name) => println(s"I'm $name!")
}
Gives an error:
warning: match is not exhaustive!
missing combination Tourist
Which is really useful. Now you won't forget to deal with the other types of Persons (people). This is essentially what the Option class in Scala does.
If that does not matter to you, then you could make it non-sealed and throw the case classes into their own files. And perhaps go with composition.
case classes are perfect for value objects, i.e. objects that don't change any properties and can be compared with equals.
But implementing equals in the presence of inheritance is rather complicated. Consider a two classes:
class Point(x : Int, y : Int)
and
class ColoredPoint( x : Int, y : Int, c : Color) extends Point
So according to the definition the ColorPoint(1,4,red) should be equal to the Point(1,4) they are the same Point after all. So ColorPoint(1,4,blue) should also be equal to Point(1,4), right? But of course ColorPoint(1,4,red) should not equal ColorPoint(1,4,blue), because they have different colors. There you go, one basic property of the equality relation is broken.
update
You can use inheritance from traits solving lots of problems as described in another answer. An even more flexible alternative is often to use type classes. See What are type classes in Scala useful for? or http://www.youtube.com/watch?v=sVMES4RZF-8
In these situations I tend to use composition instead of inheritance i.e.
sealed trait IVehicle // tagging trait
case class Vehicle(color: String) extends IVehicle
case class Car(vehicle: Vehicle, doors: Int) extends IVehicle
val vehicle: IVehicle = ...
vehicle match {
case Car(Vehicle(color), doors) => println(s"$color car with $doors doors")
case Vehicle(color) => println(s"$color vehicle")
}
Obviously you can use a more sophisticated hierarchy and matches but hopefully this gives you an idea. The key is to take advantage of the nested extractors that case classes provide