How to create a scala case class instance with a Map instance - scala

I want to create a scala case class whose fields come form a map . And , here is the case class
case class UserFeature(uid: String = null,
age: String = null,
marriageStatus: String = null,
consumptionAbility: String = null,
LBS: String = null,
interest1: String = null,
interest2: String = null,
interest3: String = null,
interest4: String = null,
interest5: String = null,
kw1: String = null,
kw2: String = null,
kw3: String = null,
topic1: String = null,
topic2: String = null,
topic3: String = null,
appIdInstall: String = null,
appIdAction: String = null,
ct: String = null,
os: String = null,
carrier: String = null,
house: String = null
)
suppose the map instance is
Map("uid" -> "4564131",
"age" -> "5",
"ct" -> "bk7755")
how can I apply the keys&values of the map to the fields&values of case class?

It is not a good idea to use null to represent missing string values. Use Option[String] instead.
case class UserFeature(uid: Option[String] = None,
age: Option[String] = None,
marriageStatus: Option[String] = None,
...
Once you have done that, you can use get on the map to retrieve the value.
UserFeature(map.get("uid"), map.get("age"), map.get("marriageStatus") ...)
Values that are present in the map will be Some(value) and missing values will be None. The Option class has lots of useful methods for processing optional values in a safe way.

You can do UserFeature(uid = map_var("uid"), age = map_var("age"), ct = map_var("ct")) assuming the variable holding Map is map_var and the keys are available

Synthesizing the other two answers, I would convert all the Strings in UserFeature that you're defaulting to null (which you should basically never use in Scala unless interacting with poorly-written Java code requires it, and even then use it as little as possible) to Option[String]. I leave that search-and-replace out of the answer.
Then you can do:
object UserFeature {
def apply(map: Map[String, String]): UserFeature =
UserFeature(map.get("uid"), map.get("age") ...)
}
Which lets you use:
val someMap: Map[String, String] = ...
val userFeature = UserFeature(someMap)
With the change to Option[String], there will be some other changes that need to be made in your codebase. https://danielwestheide.com/blog/2012/12/19/the-neophytes-guide-to-scala-part-5-the-option-type.html is a good tutorial for how to deal with Option.

Related

Fetch all entities from case class and convert them to string

I have a case class like below:
case class Class1(field1: String,
field2: Option[String] = None,
var var1: Option[String] = None,
var var2: Option[Boolean] = None,
var var3: Option[Double] = None
)
The list of variables is a bit longer. Now I want to convert all variables, which are inside the class, into a string. Say Option[] must be omitted and also Boolean, Double and Number must be converted to string type. My first approach was:
def anyOptionalToString(class1Dataset: Dataset[Class1]): DataFrame = {
val ds1 = class1Dataset.map { class1 =>
(
class1.field1,
class1.field2.getOrElse(""),
class1.var1.getOrElse(""),
class1.var2.getOrElse(false),
class1.var3.getOrElse(-1.0)
)
}
Is there a way to cast them without calling every field?
Speak in a kind of loop or something similar?
What I would do, is creating a new Seq containing the defaults you want to have. Let's say:
val defaults = Seq("", "", "", false, -1)
Now, we can use the productIterator to iterate over the existing elements, and choose whether we want to use the existing value, or the default:
val c1 = Class1("f1", Some("f2"), None, Some(true), Some(3))
c1.productIterator.zip(defaults.iterator).map {
case (None, default) => default
case (Some(value), _) => value
case (value, _) => value
}.map(_.toString)
The resulting type of the code above is Iterator[String]. Code run can be found at Scastie.

Scala Option if exists then set multiple vals if None set same multiple vals to empty string

I have a Option employee object. From employee I want to get the name, department, address, number, age or anything else from it if it exists but if None the name, department, and everything else I want to set to "".
I would like to just do like in Java:
if (employee.isDefined) {
val name = employee.get.getEmployeName
val department = employee.get.getDepartment
val address = employee.get.getAddress
val number = employee.get.getNumber
val age = employee.get.getAge
} else {
val name, department, address, number, age = ""
}
but I learned it does not work like that. It looks like I would need another employee object and set the values like and then access it later:
if (employee.isDefined) {
emp.setName(employee.get.getEmployeName)
emp.setDepartment(employee.get.getDepartment)
...
} else {
emp.setName("")
emp.setDepartment("")
...
}
I also experimented with tuples?
val employeeInfo = employee match {
case Some(emp) => (employee.getEmployeName, employee.getDepartment, employee.getAddress,
employee.getNumber, employee.getAge)
case None => ("", "", "", "", "")
}
val name = employeeInfo._1
val department = employeeInfo._2
val address = employeeInfo._3
...
Are these methods okay? Or are there any better ways to do this? Thanks for the help
.getOrElse() is the usual means of extracting a value from an Option while specifying a default if the option is None.
In your case, however, it is the container of many values that might be None. For that I'd recommend .fold().
case class Employee(empName : String
,dept : String
,addr : String
,num : String
,age : String)
val employee: Option[Employee] =
Some(Employee("Jo","mkt","21A","55","44"))
//or None
val name = employee.fold("")(_.empName)
val department = employee.fold("")(_.dept)
val address = employee.fold("")(_.addr)
val number = employee.fold("")(_.num)
val age = employee.fold("")(_.age)
But I have to agree with the comments from #sinanspd, your overall design is questionable at best.
This is how I would tackle this specific operation:
val (name, department, address, number, age) =
employee.fold(("", "", "", "", "")) { e =>
(e.getEmployeName, e.getDepartment, e.getAddress, e.getNumber, e.getAge)
}
But as suggested in the comments, it is worth looking at the overall design. For example it may be better to keep the values optional:
val employeeData: Option[(String, String, String, String, String)] =
employee.map{ e =>
(e.getEmployeName, e.getDepartment, e.getAddress, e.getNumber, e.getAge)
}
This allows you to tell whether a value is "" because employee was None or because the value in the Employee object was "". And you would probably define a different class to represent this restricted set of employee data to make the code cleaner and clearer.

How to represent nulls in DataSets consisting of list of case classes

I have a case class
final case class FieldStateData(
job_id: String = null,
job_base_step_id: String = null,
field_id: String = null,
data_id: String = null,
data_value: String = null,
executed_unit: String = null,
is_doc: Boolean = null,
mime_type: String = null,
filename: String = null,
filesize: BigInt = null,
caption: String = null,
executor_id: String = null,
executor_name: String = null,
executor_email: String = null,
created_at: BigInt = null
)
That I want to use as part of a dataset of type Dataset[FieldStateData] to eventually insert into a database. All columns need to be nullable. How would I represent null types for numbers descended from Any rather than any string? I thought about using Option[Boolean] or something like that but will that automatically unbox during insertion or when it's used as a sql query?
Also note that the above code in not correct. Boolean types are not nullable. It's just an example.
You are correct to use Option Monad for in the case class. The field shall be unboxed by spark on read.
import org.apache.spark.sql.{Encoder, Encoders, Dataset}
final case class FieldStateData(job_id: Option[String],
job_base_step_id: Option[String],
field_id: Option[String],
data_id: Option[String],
data_value: Option[String],
executed_unit: Option[String],
is_doc: Option[Boolean],
mime_type: Option[String],
filename: Option[String],
filesize: Option[BigInt],
caption: Option[String],
executor_id: Option[String],
executor_name: Option[String],
executor_email: Option[String],
created_at: Option[BigInt])
implicit val fieldCodec: Encoder[FieldStateData] = Encoders.product[FieldStateData]
val ds: Dataset[FieldStateEncoder] = spark.read.source_name.as[FieldStateData]
When you write the Dataset back into the database, None become null values and Some(x) are the values that are present.

Scala immutable container class extended with mixins

I'd like a container class that I can extend with some number of traits to contain groups of default vals that can later be changed in an immutable way. The traits will hold certain simple pieces of data that go together so that creating the class with a couple of traits will create an object with several collections of default values.
Then I'd like to be able to modify any of the vals immutably by copying the object while changing one new value at a time.
The class might have something like the following:
class Defaults(val string: String = "string", val int: Int = "int")
Then other traits like this
trait MoreDefaults{
val long: Long = 1l
}
Then I'd like to mix them when instantiated to build my the particular needed set of defaults
var d = new Defaults with MoreDefaults
and later to something like:
if (someFlag) d = d.copy( long = 1412341234l )
You can do something like this with a single case class but I run out of params at 22. But I'll have a bunch of groupings of defaults I'd like to mixin depending on the need, then allow changes to any of them (class defined or trait defined) in an immutable way.
I can stick a copy method in the Defaults class like this:
def copy(
string: String = string,
int: Int = int): Defaults = {
new Defaults(string, int)
}
then do something like
var d = new Defaults
if (someFlag) d = d.copy(int = 234234)
Question ====> This works for values in the base class but I can't figure how to extend this to the mixin traits. Ideally the d.copy would work on all vals defined by all of the class + traits. Overloading is trouble too since the vals are mainly Strings but all of the val names will be unique in any mix of class and traits or it is an error.
Using only classes I can get some of this functionality by having a base Defaults class then extending it with another class that has it's own non-overloaded copyMoreDefault function. This is really ugly and I hope a Scala expert will see it and have a good laugh before setting me straight--it does work though.
class Defaults(
val string: String = "one",
val boolean: Boolean = true,
val int: Int = 1,
val double: Double = 1.0d,
val long: Long = 1l) {
def copy(
string: String = string,
boolean: Boolean = boolean,
int: Int = int,
double: Double = double,
long: Long = long): Defaults = {
new Defaults(string, boolean, int, double, long)
}
}
class MoreDefaults(
string: String = "one",
boolean: Boolean = true,
int: Int = 1,
double: Double = 1.0d,
long: Long = 1l,
val string2: String = "string2") extends Defaults (
string,
boolean,
int,
double,
long) {
def copyMoreDefaults(
string: String = string,
boolean: Boolean = boolean,
int: Int = int,
double: Double = double,
long: Long = long,
string2: String = string2): MoreDefaults = {
new MoreDefaults(string, boolean, int, double, long, string2)
}
}
Then the following works:
var d = new MoreDefualts
if (someFlag) d = d.copyMoreDefaults(string2 = "new string2")
This method will be a mess if Defaults get's changed parameters! All the derived classes will have to be updated--ugh. There must be a better way.
I don't think I'm strictly speaking answering your question, rather suggesting an alternative solution. So your having problems with large case classes, e.g.
case class Fred(a: Int = 1, b: Int = 2, ... too many params ... )
What I would do is organize the params into more case classes:
case class Bar(a: Int = 1, b: Int = 2)
case class Foo(c: Int = 99, d: Int = 200)
// etc
case class Fred(bar: Bar = Bar(), foo: Foo = Foo(), ... etc)
Then when you want to do a copy and change, say one of the values of Foo you do:
val myFred: Fred = Fred()
val fredCopy: Fred = myFred.copy(foo = myFred.foo.copy(d = 300))
and you need not even define the copy functions, you get them for free.

Case Classes with optional fields in Scala

For example, I have this case class:
case class Student (firstName : String, lastName : String)
If I use this case class, is it possible that supplying data to the fields inside the case class are optional? For example, I'll do this:
val student = new Student(firstName = "Foo")
Thanks!
If you just want to miss the second parameter without a default information, I suggest you to use an Option.
case class Student(firstName: String, lastName: Option[String] = None)
Now you might create instances this way:
Student("Foo")
Student("Foo", None) // equal to the one above
Student("Foo", Some("Bar")) // neccesary to add a lastName
To make it usable as you wanted it, I will add an implicit:
object Student {
implicit def string2Option(s: String) = Some(s)
}
Now you are able to call it those ways:
import Student._
Student("Foo")
Student("Foo", None)
Student("Foo", Some("Bar"))
Student("Foo", "Bar")
You were close:
case class Student (firstName : String = "John", lastName : String = "Doe")
val student = Student(firstName = "Foo")
Another possibility is partially applied function:
case class Student (firstName : String, lastName : String)
val someJohn = Student("John", _: String)
//someJohn: String => Student = <function1>
val johnDoe = someJohn("Doe")
//johnDoe: Student = Student(John,Doe)
And to be complete, you can create some default object and then change some field:
val johnDeere = johnDoe.copy(lastName="Deere")
//johnDeer: Student = Student(John,Deere)
I would see two ways this is normally done.
1. default parameters
case class Student (firstName : String, lastName : String = "")
Student("jeypijeypi") # Student(jeypijeypi,)
2. alternative constructors
case class Student (firstName : String, lastName : String)
object Student {
def apply(firstName: String) = new Student(firstName,"")
}
Student("jeypijeypi") # Student(jeypijeypi,)
Which one is better depends slightly on the circumstances. The latter gives you more freedom: you can make any parameter(s) optional, or even change their order (not recommended). Default parameters need always to be at the end of the parameter list, I think. You can also combine these two ways.
Note: within the alternative constructors you need new to point the compiler to the actual constructor. Normally new is not used with case classes.