Optional boolean parameters in Scala - scala

I've been lately working on the DSL-style library wrapper over Apache POI functionality and faced a challenge which I can't seem to good solution for.
One of the goals of the library is to provide user with ability to build a spreadsheet model as a collection of immutable objects, i.e.
val headerStyle = CellStyle(fillPattern = CellFill.Solid, fillForegroundColor = Color.AquaMarine, font = Font(bold = true))
val italicStyle = CellStyle(font = Font(italic = true))
with the following assumptions:
User can optionally specify any parameter (that means, that you can create CellStyle with no parameters as well as with the full list of explicitly specified parameters);
If the parameter hasn't been specified explicitly by the user it is considered undefined and the default environment value (default value for the format we're converting to) will be used;
The 2nd point is important, as I want to convert this data model into multiple formats and i.e. the default font in Excel doesn't have to be the same as default font in HTML browser (and if user doesn't define the font family explicitly I'd like him to see the data using those defaults).
To deal with the requirements I've used the variation of the null pattern described here: Pattern for optional-parameters in Scala using null and also suggested here Scala default parameters and null (below a simplified example).
object ModelObject {
def apply(modelParam : String = null) : ModelObject = ModelObject(
modelParam = Option(modelParam)
)
}
case class ModelObject private(modelParam : Option[String])
Since null is used only internally in the companion object and very localized I decided to accept the null-sacrifice for the sake of the simplicity of the solution. The pattern works well with all the reference classes.
However for Scala primitive types wrappers null cannot be specified. This is especially a huge problem with Boolean for which I effectively consider 3 states (true, false and undefined). Wanting to provide the interface, where user still be able to write bold = true I decided to reach to Java wrappers which accept nulls.
object ModelObject {
def apply(boolParam : java.lang.Boolean = null) : ModelObject = ModelObject(
boolParam = Option(boolParam).map(_.booleanValue)
)
}
case class ModelObject private(boolParam : Option[Boolean])
This however doesn't right and I've been wondering whether there is a better approach to the problem. I've been thinking about defining the union types (with additional object denoting undefined value): How to define "type disjunction" (union types)?, however since the undefined state shouldn't be used explicitly the parameter type exposed by IDE to the user, it is going to be very confusing (ideally I'd like it to be Boolean).
Is there any better approach to the problem?
Further information:
More DSL API examples: https://github.com/norbert-radyk/spoiwo/blob/master/examples/com/norbitltd/spoiwo/examples/quickguide/SpoiwoExamples.scala
Sample implementation of the full class: https://github.com/norbert-radyk/spoiwo/blob/master/src/main/scala/com/norbitltd/spoiwo/model/CellStyle.scala

You can use a variation of the pattern I described here: How to provide helper methods to build a Map
To sum it up, you can use some helper generic class to represent optional arguments (much like an Option).
abstract sealed class OptArg[+T] {
def toOption: Option[T]
}
object OptArg{
implicit def autoWrap[T]( value: T ): OptArg[T] = SomeArg(value)
implicit def toOption[T]( arg: OptArg[T] ): Option[T] = arg.toOption
}
case class SomeArg[+T]( value: T ) extends OptArg[T] {
def toOption = Some( value )
}
case object NoArg extends OptArg[Nothing] {
val toOption = None
}
You can simply use it like this:
scala>case class ModelObject(boolParam: OptArg[Boolean] = NoArg)
defined class ModelObject
scala> ModelObject(true)
res12: ModelObject = ModelObject(SomeArg(true))
scala> ModelObject()
res13: ModelObject = ModelObject(NoArg)
However as you can see the OptArg now leaks in the ModelObject class itself (boolParam is typed as OptArg[Boolean] instead of Option[Boolean].
Fixing this (if it is important to you) just requires to define a separate factory as you have done yourself:
scala> :paste
// Entering paste mode (ctrl-D to finish)
case class ModelObject private(boolParam: Option[Boolean])
object ModelObject {
def apply(boolParam: OptArg[Boolean] = NoArg): ModelObject = new ModelObject(
boolParam = boolParam.toOption
)
}
// Exiting paste mode, now interpreting.
defined class ModelObject
defined module ModelObject
scala> ModelObject(true)
res22: ModelObject = ModelObject(Some(true))
scala> ModelObject()
res23: ModelObject = ModelObject(None)
UPDATE The advantage of using this pattern, over simply defining several overloaded apply methods as shown by #drexin is that in the latter case the number of overloads grows very fast with the number of arguments(2^N). If ModelObject had 4 parameters, that would mean 16 overloads to write by hand!

Related

Implicit object works inline but not when it is imported

I am using avro4s to help with avro serialization and deserialization.
I have a case class that includes Timestamps and need those Timestamps to be converted to nicely formatted strings before I publish the records to Kafka; the default encoder is converting my Timestamps to Longs. I read that I needed to write a decoder and encoder (from the avro4s readme).
Here is my case class:
case class MembershipRecordEvent(id: String,
userHandle: String,
planId: String,
teamId: Option[String] = None,
note: Option[String] = None,
startDate: Timestamp,
endDate: Option[Timestamp] = None,
eventName: Option[String] = None,
eventDate: Timestamp)
I have written the following encoder:
Test.scala
def test() = {
implicit object MembershipRecordEventEncoder extends Encoder[MembershipRecordEvent] {
override def encode(t: MembershipRecordEvent, schema: Schema) = {
val dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss")
val record = new GenericData.Record(schema)
record.put("id", t.id)
record.put("userHandle", t.userHandle)
record.put("teamId", t.teamId.orNull)
record.put("note", t.note.orNull)
record.put("startDate", dateFormat.format(t.startDate))
record.put("endDate", if(t.endDate.isDefined) dateFormat.format(t.endDate.get) else null)
record.put("eventName", t.eventName.orNull)
record.put("eventDate", dateFormat.format(t.eventDate))
record
}
}
val recordInAvro2 = Encoder[MembershipRecordEvent].encode(testRecord, AvroSchema[MembershipRecordEvent]).asInstanceOf[GenericRecord]
println(recordInAvro2)
}
If I declare the my implicit object in line, like I did above, it creates the GenericRecord that I am looking for just fine. I tried to abstract the implicit object to a file, wrapped in an object, and I import Implicits._ to use my custom encoder.
Implicits.scala
object Implicits {
implicit object MembershipRecordEventEncoder extends Encoder[MembershipRecordEvent] {
override def encode(t: MembershipRecordEvent, schema: Schema) = {
val dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss")
val record = new GenericData.Record(schema)
record.put("id", t.id)
record.put("userHandle", t.userHandle)
record.put("teamId", t.teamId.orNull)
record.put("note", t.note.orNull)
record.put("startDate", dateFormat.format(t.startDate))
record.put("endDate", if(t.endDate.isDefined) dateFormat.format(t.endDate.get) else null)
record.put("eventName", t.eventName.orNull)
record.put("eventDate", dateFormat.format(t.eventDate))
record
}
}
}
Test.scala
import Implicits._
val recordInAvro2 = Encoder[MembershipRecordEvent].encode(testRecord, AvroSchema[MembershipRecordEvent]).asInstanceOf[GenericRecord]
println(recordInAvro2)
It fails to use my encoder (doesn't hit my breakpoints). I have tried a myriad of things to try and see why it fails to no avail.
How can I correctly import an implicit object?
Is there a simpler solution to encode my case class's Timestamps to Strings without writing an encoder for the entire case class?
TL;DR
As suggested in one of the comments above, you can place it in the companion object.
The longer version:
Probably you have another encoder, that is used instead of the encoder you defined in Implicits.
I'll quote some phrases from WHERE DOES SCALA LOOK FOR IMPLICITS?
When a value of a certain name is required, lexical scope is searched for a value with that name. Similarly, when an implicit value of a certain type is required, lexical scope is searched for a value with that type.
Any such value which can be referenced with its “simple” name, without selecting from another value using dotted syntax, is an eligible implicit value.
There may be more than one such value because they have different names.
In that case, overload resolution is used to pick one of them. The algorithm for overload resolution is the same used to choose the reference for a given name, when more than one term in scope has that name. For example, println is overloaded, and each overload takes a different parameter type. An invocation of println requires selecting the correct overloaded method.
In implicit search, overload resolution chooses a value among more than one that have the same required type. Usually this entails selecting a narrower type or a value defined in a subclass relative to other eligible values.
The rule that the value must be accessible using its simple name means that the normal rules for name binding apply.
In summary, a definition for x shadows a definition in an enclosing scope. But a binding for x can also be introduced by local imports. Imported symbols can’t override definitions of the same name in an enclosing scope. Similarly, wildcard imports can’t override an import of a specific name, and names in the current package that are visible from other source files can’t override imports or local definitions.
These are the normal rules for deciding what x means in a given context, and also determine which value x is accessible by its simple name and is eligible as an implicit.
This means that an implicit in scope can be disabled by shadowing it with a term of the same name.
Now I'll state the companion object logic:
Implicit syntax can avoid the import tax, which of course is a “sin tax,” by leveraging “implicit scope”, which depends on the type of the implicit instead of imports in lexical scope.
When an implicit of type T is required, implicit scope includes the companion object T:
When an F[T] is required, implicit scope includes both the companion of F and the companion of the type argument, e.g., object C for F[C].
In addition, implicit scope includes the companions of the base classes of F and C, including package objects, such as p for p.F.

DataSet/DataStream of type class interface

I am just experimenting with the use of Scala type classes within Flink. I have defined the following type class interface:
trait LikeEvent[T] {
def timestamp(payload: T): Int
}
Now, I want to consider a DataSet of LikeEvent[_] like this:
// existing classes that need to be adapted/normalized (without touching them)
case class Log(ts: Int, severity: Int, message: String)
case class Metric(ts: Int, name: String, value: Double)
// create instances for the raw events
object EventInstance {
implicit val logEvent = new LikeEvent[Log] {
def timestamp(log: Log): Int = log.ts
}
implicit val metricEvent = new LikeEvent[Metric] {
def timestamp(metric: Metric): Int = metric.ts
}
}
// add ops to the raw event classes (regular class)
object EventSyntax {
implicit class Event[T: LikeEvent](val payload: T) {
val le = implicitly[LikeEvent[T]]
def timestamp: Int = le.timestamp(payload)
}
}
The following app runs just fine:
// set up the execution environment
val env = ExecutionEnvironment.getExecutionEnvironment
// underlying (raw) events
val events: DataSet[Event[_]] = env.fromElements(
Metric(1586736000, "cpu_usage", 0.2),
Log(1586736005, 1, "invalid login"),
Log(1586736010, 1, "invalid login"),
Log(1586736015, 1, "invalid login"),
Log(1586736030, 2, "valid login"),
Metric(1586736060, "cpu_usage", 0.8),
Log(1586736120, 0, "end of world"),
)
// count events per hour
val eventsPerHour = events
.map(new GetMinuteEventTuple())
.groupBy(0).reduceGroup { g =>
val gl = g.toList
val (hour, count) = (gl.head._1, gl.size)
(hour, count)
}
eventsPerHour.print()
Printing the expected output
(0,5)
(1,1)
(2,1)
However, if I modify the syntax object like this:
// couldn't make it work with Flink!
// add ops to the raw event classes (case class)
object EventSyntax2 {
case class Event[T: LikeEvent](payload: T) {
val le = implicitly[LikeEvent[T]]
def timestamp: Int = le.timestamp(payload)
}
implicit def fromPayload[T: LikeEvent](payload: T): Event[T] = Event(payload)
}
I get the following error:
type mismatch;
found : org.apache.flink.api.scala.DataSet[Product with Serializable]
required: org.apache.flink.api.scala.DataSet[com.salvalcantara.fp.EventSyntax2.Event[_]]
So, guided by the message, I do the following change:
val events: DataSet[Event[_]] = env.fromElements[Event[_]](...)
After that, the error changes to:
could not find implicit value for evidence parameter of type org.apache.flink.api.common.typeinfo.TypeInformation[com.salvalcantara.fp.EventSyntax2.Event[_]]
I cannot understand why EventSyntax2 results into these errors, whereas EventSyntax compiles and runs well. Why is using a case class wrapper in EventSyntax2 more problematic than using a regular class as in EventSyntax?
Anyway, my question is twofold:
How can I solve my problem with EventSyntax2?
What would be the simplest way to achieve my goals? Here, I am just experimenting with the type class pattern for the sake of learning, but definitively a more Object-Oriented approach (based on subtyping) looks simpler to me. Something like this:
// Define trait
trait Event {
def timestamp: Int
def payload: Product with Serializable // Any case class
}
// Metric adapter (similar for Log)
object MetricAdapter {
implicit class MetricEvent(val payload: Metric) extends Event {
def timestamp: Int = payload.ts
}
}
And then simply use val events: DataSet[Event] = env.fromElements(...) in the main.
Note that List of classes implementing a certain typeclass poses a similar question, but it considers a simple Scala List instead of a Flink DataSet (or DataStream). The focus of my question is on using the type class pattern within Flink to somehow consider heterogeneous streams/datasets, and whether it really makes sense or one should just clearly favour a regular trait in this case and inherit from it as outlined above.
BTW, you can find the code here: https://github.com/salvalcantara/flink-events-and-polymorphism.
Short answer: Flink cannot derive TypeInformation in scala for wildcard types
Long answer:
Both of your questions are really asking, what is TypeInformation, how is it used, and how is it derived.
TypeInformation is Flink's internal type system that it uses to serialize data when it is shuffled across the network and stored in a statebackend (when using the DataStream api).
Serialization is a major performance concern in data processing, so Flink contains specialized serializers for common data types and patterns. Out of the box, in its Java stack, it supports all JVM primitives, Pojo's, Flink tuples, some common collections types, and avro. The type of your class is determined using reflection and if it does not match a known type it will fall back to Kryo.
In the scala api, type information is derived using implicits. All methods on the scala DataSet and DataStream api have their generic parameters annotated for the implicit as a type class.
def map[T: TypeInformation]
This TypeInformation can be provided manually, like any type class, or derived using a macro that is imported from flink.
import org.apache.flink.api.scala._
This macro decorates the java type stack with support for scala tuples, scala case classes, and some common scala std library types. I say decorator because it can and will fall back to the java stack if your class is not one of those types.
So why does version 1 work?
Because it is an ordinary class that the type stack cannot match and so it resolved it to a generic type and returned a kryo based serializer. You can test this from the console and see it returns a generic type.
> scala> implicitly[TypeInformation[EventSyntax.Event[_]]]
res2: org.apache.flink.api.common.typeinfo.TypeInformation[com.salvalcantara.fp.EventSyntax.Event[_]] = GenericType<com.salvalcantara.fp.EventSyntax.Event>
Version 2 does not work because it recognized the type as a case class and then works to recursively derive TypeInformation instances for each of its members. This is not possible for wildcard types, which are different than Any and so derivation fails.
In general, you should not use Flink with heterogeneous types because it will not be able to derive efficient serializers for your workload.

Trying to keep types unwrapped when using refined

I am trying to use refined to create smart constructors based on primitives and avoid wrapping since same types might be used in large collections. Am I doing this right? Seems to work but a bit boilerplaty
type ONE_Pred = = MatchesRegex[W....
type ONE = String ## ONE_Pred
type TWO_Pred = OneOf[...
type TWO = String ## TWO_PRED
and then
case class C(one:ONE, two:TWO)
object C {
def apply(one:String, two:String):Either[String, C] =
(
refineT[ONE_Pred](one),
refineT[TWO_Pred](two)
).mapN(C.apply)
}
Refined has a mechanism for creating a companion-like objects that have some utilites already defined:
type ONE = String ## MatchesRegex["\\d+"]
object ONE extends RefinedTypeOps[ONE, String]
Note that:
You don't need a predicate type to be mentioned separately
It works for both shapeless tags and refined own newtypes. The flavor you get is based on the structure of type ONE.
You get:
ONE("literal") as alternative to refineMT/refineMV
ONE.from(string) as alternative to refineT/refineV
ONE.unapply so you can do string match { case ONE(taggedValue) => ... }
ONE.unsafeFrom for when your only option is to throw an exception.
With these "companions", it's possible to write much simpler code with no need to mention any predicate types:
object C {
def apply(one: String, two: String): Either[String, C] =
(ONE.from(one), TWO.from(two)).mapN(C.apply)
}
(example in scastie, using 2.13 with native literal types)

Missing scodec.Codec[Command] implicit because of class with non-value fields

I'm trying to use discriminators in existing project and something is wrong with my classes I guess.
Consider this scodec example. If I change TurnLeft and its codec to
sealed class TurnLeft(degrees: Int) extends Command {
def getDegrees: Int = degrees
}
implicit val leftCodec: Codec[TurnLeft] = uint8or16.xmap[TurnLeft](v => new TurnLeft(v), _.getDegrees)
I get
Error:(x, x) could not find Lazy implicit value of type scodec.Codec[Command]
val codec: Codec[Either[UnrecognizedCommand, Command]] = discriminatorFallback(unrecognizedCodec, Codec[Command])
It all works if I make degrees field value field. I suspect it's something tricky with shapeless. What should I do to make it work ?
Sample project that demonstrates the issue is here.
shapeless's Generic is defined for "case-class-like" types. To a first approximation, a case-class-like type is one whose values can be deconstructed to it's constructor parameters which can then be used to reconstruct an equal value, ie.
case class Foo ...
val foo = Foo(...)
val fooGen = Generic[Foo]
assert(fooGen.from(fooGen.to(foo)) == foo)
Case classes with a single constructor parameter list meet this criterion, whereas classes which don't have public (lazy) vals for their constructor parameters, or a companion with a matching apply/unapply, do not.
The implementation of Generic is fairly permissive, and will treat (lazy) val members which correspond to constructor parameters (by type and order) as being equivalent to accessible constructor arguments, so the closest to your example that we can get would be something like this,
sealed class TurnLeft(degrees: Int) extends Command {
val getDegrees: Int = degrees
}
scala> Generic[TurnLeft]
res0: shapeless.Generic[TurnLeft]{type Repr = Int :: HNil } = ...
In this case getDegrees is treated as the accessor for the single Int constructor parameter.

implicit conversions that add properties to a type, rather than to an instance of a type

I was reading through some older Scala posts to better understand type classes, and I ran
across this one that seemed quite useful, but the example seems to have gotten stale.
Can someone help me figure out the correct way to do what Phillipe intended ? Here is the code
trait Default[T] { def value : T }
implicit object DefaultInt extends Default[Int] {
def value = 42
}
implicit def listsHaveDefault[T : Default] = new Default[List[T]] {
def value = implicitly[Default[T]].value :: Nil
}
default[List[List[Int]]]
When copy/pasted and run in REPL, i get this>
scala> default[List[List[Int]]]
<console>:18: error: not found: value default
default[List[List[Int]]]
^
This doesn't have anything to do with the Scala version. If you read #Philippe's answer, you will notice that the default method simply isn't defined anywhere. That will not work in any Scala version.
It should look something like this:
def default[T: Default] = implicitly[Default[T]].value
Thanks to Jorg for his answer, which, in conjunction with this blog post helped me figure out what is going on here. Hopefully my additional answer will help others who have been confused by this.
My mental picture of type classes is that they are a means by which an author of a library would imbue his/her
library with the desireable trait of retroactive extensibility.
On the other hand, there is yet another technique for ad hoc polymorphism: implicits with wrapper classes.
You would use this latter technique when you are the consumer of a library which has some type
which is missing methods that you find useful.
I am going to try to extend Phillipe's example a bit to illustrate my understanding of how
type classes could be used by a library designer. I am not that experienced with Scala... so if
anyone notices something that is not correct in my understanding please do correct me ! ;^)
// 'Default' defines a trait that represents the ability to manufacture
// defaults for a particular type 'T'.
//
trait Default[T] { def value : T }
// Define a default for int.. this is an object default
//
implicit object DefaultInt extends Default[Int] {
def value = 42
}
// Define a default for Lists that contain instances of a 'T'.
// Interesting that this is method, not an object implicit definition. I
// guess that is to enable the 'on the fly' creation of default values for
// lists of any old type.... So I guess that's why we have
// a 'method implicit' rather than an 'object implicit' definition.
implicit def listsHaveDefault[T : Default] = new Default[List[T]] {
def value = implicitly[Default[T]].value :: Nil
}
// Here is the meat of the library... a method to make a message based of
// some arbitrary seed String, where that method is parameterized by 'T',
// a type chosen by the library user. That 'T' can be
// types for which implicits are already defined by the base library
// (T = Int, and List[T]), or an implicit defined by the library user.
//
// So the library is open for extension.
//
def makeMsg[T](msg: String)(implicit something: Default[T]) : String = {
msg + something.value
}
Given the above code, if I can create a message for the type List[List[Int]], or for Int's
makeMsg[List[List[Int]]]("moocow-")
makeMsg[Int]("dogbox")
And I get this result:
res0: String = moocow-List(List(42))
res1: String = dogbox42
If I want to override the default implicit value for a given type, I can do so like this:
makeMsg[Int]("moocow-")(something=new Object with Default[Int] { def value = 33344 } )
And I get this result:
res3: String = moocow-33344