Circe Scala - Encode & Decode Map[] and case classes - scala

I'm trying to create an encoder and decoder for a case class I have:
case class Road(id: String, light: RoadLight, names: Map[String, String])
RoadLight is a java class, with enum.
public enum RoadLight {
red,yellow,green
}
I have tried to do a semi-auto encode&decode: making a implicit encoders and decoders.
I've started with the Map[String,String] type:
implicit val namesDecoder: Decoder[Map[String, String]] = deriveDecoder[Map[String, String]]
implicit val namesEncoder: Encoder[Map[String, String]] = deriveEncoder[Map[String, String]]
But I did get an error for both of them!
1:
could not find Lazy implicit value of type io.circe.generic.decoding.DerivedDecoder[A]
2: Error: not enough arguments for method deriveDecoder: (implicit decode: shapeless.Lazy[io.circe.generic.decoding.DerivedDecoder[A]])io.circe.Decoder[A].
Unspecified value parameter decode.
implicit val namesDecoder: Decoder[Map[String,String]]= deriveDecoder
I've done everything by the book, can't understand what's wrong. I'm not even trying to parse the case class, only the map, and even that doesn't work.
Any ideas? Thanks!

Scaladoc says
/**
* Semi-automatic codec derivation.
*
* This object provides helpers for creating [[io.circe.Decoder]] and [[io.circe.ObjectEncoder]]
* instances for case classes, "incomplete" case classes, sealed trait hierarchies, etc.
Map is not a case class or element of sealed trait hierarchy.
https://github.com/circe/circe/issues/216
Encode Map[String, MyCaseClass] into Seq[String, String] using circe
Circe and Scala's Enumeration type

circe-generic does not create codecs for java enums, only for scala product and sum types. But rolling your own for RoadLight is not hard. And once you have that, you get the map.
The code below works:
object RoadLightCodecs {
implicit val decRl: Decoder[RoadLight] = Decoder.decodeString.emap {
case "red" => Right(RoadLight.Red)
case "yellow" => Right(RoadLight.Yellow)
case "green" => Right(RoadLight.Green)
case s => Left(s"Unrecognised traffic light $s")
}
implicit val encRl: Encoder[RoadLight] = Encoder.encodeString.contramap(_.toString)
implicit val decodeMap = Decoder.decodeMap[String, RoadLight]
implicit val encodeMap = Encoder.encodeMap[String, RoadLight]
}
So what we have done is made codecs for the basic types and then use them to build the bigger map codec.
Now as far as I am aware, there aren't any libraries that do this automatically for java enums, although it should theoretically be possible to write one. But using combinators on basic codecs to build up more complex ones works great and scales well.
EDIT: I had a play at auto-deriving java enum codecs and you can almost do it:
def decodeEnum[E <: Enum[E]](values: Array[E]): Decoder[E] = Decoder.decodeString.emap { str =>
values.find(_.toString.toLowerCase == str)
.fold[Either[String, E]](Left(s"Value $str does not map correctly"))(Right(_))
}
def encodeEnum[E <: Enum[E]]: Encoder[E] =
Encoder.encodeString.contramap(_.toString.toLowerCase)
implicit val roadLightDecoder = decodeEnum[RoadLight](RoadLight.values())
implicit val roadLightEncoder = encodeEnum[RoadLight]
So encodeEnum could be automatic (you could make it implicit instead of the val at the end) but the decoder needs to be given the values (which I see no way of getting automatically from the type), so you need to pass those when creating the codec.

Related

Mixing dependent types and 'concrete' types in Scala 3

I'm fairly new to Scala in general, and Scala 3 in particular, and I'm trying to write some code that deals with transparently encoding + decoding values before they are passed to another library.
Basically, I need to map a set of types like Ints to a counterpart in the underlying library. The code I've written is too verbose to replicate here in full, but here's a minimal example demonstrating the kind of thing, using a higher-kinded Encoder type that encapsulates encoding values into types which depend on the values' original types:
trait Encoder[T] {
type U
def encode(v: T): U
}
object Encoder {
given Encoder[Int] with {
override type U = String
override def encode(v: Int): String = v.toString
}
}
case class Value[T : Encoder](v: T) {
val encoder: Encoder[T] = summon[Encoder[T]]
}
I also need to be able to write functions that deal with specific types of Value and which have 'concrete' return types. Like this:
def doStuff(v1: Value[Int]): String = {
v1.encoder.encode(v1.v)
}
However, even though in this case v1.codec.encode does indeed return a String, I get an error:
-- [E007] Type Mismatch Error: -------------------------------------------------
2 | v1.encoder.encode(v1.v)
| ^^^^^^^^^^^^^^^^^^^^^^^
| Found: v1.encoder.U
| Required: String
What can I do differently to solve this error? Really appreciate any pointers to help a newbie out šŸ™
Answering the question in the comments
Is there any sensible way I tell the compiler that Iā€™m only interested in Values with Encoders that encode to String?
You can force Value to remember its encoder's result type with an extra type argument.
case class Value[T, R](val v: T)(
using val encoder: Encoder[T],
val eqv: encoder.U =:= R,
)
The encoder is the same as your encoder, just moved to the using list so we can use it in implicit resolution.
eqv is a proof that R (our type parameter) is equivalent to the encoder's U type.
Then doStuff can take a Value[Int, String]
def doStuff(v1: Value[Int, String]): String = {
v1.eqv(v1.encoder.encode(v1.v))
}
Let's be clear about what's happening here. v1.encoder.encode(v1.v) returns an encoder.U. Scala isn't smart enough to know what that is. However, we also have a proof that encoder.U is equal to String, and that proof can be used to convert an encoder.U to a String. And that's exactly what =:=.apply does.
We have to do this back in the case class because you've already lost the type information by the time we hit doStuff. Only the case class (which instantiates the implicit encoder) knows what the result type is, so we need to expose it there.
If you have other places in your codebase where you don't care about the result type, you can fill in a type parameter R for it, or use a wildcard Value[Int, ?].
I would also suggest giving Match Types a try if we are only talking about Scala 3 here.
import scala.util.Try
type Encoder[T] = T match
case Int => String
case String => Either[Throwable, Int]
case class Value[T](v: T):
def encode: Encoder[T] = v match
case u: Int => u.toString
case u: String => Try(u.toInt).toEither
object Main extends App:
val (v1, v2) = (Value(1), Value(2))
def doStuff(v: Value[Int]): String =
v.encode
println(doStuff(v1) + doStuff(v2)) //12
println(Value(v1.encode).encode) //Right(1)

How to normalise a Union Type (T | Option[T])?

I have the following case class:
case class Example[T](
obj: Option[T] | T = None,
)
This allows me to construct it like Example(myObject) instead of Example(Some(myObject)).
To work with obj I need to normalise it to Option[T]:
lazy val maybeIn = obj match
case o: Option[T] => o
case o: T => Some(o)
the type test for Option[T] cannot be checked at runtime
I tried with TypeTest but I got also warnings - or the solutions I found look really complicated - see https://stackoverflow.com/a/69608091/2750966
Is there a better way to achieve this pattern in Scala 3?
I don't know about Scala3. But you could simply do this:
case class Example[T](v: Option[T] = None)
object Example {
def apply[T](t: T): Example[T] = Example(Some(t))
}
One could also go for implicit conversion, regarding the specific use case of the OP:
import scala.language.implicitConversions
case class Optable[Out](value: Option[Out])
object Optable {
implicit def fromOpt[T](o: Option[T]): Optable[T] = Optable(o)
implicit def fromValue[T](v: T): Optable[T] = Optable(Some(v))
}
case class SomeOpts(i: Option[Int], s: Option[String])
object SomeOpts {
def apply(i: Optable[Int], s: Optable[String]): SomeOpts = SomeOpts(i.value, s.value)
}
println(SomeOpts(15, Some("foo")))
We have a specialized Option-like type for this purpose: OptArg (in Scala 2 but should be easily portable to 3)
import com.avsystem.commons._
def gimmeLotsOfParams(
intParam: OptArg[Int] = OptArg.Empty,
strParam: OptArg[String] = OptArg.Empty
): Unit = ???
gimmeLotsOfParams(42)
gimmeLotsOfParams(strParam = "foo")
It relies on an implicit conversion so you have to be a little careful with it, i.e. don't use it as a drop-in replacement for Option.
The implementation of OptArg is simple enough that if you don't want external dependencies then you can probably just copy it into your project or some kind of "commons" library.
EDIT: the following answer is incorrect. As of Scala 3.1, flow analysis is only able to check for nullability. More information is available on the Scala book.
I think that the already given answer is probably better suited for the use case you proposed (exposing an API can can take a simple value and normalize it to an Option).
However, the question in the title is still interesting and I think it makes sense to address it.
What you are observing is a consequence of type parameters being erased at runtime, i.e. they only exist during compilation, while matching happens at runtime, once those have been erased.
However, the Scala compiler is able to perform flow analysis for union types. Intuitively I'd say there's probably a way to make it work in pattern matching (as you did), but you can make it work for sure using an if and isInstanceOf (not as clean, I agree):
case class Example[T](
obj: Option[T] | T = None
) {
lazy val maybeIn =
if (obj.isInstanceOf[Option[_]]) {
obj
} else {
Some(obj)
}
}
You can play around with this code here on Scastie.
Here is the announcement from 2019 when flow analysis was added to the compiler.

How to derive a decoder semiautomatically for a list of some type with Circe?

I have an implicit class that decodes server's response into a JSON and latter in the right case class to avoid repeating calls to .as and .getOrElse all around the tests:
implicit class RouteTestResultBody(testResult: RouteTestResult) {
def body: String = bodyOf(testResult)
def decodedBody[T](implicit d: Decoder[T]): T =
decode[Json](body)
.fold(err => throw new Exception(s"Body is not a valid JSON: $body"), identity)
.as[T]
.getOrElse(throw new Exception(s"JSON doesn't have the right shape: $body"))
}
Of course, it relies on us passing a decoder:
import io.circe.generic.semiauto.deriveDecoder
val result: RouteTestResult = ...
result.decodedBody(deriveDecoder[SomeType[AnotherType])
It works most of the time, but fails when the response is a list:
result.dedoceBody(deriveDecoder[List[SomeType]])
// throws "JSON doesn't have the right shape"
How can I semiautomatically derive a decoder for a list with specific types inside?
The terminology here is unfortunately overloaded, in that we use "deriving" in two senses:
Providing an instance for e.g. List[A] given an instance for A.
Providing an instance for a case class or sealed trait hierarchy given instances for all member types.
This problem isn't specific to Circe, or even Scala. In writing about Circe I generally try to avoid referring to the first kind of instance generation as "derivation" at all, and to refer to the second kind as "generic derivation" to emphasize that we're generating instances via a generic representation of the algebraic data type.
The fact that we sometimes use the same word to refer to both kinds of type class instance generation is a problem because they're typically very distinct mechanisms in Scala. In Circe the thing that provides an encoder or decoder instance for List[A] given one for A is a method in the type class companion object. For example, in the object Decoder in circe-core we have a method like this:
implicit def decodeList[A](implicit decodeA: Decoder[A]): Decoder[List[A]] = ...
Because this method definition is in the Decoder companion object, if you ask for an implicit Decoder[List[A]] in a context where you have an implicit Decoder[A], the compiler will find and use decodeList. You don't need any imports or extra definitions. For example:
scala> case class Foo(i: Int)
class Foo
scala> import io.circe.Decoder, io.circe.parser
import io.circe.Decoder
import io.circe.parser
scala> implicit val decodeFoo: Decoder[Foo] = Decoder[Int].map(Foo(_))
val decodeFoo: io.circe.Decoder[Foo] = io.circe.Decoder$$anon$1#6e992c05
scala> parser.decode[List[Foo]]("[1, 2, 3]")
val res0: Either[io.circe.Error,List[Foo]] = Right(List(Foo(1), Foo(2), Foo(3)))
If we desugared the implicit machinery here, it would look like this:
scala> parser.decode[List[Foo]]("[1, 2, 3]")(Decoder.decodeList(decodeFoo))
val res1: Either[io.circe.Error,List[Foo]] = Right(List(Foo(1), Foo(2), Foo(3)))
Note that we could replace first kind of derivation with the second, and it would still compile:
scala> import io.circe.generic.semiauto.deriveDecoder
import io.circe.generic.semiauto.deriveDecoder
scala> parser.decode[List[Foo]]("[1, 2, 3]")(deriveDecoder[List[Foo]])
val res2: Either[io.circe.Error,List[Foo]] = Left(DecodingFailure(CNil, List()))
This compiles because Scala's List is an algebraic data type that has a generic representation that circe-generic can create an instance for. The decoding fails for this input, though, since this representation doesn't result in the encoding we expect. We can derive the corresponding encoder to see what this encoding looks like:
scala> import io.circe.Encoder, io.circe.generic.semiauto.deriveEncoder
import io.circe.Encoder
import io.circe.generic.semiauto.deriveEncoder
scala> implicit val encodeFoo: Encoder[Foo] = Encoder[Int].contramap(_.i)
val encodeFoo: io.circe.Encoder[Foo] = io.circe.Encoder$$anon$1#2717857a
scala> deriveEncoder[List[Foo]].apply(List(Foo(1), Foo(2)))
val res3: io.circe.Json =
{
"::" : [
1,
2
]
}
So we're actually seeing the :: case class for List, which is basically never what we want.
If you need to provide a Decoder[List[Foo]] explicitly, the solution is to use either the Decoder.apply "summoner" method, or to call Decoder.decodeList explicitly:
scala> Decoder[List[Foo]]
val res4: io.circe.Decoder[List[Foo]] = io.circe.Decoder$$anon$44#5d40f590
scala> Decoder.decodeList[Foo]
val res5: io.circe.Decoder[List[Foo]] = io.circe.Decoder$$anon$44#2f936a01
scala> Decoder.decodeList(decodeFoo)
val res6: io.circe.Decoder[List[Foo]] = io.circe.Decoder$$anon$44#7f525e05
These all provide exactly the same instance, and which you should choose is a matter of taste.
As a footnote, I've thought about special-casing List in circe-generic so that deriveDecoder[List[X]] doesn't compile, since it's approximately never what you want (but seems like it might be, especially because of the confusing way we talk about instance derivation). I typically don't like the idea of having special cases like that, but I think in this case it might be the right thing to do, since this question comes up a lot.

Polymorphism with Spark / Scala, Datasets and case classes

We are using Spark 2.x with Scala for a system that has 13 different ETL operations. 7 of them are relatively simple and each driven by a single domain class, and differ primarily by this class and some nuances in how the load is handled.
A simplified version of the load class is as follows, for the purposes of this example say that there are 7 pizza toppings being loaded, here's Pepperoni:
object LoadPepperoni {
def apply(inputFile: Dataset[Row],
historicalData: Dataset[Pepperoni],
mergeFun: (Pepperoni, PepperoniRaw) => Pepperoni): Dataset[Pepperoni] = {
val sparkSession = SparkSession.builder().getOrCreate()
import sparkSession.implicits._
val rawData: Dataset[PepperoniRaw] = inputFile.rdd.map{ case row : Row =>
PepperoniRaw(
weight = row.getAs[String]("weight"),
cost = row.getAs[String]("cost")
)
}.toDS()
val validatedData: Dataset[PepperoniRaw] = ??? // validate the data
val dedupedRawData: Dataset[PepperoniRaw] = ??? // deduplicate the data
val dedupedData: Dataset[Pepperoni] = dedupedRawData.rdd.map{ case datum : PepperoniRaw =>
Pepperoni( value = ???, key1 = ???, key2 = ??? )
}.toDS()
val joinedData = dedupedData.joinWith(historicalData,
historicalData.col("key1") === dedupedData.col("key1") &&
historicalData.col("key2") === dedupedData.col("key2"),
"right_outer"
)
joinedData.map { case (hist, delta) =>
if( /* some condition */) {
hist.copy(value = /* some transformation */)
}
}.flatMap(list => list).toDS()
}
}
In other words the class performs a series of operations on the data, the operations are mostly the same and always in the same order, but can vary slightly per topping, as would the mapping from "raw" to "domain" and the merge function.
To do this for 7 toppings (i.e. Mushroom, Cheese, etc), I would rather not simply copy/paste the class and change all of the names, because the structure and logic is common to all loads. Instead I'd rather define a generic "Load" class with generic types, like this:
object Load {
def apply[R,D](inputFile: Dataset[Row],
historicalData: Dataset[D],
mergeFun: (D, R) => D): Dataset[D] = {
val sparkSession = SparkSession.builder().getOrCreate()
import sparkSession.implicits._
val rawData: Dataset[R] = inputFile.rdd.map{ case row : Row =>
...
And for each class-specific operation such as mapping from "raw" to "domain", or merging, have a trait or abstract class that implements the specifics. This would be a typical dependency injection / polymorphism pattern.
But I'm running into a few problems. As of Spark 2.x, encoders are only provided for native types and case classes, and there is no way to generically identify a class as a case class. So the inferred toDS() and other implicit functionality is not available when using generic types.
Also as mentioned in this related question of mine, the case class copy method is not available when using generics either.
I have looked into other design patterns common with Scala and Haskell such as type classes or ad-hoc polymorphism, but the obstacle is the Spark Dataset basically only working on case classes, which can't be abstractly defined.
It seems that this would be a common problem in Spark systems but I'm unable to find a solution. Any help appreciated.
The implicit conversion that enables .toDS is:
implicit def rddToDatasetHolder[T](rdd: RDD[T])(implicit arg0: Encoder[T]): DatasetHolder[T]
(from https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.SQLImplicits)
You are exactly correct in that there's no implicit value in scope for Encoder[T] now that you've made your apply method generic, so this conversion can't happen. But you can simply accept one as an implicit parameter!
object Load {
def apply[R,D](inputFile: Dataset[Row],
historicalData: Dataset[D],
mergeFun: (D, R) => D)(implicit enc: Encoder[D]): Dataset[D] = {
...
Then at the time you call the load, with a specific type, it should be able to find an Encoder for that type. Note that you will have to import sparkSession.implicits._ in the calling context as well.
Edit: a similar approach would be to enable the implicit newProductEncoder[T <: Product](implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[T]): Encoder[T] to work by bounding your type (apply[R, D <: Product]) and accepting an implicit JavaUniverse.TypeTag[D] as a parameter.

implement conversion parameters function with scala

I'm trying to implement something like clever parameters converter function with Scala.
Basically in my program I need to read parameters from a properties file, so obviously they are all strings and I would like then to convert each parameter in a specific type that I pass as parameter.
This is the implementation that I start coding:
def getParam[T](key : String , value : String, paramClass : T): Any = {
value match {
paramClass match {
case i if i == Int => value.trim.toInt
case b if b == Boolean => value.trim.toBoolean
case _ => value.trim
}
}
/* Exception handling is missing at the moment */
}
Usage:
val convertedInt = getParam("some.int.property.key", "10", Int)
val convertedBoolean = getParam("some.boolean.property.key", "true", Boolean)
val plainString = getParam("some.string.property.key", "value",String)
Points to note:
For my program now I need just 3 main type of type: String ,Int and Boolean,
if is possible I would like to extends to more object type
This is not clever, cause I need to explicit the matching against every possibile type to convert, I would like an more reflectional like approach
This code doesn't work, it give me compile error: "object java.lang.String is not a value" when I try to convert( actually no conversion happen because property values came as String).
Can anyone help me? I'm quite newbie in Scala and maybe I missing something
The Scala approach for a problem that you are trying to solve is context bounds. Given a type T you can require an object like ParamMeta[T], which will do all conversions for you. So you can rewrite your code to something like this:
trait ParamMeta[T] {
def apply(v: String): T
}
def getParam[T](key: String, value: String)(implicit meta: ParamMeta[T]): T =
meta(value.trim)
implicit case object IntMeta extends ParamMeta[Int] {
def apply(v: String): Int = v.toInt
}
// and so on
getParam[Int](/* ... */, "127") // = 127
There is even no need to throw exceptions! If you supply an unsupported type as getParam type argument, code will even not compile. You can rewrite signature of getParam using a syntax sugar for context bounds, T: Bound, which will require implicit value Bound[T], and you will need to use implicitly[Bound[T]] to access that values (because there will be no parameter name for it).
Also this code does not use reflection at all, because compiler searches for an implicit value ParamMeta[Int], founds it in object IntMeta and rewrites function call like getParam[Int](..., "127")(IntMeta), so it will get all required values at compile time.
If you feel that writing those case objects is too boilerplate, and you are sure that you will not need another method in these objects in future (for example, to convert T back to String), you can simplify declarations like this:
case class ParamMeta[T](f: String => T) {
def apply(s: String): T = f(s)
}
implicit val stringMeta = ParamMeta(identity)
implicit val intMeta = ParamMeta(_.toInt)
To avoid importing them every time you use getParam you can declare these implicits in a companion object of ParamMeta trait/case class, and Scala will pick them automatically.
As for original match approach, you can pass a implicit ClassTag[T] to your function, so you will be able to match classes. You do not need to create any values for ClassTag, as the compiler will pass it automatically. Here is a simple example how to do class matching:
import scala.reflect.ClassTag
import scala.reflect._
def test[T: ClassTag] = classTag[T].runtimeClass match {
case x if x == classOf[Int] => "I'm an int!"
case x if x == classOf[String] => "I'm a string!"
}
println(test[Int])
println(test[String])
However, this approach is less flexible than ParamMeta one, and ParamMeta should be preferred.