Creating a `Decoder` for arbitrary JSON

Creating a `Decoder` for arbitrary JSON - scala

I am building a GraphQL endpoint for an API using Finch, Circe and Sangria. The variables that come through in a GraphQL query are basically an arbitrary JSON object (let's assume there's no nesting). So for example, in my test code as Strings, here are two examples:
val variables = List(
"{\n \"foo\": 123\n}",
"{\n \"foo\": \"bar\"\n}"
)
The Sangria API expects a type for these of Map[String, Any].
I've tried a bunch of ways but have so far been unable to write a Decoder for this in Circe. Any help appreciated.

The Sangria API expects a type for these of Map[String, Any]
This is not true. Variables for an execution in sangria can be of an arbitrary type T, the only requirement that you have an instance of InputUnmarshaller[T] type class for it. All marshalling integration libraries provide an instance of InputUnmarshaller for correspondent JSON AST type.
This means that sangria-circe defines InputUnmarshaller[io.circe.Json] and you can import it with import sangria.marshalling.circe._.
Here is a small and self-contained example of how you can use circe Json as a variables:
import io.circe.Json
import sangria.schema._
import sangria.execution._
import sangria.macros._
import sangria.marshalling.circe._
val query =
graphql"""
query ($$foo: Int!, $$bar: Int!) {
add(a: $$foo, b: $$bar)
}
"""
val QueryType = ObjectType("Query", fields[Unit, Unit](
Field("add", IntType,
arguments = Argument("a", IntType) :: Argument("b", IntType) :: Nil,
resolve = c ⇒ c.arg[Int]("a") + c.arg[Int]("b"))))
val schema = Schema(QueryType)
val vars = Json.obj(
"foo" → Json.fromInt(123),
"bar" → Json.fromInt(456))
val result: Future[Json] =
Executor.execute(schema, query, variables = vars)
As you can see in this example, I used io.circe.Json as variables for an execution. The execution would produce following result JSON:
{
"data": {
"add": 579
}
}

Here's a decoder that works.
type GraphQLVariables = Map[String, Any]
val graphQlVariablesDecoder: Decoder[GraphQLVariables] = Decoder.instance { c =>
val variablesString = c.downField("variables").focus.flatMap(_.asString)
val parsedVariables = variablesString.flatMap { str =>
val variablesJsonObject = io.circe.jawn.parse(str).toOption.flatMap(_.asObject)
variablesJsonObject.map(j => j.toMap.transform { (_, value: Json) =>
val transformedValue: Any = value.fold(
(),
bool => bool,
number => number.toDouble,
str => str,
array => array.map(_.toString),
obj => obj.toMap.transform((s: String, json: Json) => json.toString)
)
transformedValue
})
}
parsedVariables match {
case None => left(DecodingFailure(s"Unable to decode GraphQL variables", c.history))
case Some(variables) => right(variables)
}
}
We basically parse the JSON, turn it into a JsonObject, then transform the values within the object fairly simplistically.

Although the above answers work for the specific case of Sangria, I'm interested in the original question: What's the best approach in Circe (which generally assumes that all types are known up front) for dealing with arbitrary chunks of Json?
It's fairly common when encoding/decoding Json that 95% of the Json is specified, but the last 5% is some type of "additional properties" chunk which can be any Json object.
Solutions I've played with:
Encode/Decode the free-form chunk as Map[String,Any]. This means you'll have to introduce implicit encoders/decoders for Map[String, Any], which can be done, but is dangerous as that implicit can be pulled into places you didn't intend.
Encode/Decode the free-form chunk as Map[String, Json]. This is the easiest approach and is supported out of the box in Circe. But now the Json serialization logic has leaked out into your API (often you'll want to keep the Json stuff completely wrapped, so you can swap in other non-json formats later).
Encode/Decode to a String, where the string is required to be a valid Json chunk. At least you haven't locked your API into a specific Json library, but it doesn't feel very nice to have to ask your users to create Json chunks in this manual way.
Create a custom trait hierarchy to hold the data (e.g sealed trait Free; FreeInt(i: Int) extends Free; FreeMap(m: Map[String, Free] extends Free; ...). Now you you can create specific encoders/decoders for it. But what you've really done is replicate the Json type hierarchy that already exists in Circe.
I'm leaning more toward option 3. Since it's the most flexible, and will introduce the least dependencies in the API. But none of them are entirely satisfying. Any other ideas?

Related

How to return successfully parsed rows that converted into my case class

I have a file, each row is a json array.
I reading each line of the file, and trying to convert the rows into a json array, and then for each element I am converting to a case class using json spray.
I have this so far:
for (line <- source.getLines().take(10)) {
val jsonArr = line.parseJson.convertTo[JsArray]
for (ele <- jsonArr.elements) {
val tryUser = Try(ele.convertTo[User])
}
}
How could I convert this entire process into a single line statement?
val users: Seq[User] = source.getLines.take(10).map(line => line.parseJson.convertTo[JsonArray].elements.map(ele => Try(ele.convertTo[User])
The error is:
found : Iterator[Nothing]

Note: I used Scala 2.13.6 for all my examples.
There is a lot to unpack in these few lines of code. First of all, I'll share some code that we can use to generate some meaningful input to play around with.
object User {
import scala.util.Random
private def randomId: Int = Random.nextInt(900000) + 100000
private def randomName: String = Iterator
.continually(Random.nextInt(26) + 'a')
.map(_.toChar)
.take(6)
.mkString
def randomJson(): String = s"""{"id":$randomId,"name":"$randomName"}"""
def randomJsonArray(size: Int): String =
Iterator.continually(randomJson()).take(size).mkString("[", ",", "]")
}
final case class User(id: Int, name: String)
import scala.util.{Try, Success, Failure}
import spray.json._
import DefaultJsonProtocol._
implicit val UserFormat = jsonFormat2(User.apply)
This is just some scaffolding to define some User domain object and come up with a way to generate a JSON representation of an array of such objects so that we can then use a JSON library (spray-json in this case) to parse it back into what we want.
Now, going back to your question. This is a possible way to massage your data into its parsed representation. It may not fit 100% what your are trying to do, but there's some nuance in the data types involved and how they work:
val parsedUsers: Iterator[Try[User]] =
for {
line <- Iterator.continually(User.randomJsonArray(4)).take(10)
element <- line.parseJson.convertTo[JsArray].elements
} yield Try(element.convertTo[User])
First difference: notice that I use the for comprehension in a form in which the "outcome" of an iteration is not a side effect (for (something) { do something }) but an actual value for (something) yield { return a value }).
Second difference: I explicitly asked for an Iterator[Try[User]] rather than a Seq[User]. We can go very down into a rabbit hole on the topic of why the types are what they are here, but the simple explanation is that a for ... yield expression:
returns the same type as the one in the first line of the generation -- if you start with a val ns: Iterator[Int]; for (n<- ns) ... you'll get an iterator at the end
if you nest generators, they need to be of the same type as the "outermost" one
You can read more on for comprehensions on the Tour of Scala and the Scala Book.
One possible way of consuming this is the following:
for (user <- parsedUsers) {
user match {
case Success(user) => println(s"parsed object $user")
case Failure(error) => println(s"ERROR: '${error.getMessage}'")
}
As for how to turn this into a "one liner", for comprehensions are syntactic sugar applied by the compiler which turns every nested call into a flatMap and the final one into map, as in the following example (which yields an equivalent result as the for comprehension above and very close to what the compiler does automatically):
val parsedUsers: Iterator[Try[User]] = Iterator
.continually(User.randomJsonArray(4))
.take(10)
.flatMap(line =>
line.parseJson
.convertTo[JsArray]
.elements
.map(element => Try(element.convertTo[User]))
)
One note that I would like to add is that you should be mindful of readability. Some teams prefer for comprehensions, others manually rolling out their own flatMap/map chains. Coders discretion is advised.
You can play around with this code here on Scastie (and here is the version with the flatMap/map calls).

json4s extractor for partial fields (when we care to preserve the original Json)

When parsing a Json String, there are cases where we are interested in parsing just a subset of the fields (the ones we know we need) and delay any semantic parsing/mapping of the rest of the fields.
E.g. we want inspect the key in a key-value Event for (say) load-balancing, but at that point in time we don't have enough info to completely parse the values (we assume that some application further down the road will know what to do with the values):
val json = """{ "key": {"a": true, "b": 123}, "value": [1,2,3,4] }"""
case class Key(a: Boolean, b: Int)
case class Event(key: Key, value: ???) // value can be any valid Json
import org.json4s._
import org.json4s.native.JsonMethods._
implicit val formats = DefaultFormats
val parsedJson = parse(json)
parsedJson.extract[Event]
The question is how to represent the fields that we don't yet know (or care) to parse? What do we add as the type for value?
Note: One solution is to change the event to be case class Event(key: Key). This will work, but it will completely ignore the value, which we ideally want to keep so we can dispatch it properly to another service. So this solution will not work for us.

parse() parses JSON into json4s AST and returns JValue.
AFAIK You can not parse JSON partially because parsing to AST includes JSON validation and for that you need to parse the whole JSON string to AST tree.
But you can partially extract from AST. Here you have two options.
First. Make value field a JValue to defer extracting. You can do it later by calling extract() on this JValue instance.
case class Event(key: Key, value: JValue)
val event = parsedJson.extract[Event]
val value = event.value.extract[Seq[Int]]
Second. Split Event to two classes and extract them separately.
case class EventKey(key: Key)
case class EventValue(value: Seq[Int])
val parsedJson = parse(json)
val eventKey = parsedJson.extract[EventKey]
val eventValue = parsedJson.extract[EventValue]

Decoding a sealed trait in Argonaut based on JSON structure?

Given the following example:
sealed trait Id
case class NewId(prefix: String, id: String) extends Id
case class RevisedId(prefix: String, id: String, rev: String) extends Id
case class User(key: Id, name: String)
val json = """
{
"key": {
"prefix": "user",
"id": "Rt01",
"rev": "0-1"
},
"name": "Bob Boberson"
}
"""
implicit val CodecUser: CodecJson[User] = casecodec2(User.apply, User.unapply)("key", "name")
implicit val CodecId: CodecJson[Id] = ???
json.decodeOption[User]
I need to write a CodecJson for Id that will decode an object when it has the proper structure.
Adding a discriminator field of some sort is a common suggestion for this, but I don't want to change the JSON I'm already producing/consuming with spray-json and json4s.
In those libraries your encoders/decoders are basically just PartialFunction[JValue, A] and PartialFunction[A, JValue]. If your value isn't defined in the domain it's a failure. It's a really simple, elegant solution I think. In addition to that you've got Extractors for the JSON types, so it's easy to match an object on fields/structure.
Rapture goes a step further, making field order unimportant and ignoring the presence of non-matching fields, so you could just do something like:
case json"""{ "prefix": $prefix, "id": $id, "rev": $rev }""" =>
RevisedId(prefix, id, rev)
That's really simple/powerful.
I'm having trouble figuring out how to do something similar with argonaut. This is the best I've come up with so far:
val CodecNewId = casecodec2(NewId.apply, NewId.unapply)("prefix", "id")
val CodecRevisedId = casecodec3(RevisedId.apply, RevisedId.unapply)("prefix", "id", "rev")
implicit val CodecId: CodecJson[Id] =
CodecJson.derived[Id](
EncodeJson {
case id: NewId => CodecNewId(id)
case id: IdWithRev => RevisedId(id)
},
DecodeJson[Id](c => {
val q = RevisedId(c).map(a => a: Id)
q.result.fold(_ => CodecNewId(c).map(a => a: Id), _ => q)
})
)
So there's a few problems with that. I have to define extra codecs I don't intend to use. Instead of using the case-class extractors in the EncodeJson for CodecJson[Id], I'm delegating to the other encoders I've defined. Just just doesn't feel very straight-forward or clean for classes that only have 2 or 3 fields.
The code for the DecodeJson section is also pretty messy. Aside from an extra type-cast in the ifEmpty side of the fold it's identical to the code in DecodeJson.|||.
Does anyone have a more idiomatic way to write a basic codec for Sum-types in argonaut that doesn't require a discriminator and can instead match on the structure of the json?

This is the best I've been able to come up with. It doesn't have that fundamental feel of elegance that the partial functions do, but it is a good bit more terse and easier to decipher than my first attempt.
val CodecNewId = casecodec2(NewId.apply, NewId.unapply)("prefix", "id")
val CodecRevisedId = casecodec3(RevisedId.apply, RevisedId.unapply)("prefix", "id", "rev")
implicit val CodecId: CodecJson[Id] = CodecJson(
{
case id: NewId => CodecNewId(id)
case id: RevisedId => CodecRevisedId(id)
},
(CodecRevisedId ||| CodecNewId.map(a => a: Id))(_))
We're still using the "concrete" codecs for each sub-type. But we've gotten away from the CodecJson.derive call, we don't need to wrap our encode function in an EncodeJson, and we can map our DecodeJson function instead of type-casting so we can go back to using ||| instead of copying it's implementation, which makes the code a lot more readable.
This is definitely a usable solution, if not quite what I'd hoped for.

Map[String,Any] to compact json string using json4s

I am currently extracting some metrics from different data sources and storing them in a map of type Map[String,Any] where the key corresponds to the metric name and the value corresponds to the metric value. I need this to be more or less generic, which means that values types can be primitive types or lists of primitive types.
I would like to serialize this map to a JSON-formatted string and for that I am using json4s library. The thing is that it does not seem possible and I don't see a possible solution for that. I would expect something like the following to work out of the box :)
val myMap: Map[String,Any] = ... // extract metrics
val json = myMap.reduceLeft(_ ~ _) // create JSON of metrics
Navigating through source code I've seen json4s provides implicit conversions in order to transform primitive types to JValue's and also to convert Traversable[A]/Map[String,A]/Option[A] to JValue's (under the restriction of being available an implicit conversion from A to JValue, which I understand it actually means A is a primitive type). The ~ operator offers a nice way of constructing JObject's out of JField's, which is just a type alias for (String, JValue).
In this case, map values type is Any, so implicit conversions don't take place and hence the compiler throws the following error:
value ~ is not a member of (String, Any)
[error] val json = r.reduceLeft(_ ~ _)
Is there a solution for what I want to accomplish?

Since you are actually only looking for the JSON string representation of myMap, you can use the Serialization object directly. Here is a small example (if using the native version of json4s change the import to org.json4s.native.Serialization):
EDIT: added formats implicit
import org.json4s.jackson.Serialization
implicit val formats = org.json4s.DefaultFormats
val m: Map[String, Any] = Map(
"name "-> "joe",
"children" -> List(
Map("name" -> "Mary", "age" -> 5),
Map("name" -> "Mazy", "age" -> 3)
)
)
// prints {"name ":"joe","children":[{"name":"Mary","age":5},{"name":"Mazy","age":3}]}
println(Serialization.write(m))

json4s has method for it.
pretty(render(yourMap))

Serializing case class with trait mixin using json4s

I've got a case class Game which I have no trouble serializing/deserializing using json4s.
case class Game(name: String,publisher: String,website: String, gameType: GameType.Value)
In my app I use mapperdao as my ORM. Because Game uses a Surrogate Id I do not have id has part of its constructor.
However, when mapperdao returns an entity from the DB it supplies the id of the persisted object using a trait.
Game with SurrogateIntId
The code for the trait is
trait SurrogateIntId extends DeclaredIds[Int]
{
def id: Int
}
trait DeclaredIds[ID] extends Persisted
trait Persisted
{
#transient
private var mapperDaoVM: ValuesMap = null
#transient
private var mapperDaoDetails: PersistedDetails = null
private[mapperdao] def mapperDaoPersistedDetails = mapperDaoDetails
private[mapperdao] def mapperDaoValuesMap = mapperDaoVM
private[mapperdao] def mapperDaoInit(vm: ValuesMap, details: PersistedDetails) {
mapperDaoVM = vm
mapperDaoDetails = details
}
.....
}
When I try to serialize Game with SurrogateIntId I get empty parenthesis returned, I assume this is because json4s doesn't know how to deal with the attached trait.
I need a way to serialize game with only id added to its properties , and almost as importantly a way to do this for any T with SurrogateIntId as I use these for all of my domain objects.
Can anyone help me out?

So this is an extremely specific solution since the origin of my problem comes from the way mapperDao returns DOs, however it may be helpful for general use since I'm delving into custom serializers in json4s.
The full discussion on this problem can be found on the mapperDao google group.
First, I found that calling copy() on any persisted Entity(returned from mapperDao) returned the clean copy(just case class) of my DO -- which is then serializable by json4s. However I did not want to have to remember to call copy() any time I wanted to serialize a DO or deal with mapping lists, etc. as this would be unwieldy and prone to errors.
So, I created a CustomSerializer that wraps around the returned Entity(case class DO + traits as an object) and gleans the class from generic type with an implicit manifest. Using this approach I then pattern match my domain objects to determine what was passed in and then use Extraction.decompose(myDO.copy()) to serialize and return the clean DO.
// Entity[Int, Persisted, Class[T]] is how my DOs are returned by mapperDao
class EntitySerializer[T: Manifest] extends CustomSerializer[Entity[Int, Persisted, Class[T]]](formats =>(
{PartialFunction.empty} //This PF is for extracting from JSON and not needed
,{
case g: Game => //Each type is one of my DOs
implicit val formats: Formats = DefaultFormats //include primitive formats for serialization
Extraction.decompose(g.copy()) //get plain DO and then serialize with json4s
case u : User =>
implicit val formats: Formats = DefaultFormats + new LinkObjectEntitySerializer //See below for explanation on LinkObject
Extraction.decompose(u.copy())
case t : Team =>
implicit val formats: Formats = DefaultFormats + new LinkObjectEntitySerializer
Extraction.decompose(t.copy())
...
}
The only need for a separate serializer is in the event that you have non-primitives as parameters of a case class being serialized because the serializer can't use itself to serialize. In this case you create a serializer for each basic class(IE one with only primitives) and then include it into the next serializer with objects that depend on those basic classes.
class LinkObjectEntitySerializer[T: Manifest] extends CustomSerializer[Entity[Int, Persisted, Class[T]]](formats =>(
{PartialFunction.empty},{
//Team and User have Set[TeamUser] parameters, need to define this "dependency"
//so it can be included in formats
case tu: TeamUser =>
implicit val formats: Formats = DefaultFormats
("Team" -> //Using custom-built representation of object
("name" -> tu.team.name) ~
("id" -> tu.team.id) ~
("resource" -> "/team/") ~
("isCaptain" -> tu.isCaptain)) ~
("User" ->
("name" -> tu.user.globalHandle) ~
("id" -> tu.user.id) ~
("resource" -> "/user/") ~
("isCaptain" -> tu.isCaptain))
}
))
This solution is hardly satisfying. Eventually I will need to replace mapperDao or json4s(or both) to find a simpler solution. However, for now, it seems to be the fix with the least amount of overhead.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse