Recursive traverse JSON with circe-optics - scala

I have a json with complex structure. Something like this:
{
"a":"aa",
"b":"bb",
"c":[
"aaa",
"bbb"
],
"d":{
"e":"ee",
"f":"ff"
}
}
And I want to uppercase all string values. The Documentation says:
root.each.string.modify(_.toUpperCase)
But only root values are updated, as expected.
How to make circe-optics traverse all string values recursively?
JSON structure is unknown in advance.
Here is the example on Scastie.
via comments:
I am expecting all string values uppercased, not only root values:
{
"a":"AA",
"b":"BB",
"c":[
"AAA",
"BBB"
],
"d":{
"e":"EE",
"f":"FF"
}
}

Here is a partial solution, as in, it is not fully recursive, but it will solve the issue with the json from your example:
val level1UpperCase = root.each.string.modify(s => s.toUpperCase)
val level2UpperCase = root.each.each.string.modify(s => s.toUpperCase)
val uppered = (level1UpperCase andThen level2UpperCase)(json.right.get)

The following might be a new way to do this. Adding it here for completeness.
import io.circe.Json
import io.circe.parser.parse
import io.circe.optics.JsonOptics._
import monocle.function.Plated
val json = parse(
"""
|{
| "a":"aa",
| "b":"bb",
| "c":[
| "aaa",
| {"k": "asdads"}
| ],
| "d":{
| "e":"ee",
| "f":"ff"
| }
|}
|""".stripMargin).right.get
val transformed = Plated.transform[Json] { j =>
j.asString match {
case Some(s) => Json.fromString(s.toUpperCase)
case None => j
}
}(json)
println(transformed.spaces2)
gives
{
"a" : "AA",
"b" : "BB",
"c" : [
"AAA",
{
"k" : "ASDADS"
}
],
"d" : {
"e" : "EE",
"f" : "FF"
}
}

Related

Scala play json - update all values with the same key

Let's say I have a JsValue in the form:
{
"businessDetails" : {
"name" : "Business",
"phoneNumber" : "+44 0808 157 0192"
},
"employees" : [
{
"name" : "Employee 1",
"phoneNumber" : "07700 900 982"
},
{
"name" : "Employee 2",
"phoneNumber" : "+44(0)151 999 2458"
}
]
}
I was wondering if there is a way to do an update on every value belonging to a key with a certain name inside a JsValue regardless of its complexity?
Ideally I'd like to map on every phone number to ensure that a (0) is removed if there is one.
I have come across play-json-zipper updateAll but I'm getting unresolved dependency issues when adding the library to my sbt project.
Any help either adding the play-json-zipper library or implementing this in ordinary play-json would be much appreciated.
Thanks!
From what I can see in play-json-zipper project page, you might forgot to add resolver resolvers += "mandubian maven bintray" at "http://dl.bintray.com/mandubian/maven"
If it won't help, and you would like to proceed with custom implementation: play-json does not provide folding or traversing API over JsValue out of the box, so it can be implemented recursively in the next way:
/**
* JSON path from the root. Int - index in array, String - field
*/
type JsPath = Seq[Either[Int,String]]
type JsEntry = (JsPath, JsValue)
type JsTraverse = PartialFunction[JsEntry, JsValue]
implicit class JsPathOps(underlying: JsPath) {
def isEndsWith(field: String): Boolean = underlying.lastOption.contains(Right(field))
def isEndsWith(index: Int): Boolean = underlying.lastOption.contains(Left(index))
def /(field: String): JsPath = underlying :+ Right(field)
def /(index: Int): JsPath = underlying :+ Left(index)
}
implicit class JsValueOps(underlying: JsValue) {
/**
* Traverse underlying json based on given partially defined function `f` only on scalar values, like:
* null, string or number.
*
* #param f function
* #return updated json
*/
def traverse(f: JsTraverse): JsValue = {
def traverseRec(prefix: JsPath, value: JsValue): JsValue = {
val lifted: JsValue => JsValue = value => f.lift(prefix -> value).getOrElse(value)
value match {
case JsNull => lifted(JsNull)
case boolean: JsBoolean => lifted(boolean)
case number: JsNumber => lifted(number)
case string: JsString => lifted(string)
case array: JsArray =>
val updatedArray = array.value.zipWithIndex.map {
case (arrayValue, index) => traverseRec(prefix / index, arrayValue)
}
JsArray(updatedArray)
case `object`: JsObject =>
val updatedFields = `object`.fieldSet.toSeq.map {
case (field, fieldValue) => field -> traverseRec(prefix / field, fieldValue)
}
JsObject(updatedFields)
}
}
traverseRec(Nil, underlying)
}
}
which can be used in the next way:
val json =
s"""
|{
| "businessDetails" : {
| "name" : "Business",
| "phoneNumber" : "+44(0) 0808 157 0192"
| },
| "employees" : [
| {
| "name" : "Employee 1",
| "phoneNumber" : "07700 900 982"
| },
| {
| "name" : "Employee 2",
| "phoneNumber" : "+44(0)151 999 2458"
| }
| ]
|}
|""".stripMargin
val updated = Json.parse(json).traverse {
case (path, JsString(phone)) if path.isEndsWith("phoneNumber") => JsString(phone.replace("(0)", ""))
}
println(Json.prettyPrint(updated))
which will produce desired result:
{
"businessDetails" : {
"name" : "Business",
"phoneNumber" : "+44 0808 157 0192"
},
"employees" : [ {
"name" : "Employee 1",
"phoneNumber" : "07700 900 982"
}, {
"name" : "Employee 2",
"phoneNumber" : "+44151 999 2458"
} ]
}
Hope this helps!

spark: how to merge rows to array of jsons

Input:
id1 id2 name value epid
"xxx" "yyy" "EAN" "5057723043" "1299"
"xxx" "yyy" "MPN" "EVBD" "1299"
I want:
{ "id1": "xxx",
"id2": "yyy",
"item_specifics": [
{
"name": "EAN",
"value": "5057723043"
},
{
"name": "MPN",
"value": "EVBD"
},
{
"name": "EPID",
"value": "1299"
}
]
}
I tried the following two solutions from How to aggregate columns into json array? and how to merge rows into column of spark dataframe as vaild json to write it in mysql:
pi_df.groupBy(col("id1"), col("id2"))
//.agg(collect_list(to_json(struct(col("name"), col("value"))).alias("item_specifics"))) // => not working
.agg(collect_list(struct(col("name"),col("value"))).alias("item_specifics"))
But I got:
{ "name":"EAN","value":"5057723043", "EPID": "1299", "id1": "xxx", "id2": "yyy" }
How to fix this? Thanks
For Spark < 2.4
You can create 2 dataframes, one with name and value and other with epic as name and epic value as value and union them together. Then aggregate them as collect_set and create a json. The code should look like this.
//Creating Test Data
val df = Seq(("xxx","yyy" ,"EAN" ,"5057723043","1299"), ("xxx","yyy" ,"MPN" ,"EVBD", "1299") )
.toDF("id1", "id2", "name", "value", "epid")
df.show(false)
+---+---+----+----------+----+
|id1|id2|name|value |epid|
+---+---+----+----------+----+
|xxx|yyy|EAN |5057723043|1299|
|xxx|yyy|MPN |EVBD |1299|
+---+---+----+----------+----+
val df1 = df.withColumn("map", struct(col("name"), col("value")))
.select("id1", "id2", "map")
val df2 = df.withColumn("map", struct(lit("EPID").as("name"), col("epid").as("value")))
.select("id1", "id2", "map")
val jsonDF = df1.union(df2).groupBy("id1", "id2")
.agg(collect_set("map").as("item_specifics"))
.withColumn("json", to_json(struct("id1", "id2", "item_specifics")))
jsonDF.select("json").show(false)
+---------------------------------------------------------------------------------------------------------------------------------------------+
|json |
+---------------------------------------------------------------------------------------------------------------------------------------------+
|{"id1":"xxx","id2":"yyy","item_specifics":[{"name":"MPN","value":"EVBD"},{"name":"EAN","value":"5057723043"},{"name":"EPID","value":"1299"}]}|
+---------------------------------------------------------------------------------------------------------------------------------------------+
For Spark = 2.4
It provides a array_union method. It might be helpful in doing it without union. I haven't tried it though.
val jsonDF = df.withColumn("map1", struct(col("name"), col("value")))
.withColumn("map2", struct(lit("epid").as("name"), col("epid").as("value")))
.groupBy("id1", "id2")
.agg(collect_set("map1").as("item_specifics1"),
collect_set("map2").as("item_specifics2"))
.withColumn("item_specifics", array_union(col("item_specifics1"), col("item_specifics2")))
.withColumn("json", to_json(struct("id1", "id2", "item_specifics2")))
You're pretty close. I believe you're looking for something like this:
val pi_df2 = pi_df.withColumn("name", lit("EPID")).
withColumnRenamed("epid", "value").
select("id1", "id2", "name","value")
pi_df.select("id1", "id2", "name","value").
union(pi_df2).withColumn("item_specific", struct(col("name"), col("value"))).
groupBy(col("id1"), col("id2")).
agg(collect_list(col("item_specific")).alias("item_specifics")).
write.json(...)
The union should bring back epid into item_specifics
Here is what you need to do
import scala.util.parsing.json.JSONObject
import scala.collection.mutable.WrappedArray
//Define udf
val jsonFun = udf((id1 : String, id2 : String, item_specifics: WrappedArray[Map[String, String]], epid: String)=> {
//Add epid to item_specifics json
val item_withEPID = item_specifics :+ Map("epid" -> epid)
val item_specificsArray = item_withEPID.map(m => ( Array(Map("name" -> m.keys.toSeq(0), "value" -> m.values.toSeq(0))))).map(m => m.map( mi => JSONObject(mi).toString().replace("\\",""))).flatten.mkString("[",",","]")
//Add id1 and id2 to output json
val m = Map("id1"-> id1, "id2"-> id2, "item_specifics" -> item_specificsArray.toSeq )
JSONObject(m).toString().replace("\\","")
})
val pi_df = Seq( ("xxx","yyy","EAN","5057723043","1299"), ("xxx","yyy","MPN","EVBD","1299")).toDF("id1","id2","name","value","epid")
//Add epid as part of group by column else the column will not be available after group by and aggregation
val df = pi_df.groupBy(col("id1"), col("id2"), col("epid")).agg(collect_list(map(col("name"), col("value")) as "map").as("item_specifics")).withColumn("item_specifics",jsonFun($"id1",$"id2",$"item_specifics",$"epid"))
df.show(false)
scala> df.show(false)
+---+---+----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|id1|id2|epid|item_specifics |
+---+---+----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|xxx|yyy|1299|{"id1" : "xxx", "id2" : "yyy", "item_specifics" : [{"name" : "MPN", "value" : "EVBD"},{"name" : "EAN", "value" : "5057723043"},{"name" : "epid", "value" : "1299"}]}|
+---+---+----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Content of item_specifics column/ output
{
"id1": "xxx",
"id2": "yyy",
"item_specifics": [{
"name": "MPN",
"value": "EVBD"
}, {
"name": "EAN",
"value": "5057723043"
}, {
"name": "epid",
"value": "1299"
}]
}

SCALA: Reading JSON file with the path provided

I have scenario where I will be getting different JSON result from multiple API's, I need to read specific value from the response.
For instance my JSON response is as below, now I need a format from user to provider by which I can read the value of Lat, Don't want hard-coded approach for this, user can provided a node to read in some other json file or txt file:
{
"name" : "Watership Down",
"location" : {
"lat" : 51.235685,
"long" : -1.309197
},
"residents" : [ {
"name" : "Fiver",
"age" : 4,
"role" : null
}, {
"name" : "Bigwig",
"age" : 6,
"role" : "Owsla"
} ]
}
You can get the key of json using scala JSON parser as below. Im defining a function to get the lat, which you can make generic as per your need, so that you just need to change the function.
import scala.util.parsing.json.JSON
val json =
"""
|{
| "name" : "Watership Down",
| "location" : {
| "lat" : 51.235685,
| "long" : -1.309197
| },
| "residents" : [ {
| "name" : "Fiver",
| "age" : 4,
| "role" : null
| }, {
| "name" : "Bigwig",
| "age" : 6,
| "role" : "Owsla"
| } ]
|}
""".stripMargin
val jsonObject = JSON.parseFull(json).get.asInstanceOf[Map[String, Any]]
val latLambda : (Map[String, Any] => Option[Double] ) = _.get("location")
.map(_.asInstanceOf[Map[String, Any]]("lat").toString.toDouble)
assert(latLambda(jsonObject) == Some(51.235685))
The expanded version of function,
val latitudeLambda = new Function[Map[String, Any], Double]{
override def apply(input: Map[String, Any]): Double = {
input("location").asInstanceOf[Map[String, Any]]("lat").toString.toDouble
}
}
Make the function generic so that once you know what key you want from the JSON, just change the function and apply the JSON.
Hope it helps. But there are nicer APIs out there like Play JSON lib. You simply can use,
import play.api.libs.json._
val jsonVal = Json.parse(json)
val lat = (jsonVal \ "location" \ "lat").get

PlayFramework: how to transform each element of a JSON array

Given the following JSON...
{
"values" : [
"one",
"two",
"three"
]
}
... how do I transform it like this in Scala/Play?
{
"values" : [
{ "elem": "one" },
{ "elem": "two" },
{ "elem": "three" }
]
}
It's easy with Play's JSON Transformers:
val json = Json.parse(
"""{
| "somethingOther": 5,
| "values" : [
| "one",
| "two",
| "three"
| ]
|}
""".stripMargin
)
// transform the array of strings to an array of objects
val valuesTransformer = __.read[JsArray].map {
case JsArray(values) =>
JsArray(values.map { e => Json.obj("elem" -> e) })
}
// update the "values" field in the original json
val jsonTransformer = (__ \ 'values).json.update(valuesTransformer)
// carry out the transformation
val transformedJson = json.transform(jsonTransformer)
You can use Play's JSON APIs:
import play.api.libs.json._
val json = Json parse """
{
"values" : [
"one",
"two",
"three"
]
}
"""
val newArray = json \ "values" match {
case JsArray(values) => values.map { v => JsObject(Seq("elem" -> v)) }
}
// or Json.stringify if you don't need readability
val str = Json.prettyPrint(JsObject(Seq("values" -> JsArray(newArray))))
Output:
{
"values" : [ {
"elem" : "one"
}, {
"elem" : "two"
}, {
"elem" : "three"
} ]
}

Creating a json in Play/Scala with same keys but different values

Here's what I want to achieve:
{ "user-list" : {
"user" : [
"username" : "foo"
},
{
"username" : "bar"
}
]
}
}
Im using play-framework and scala.
Thanks!
As previous commenters already pointed out, it is not obvious how to help you, given that your json code is invalid (try JSONLint) and that we don't know where it comes from (string? (case) classes from a database? literals?) and what you want to do with it.
Valid json code close to yours would be:
{
"user-list": {
"user": [
{ "username": "foo" },
{ "username": "bar" }
]
}
}
Depending on how much additional information your structure contains, the following might be sufficient (V1):
{
"user-list": [
{ "username": "foo" },
{ "username": "bar" }
]
}
Or even (V2):
{ "user-list": ["foo", "bar"] }
Following the Play documentation, you should be able to generate V1 with:
val jsonObject = Json.toJson(
Map(
"user-list" -> Seq(
toJson(Map("username" -> toJson("foo"))),
toJson(Map("username" -> toJson("bar")))
)
)
)
and V2 with:
val jsonObject = Json.toJson(
Map(
"user-list" -> Seq(toJson("foo"), toJson("bar"))
)
)