Scala play json - update all values with the same key - scala

Let's say I have a JsValue in the form:
{
"businessDetails" : {
"name" : "Business",
"phoneNumber" : "+44 0808 157 0192"
},
"employees" : [
{
"name" : "Employee 1",
"phoneNumber" : "07700 900 982"
},
{
"name" : "Employee 2",
"phoneNumber" : "+44(0)151 999 2458"
}
]
}
I was wondering if there is a way to do an update on every value belonging to a key with a certain name inside a JsValue regardless of its complexity?
Ideally I'd like to map on every phone number to ensure that a (0) is removed if there is one.
I have come across play-json-zipper updateAll but I'm getting unresolved dependency issues when adding the library to my sbt project.
Any help either adding the play-json-zipper library or implementing this in ordinary play-json would be much appreciated.
Thanks!

From what I can see in play-json-zipper project page, you might forgot to add resolver resolvers += "mandubian maven bintray" at "http://dl.bintray.com/mandubian/maven"
If it won't help, and you would like to proceed with custom implementation: play-json does not provide folding or traversing API over JsValue out of the box, so it can be implemented recursively in the next way:
/**
* JSON path from the root. Int - index in array, String - field
*/
type JsPath = Seq[Either[Int,String]]
type JsEntry = (JsPath, JsValue)
type JsTraverse = PartialFunction[JsEntry, JsValue]
implicit class JsPathOps(underlying: JsPath) {
def isEndsWith(field: String): Boolean = underlying.lastOption.contains(Right(field))
def isEndsWith(index: Int): Boolean = underlying.lastOption.contains(Left(index))
def /(field: String): JsPath = underlying :+ Right(field)
def /(index: Int): JsPath = underlying :+ Left(index)
}
implicit class JsValueOps(underlying: JsValue) {
/**
* Traverse underlying json based on given partially defined function `f` only on scalar values, like:
* null, string or number.
*
* #param f function
* #return updated json
*/
def traverse(f: JsTraverse): JsValue = {
def traverseRec(prefix: JsPath, value: JsValue): JsValue = {
val lifted: JsValue => JsValue = value => f.lift(prefix -> value).getOrElse(value)
value match {
case JsNull => lifted(JsNull)
case boolean: JsBoolean => lifted(boolean)
case number: JsNumber => lifted(number)
case string: JsString => lifted(string)
case array: JsArray =>
val updatedArray = array.value.zipWithIndex.map {
case (arrayValue, index) => traverseRec(prefix / index, arrayValue)
}
JsArray(updatedArray)
case `object`: JsObject =>
val updatedFields = `object`.fieldSet.toSeq.map {
case (field, fieldValue) => field -> traverseRec(prefix / field, fieldValue)
}
JsObject(updatedFields)
}
}
traverseRec(Nil, underlying)
}
}
which can be used in the next way:
val json =
s"""
|{
| "businessDetails" : {
| "name" : "Business",
| "phoneNumber" : "+44(0) 0808 157 0192"
| },
| "employees" : [
| {
| "name" : "Employee 1",
| "phoneNumber" : "07700 900 982"
| },
| {
| "name" : "Employee 2",
| "phoneNumber" : "+44(0)151 999 2458"
| }
| ]
|}
|""".stripMargin
val updated = Json.parse(json).traverse {
case (path, JsString(phone)) if path.isEndsWith("phoneNumber") => JsString(phone.replace("(0)", ""))
}
println(Json.prettyPrint(updated))
which will produce desired result:
{
"businessDetails" : {
"name" : "Business",
"phoneNumber" : "+44 0808 157 0192"
},
"employees" : [ {
"name" : "Employee 1",
"phoneNumber" : "07700 900 982"
}, {
"name" : "Employee 2",
"phoneNumber" : "+44151 999 2458"
} ]
}
Hope this helps!

Related

Decoding a nested json using circe

Hi I am trying to write a decoder for a nested json using circe in scala3 but can't quite figure out how. The json I have looks something like this:
[{
"id" : "something",
"clientId" : "something",
"name" : "something",
"rootUrl" : "something",
"baseUrl" : "something",
"surrogateAuthRequired" : something boolean,
"enabled" : something boolean,
"alwaysDisplayInConsole" : someBoolean,
"clientAuthenticatorType" : "client-secret",
"redirectUris" : [
"/realms/WISEMD_V2_TEST/account/*"
],
"webOrigins" : [
],
.
.
.
.
"protocolMappers" : [
{
"id" : "some Id",
"name" : "something",
"protocol" : "something",
"protocolMapper" : "something",
"consentRequired" : someBoolean,
"config" : {
"claim.value" : "something",
"userinfo.token.claim" : "someBoolean",
"id.token.claim" : "someBoolean",
"access.token.claim" : "someBoolean",
"claim.name" : "something",
"jsonType.label" : "something",
"access.tokenResponse.claim" : "something"
},
{
"id" : "some Id",
"name" : "something",
"protocol" : "something",
"protocolMapper" : "something",
"consentRequired" : someBoolean,
"config" : {
"claim.value" : "something",
"userinfo.token.claim" : "someBoolean",
"id.token.claim" : "someBoolean",
"access.token.claim" : "someBoolean",
"claim.name" : "something",
"jsonType.label" : "something",
"access.tokenResponse.claim" : "something"
},
.
.
}
],
}]
What I want is my decoder to have list of protocolMappers with name and claim.value. something like List(ProtocolMappers("something", Configs("something")),ProtocolMappers("something", Configs("something")))
The case class I have consists of just the needed keys and looks something like this
case class ClientsResponse (
id: String,
clientId: String,
name: String,
enabled: Boolean,
alwaysDisplayInConsole: Boolean,
redirectUris: Seq[String],
directAccessGrantsEnabled: Boolean,
publicClient: Boolean,
access: Access,
protocolMappers : List[ProtocolMappers]
)
case class ProtocolMappers (
name: String,
config: Configs
)
case class Configs (
claimValue: String
)
And my decoder looks something like this:
given clientsDecoder: Decoder[ClientsResponse] = new Decoder[ClientsResponse] {
override def apply(x: HCursor) =
for {
id <- x.downField("id").as[Option[String]]
clientId <- x.downField("clientId").as[Option[String]]
name <- x.downField("name").as[Option[String]]
enabled <- x.downField("enabled").as[Option[Boolean]]
alwaysDisplayInConsole <- x
.downField("alwaysDisplayInConsole")
.as[Option[Boolean]]
redirectUris <- x.downField("redirectUris").as[Option[Seq[String]]]
directAccessGrantsEnabled <- x
.downField("directAccessGrantsEnabled")
.as[Option[Boolean]]
publicClient <- x.downField("publicClient").as[Option[Boolean]]
access <- x.downField("access").as[Option[Access]]
protocolMapper <- x.downField("protocolMappers").as[Option[List[ProtocolMappers]]]
} yield ClientsResponse(
id.getOrElse(""),
clientId.getOrElse(""),
name.getOrElse(""),
enabled.getOrElse(false),
alwaysDisplayInConsole.getOrElse(false),
redirectUris.getOrElse(Seq()),
directAccessGrantsEnabled.getOrElse(false),
publicClient.getOrElse(false),
access.getOrElse(Access(false, false, false)),
protocolMapper.getOrElse(List(ProtocolMappers("", Configs(""))))
)
}
given protocolMapperDecoder: Decoder[ProtocolMappers] = new Decoder[ProtocolMappers] {
override def apply(x: HCursor) =
for {
protocolName <- x.downField("protocolMappers").downField("name").as[Option[String]]
configs <- x.downField("protocolMappers").downField("config").as[Option[Configs]]
claimValue <- x.downField("protocolMappers").downField("config").downField("claim.value").as[Option[String]]
}yield ProtocolMappers(protocolName.getOrElse(""), configs.getOrElse(Configs("")))
}
given configsDecoder: Decoder[Configs] = new Decoder[Configs] {
override def apply(x: HCursor) =
for {
claimValue <- x.downField("protocolMappers").downField("config").downField("claim.value").as[Option[String]]
}yield Configs(claimValue.getOrElse(""))
}
but it just returns empty strings. Can you please help me on how to do this?

Recursive traverse JSON with circe-optics

I have a json with complex structure. Something like this:
{
"a":"aa",
"b":"bb",
"c":[
"aaa",
"bbb"
],
"d":{
"e":"ee",
"f":"ff"
}
}
And I want to uppercase all string values. The Documentation says:
root.each.string.modify(_.toUpperCase)
But only root values are updated, as expected.
How to make circe-optics traverse all string values recursively?
JSON structure is unknown in advance.
Here is the example on Scastie.
via comments:
I am expecting all string values uppercased, not only root values:
{
"a":"AA",
"b":"BB",
"c":[
"AAA",
"BBB"
],
"d":{
"e":"EE",
"f":"FF"
}
}
Here is a partial solution, as in, it is not fully recursive, but it will solve the issue with the json from your example:
val level1UpperCase = root.each.string.modify(s => s.toUpperCase)
val level2UpperCase = root.each.each.string.modify(s => s.toUpperCase)
val uppered = (level1UpperCase andThen level2UpperCase)(json.right.get)
The following might be a new way to do this. Adding it here for completeness.
import io.circe.Json
import io.circe.parser.parse
import io.circe.optics.JsonOptics._
import monocle.function.Plated
val json = parse(
"""
|{
| "a":"aa",
| "b":"bb",
| "c":[
| "aaa",
| {"k": "asdads"}
| ],
| "d":{
| "e":"ee",
| "f":"ff"
| }
|}
|""".stripMargin).right.get
val transformed = Plated.transform[Json] { j =>
j.asString match {
case Some(s) => Json.fromString(s.toUpperCase)
case None => j
}
}(json)
println(transformed.spaces2)
gives
{
"a" : "AA",
"b" : "BB",
"c" : [
"AAA",
{
"k" : "ASDADS"
}
],
"d" : {
"e" : "EE",
"f" : "FF"
}
}

SCALA: Reading JSON file with the path provided

I have scenario where I will be getting different JSON result from multiple API's, I need to read specific value from the response.
For instance my JSON response is as below, now I need a format from user to provider by which I can read the value of Lat, Don't want hard-coded approach for this, user can provided a node to read in some other json file or txt file:
{
"name" : "Watership Down",
"location" : {
"lat" : 51.235685,
"long" : -1.309197
},
"residents" : [ {
"name" : "Fiver",
"age" : 4,
"role" : null
}, {
"name" : "Bigwig",
"age" : 6,
"role" : "Owsla"
} ]
}
You can get the key of json using scala JSON parser as below. Im defining a function to get the lat, which you can make generic as per your need, so that you just need to change the function.
import scala.util.parsing.json.JSON
val json =
"""
|{
| "name" : "Watership Down",
| "location" : {
| "lat" : 51.235685,
| "long" : -1.309197
| },
| "residents" : [ {
| "name" : "Fiver",
| "age" : 4,
| "role" : null
| }, {
| "name" : "Bigwig",
| "age" : 6,
| "role" : "Owsla"
| } ]
|}
""".stripMargin
val jsonObject = JSON.parseFull(json).get.asInstanceOf[Map[String, Any]]
val latLambda : (Map[String, Any] => Option[Double] ) = _.get("location")
.map(_.asInstanceOf[Map[String, Any]]("lat").toString.toDouble)
assert(latLambda(jsonObject) == Some(51.235685))
The expanded version of function,
val latitudeLambda = new Function[Map[String, Any], Double]{
override def apply(input: Map[String, Any]): Double = {
input("location").asInstanceOf[Map[String, Any]]("lat").toString.toDouble
}
}
Make the function generic so that once you know what key you want from the JSON, just change the function and apply the JSON.
Hope it helps. But there are nicer APIs out there like Play JSON lib. You simply can use,
import play.api.libs.json._
val jsonVal = Json.parse(json)
val lat = (jsonVal \ "location" \ "lat").get

How to convert column to vector type?

I have an RDD in Spark where the objects are based on a case class:
ExampleCaseClass(user: User, stuff: Stuff)
I want to use Spark's ML pipeline, so I convert this to a Spark data frame. As part of the pipeline, I want to transform one of the columns into a column whose entries are vectors. Since I want the length of that vector to vary with the model, it should be built into the pipeline as part of the feature transformation.
So I attempted to define a Transformer as follows:
class MyTransformer extends Transformer {
val uid = ""
val num: IntParam = new IntParam(this, "", "")
def setNum(value: Int): this.type = set(num, value)
setDefault(num -> 50)
def transform(df: DataFrame): DataFrame = {
...
}
def transformSchema(schema: StructType): StructType = {
val inputFields = schema.fields
StructType(inputFields :+ StructField("colName", ???, true))
}
def copy (extra: ParamMap): Transformer = defaultCopy(extra)
}
How do I specify the DataType of the resulting field (i.e. fill in the ???)? It will be a Vector of some simple class (Boolean, Int, Double, etc). It seems VectorUDT might have worked, but that's private to Spark. Since any RDD can be converted to a DataFrame, any case class can be converted to a custom DataType. However I can't figure out how to manually do this conversion, otherwise I could apply it to some simple case class wrapping the vector.
Furthermore, if I specify a vector type for the column, will VectorAssembler correctly process the vector into separate features when I go to fit the model?
Still new to Spark and especially to the ML Pipeline, so appreciate any advice.
import org.apache.spark.ml.linalg.SQLDataTypes.VectorType
def transformSchema(schema: StructType): StructType = {
val inputFields = schema.fields
StructType(inputFields :+ StructField("colName", VectorType, true))
}
In spark 2.1 VectorType makes VectorUDT publicly available:
package org.apache.spark.ml.linalg
import org.apache.spark.annotation.{DeveloperApi, Since}
import org.apache.spark.sql.types.DataType
/**
* :: DeveloperApi ::
* SQL data types for vectors and matrices.
*/
#Since("2.0.0")
#DeveloperApi
object SQLDataTypes {
/** Data type for [[Vector]]. */
val VectorType: DataType = new VectorUDT
/** Data type for [[Matrix]]. */
val MatrixType: DataType = new MatrixUDT
}
import org.apache.spark.mllib.linalg.{Vector, Vectors}
case class MyVector(vector: Vector)
val vectorDF = Seq(
MyVector(Vectors.dense(1.0,3.4,4.4)),
MyVector(Vectors.dense(5.5,6.7))
).toDF
vectorDF.printSchema
root
|-- vector: vector (nullable = true)
println(vectorDF.schema.fields(0).dataType.prettyJson)
{
"type" : "udt",
"class" : "org.apache.spark.mllib.linalg.VectorUDT",
"pyClass" : "pyspark.mllib.linalg.VectorUDT",
"sqlType" : {
"type" : "struct",
"fields" : [ {
"name" : "type",
"type" : "byte",
"nullable" : false,
"metadata" : { }
}, {
"name" : "size",
"type" : "integer",
"nullable" : true,
"metadata" : { }
}, {
"name" : "indices",
"type" : {
"type" : "array",
"elementType" : "integer",
"containsNull" : false
},
"nullable" : true,
"metadata" : { }
}, {
"name" : "values",
"type" : {
"type" : "array",
"elementType" : "double",
"containsNull" : false
},
"nullable" : true,
"metadata" : { }
} ]
}
}

Array query in Scala and ReactiveMongo?

I have a MongoDB collection whose documents look like this:
{
"name" : "fabio",
"items" : [
{
"id" : "1",
"word" : "xxxx"
},
{
"id" : "2",
"word" : "yyyy"
}
]
}
Now, given one name and one id, I want to retrieve "name" and the corresponding "word".
I query it like this and it seems to work:
val query = BSONDocument("name" -> name, "items.id" -> id)
But then, how do I access the value of "word"? I can get the name using this reader:
The reader for this object is like this:
implicit object UserReader extends BSONDocumentReader[User] {
def read(doc: BSONDocument): User = {
val name = doc.getAs[String]("name").get
// how do I retrive the value of "word"?
User(id, word)
}
}
But I am very confused about "word".
Additionally, because I am only interested in two fields, how should I filter the query? The following doesn't seem to work.
val filter = BSONDocument("name" -> 1, "items.$.word" -> 1)
Thanks for your help!