How can I access the nested values in JsValue? - scala

I have the following Json string
{
"results" : [
{
"address_components" : [
{
"long_name" : "1600",
"short_name" : "1600",
"types" : [ "street_number" ]
},
{
"long_name" : "Amphitheatre Parkway",
"short_name" : "Amphitheatre Pkwy",
"types" : [ "route" ]
},
{
"long_name" : "Mountain View",
"short_name" : "Mountain View",
"types" : [ "locality", "political" ]
}
]}]}
I parse it using the following statement
val payload = Json.parse(results)
I then get the following results
payload: play.api.libs.json.JsValue = {"results":[{"address_components":[{"long_name":"1600","short_name":"1600","types":["street_number"]},{"long_name":"Amphitheatre Parkway","short_name":"Amphitheatre Pkwy","types":["route"]},{"long_name":"Mountain View","short_name":"Mountain View","types":["locality","political"]}]}
When I try and run this command
val extract = (payload \ "results.address_components")
I get the following output
play.api.libs.json.JsLookupResult = JsUndefined('results.address_components' is undefined on object
How do I access the element "address_components" ?

I'd recommend using case classes kind of like GamingFelix's suggestion (an alternative to that is below the line in my answer), but if you're wondering why your example didn't work here's why:
You didn't traverse the array properly. Given the following JSON:
{
"foo": [
{
"bar": 123,
"baz": "abc"
},
{
"bar": 456,
"baz": "def"
}
]
}
you'd get all of the bar values like this: Json.parse(result) \ "foo" \\ "bar" - ie, use double backslashes. You can chain this together too, like (Json.parse(result) \\ "field1").map(_ \\ "field2") etc.
So in your example, if you want all of the long_name fields you'd do it like this:
import play.api.libs.json.Json
val json = Json.parse(
"""{"results" : [{"address_components" : [{"long_name" : "1600","short_name" : "1600","types" : [ "street_number" ]},{"long_name" : "Amphitheatre Parkway","short_name" : "Amphitheatre Pkwy","types" : [ "route" ]},{"long_name" : "Mountain View","short_name" : "Mountain View","types" : [ "locality", "political" ]}]},{"address_components" : [{"long_name" : "16400","short_name" : "1600","types" : [ "street_number" ]},{"long_name" : "Amphitheatre 5Parkway","short_name" : "Amphitheatre Pkwy","types" : [ "route" ]},{"long_name" : "Mounta6in View","short_name" : "Mountain View","types" : [ "locality", "political" ]}]}]}"""
)
(json \ "results" \\ "address_components").map(_ \\ "long_name")
This returns an array of arrays, since you're going through multiple arrays to get your value. If you put a .foreach(_.foreach(println)) on the end, you'll get this:
"1600"
"Amphitheatre Parkway"
"Mountain View"
"16400"
"Amphitheatre 5Parkway"
"Mounta6in View"
See it working on Scastie.
If you're curious, GamingFelix's suggestion to almost combine the two approaches isn't really the best - it's easier to follow what's going on if you just convert the entire thing into case classes instead. Also, if you create a companion object to each case class to put the formatters in, instead of keeping everything in the same formatter object, you won't need to explicitly import formatters every time you need them (Scala will find them automatically).
This is how I'd do it:
import play.api.libs.json.{Json, OFormat}
val json = Json.parse(
"""{"results" : [{"address_components" : [{"long_name" : "1600","short_name" : "1600","types" : [ "street_number" ]},{"long_name" : "Amphitheatre Parkway","short_name" : "Amphitheatre Pkwy","types" : [ "route" ]},{"long_name" : "Mountain View","short_name" : "Mountain View","types" : [ "locality", "political" ]}]},{"address_components" : [{"long_name" : "16400","short_name" : "1600","types" : [ "street_number" ]},{"long_name" : "Amphitheatre 5Parkway","short_name" : "Amphitheatre Pkwy","types" : [ "route" ]},{"long_name" : "Mounta6in View","short_name" : "Mountain View","types" : [ "locality", "political" ]}]}]}"""
)
case class MyClass(results: Seq[Result])
object MyClass {
implicit val format: OFormat[MyClass] = Json.format[MyClass]
}
case class Result(address_components: Seq[AddressComponent])
object Result {
implicit val format: OFormat[Result] = Json.format[Result]
}
case class AddressComponent(long_name: String, short_name: String, types: Seq[String])
object AddressComponent {
implicit val format: OFormat[AddressComponent] = Json.format[AddressComponent]
}
val model = json.as[MyClass] // `.as` is unsafe, use `.asOpt` or `.validate` or something like that in a real scenario
model.results.flatMap(_.address_components.map(_.long_name)) // get values out like you normally would in a class
Here's that working on Scastie.

So since you're using play api, you can use implicits and case classes.
I usually use this website: https://json2caseclass.cleverapps.io/
to turn the json into scala case classes. And then I create implicits for the case classes.
Here's how I do it:
import play.api.libs.json._
case class Address_components(
long_name: String,
short_name: String,
types: List[String]
)
case class Results(
address_components: List[Address_components]
)
case class RootJsonObject(
results: List[Results]
)
implicit val a: Format[Address_components] = Json.format[Address_components]
implicit val b: Format[Results] = Json.format[Results]
implicit val c: Format[RootJsonObject] = Json.format[RootJsonObject]
Then you can easily parse the json and access them.
val test = Json.parse("""{
"results" : [
{
"address_components" : [
{
"long_name" : "1600",
"short_name" : "1600",
"types" : [ "street_number" ]
},
{
"long_name" : "Amphitheatre Parkway",
"short_name" : "Amphitheatre Pkwy",
"types" : [ "route" ]
},
{
"long_name" : "Mountain View",
"short_name" : "Mountain View",
"types" : [ "locality", "political" ]
}
]}]}""")
println(test.validate[RootJsonObject])
val mylist = (test \ "results").get.as[List[Results]]
mylist.map(a => println(a))
I also created a Scalafiddle for you if you wanna test it out in the browser:
https://scalafiddle.io/sf/J5dDfFo/6

Related

Join data-frame based on value in list of WrappedArray

I have to join two spark data-frames in Scala based on a custom function. Both data-frames have the same schema.
Sample Row of data in DF1:
{
"F1" : "A",
"F2" : "B",
"F3" : "C",
"F4" : [
{
"name" : "N1",
"unit" : "none",
"count" : 50.0,
"sf1" : "val_1",
"sf2" : "val_2"
},
{
"name" : "N2",
"unit" : "none",
"count" : 100.0,
"sf1" : "val_3",
"sf2" : "val_4"
}
]
}
Sample Row of data in DF2:
{
"F1" : "A",
"F2" : "B",
"F3" : "C",
"F4" : [
{
"name" : "N1",
"unit" : "none",
"count" : 80.0,
"sf1" : "val_5",
"sf2" : "val_6"
},
{
"name" : "N2",
"unit" : "none",
"count" : 90.0,
"sf1" : "val_7",
"sf2" : "val_8"
},
{
"name" : "N3",
"unit" : "none",
"count" : 99.0,
"sf1" : "val_9",
"sf2" : "val_10"
}
]
}
RESULT of Joining these sample rows:
{
"F1" : "A",
"F2" : "B",
"F3" : "C",
"F4" : [
{
"name" : "N1",
"unit" : "none",
"count" : 80.0,
"sf1" : "val_5",
"sf2" : "val_6"
},
{
"name" : "N2",
"unit" : "none",
"count" : 100.0,
"sf1" : "val_3",
"sf2" : "val_4"
},
{
"name" : "N3",
"unit" : "none",
"count" : 99.0,
"sf1" : "val_9",
"sf2" : "val_10"
}
]
}
The result is:
full-outer-join based on value of "F1", "F2" and "F3" +
join of "F4" keeping unique nodes(use name as id) with max value of "count"
I am not very familiar with Scala and have been struggling with this for more than a day now. Here is what I have gotten to so far:
val df1 = sqlContext.read.parquet("stack_a.parquet")
val df2 = sqlContext.read.parquet("stack_b.parquet")
val df4 = df1.toDF(df1.columns.map(_ + "_A"):_*)
val df5 = df2.toDF(df1.columns.map(_ + "_B"):_*)
val df6 = df4.join(df5, df4("F1_A") === df5("F1_B") && df4("F2_A") === df5("F2_B") && df4("F3_A") === df5("F3_B"), "outer")
def joinFunction(r:Row) = {
//Need the real-deal here!
//print(r(3)) //-->Any = WrappedArray([..])
//also considering parsing as json to do the processing but not sure about the performance impact
//val parsed = JSON.parseFull(r.json) //then play with parsed
r.toSeq //
}
val finalResult = df6.rdd.map(joinFunction)
finalResult.collect
I was planning to add the custom merge logic in joinFunction but I am struggling to convert the WrappedArray/Any class to something I can work with.
Any inputs on how to do the conversion or the join in a better way will be very helpful.
Thanks!
Edit (7 Mar, 2021)
The full-outer join actually has to be performed only on "F1".
Hence, using #werner's answer, I am doing:
val df1_a = df1.toDF(df1.columns.map(_ + "_A"):_*)
val df2_b = df2.toDF(df2.columns.map(_ + "_B"):_*)
val finalResult = df1_a.join(df2_b, df1_a("F1_A") === df2_b("F1_B"), "full_outer")
.drop("F1_B")
.withColumn("F4", joinFunction(col("F4_A"), col("F4_B")))
.drop("F4_A", "F4_B")
.withColumn("F2", when(col("F2_A").isNull, col("F2_B")).otherwise(col("F2_A")))
.drop("F2_A", "F2_B")
.withColumn("F3", when(col("F3_A").isNull, col("F3_B")).otherwise(col("F3_A")))
.drop("F3_A", "F3_B")
But I am getting this error. What am I missing..?
You can implement the merge logic with the help of an udf:
//case class to define the schema of the udf's return value
case class F4(name: String, unit: String, count: Double, sf1: String, sf2: String)
val joinFunction = udf((a: Seq[Row], b: Seq[Row]) =>
(a ++ b).map(r => F4(r.getAs[String]("name"),
r.getAs[String]("unit"),
r.getAs[Double]("count"),
r.getAs[String]("sf1"),
r.getAs[String]("sf2")))
//group the elements from both arrays by name
.groupBy(_.name)
//take the element with the max count from each group
.map { case (_, d) => d.maxBy(_.count) }
.toSeq)
//join the two dataframes
val finalResult = df1.withColumnRenamed("F4", "F4_A").join(
df2.withColumnRenamed("F4", "F4_B"), Seq("F1", "F2", "F3"), "full_outer")
//call the merge function
.withColumn("F4", joinFunction('F4_A, 'F4_B))
//drop the the intermediate columns
.drop("F4_A", "F4_B")

MongoDB delete specific nested element in object array

I want to remove this
"lightControl" : 75
from this
{
"_id" : "dfdfwef-fdfd-fd94284o-aafg",
"name" : "Testing",
"serialNumber" : "fhidfhd8ghfd",
"description" : "",
"point" : {
"type" : "Point",
"coordinates" : [
10.875447277532754,
20.940549069378634
]
},
"ancestors" : [ ],
"metadata" : {
"measurement" : {
"high" : "40000.0",
"medium" : "25000.0"
},
"digitalTwin" : {
},
"emails" : [
""
],
"lastMeasurementDate" : "2010-03-04T11:32:06.691Z",
"lightControl" : 75
},
"tags" : [ ],
"createdAt" : ISODate("2019-12-07T15:22:10.988Z"),
"updatedAt" : ISODate("2020-03-08T15:38:21.736Z"),
"_class" : "com.test.demo.api.model.Device"
}
All I want is for this specific id, to completely delete the lightControl element from metadata. I have tried $pull, but I am probably missing something. Any ideas? Thank you in advance!
Your lightControl is not in array, for nested property, use the dot wrapped in doublequotes:
MongoDB shell:
db.getCollection('yourCollectionName').update({},{$unset:{
"metadata.lightControl":""
}})
In case you have a list of objects with _id(s) instead of updating directly, assume Node.js client for MongoDB is used:
// import mongodb from "mongodb"; // ES6
const mongodb = require("mongodb");
var client = MongoClient(...);
var coll = client["yourDbName"]["yourCollectionName"];
var objs = []; // <-- The array with _id(s)
for (let i=0; i<objs.length; i++){
let id = mongodb.ObjectID(objs[i]["_id"]);
coll.update({ _id:id },{$unset:{ "metadata.lightControl":"" }});
}

Adding elements to JSON array using circe and scala

I have a JSON string as the following:
{
"cars": {
"Nissan": [
{"model":"Sentra", "doors":4},
{"model":"Maxima", "doors":4},
{"model":"Skyline", "doors":2}
],
"Ford": [
{"model":"Taurus", "doors":4},
{"model":"Escort", "doors":4}
]
}
}
I would like to add a new cars brand (in addition to Nissan and Ford), using circe at scala.
How could I do it?
Thank you in advance.
You can modify JSON using cursors. One of the possible solutions:
import io.circe._, io.circe.parser._
val cars: String = """
{
"cars": {
"Nissan": [
{"model":"Sentra", "doors":4},
{"model":"Maxima", "doors":4},
{"model":"Skyline", "doors":2}
],
"Ford": [
{"model":"Taurus", "doors":4},
{"model":"Escort", "doors":4}
]
}
}"""
val carsJson = parse(cars).getOrElse(Json.Null)
val teslaJson: Json = parse("""
{
"Tesla": [
{"model":"Model X", "doors":5}
]
}""").getOrElse(Json.Null)
val carsCursor = carsJson.hcursor
val newJson = carsCursor.downField("cars").withFocus(_.deepMerge(teslaJson)).top
Here we just go down to cars field, "focus" on it and pass the function for modifying JSON values. Here deepMerge is used.
newJson will be look as follows:
Some({
"cars" : {
"Tesla" : [
{
"model" : "Model X",
"doors" : 5
}
],
"Nissan" : [
{
"model" : "Sentra",
"doors" : 4
},
{
"model" : "Maxima",
"doors" : 4
},
{
"model" : "Skyline",
"doors" : 2
}
],
"Ford" : [
{
"model" : "Taurus",
"doors" : 4
},
{
"model" : "Escort",
"doors" : 4
}
]
}
})

Grails-Mongo Check Contains in Domain's List Criteria Query

I'm using grails 3.3.5 with GORM Version 6.1.9
In my application I've created the domain as follows..
class Camera {
String cameraId
List<String> typesInclude
static constraints = {
typesInclude nullable: false
}
}
Now I've added some records in the Camera Colletions.
db.camera.find().pretty()
{
"_id" : NumberLong(1),
"version" : NumberLong(0),
"typesInclude" : [
"T1"
],
"cameraId" : "cam1"
}
{
"_id" : NumberLong(2),
"version" : NumberLong(0),
"typesInclude" : [
"T2"
],
"cameraId" : "cam2",
}
{
"_id" : NumberLong(3),
"version" : NumberLong(0),
"typesInclude" : [
"T2",
"T3"
],
"cameraId" : "cam3",
}
Now when I'm trying to get Camera By type like T2. I'm unable to get results using the following function..
def getCameraListByType(String type){
def cameraInstanceList = Camera.createCriteria().list {
ilike("typesInclude","%${type}%")
}
return cameraInstanceList
}
Any help would be appreciated.
I wouldn't use criteria queries with mongo, as they barely reflect document-oriented paradigm.
Use the native queries instead, as they are way more powerful:
def getCameraListByType(String type){
Camera.collection.find( [ typesInclude:[ $regex:/$type/, $options:'i' ] ] ).collect{ it as Camera }
}

Mongodb Search nested array elements

I have a below data. Would like to search aclpermissions where any of the elements (CRT, READ, DLT, UPD) will match to an array of inputs.
Below query
db.AMSAppACL.find({"aclpermissions.READ" : {'$in': ['58dc0bea0cd182789fc62fab']}}).pretty();
only searches READ element. Is there any way to search all the elements instead of using or queries and aggregate
{
"_id" : ObjectId("5900d6abb9eb284a78f5a350"),
"_class" : "com.debopam.amsapp.model.AMSAppACL",
"attrUniqueCode" : "USER",
"attributeVersion" : 1,
"aclpermissions" : {
"CRT" : [
"58dc0bd70cd182789fc62faa"
],
"READ" : [
"58dc0bd70cd182789fc62faa",
"58dc0bea0cd182789fc62fab"
],
"UPD" : [
"58dc0bd70cd182789fc62faa"
],
"DLT" : [
"58dc0bd70cd182789fc62faa"
]
},
"orgHierachyIdentifier" : "14",
"orgid" : 14,
"createDate" : ISODate("2017-04-26T17:19:39.026Z"),
"lastModifiedDate" : ISODate("2017-04-26T17:19:39.026Z"),
"createdBy" : "appadmin",
"lastModifiedBy" : "appadmin"
}
You should try updating aclpermissions part of schema from dynamic keys to labeled keys.
"aclpermissions":
[
{k:"CRT", v: ["58dc0bd70cd182789fc62faa"]},
{k:"READ", v: [ "58dc0bd70cd182789fc62faa", "58dc0bea0cd182789fc62fab"]}....
]
Now you can update the query from post to something like
db.AMSAppACL.find({"aclpermissions.v" : {'$in': ['58dc0bea0cd182789fc62fab']}}).pretty();