How can I read a JSON file into a Map, using Scala. I've been trying to accomplish this but the JSON I am reading is nested JSon and I have not found a way to easily extract the JSON into keys because of that. Scala seems to be wanting to also convert the nested JSON String into an object. Instead, I want the nested JSON as a String "value". I am hoping someone can clarify or give me a hint on how I might do this.
My JSON source might look something like this:
{
"authKey": "34534645645653455454363",
"member": {
"memberId": "whatever",
"firstName": "Jon",
"lastName": "Doe",
"address": {
"line1": "Whatever Rd",
"city": "White Salmon",
"state": "WA",
"zip": "98672"
},
"anotherProp": "wahtever",
}
}
I want to extract this JSON into a Map of 2 keys without drilling into the nested JSON. Is this possible? Once I have the Map, my intention is to add the key-values to my POST request headers, like so:
val sentHeaders = Map("Content-Type" -> "application/javascript",
"Accept" -> "text/html", "authKey" -> extractedValue,
"member" -> theMemberInfoAsStringJson)
http("Custom headers")
.post("myUrl")
.headers(sentHeaders)
Since the question is tagged 'gatling', behind the curtains this lib depends on Jackson/fasterxml for JSON processing, so we can make use of it.
There is no way to retrieve a nested structured part of JSON as String directly, but with very few additional code the result can still be achieved.
So, having the input JSON:
val json = """{
| "authKey": "34534645645653455454363",
| "member": {
| "memberId": "whatever",
| "firstName": "Jon",
| "lastName": "Doe",
| "address": {
| "line1": "Whatever Rd",
| "city": "White Salmon",
| "state": "WA",
| "zip": "98672"
| },
| "anotherProp": "wahtever"
| }
|}""".stripMargin
A Jackson's ObjectMapper can be created and configured for use in Scala:
// import com.fasterxml.jackson.module.scala.DefaultScalaModule
val mapper = new ObjectMapper().registerModule(DefaultScalaModule)
To parse the input json easily, a dedicated case class is useful:
case class SrcJson(authKey: String, member: Any) {
val memberAsString = mapper.writeValueAsString(member)
}
We also include val memberAsString in it, which will contain our target JSON string, obtained through a reverse conversion from initially parsed member which actually is a Map.
Now, to parse the input json:
val parsed = mapper.readValue(json, classOf[SrcJson])
The references parsed.authKey and parsed.memberAsString will contain the researched values.
have a look at the scala play library - it has support for handling JSON. From what you describe, it should be pretty straightforward to read in the JSON and get the string value from any desired node.
Scala Play - JSON
Related
Using Azure Data Factory and a data transformation flow. I have a csv that contains a column with a json object string, below an example including the header:
"Id","Name","Timestamp","Value","Metadata"
"99c9347ab7c34733a4fe0623e1496ffd","data1","2021-03-18 05:53:00.0000000","0","{""unit"":""%""}"
"99c9347ab7c34733a4fe0623e1496ffd","data1","2021-03-19 05:53:00.0000000","4","{""jobName"":""RecipeB""}"
"99c9347ab7c34733a4fe0623e1496ffd","data1","2021-03-16 02:12:30.0000000","state","{""jobEndState"":""negative""}"
"99c9347ab7c34733a4fe0623e1496ffd","data1","2021-03-19 06:33:00.0000000","23","{""unit"":""kg""}"
Want to store the data in a json like this:
{
"id": "99c9347ab7c34733a4fe0623e1496ffd",
"name": "data1",
"values": [
{
"timestamp": "2021-03-18 05:53:00.0000000",
"value": "0",
"metadata": {
"unit": "%"
}
},
{
"timestamp": "2021-03-19 05:53:00.0000000",
"value": "4",
"metadata": {
"jobName": "RecipeB"
}
}
....
]
}
The challenge is that metadata has dynamic content, meaning, that it will be always a json object but the content can vary. Therefore I cannot define a schema. Currently the column "metadata" on the sink schema is defined as object, but whenever I run the transformation I run into an exception:
Conversion from ArrayType(StructType(StructField(timestamp,StringType,false),
StructField(value,StringType,false), StructField(metadata,StringType,false)),true) to ArrayType(StructType(StructField(timestamp,StringType,true),
StructField(value,StringType,true), StructField(metadata,StructType(StructField(,StringType,true)),true)),false) not defined
We can get the output you expected, we need the expression to get the object Metadata.value.
Please ref my steps, here's my source:
Derived column expressions, create a JSON schema to convert the data:
#(id=Id,
name=Name,
values=#(timestamp=Timestamp,
value=Value,
metadata=#(unit=substring(split(Metadata,':')[2], 3, length(split(Metadata,':')[2])-6))))
Sink mapping and output data preview:
The key is that your matadata value is an object and may have different schema and content, may be 'value' or other key. We only can manually build the schema, it doesn't support expression. That's the limit.
We can't achieve that within Data Factory.
HTH.
sample json payload:
'{
"Stub1": "XXXXX",
"Stub2": "XXXXX-3047-4ed3-b73b-83fbcc0c2aa9",
"Code": "CodeX",
"people": [
{
"ID": "XXXXX-6425-EA11-A94A-A08CFDCA6C02"
"customer": {
"Id": 173,
"Account": 275,
"AFile": "tel"
},
"products": [
{
"product": 1,
"type": "A",
"stub1": "XXXXX-42E1-4A13-8190-20C2DE39C0A5",
"Stub2": "XXXXX-FC4F-41AB-92E7-A408E7F4C632",
"stub3": "XXXXX-A2B4-4ADF-96C5-8F3CDCF5821D",
"Stub4": "XXXXX-1948-4B3C-987F-B5EC4D6C2824"
},
{
"product": 2,
"type": "B",
"stub1": "XXXXX-42E1-4A13-8190-20C2DE39C0A5",
"Stub2": "XXXXX-FC4F-41AB-92E7-A408E7F4C632",
"stub3": "XXXXX-A2B4-4ADF-96C5-8F3CDCF5821D",
"Stub4": "XXXXX-1948-4B3C-987F-B5EC4D6C2824"
}
]
}
]
}'
I am working on a POST call. Is there any way to feed multiple json files as a payload in Gatling. I am using body(RawFileBody("file.json")) as json here.
This works fine for a single json file. I want to check response for multiple json files. Is there any way we can parametrize this and get response against multiple json files.
As far as I can see, there's a couple of ways you could do this.
Use a JSON feeder (https://gatling.io/docs/current/session/feeder#json-feeders). This would need your multiple JSON files to be in a single file, with the root element being a JSON array. Essentially you'd put the JSON objects you have inside an array inside a single JSON file
Create a Scala Iterator and have the names of the JSON files you're going to use in it. e.g:
val fileNames = Iterator("file1.json", "file2.json)
// and later, in your scenario
body(RawFileBody(fileNames.next())
Note that this method cannot be used across users, as the iterator will initialize separately for each user. You'd have to use repeat or something similar to send multiple files as a single user.
You could do something similar by maintaining the file names as a list inside Gatling's session variable, but this session would still not be shared between different users you inject into your scenario.
I have some json like below, when I loaded this json some fields is string of json,
How to parse this json using spark scala and look for the key words I am looking for in that json
{"main":"{\"payload\": { \"mode\": [\"Node\"], \"currentSatate\": \"Ready\", \"Previousstate\": \"slow\", \"trigger\": [\"11\", \"12\"], \"AllStates\": [\"Ready\", \"slow\", \"fast\", \"new\"],\"UnusedStates\": [\"slow\", \"new\"],\"Percentage\": \"70\",\"trigger\": [\"11\"]}"}
{"main":"{\"payload\": {\"trigger\": [\"11\", \"22\"],\"mode\": [\"None\"],\"cangeState\": \"Open\"}}"}
{"main":"{\"payload\": { \"trigger\": [\"23\", \"45\"], \"mode\": [\"Edge\"], \"node.postions\": [\"12\", \"23\", \"45\", \"67\"], \"node.names\": [\"aa\", \"bb\", \"cc\", \"dd\"]}}" }
This is how its looking after loading in to data frame
val df = spark.read.json("<pathtojson")
df.show(false)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|main |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{"payload": { "mode": ["Node"], "currentSatate": "Ready", "Previousstate": "slow", "trigger": ["11", "12"], "AllStates": ["Ready", "slow", "fast", "new"],"UnusedStates": ["slow", "new"],"Percentage": "70","trigger": ["11"]}|
|{"payload": {"trigger": ["11", "22"],"mode": ["None"],"cangeState": "Open"}} |
|{"payload": { "trigger": ["23", "45"], "mode": ["Edge"], "node.postions": ["12", "23", "45", "67"], "node.names": ["aa", "bb", "cc", "dd"]}} |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Since my json filed is different for all the 3 json strings , is there a way to match define 3 case class and match
I know only matching to one class
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
val parsedJson = mapper.readValue[classname](jsonstring)
is there a way to create a multiple matching case class and match to any particular class ?
You are using Spark SQL, the first thing you have to do is to turn it into a dataset, and then use the spark's methods to deal with them. Don't use Json, all over the place (e.g., like in Play). The first task is to turn it into a dataset.
You could turn the serialize a Json into a case class:
val jsonFilePath: String = "/whatever/data.json"
val myDataSet = sparkSession.read.json(jsonFilePath).as[StudentRecord]
Then here you have the dataset for StudentRecord. So, you can now use the spark's groupBy method to get the data of the column you want from the dataset:
myDataSet.groupBy("whateverTable.whateverColumn").max() //could be min(), count(), etc...
Extra Note: Your Json, should "cleaned up" a little. For example, if it is within your program you can use the multi line way of declaring your Json, and then you don't need to use escape character all over the place:
val myJson: String =
"""
{
}
""".stripMargin
If it is in the file, then the Json you wrote is not correct. So first, make sure you have a syntactically correct Json to work on.
I am having difficulties getting multiple datasets out of my database with RestTemplate. I have many routines that extract a single row, with a format like:
IndicatorModel indicatorModel = restTemplate.getForObject(URL + id,
IndicatorModel.class);
and they work fine. However, if I try to extract a set of data, such as:
Map<String, List<S_ServiceCoreTypeModel>> coreTypesMap =
restTemplate.getForObject(URL + id, Map.class);
this returns values in a
Map<String, LinkedHashMap<>>
format. Is there an easy way to return a List<> or Set<> in the desired format?
Fundamentally the issue is that your Java object model does not match the structure of your json document. You are attempting to deserialize a single json element into a java List. Your JSON document looks like:
{
"serviceCoreTypes":[
{
"serviceCoreType":{
"name":"ALL",
"description":"All",
"dateCreated":"2016-06-23 14:46:32.09",
"dateModified":"2016-06-23 14:46:32.09",
"deleted":false,
"id":1
}
},
{
"serviceCoreType":{
"name":"HSI",
"description":"High-speed Internet",
"dateCreated":"2016-06-23 14:47:31.317",
"dateModified":"2016-06-23 14:47:31.317",
"deleted":false,
"id":2
}
}
]
}
But you cannot turn a serviceCoreTypes into a List, you can only turn a Json Array into a List. For instance if you removed the unnecessary wrapper elements from your json and your input document looked like:
[
{
"name": "ALL",
"description": "All",
"dateCreated": "2016-06-23 14:46:32.09",
"dateModified": "2016-06-23 14:46:32.09",
"deleted": false,
"id": 1
},
{
"name": "HSI",
"description": "High-speed Internet",
"dateCreated": "2016-06-23 14:47:31.317",
"dateModified": "2016-06-23 14:47:31.317",
"deleted": false,
"id": 2
}
]
You should be able to then deserialize THAT into a List< S_ServiceCoreTypeModel>. Alternately if you cannot change the json structure, you could create a Java object model that models the json document by creating some wrapper classes. Something like:
class ServiceCoreTypes {
List<ServiceCoreType> serviceCoreTypes;
...
}
class ServiceCoreTypeWrapper {
ServiceCoreType serviceCoreType;
...
}
class ServiceCoreType {
String name;
String description;
...
}
I'm assuming you don't actually mean database, but instead a restful service as you're using RestTemplate
The problem you're facing is that you want to get a Collection back, but the getForObject method can only take in a single type parameter and cannot figure out what the type of the returned collection is.
I'd encourage you to consider using RestTemplate.exchange(...)
which should allow you request for and receive back a collection type.
I have a solution that works, for now at least. I would prefer a solution such as the one proposed by Ben, where I can get the HTTP response body as a list of items in the format I chose, but at least here I can extract each individual item from the JSON node. The code:
S_ServiceCoreTypeModel endModel;
RestTemplate restTemplate = new RestTemplate();
JsonNode node = restTemplate.getForObject(URL, JsonNode.class);
JsonNode allNodes = node.get("serviceCoreTypes");
JsonNode oneNode = allNodes.get(1);
ObjectMapper objectMapper = new ObjectMapper();
endModel = objectMapper.readValue(oneNode.toString(), S_ServiceCoreTypeModel.class);
If anyone has thoughts on how to make Ben's solution work, I would love to hear it.
I am using BigQuery from Scala. I tried the sample Scala code to call Google bigQuery API
Scala:
val queryInfo: QueryRequest =
new QueryRequest().setQuery(s"SELECT * FROM $PROJECT_ID:$dataSetId.$tableId;")
val queryRequest: Bigquery#Jobs#Query =
bigquery.jobs().query(PROJECT_ID, queryInfo)
val queryResponse: QueryResponse =
queryRequest.execute()
Above BQ returns:
{
"jobComplete":true,
"jobReference":{
"jobId":"job_xxx",
"projectId":"xxx"
},
"kind":"bigquery#queryResponse",
"rows":[{"f":[{"v":"1"},{"v":"1364206559422"}]}],
"schema": {
"fields":[
{"mode":"NULLABLE","name":"id","type":"STRING"},
{"mode":"NULLABLE","name":"timestamp","type":"INTEGER"}
]
},
"totalRows":"1",
"pageToken":"xxxx"
}
Please help me parse the values from above the results in JSON Format or change the query to return the result of the format like this:
{"id": "1", "timestamp": "1364206559422"}
I like lift json.
Look at the lotto example, it's straight forward with case classes