jolt transformation : remove duplicates from json (list elements) - jolt

I am looking for some inputs to remove duplicates from a json string similar to below sample :
Sample Input :
{
"profile": true,
"address": [
"12",
"23"
],
"zipCodes": [
"12345",
"56789",
"12345",
"56789"
],
"phoneNumber": [
"87857",
"927465",
"274894",
"87857"
],
"userName": [
"ABC",
"PQR",
"ABC"
],
"enableEmailNot": "No"
}
Expected Output :
{
"profile": true,
"zipCodes": [
"12345",
"56789"
],
"phoneNumber": [
"87857",
"927465",
"274894"
],
"userName": [
"ABC",
"PQR"
],
"address": [
"12",
"23"
],
"enableEmailNot": "No"
}
Thanks for the help

You can use following specs
[
{
// exchange key-value pairs while seperating lists and non-lists by prefixing "xx" non-lists
"operation": "shift",
"spec": {
"profile": "xx.&",
"*": {
"*": {
"$": "&2.#(0)"
}
},
"enableEmailNot": "xx.&"
}
},
{
// pick the first components of lists
"operation": "cardinality",
"spec": {
"*": {
"*": "ONE"
}
}
},
{
// reform the lists through use of "$" operator
"operation": "shift",
"spec": {
"xx": {
"*": "&"
},
"*": {
"*": {
"$": "&2[]"
}
}
}
}
]

Related

Moving one array into json objects defined in another array - JOLT Transformation

I have my input defined as
{
"lineId": "1",
"Collection": {
"services": [
{
"Code": "TB",
"Type": [
"Data"
]
},
{
"Code": "MAGTB",
"Type": [
"Data"
]
}
]
},
"promotions": [
{
"Id": "1"
},
{
"Id": "2"
}
]
}
I would like to get my output as
{
"lineId": "1",
"Collection": {
"services": [
{
"Code": "TB",
"Type": [
"Data"
],
"promotions": [
{
"Id": "1"
},
{
"Id": "2"
}
]
},
{
"Code": "TB2",
"Type": [
"Data"
],
"promotions": [
{
"Id": "1"
},
{
"Id": "2"
}
]
}
]
}
}
Any help would be appreciated.
I am new to JOLT. And I'm having trouble navigating to the second array from inside the first.
Incomplete transformation that I tried:
[
{
"operation": "shift",
"spec": {
"Collection": {
"services": {
"*": "Collection.services[].&",
"#(3,lineId)": "lineId",
"#(3,promotions)":{
"*":
}
}
}
}
}
]
edit: Tried this now
[
{
"operation": "shift",
"spec": {
"Collection": {
"services": {
"*": "Collection.services[]",
// "*": "&",
"#(3,lineId)": "lineId",
"#(3,promotions)": {
"*": {
"Id": "Id"
}
}
}
}
}
}
]
I just have to figure out a way to move the Id list inside the objects in services array.
edit2:
[
{
"operation": "shift",
"spec": {
"Collection": {
"services": {
"*": {
"*": "Collection.services[&1].&",
"#(3,lineId)": "Collection.services[&1].lineId",
"#(3,promotions)": "Collection.services[&1].promotions"
}
}
}
}
}
]
I think this is the spec you're looking for?
[
{
"operation": "shift",
"spec": {
"lineId": "lineId",
"Collection": {
"services": {
"*": {
"#(3,promotions)": "Collection.services[&1].promotions",
"*": "Collection.services[&1].&"
}
}
}
}
}
]

JOLT Nested if condition

I have this input JSON:
{
"user": "123456",
"product": "television",
"category": "electronics",
"tag": "summer"
}
And this transformation:
[
{
"operation": "shift",
"spec": {
"product": {
"#(1,product)": "item",
"#(1,user)": {
"#2": "userBias"
}
},
"user": {
"#(1,user)": "user"
},
"category": {
"#category": "rules.[0].name",
"#(1,category)": "rules.[0].values[0]"
},
"tag": {
"rules": "rules",
"#tag": "rules.[1].name",
"#(2,tag)": "rules.[1].values[0]"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"userBias?": "=toInteger"
}
}
]
Which works fine and produces the following JSON:
{
"item": "television",
"userBias": 2,
"user": "123456",
"rules": [
{
"name": "category",
"values": [
"electronics"
]
},
{
"name": "tag",
"values": [
"summer"
]
}
]
}
If from the input though I delete "category": "electronics" so it becomes:
{
"user": "123456",
"product": "television",
"tag": "summer"
}
Then i get back the following result:
{
"item": "television",
"userBias": 2,
"user": "123456",
"rules": [
null,
{
"name": "tag",
"values": [
"summer"
]
}
]
}
The problem with the above is that it contains a null element inside the array and I do not know how to get rid of it. I have tried with recursivelySquashNulls but it does not work.
Also basically what am looking for is if both category and tag exist then tag should go to rules[1] if only tag exists then tag should go to rules[0].
Thanks in advance,
giannis
Because you specify tag and category elements individually. Rather, prefer putting them into the category rest by combining them under asterisked key notation such as
[
{
"operation": "shift",
"spec": {
"product": {
"#(1,product)": "item",
"#(1,user)": {
"#2": "userBias"
}
},
"user": {
"#(1,user)": "user"
},
"*": {
"$": "r[0].&.name",
"#(1,&)": "r[0].&.values[]"
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"r": {
"*": { "*": "rules" }
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"userBias?": "=toInteger"
}
}
]
Result1 :
Result2(without "category": "electronics" pair for the Input) :

Add array inside object with same key in jolt spec

I would like to move the recommendations array by row_id key inside the investors with the same row_id
Original Json
{
"investors": [
{
"row_id": 1,
"name": "AAAA"
},
{
"row_id": 2,
"name": "BBBB"
}
],
"recommendations": [
{
"row_id": "1",
"title": "ABC"
},
{
"row_id": "2",
"title": "CDE"
}
]
}
I've tried a lot of specs at https://jolt-demo.appspot.com with no success
Specs tried...
[{
"operation": "shift",
"spec": {
"investors": {
"*": "investors[]"
},
"recommendations": {
"#": "recommendations[]"
}
}
}]
Desired Json
{
"investors": [
{
"row_id": 1,
"name": "AAAA",
"recommendations":[{
"row_id": "1",
"title": "ABC"
}]
},
{
"row_id": 2,
"name": "BBBB",
"recommendations":[{
"row_id": "2",
"title": "CDE"
}]
}
]
}
This can be done in two stage shift
First shift groups everything based on row_id.
(I'd suggest running the first shift of its own to see what the output is)
Second shift uses that grouped output and formats results.
[
{
"operation": "shift",
"spec": {
"*": {
"*": {
"row_id": {
"*": {
"#2": "&.&4"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"investors": "investors.[#2]",
"recommendations": "investors.[#2].recommendations[]"
}
}
}
]

JOLT spec for ternary operation

I am trying to write a jolt spec for conversion of the following input in the expected output mentioned below
INPUT :
{
"city": "Seattle",
"state": "WA",
"country": "US",
"date": "10/20/2018",
"userList": [
{
"name": "David",
"age": "22",
"sex": "M",
"company": "good"
},
{
"name": "Tom",
"age": "30",
"sex": "M",
"company": "good"
},
{
"name": "Annie",
"age": "25",
"sex": "F",
"company": "bad"
},
{
"name": "Aaron",
"age": "27",
"sex": "M",
"company": "bad"
}
]
}
EXPECTED OUTPUT:
{
"users" : [ {
"date" : "10/20/2018",
"username" : "David",
"age" : "22",
"sex" : "M",
"organization" : "good"
}, {
"date" : "10/20/2018",
"username" : "Tom",
"age" : "30",
"sex" : "M",
"organization" : "good"
} ],
"Date" : "10/20/2018",
"State" : "WA",
"Country" : "US"
}
I want to filter out all the elements in the list where company = bad or sex = F.Or alternatively keep only the elements where company = good and sex=M.
I need help in removing the elements from the list based on the specific conditions.
Is jolt recommended for such data driven conversions?
The Spec that I have written so far is
[{
"operation": "shift",
"spec": {
"userList": {
"*": {
"name": "users.[&1].username",
"age": "users.[&1].age",
"sex": "users.[&1].sex",
"company": "users.[&1].organization",
"#(2,date)": "users.[&1].date"
}
},
"date": "Date",
"state": "State",
"country": "Country"
}
}
]
ok, here is the process I followed to come up to a solution for the "take all M+good"
copy top level element as-is, not impacting the scope
for users content, go to the first field used for filtering "users -> * -> sex -> M" and copy all fields in that case
do the same for the other field used for filtering
in step 2 and 3, I used a map to avoid the "null" in the array. Now switching back to array
[{
"operation": "shift",
"spec": {
"userList": {
"*": {
"name": "users.[&1].username",
"age": "users.[&1].age",
"sex": "users.[&1].sex",
"company": "users.[&1].organization",
"#(2,date)": "users.[&1].date"
}
},
"date": "Date",
"state": "State",
"country": "Country"
}
}
,
{
"operation": "shift",
"spec": {
"*": "&",
"users": {
"*": {
"sex": {
"M": {
"#(2,username)": "users.&3.username",
"#(2,age)": "users.&3.age",
"#(2,sex)": "users.&3.sex",
"#(2,organization)": "users.&3.organization",
"#(2,date)": "users.&3.date"
}
}
}
}
}
}
,
{
"operation": "shift",
"spec": {
"*": "&",
"users": {
"*": {
"organization": {
"good": {
"#(2,username)": "users.&3.username",
"#(2,age)": "users.&3.age",
"#(2,sex)": "users.&3.sex",
"#(2,organization)": "users.&3.organization",
"#(2,date)": "users.&3.date"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"users": {
"*": "users[]"
}
}
}
]
For the other case, you just have to change the filter rules to do nothing when the rule match and copy the field in the other cases (*). Which give following spec (did it only on first field, updating for second field should not be an issue)
[{
"operation": "shift",
"spec": {
"userList": {
"*": {
"name": "users.[&1].username",
"age": "users.[&1].age",
"sex": "users.[&1].sex",
"company": "users.[&1].organization",
"#(2,date)": "users.[&1].date"
}
},
"date": "Date",
"state": "State",
"country": "Country"
}
}
,
{
"operation": "shift",
"spec": {
"*": "&",
"users": {
"*": {
"sex": {
"F": null,
"*": {
"#(2,username)": "users.&3.username",
"#(2,age)": "users.&3.age",
"#(2,sex)": "users.&3.sex",
"#(2,organization)": "users.&3.organization",
"#(2,date)": "users.&3.date"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"users": {
"*": "users[]"
}
}
}
]

Combining two arrays and transformation using Jolt

I am having a tough time using Jolt mapping trying to transform the input to the necessary form
Input JSON
{
"data": [
{
"name": "Alcohol",
"collection_id": 123,
"properties": [
{
"name": "Tax",
"property_id": "00001"
},
{
"name": "Expenditure",
"property_id": "00002"
}
],
"attributes": [
{
"name": "alcohol_tax",
"attribute_id": "00011"
},
{
"name": "alcohol_expenditure",
"attribute_id": "00022"
}
]
}
]
}
Output JSON
[
{
"name": "Alcohol",
"collection_id": 123,
"details": [{
"property_name": "Tax",
"property_id": "00001",
"attribute_id": "00011"
},
{
"property_name": "Expenditure",
"property_id": "00002",
"attribute_id": "00022"
}
]
}
]
I have tried a couple of ways to combine the arrays using a few rules but with little success.
One of the rules
[{
"operation": "shift",
"spec": {
"data": {
"*": {
"name": "&1.name",
"collection_id": "&1.collection_id",
"attributes": {
"*": {
"attribute_id": "&1.attribute_id[]"
}
},
"properties": {
"*": {
"name": "&1.myname[]",
"property_id": "&1.property_id[]"
}
}
}
}
}
}]
is adding all the attributes and properties to all the collections.I do not know why this happens as I thought &1.property_id[] would only add items in that particular collection to the array and not all collections. Any help/clues on why this is happening would be truly appreciated.
See solution below:
[0] creates the wrapping array
[&1] uses the position of the respective arrays so the results are combined in details the important part is wrapping square brackets so its treated as array rather than literal.
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"name": "[0].name",
"collection_id": "[0].collection_id",
"attributes": {
"*": {
"attribute_id": "[0].details.[&1].attribute_id"
}
},
"properties": {
"*": {
"name": "[0].details.[&1].name",
"property_id": "[0].details.[&1].property_id"
}
}
}
}
}
}
]
Produces the following:
[
{
"name": "Alcohol",
"collection_id": 123,
"details": [
{
"attribute_id": "00011",
"name": "Tax",
"property_id": "00001"
},
{
"attribute_id": "00022",
"name": "Expenditure",
"property_id": "00002"
}
]
}
]