Groovy: Retrieve a value from a JSON based on an object - rest

So I have a JSON that looks like this
{
"_embedded" : {
"userTaskDtoList" : [
{
"userTaskId" : 8,
"userTaskDefinitionId" : "JJG",
"userRoleId" : 8,
"workflowId" : 9,
"criticality" : "MEDIUM",
**"dueDate"** : "2021-09-29T09:04:37Z",
"dueDateFormatted" : "Tomorrow 09:04",
"acknowledge" : false,
"key" : 8,
},
{
"userTaskId" : 10,
"userTaskDefinitionId" : "JJP",
"userRoleId" : 8,
"workflowId" : 11,
"criticality" : "MEDIUM",
**"dueDate"** : "2021-09-29T09:06:44Z",
"dueDateFormatted" : "Tomorrow 09:06",
"acknowledge" : false,
"key" : 10,
},
{
"userTaskId" : 12,
"userTaskDefinitionId" : "JJD",
"userRoleId" : 8,
"workflowId" : 13,
"criticality" : "MEDIUM",
**"dueDate"** : "2021-09-29T09:59:07Z",
"dueDateFormatted" : "Tomorrow 09:59",
"acknowledge" : false,
"key" : 12,
}
]
}
}
It's a response from a REST request. What I need is to extract the data of key "dueDate" ONLY from a specific object and make some validations with it. I'm trying to use Groovy to resolve this.
The only thing I've managed to do is this:
import groovy.json.*
def response = context.expand( '${user tasks#Response}' )
def data = new JsonSlurper().parseText(response)
idValue = data._embedded.userTaskDtoList.dueDate
Which returns all 3 of the values from the "dueDate" key in the response.
I was thinking that maybe I can interact with a certain object based on another key, for instance let's say I retrieve only the value from the "dueDate" key, that is part of the object with "userTaskId" : 12.
How could I do this?
Any help would be greatly appreciated.

You can find the record of interest, then just grab the dueDate from that
data._embedded.userTaskDtoList.find { it.userTaskId == 12 }.dueDate

Related

Adding new property (in the latest array item) in mongodb embedded document

I habe a mongo document like below:
{
"_id" : ObjectId("59ccb655071d4c2ceebe190c"),
"session_deatils" : [
{
"session_start" : 1,
"session_complete" : 1,
"started_at" : ISODate("2017-09-28T08:42:19.770Z"),
"event_count" : 2
},
{
"session_start" : 1,
"session_complete" : 1,
"started_at" : ISODate("2017-09-28T08:53:08.618Z"),
"event_count" : 1
},
{
"session_start" : 1,
"session_complete" : 1,
"started_at" : ISODate("2017-09-28T09:19:42.726Z")
}
],
"session_id" : "12312312313123",
}
I want to add new field and value like "event_count" in the latest item in session details which is
{
"session_start" : 1,
"session_complete" : 1,
"started_at" : ISODate("2017-09-28T09:19:42.726Z")
}
and I want to update it and the array element should look like below:
{
"session_start" : 1,
"session_complete" : 1,
"started_at" : ISODate("2017-09-28T09:19:42.726Z"),
"event_count" : 3
}
I tried like below:
collection.update({"session_deatils.started_at": datetime.datetime(2017, 9, 28, 9, 22, 3, 459000)}, {"$set":{"session_deatils.event_count":3}})
which adds new property in parrent docuemnt.
Is there a way I can achieve that?
thanks in advance

Why does MapReduce return undefined for a field that exists in document?

I am trying to debug a strange issue I am running into while running mapreduce on a collection:
For reference, here's a single document from the collection:
{
"_id" : "ITOUXFWgvWs",
"source" : "youtube",
"insert_datetime" : ISODate("2017-04-06T22:27:43.598Z"),
"processed" : false,
"raw" : {
"id" : "ITOUXFWgvWs",
"etag" : "\"m2yskBQFythfE4irbTIeOgYYfBU/hiQtS6aptLlqxTpsYp1EJIRcoZo\"",
"snippet" : {
"publishedAt" : ISODate("2017-04-06T13:25:28Z"),
"title" : "Alarm.com: The Only Smart Home App You Need",
"channelId" : "UC_HZfoZUP36STk7SrtKYH4g",
"description" : "All these new connected devices are awesome, but wouldn’t it be great if you could use one app for the entire connected home? It can all come together with Alarm.com.",
"categoryId" : "28",
"channelTitle" : "Alarm.com",
"thumbnails" : {
"default" : {
"height" : 90,
"width" : 120,
"url" : "https://i.ytimg.com/vi/ITOUXFWgvWs/default.jpg"
},
"standard" : {
"height" : 480,
"width" : 640,
"url" : "https://i.ytimg.com/vi/ITOUXFWgvWs/sddefault.jpg"
},
"high" : {
"height" : 360,
"width" : 480,
"url" : "https://i.ytimg.com/vi/ITOUXFWgvWs/hqdefault.jpg"
},
"medium" : {
"height" : 180,
"width" : 320,
"url" : "https://i.ytimg.com/vi/ITOUXFWgvWs/mqdefault.jpg"
},
"maxres" : {
"height" : 720,
"width" : 1280,
"url" : "https://i.ytimg.com/vi/ITOUXFWgvWs/maxresdefault.jpg"
}
},
"liveBroadcastContent" : "none",
"localized" : {
"title" : "Alarm.com: The Only Smart Home App You Need",
"description" : "All these new connected devices are awesome, but wouldn’t it be great if you could use one app for the entire connected home? It can all come together with Alarm.com."
}
},
"contentDetails" : {
"duration" : "PT37S",
"dimension" : "2d",
"definition" : "hd",
"licensedContent" : false,
"projection" : "rectangular",
"caption" : "false"
},
"kind" : "youtube#video",
"statistics" : {
"likeCount" : "0",
"dislikeCount" : "0",
"favoriteCount" : "0",
"viewCount" : "32"
},
"uploaded" : ISODate("2017-04-06T13:25:28Z")
},
}
I am literally following the mapreduce debug steps from official mongo documentation.
Here's what my mapreduce script looks like:
var map = function() {
emit("1", this._id);
};
var emit = function(key, value) {
print("emit");
print("key: " + key + " value: " + tojson(value));
}
var myDoc = db.getCollection("abc").find({}).limit(1);
map.apply(myDoc);
And it always produces the result like this:
MongoDB shell version: 2.4.6
connecting to: test
emit
key: 1 value: undefined
I expect the script to emit the _id since it clearly exists in the document but it doesn't.
What might be possible cause of this?
find() always returns a cursor.
Replace it with findOne()
var myDoc = db.getCollection("abc").findOne({});
Or store the documents in an Array using toArray()
var myDoc = db.getCollection("abc").find({}).limit(1).toArray()[0];

How TO Filter by _id in mongodb using pig

I have a mongo documents like this:
db.activity_days.findOne()
{
"_id" : ObjectId("54b4ee617acf9ce0440a3185"),
"aca" : 0,
"ca" : 0,
"cbdw" : true,
"day" : ISODate("2014-12-10T00:00:00Z"),
"dm" : 0,
"fbc" : 0,
"go" : 2500,
"gs" : [ ],
"its" : [
{
"_id" : ObjectId("551ac8d44f9f322e2b055d3a"),
"at" : 2000,
"atn" : "Running",
"cas" : 386.514909469507,
"dis" : 2.788989730832084,
"du" : 1472,
"ibr" : false,
"ide" : false,
"lcs" : false,
"pt" : 0,
"rpt" : 0,
"src" : 1001,
"stp" : 0,
"tcs" : [ ],
"ts" : 1418257729,
"u_at" : ISODate("2015-01-13T00:32:10.954Z")
}
],
"po" : 0,
"se" : 0,
"st" : 0,
"tap3c" : [ ],
"tzo" : -21600,
"u_at" : ISODate("2015-01-13T00:32:10.952Z"),
"uid" : ObjectId("545eb753ae9237b1df115649")
}
I want to use pig to filter special _id range,I can write mongo query like this:
db.activity_day.find(_id:{$gt:ObjectId("54a48e000000000000000000"),$lt:ObjectId("54cd6c800000000000000000")})
But I don't know how to write in pig, anyone knows?
You could try using mongo-hadoop connector for Pig, see mongo-hadoop: Usage with Pig.
Once you REGISTER the JARs (core, pig, and the Java driver), e.g., REGISTER /path-to/mongo-hadoop-pig-<version>.jar; via grunt you could run:
SET mongo.input.query '{"_id":{"\$gt":{"\$oid":"54a48e000000000000000000},"\$lt":{"\$oid":"54cd6c800000000000000000}}}'
rangeActivityDay = LOAD 'mongodb://localhost:27017/database.collection' USING com.mongodb.hadoop.pig.MongoLoader()
DUMP rangeActivityDay
You may want to use LIMIT before dumping the data as well.
The above was tested using: mongo-java-driver-3.0.0-rc1.jar, mongo-hadoop-pig-1.4.0.jar, mongo-hadoop-core-1.4.0.jar and MongoDB v3.0.9

Errors while creating a collection in MongoDB

I am new to MongoDB. I am not able to create a collection. It gives a sentence in the mongo shell - Display all 169 possibilities? (y or n). The code is -
db.Lead.insert(
{ LeadID: 1,
MasterAccountID: 100,
LeadName: 'Sarah',
LeadEmailID : 'sarah#hmail.com',
LeadPhoneNumber : '2132155445',
Details : [{ StateID: 1,
TaskID : 1,
Assigned By : 1001,
TimeStamp : '10:00:00',
StatusID : 1 }
]
}
)
Not sure what the issue is. Please help me out with the same.
Regards.
Apart from the fact there is a space in Assigned By everything looks good.
I am able to insert it properly.
> db.Lead.find().pretty()
{
"_id" : ObjectId("517ebe75278e0557fd167eb7"),
"LeadID" : 1,
"MasterAccountID" : 100,
"LeadName" : "Sarah",
"LeadEmailID" : "sarah#hmail.com",
"LeadPhoneNumber" : "2132155445",
"Details" : [
{
"StateID" : 1,
"TaskID" : 1,
"AssignedBy" : 1001,
"TimeStamp" : "10:00:00",
"StatusID" : 1
}
]
}

Migrating from MongoDB to HBase

Hi I am very new to HBase database. I downloaded some twitter data and stored into MongoDB. Now I need to transform that data into HBase to speed-up Hadoop processing. But I am not able to create it's scheme. Here I have twitter data into JSON format-
{
"_id" : ObjectId("512b71e6e4b02a4322d1c0b0"),
"id" : NumberLong("306044618179506176"),
"source" : "Facebook",
"user" : {
"name" : "Dada Bhagwan",
"location" : "India",
"url" : "http://www.dadabhagwan.org",
"id" : 191724440,
"protected" : false,
"timeZone" : null,
"description" : "Founder of Akram Vignan - Practical Spiritual Science of Self Realization",
"screenName" : "dadabhagwan",
"geoEnabled" : false,
"profileImageURL" : "http://a0.twimg.com/profile_images/1647956820/M_DSC_0034_normal.jpg",
"biggerProfileImageURL" : "http://a0.twimg.com/profile_images/1647956820/M_DSC_0034_bigger.jpg",
"profileImageUrlHttps" : "https://si0.twimg.com/profile_images/1647956820/M_DSC_0034_normal.jpg",
"profileImageURLHttps" : "https://si0.twimg.com/profile_images/1647956820/M_DSC_0034_normal.jpg",
"biggerProfileImageURLHttps" : "https://si0.twimg.com/profile_images/1647956820/M_DSC_0034_bigger.jpg",
"miniProfileImageURLHttps" : "https://si0.twimg.com/profile_images/1647956820/M_DSC_0034_mini.jpg",
"originalProfileImageURLHttps" : "https://si0.twimg.com/profile_images/1647956820/M_DSC_0034.jpg",
"followersCount" : 499,
"profileBackgroundColor" : "EEE4C1",
"profileTextColor" : "333333",
"profileLinkColor" : "990000",
"lang" : "en",
"profileSidebarFillColor" : "FCF9EC",
"profileSidebarBorderColor" : "CBC09A",
"profileUseBackgroundImage" : true,
"showAllInlineMedia" : false,
"friendsCount" : 1,
"favouritesCount" : 0,
"profileBackgroundImageUrl" : "http://a0.twimg.com/profile_background_images/396759326/dadabhagwan-twitter.jpg",
"profileBackgroundImageURL" : "http://a0.twimg.com/profile_background_images/396759326/dadabhagwan-twitter.jpg",
"profileBackgroundImageUrlHttps" : "https://si0.twimg.com/profile_background_images/396759326/dadabhagwan-twitter.jpg",
"profileBannerURL" : null,
"profileBannerRetinaURL" : null,
"profileBannerIPadURL" : null,
"profileBannerIPadRetinaURL" : null,
"miniProfileImageURL" : "http://a0.twimg.com/profile_images/1647956820/M_DSC_0034_mini.jpg",
"originalProfileImageURL" : "http://a0.twimg.com/profile_images/1647956820/M_DSC_0034.jpg",
"utcOffset" : -1,
"contributorsEnabled" : false,
"status" : null,
"createdAt" : NumberLong("1284700143000"),
"profileBannerMobileURL" : null,
"profileBannerMobileRetinaURL" : null,
"profileBackgroundTiled" : false,
"statusesCount" : 1713,
"verified" : false,
"translator" : false,
"listedCount" : 6,
"followRequestSent" : false,
"descriptionURLEntities" : [ ],
"urlentity" : {
"url" : "http://www.dadabhagwan.org",
"start" : 0,
"end" : 26,
"expandedURL" : "http://www.dadabhagwan.org",
"displayURL" : "http://www.dadabhagwan.org"
},
"rateLimitStatus" : null,
"accessLevel" : 0
},
"contributors" : [ ],
"geoLocation" : null,
"place" : null,
"favorited" : false,
"retweet" : false,
"retweetedStatus" : null,
"retweetCount" : 0,
"userMentionEntities" : [ ],
"retweetedByMe" : false,
"currentUserRetweetId" : -1,
"possiblySensitive" : false,
"urlentities" : [
{
"url" : "http://t.co/gR1GohGjaj",
"start" : 113,
"end" : 135,
"expandedURL" : "http://fb.me/2j2HKHJrM",
"displayURL" : "fb.me/2j2HKHJrM"
}
],
"hashtagEntities" : [ ],
"mediaEntities" : [ ],
"truncated" : false,
"inReplyToStatusId" : -1,
"text" : "Spiritual Quote of the Day :\n\n‘I am Chandubhai’ is an illusion itself and from that are \nkarmas charged. When... http://t.co/gR1GohGjaj",
"inReplyToUserId" : -1,
"inReplyToScreenName" : null,
"createdAt" : NumberLong("1361801697000"),
"rateLimitStatus" : null,
"accessLevel" : 0
}
Here how to divide data into columns and column-family? I thought to make one "twitter" column-family that contain source, getlocation, place, retweet etc... and another "user" column-family and that contain name, location etc... (user's data). i.e new column family for each inner level sub-document.
Is this approach is correct? Now How I will differentiate urlentity for "user" column-family and "twitter" column-family?
And how to handle those keys that contain list of sub-documents (for e.g. urlentity)
There are many ways to model this in HBase ranging from storing everything in a single column to having a different table for each sub entity with several other tables for "indexing".
Generally speaking you model the data in hbase based on you read and write access patterns. fo r example column family are stored in different files on disk. A reason to divide data into two column families is if there are a lot of cases where you need data from one and not the other. etc.
There's a good presentation about HBAse schema design by Ian Varley from HBaseCon 2012 you can find the slides here and the video here