mongoDb: aggregate document references to subgroups - mongodb

I've a set of BSON documents in a mongoDb collection which I want to process according to form cascading sub-documents. There are thousand of duplicate id's only to be differentiated by timestamps and documents with type subsystem-> component -> subcomponent; and they should be attached to each other based on source. This is a sample I currently have:
[{
"id": 1001,
"type": "subsystem",
"time": 1454767739,
"payload": {
"key": "..."
}
},
{
"id": 1001,
"type": "subsystem",
"time": 1454767539,
"payload": {
"key": "..."
}
},
{
"id": 2001,
"type": "component",
"time": 1454767778,
"source": 1001,
"payload": {
"key": "..."
}
},
{
"id": 5002,
"type": "subcomponent",
"time": 1454767779,
"source": 2001,
"payload": {
"key": "..."
}
},
{
"id": 5003,
"type": "subcomponent",
"time": 1454767798,
"source": 2001,
"payload": {
"key": "..."
}
}]
And what am I trying to do is this:
match all elements with a timestamp lower than a maxTime value
group all elements by (unique) id, but only use the newest value ($max: $time) for $$ROOT (every id is unique now)
group all by attribute source as a array to the element with that id (or _id)
I've achieved 1.) and 2.) already have this group:
[{
"_id": 1001,
"reftime": 1454767739,
"element": [
{
"id": 1001,
"type": "subsystem",
"time": 1454767739,
"payload": {
"key": "..."
}
}
]
},
{
"_id": 2001,
"reftime": 1454767778,
"element": [
{
"id": 2001,
"type": "component",
"time": 1454767778,
"source": 1001,
"payload": {
"key": "..."
}
}
]
},
{
"_id": 5002,
"reftime": 1454767779,
"element": [
{
"id": 5002,
"type": "subcomponent",
"time": 1454767779,
"source": 2001,
"payload": {
"key": "..."
}
}
]
},
{
"_id": 5003,
"reftime": 1454767798,
"element": [
{
"id": 5003,
"type": "subcomponent",
"time": 1454767798,
"source": 2001,
"payload": {
"key": "..."
}
}
]
}]
But what I want is the elements to be structured according to their links in the source value. So it would look something like this:
{
"_id": 1001,
"reftime": 1454767739,
"element": [
{
"id": 1001,
"type": "subsystem",
"time": 1454767739,
"payload": {
"key": "..."
}
}
],
"attached": [
{
"_id": 2001,
"reftime": 1454767778,
"element": [
{
"id": 2001,
"type": "component",
"time": 1454767778,
"source": 1001,
"payload": {
"key": "..."
}
}
],
"attached": [
{
"_id": 5002,
"reftime": 1454767779,
"element": [
{
"id": 5002,
"type": "subcomponent",
"time": 1454767779,
"source": 2001,
"payload": {
"key": "..."
}
}
]
},
{
"_id": 5003,
"reftime": 1454767798,
"element": [
{
"id": 5003,
"type": "subcomponent",
"time": 1454767798,
"source": 2001,
"payload": {
"key": "..."
}
}
]
}
]
}
]
}
Is this possible? I'm struggling to form a group with everything that has a source and combine this with the group from sample 2.). Or if that would help, I could introduce a source: 0? for every subsystem and group by that under a single root element?
Any ideas or insights? I'm currently stuck at this point. Thank you!

Related

How to parse JSON data with Swift Class?

I want to parse this JSON with swift class using codable, this can be done with struct easily, but how to do the same with class?
{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
You can use https://quicktype.io/ to generate Swift files from a JSON. It will do most of the work for you. You may need to do some cleanup, but overall it does a great job generating the object.

Filter for one attribute (array) for one of its value (json)

Having the following record
{
"name": "
 Festões Plástico, 12mt x 17cm - Festas Populares",
"categories": [
"Festas",
"Casamentos",
"Decorações"
],
"hierarchicalCategories": {
"lvl0": "Festas",
"lvl1": "Festas > Casamentos",
"lvl2": "Festas > Casamentos > Decorações"
},
"description": "",
"brand": "Misterius",
"price": 14.94,
"stock": "Disponível",
"prices": [
{
"value": 12,
"type": "specificValue",
"family": "fatos",
"subfamily": "example"
},
{
"value": 13,
"type": "specificValue13",
"family": "fatos13",
"subfamily": "example13"
},
{
"value": 14,
"type": "specificValue14",
"family": "fatos14",
"subfamily": "example14"
},
{
"value": 15,
"type": "specificValue15",
"family": "fatos15",
"subfamily": "example15"
},
{
"value": 16,
"type": "specificValue16",
"family": "fatos16",
"subfamily": "example16"
}
],
"color": [
{
"name": "Amarelo",
"label": "Amarelo,#FFFF00",
"hexa": "#FFFF00"
},
{
"name": "Azul",
"label": "Azul,#0000FF",
"hexa": "#0000FF"
},
{
"name": "Branco",
"label": "Branco,#FFFFFF",
"hexa": "#FFFFFF"
},
{
"name": "Laranja",
"label": "Laranja,#FFA500",
"hexa": "#FFA500"
},
{
"name": "Verde Escuro",
"label": "Verde Escuro,#006400",
"hexa": "#006400"
},
{
"name": "Vermelho",
"label": "Vermelho,#FF0000",
"hexa": "#FF0000"
}
],
"specialcategorie": "",
"reference": "3546",
"rating": 0,
"free_shipping": false,
"popularity": 0,
"objectID": "30"
}
Now by searching for "Festas Populares" will return the record and its attributes, is it possible to also filter for one attribute array as "prices" to only return one json. for example "prices.type"="specificValue14" and "family"="fatos14" and "family"="fatos" and "subfamily"="example"
{
“value”: 14,
“type”: “specificValue14”,
“family”: “fatos14”,
“subfamily”: “example14”
}
the record return would be:
{
"name": "
 Festões Plástico, 12mt x 17cm - Festas Populares",
"categories": [
"Festas",
"Casamentos",
"Decorações"
],
"hierarchicalCategories": {
"lvl0": "Festas",
"lvl1": "Festas > Casamentos",
"lvl2": "Festas > Casamentos > Decorações"
},
"description": "",
"brand": "Misterius",
"price": 14.94,
"stock": "Disponível",
"prices": [
{
"value": 14,
"type": "specificValue14",
"family": "fatos14",
"subfamily": "example14"
}
],
"color": [
{
"name": "Amarelo",
"label": "Amarelo,#FFFF00",
"hexa": "#FFFF00"
},
{
"name": "Azul",
"label": "Azul,#0000FF",
"hexa": "#0000FF"
},
{
"name": "Branco",
"label": "Branco,#FFFFFF",
"hexa": "#FFFFFF"
},
{
"name": "Laranja",
"label": "Laranja,#FFA500",
"hexa": "#FFA500"
},
{
"name": "Verde Escuro",
"label": "Verde Escuro,#006400",
"hexa": "#006400"
},
{
"name": "Vermelho",
"label": "Vermelho,#FF0000",
"hexa": "#FF0000"
}
],
"specialcategorie": "",
"reference": "3546",
"rating": 0,
"free_shipping": false,
"popularity": 0,
"objectID": "30"
}
for some context a product can have multiple prices associated, for a specific user, or one day there is campaign giving discount, etc so for that cases want to filter price associated to the product/record.
No, this is not possible with Algolia. Records are always returned with the attributes specified inside attributesToRetrieve. These attributes are returned in full.

Nested grouping of array

There are 3 master collection of category , subcategory and criteria each, i will be building framework with any possible combination of category , subcategory and criteria which will be stored as below-
framework document is added below having list of criteriaconfig as embedded object which further have single object of category , subcategory and criteria. you can refer criteriaconfig as link table that u call in mysql.
[
{
"id": "592bc3059f3ad715002b2331",
"name": "Framework1",
"description": "framework 1 for testing",
"criteriaConfigs": [
{
"id": "592bc3059f3ad715002b232f",
"category": {
"id": "591c2f5faa187956b2d0fb39",
"name": "category1",
"description": "category1",
"deleted": false,
"createdDate": 1495019359558
},
"subCategory": {
"id": "591c2f5faa187956b2d0fb83",
"name": "subCat1",
"description": "subCat1"
},
"criteria": {
"id": "591c2f5faa187956b2d0fbad",
"name": "criteria1",
"measure": "Action"
}
},
{
"id": "592bc3059f3ad715002b232e",
"category": {
"id": "591c2f5faa187956b2d0fb37",
"name": "Process",
"description": "Enagagement"
},
"subCategory": {
"id": "591c2f5faa187956b2d0fb81",
"name": "COMM / BRANDING",
"description": "COMM / BRANDING"
},
"criteria": {
"id": "591c2f5faa187956b2d0fba9",
"name": "Company representative forgets about customer on hold",
"measure": ""
}
} ]
},
{
"id": "592bc3059f3ad715002b2332",
"name": "Framework2",
"description": "framework 2 for testing",
"criteriaConfigs": [
{
"id": "592bc3059f3ad715002b232f",
"category": {
"id": "591c2f5faa187956b2d0fb39",
"name": "category1",
"description": "category1"
},
"subCategory": {
"id": "591c2f5faa187956b2d0fb83",
"name": "subCat1",
"description": "subCat1"
},
"criteria": {
"id": "591c2f5faa187956b2d0fbad",
"name": "criteria1",
"measure": "Action"
}
}
]
}
]
i need a view containing framework that will contain all list of category and inside category there will be list of added subcategory and inside subcategory will have list of criteria for single framework.
expected result -
[
{
"id": "f1",
"name": "Framework1",
"description": "framework 1 for testing",
"categories": [
{
"id": "c2",
"name": "category2",
"description": "category2",
"subCategories": [
{
"id": "sb1",
"name": "subCat1",
"description": "subCat1",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr2",
"name": "criteria2",
"measure": "Action"
},
{
"id": "cr3",
"name": "criteria3",
"measure": "Action"
}]
},
{
"id": "sb2",
"name": "subCat2",
"description": "subCat2",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr4",
"name": "criteria4",
"measure": "Action"
}]
}]
},
{
"id": "c1",
"name": "category1",
"description": "category1",
"subCategories": [
{
"id": "sb3",
"name": "subCat3",
"description": "subCat3",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr2",
"name": "criteria2",
"measure": "Action"
}
]},
{
"id": "sb2",
"name": "subCat2",
"description": "subCat2",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr4",
"name": "criteria4",
"measure": "Action"
}]
}
]
}]
},
{
"id": "f2",
"name": "Framework2",
"description": "framework 2 for testing",
"categories": [
{
"id": "c2",
"name": "category2",
"description": "category2",
"subCategories": [
{
"id": "sb4",
"name": "subCat5",
"description": "subCat5",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr3",
"name": "criteria3",
"measure": "Action"
}]
},
{
"id": "sb2",
"name": "subCat2",
"description": "subCat2",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr4",
"name": "criteria4",
"measure": "Action"
}]
}]
},
{
"id": "c1",
"name": "category1",
"description": "category1",
"subCategories": [
{
"id": "sb3",
"name": "subCat3",
"description": "subCat3",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr2",
"name": "criteria2",
"measure": "Action"
}
]},
{
"id": "sb2",
"name": "subCat2",
"description": "subCat2",
"criterias": [
{
"id": "cr1",
"name": "criteria1",
"measure": "Action"
},
{
"id": "cr4",
"name": "criteria4",
"measure": "Action"
}]
}
]
}]
}
]
Note - Category document doesn't have any reference to subcategory and same way subcategory doesn't have any reference to criteria object currently as they are master data and are generic , framework is created with their combination dynamically.
If you want to try to do all the work in the aggregation, you could group first by subcategory, then by category like:
db.collection.aggregate([
{$unwind:"$criteriaConfigs"},
{$project:{
_id:0,
category:"$criteriaConfigs.category",
subCategory:"$criteriaConfigs.subCategory",
criteria:"$criteriaConfigs.criteria"
}},
{$group:{
_id:{"category":"$category","subCategory":"$subCategory"},
criteria:{$addToSet:"$criteria"}
}},
{$group:{
_id:{"category":"$_id.category"},
subCategories:{$addToSet:{subCategory:"$_id.subCategory",
criteria:"$criteria"}}
}},
{$project:{
_id:0,category:"$_id.category",
subCategories:"$subCategories"
}}
])
Depending on how you plan to us the return data, it may be more efficient to return each unique combination:
db.collection.aggregate([
{$unwind:"$criteriaConfigs"},
{$group:{
_id:{
category:"$criteriaConfigs.category.name",
subCategory:"$criteriaConfigs.subCategory.name",
criteria:"$criteriaConfigs.criteria.name"
}
}},
{$project:{
_id:0,
category:"$_id.category",
subCategory:"$_id.subCategory",
criteria:"$_id.criteria"
}}
])
I'm not sure from your question what shape you are expecting the return data to have, so you may need to adjust for that.

Mongodb query inside array with single ObjectID

Here is the json data i am trying to process. I am trying to get messages between two dates. The data is already imported to mongodb.
{
"items": [
{
"date": "2017-04-06T09:46:20.387420+00:00",
"from": {
"id": 4624534,
"links": {
"self": "https://api.hipchat.com/v2/user/4624534"
},
"mention_name": "holy",
"name": "holy god",
"version": "Y1ML0DRJ"
},
"id": "38f90558-2a23-458b-b87b-88dbdf997f7a",
"mentions": [],
"message": "ping",
"type": "message"
},
{
"date": "2017-04-08T04:30:44.240163+00:00",
"from": {
"id": 4624534,
"links": {
"self": "https://api.hipchat.com/v2/user/4624534"
},
"mention_name": "holy",
"name": "holy god",
"version": "Y1ML0DRJ"
},
"id": "822b81e0-8077-41d7-bc50-fc9e4eba7d9e",
"mentions": [],
"message": "https://twitter.com/",
"type": "message"
},
{
"attach_to": "822b81e0-8077-41d7-bc50-fc9e4eba7d9e",
"card": "{\"style\": \"link\", \"description\": \"From breaking news and entertainment to sports and politics, get the full story with all the live commentary.\", \"format\": \"medium\", \"url\": \"https://twitter.com/i/hello\", \"title\": \"Twitter. It's what's happening.\", \"id\": \"https://twitter.com/i/hello\", \"validation\": {\"safehtmls\": [\"activity.html\"], \"safeurls\": [\"url\", \"images.image\", \"images.image-small\", \"images.image-big\", \"icon.url\", \"icon.url#2x\", \"icon\", \"thumbnail.url#2x\", \"thumbnail.url\"]}, \"type\": \"link\", \"thumbnail\": {\"url\": \"https://pbs.twimg.com/ext_tw_video_thumb/850335753108324353/pu/img/T8cV-7bGbbguiRGV.jpg\", \"width\": 599, \"type\": \"image/jpeg\", \"height\": 337}, \"icon\": {\"url\": \"https://abs.twimg.com/a/1491551685/img/t1/favicon.svg\", \"type\": \"image\"}}",
"color": "gray",
"date": "2017-04-08T04:30:44.825185+00:00",
"from": "Link",
"id": "7ccaf2b9-09bb-45ac-a025-c93f1f7df745",
"mentions": [],
"message": "\n\n\n<p><b>Twitter. It's what's happening.</b></p>\n\n\n<p>From breaking news and entertainment to sports and politics, get the full story with all the live commentary.</p>\n\n",
"message_format": "html",
"notification_sender": {
"client_id": "888aec94-afee-45d8-89f7-ae077fcc4a7c",
"id": "hipchat-clinky",
"type": "addon"
},
"type": "notification"
},
{
"date": "2017-04-08T09:39:00.468858+00:00",
"from": {
"id": 4624534,
"links": {
"self": "https://api.hipchat.com/v2/user/4624534"
},
"mention_name": "abcholy",
"name": "holy god",
"version": "Y1ML0DRJ"
},
"id": "8a0de0e0-c312-490e-afcc-b0b16404cd67",
"mentions": [],
"message": "second message",
"type": "message"
},
{
"date": "2017-04-11T15:32:39.367744+00:00",
"from": {
"id": 4624534,
"links": {
"self": "https://api.hipchat.com/v2/user/4624534"
},
"mention_name": "abcholy",
"name": "holy god",
"version": "Y1ML0DRJ"
},
"id": "4c1a090c-cb71-4548-8f96-a03ec2f3fb3b",
"mentions": [],
"message": "https://gist.github.com/abcholy/6d86352e73eab21cdd4fe78b37bd5aa0",
"type": "message"
},
{
"date": "2017-04-11T15:33:42.730696+00:00",
"from": {
"id": 4624534,
"links": {
"self": "https://api.hipchat.com/v2/user/4624534"
},
"mention_name": "abcholy",
"name": "holy god",
"version": "Y1ML0DRJ"
},
"id": "a42b5267-937b-4de5-8c51-de4625742a4a",
"mentions": [],
"message": "hello",
"type": "message"
}
],
"links": {
"self": "https://api.hipchat.com/v2/room/3452990/history"
},
"maxResults": 100,
"startIndex": 0
}
Now, I entered this mongo query:
db.data.find( {"_id" : ObjectId("58ee59f7f35120aaba26cff0")},{ items: { $elemMatch: { "date": {$gte:"2017-04-08T04:30:44.240163+00:00"} } } } )
But it returns just a single item which is the first one. If I try $lte, it also returns single item but I want all the items which fall under the specification of date. How to achieve that?
Your date field is a string, it has no concept of what's greater than that string. You'll need to convert it to a ISODate or another date format for you to use the $gte or $lte operation.

what is the date format from stash api?

In the below json response, what is the date format for createdDate and updatedDate? I am not sure how to work in reverse to find what format the api is using for date. I couldn't find this any where in the documentation.
{
"size": 1,
"limit": 25,
"isLastPage": true,
"values": [
{
"id": 101,
"version": 1,
"title": "Talking Nerdy",
"description": "It’s a kludge, but put the tuple from the database in the cache.",
"state": "OPEN",
"open": true,
"closed": false,
"createdDate": 1359075920,
"updatedDate": 1359085920,
"fromRef": {
"id": "refs/heads/feature-ABC-123",
"repository": {
"slug": "my-repo",
"name": null,
"project": {
"key": "PRJ"
}
}
},
"toRef": {
"id": "refs/heads/master",
"repository": {
"slug": "my-repo",
"name": null,
"project": {
"key": "PRJ"
}
}
},
"locked": false,
"author": {
"user": {
"name": "tom",
"emailAddress": "tom#example.com",
"id": 115026,
"displayName": "Tom",
"active": true,
"slug": "tom",
"type": "NORMAL"
},
"role": "AUTHOR",
"approved": true
},
"reviewers": [
{
"user": {
"name": "jcitizen",
"emailAddress": "jane#example.com",
"id": 101,
"displayName": "Jane Citizen",
"active": true,
"slug": "jcitizen",
"type": "NORMAL"
},
"role": "REVIEWER",
"approved": true
}
],
"participants": [
{
"user": {
"name": "dick",
"emailAddress": "dick#example.com",
"id": 3083181,
"displayName": "Dick",
"active": true,
"slug": "dick",
"type": "NORMAL"
},
"role": "PARTICIPANT",
"approved": false
},
{
"user": {
"name": "harry",
"emailAddress": "harry#example.com",
"id": 99049120,
"displayName": "Harry",
"active": true,
"slug": "harry",
"type": "NORMAL"
},
"role": "PARTICIPANT",
"approved": true
}
],
"link": {
"url": "http://link/to/pullrequest",
"rel": "self"
},
"links": {
"self": [
{
"href": "http://link/to/pullrequest"
}
]
}
}
],
"start": 0
}
Just making a note that in my case, it is a UNIX timestamp, but I have to remove three trailing zeroes. E.g. the data looks like this:
"createdDate":1555621993000
If interpreted as a UNIX timestamp, that would be 09/12/51265 # 4:16am (UTC).
By removing the three trailing zeroes I get 1555621993, which is the correct time 04/18/2019 # 9:13pm (UTC)
Your mileage may vary but that was a key discovery for me :)
It looks like a UNIX timestamp.
https://en.wikipedia.org/wiki/Unix_time