Using mongodb aggregate to match documents and get all values for a field - mongodb

In Mongo 3.4
I have a collection with documents in the format:
Type1:
{
"Level1": {
"#version": "genR",
"#revision": "aux",
"Level2": {
"container": {
"type": "ARRAY",
"categories": [
{
"category": [
{
"Type": "STRING",
"Value": "Currency"
},
{
"Type": "STRING",
"Value": "EUR"
}
]
},
{
"category": [
{
"Type": "STRING",
"Value": "Portfolio"
},
{
"Type": "STRING",
"Value": "ABCDEF"
}
]
},
]
}
}
}
}
Type 2:
{
"Level1": {
"#version": "genR",
"#revision": "aux",
"Level2": {
"container": {
"type": "ARRAY",
"categories": [
{
"category": [
{
"Type": "STRING",
"Value": "Currency"
},
{
"Type": "STRING",
"Value": "EUR"
}
]
},
{
"category": [
{
"Type": "STRING",
"Value": "Portfolio"
},
{
"Type": "STRING",
"Value": "ABCDEF"
}
]
},
{
"category": [
{
"Type": "STRING",
"Value": "Short Description"
},
{
"Type": "STRING",
"Value": "Cash Only"
}
]
},
]
}
}
}
}
How do i write an aggregate statement so that I get ALL the Currency Values, ONLY from the documents where Portfolio matches a certain value.
I have been using pymongo's aggregate framework as below:
pipeline = [{"$unwind":"$Level1.Level2.container.categories"},{"$unwind":"$Level1.Level2.container.categories.category"},{"$match":{"Level1.Level2.container.categories.category.Value":"Portfolio"}}]
pprint(db.command('aggregate',collection,pipeline=pipeline))
But no results. Pymongo is a little confusing. Even if someone can point the general approach, it would really help.
The expected response assuming 4 matching documents (each with varying number of category items) is:
{'Currency': [{'Level1': {'Level2': {'container': {'categories': {'category': {'Value': 'EUR'}}}}}},
{'Level1': {'Level2': {'container': {'categories': {'category': {'Value': 'EUR'}}}}}},
{'Level1': {'Level2': {'container': {'categories': {'category': {'Value': 'USD'}}}}}},
{'Level1': {'Level2': {'container': {'categories': {'category': {'Value': 'EUR'}}}}}}]}

Your structure is not ideal but you can use below query.
The below $match stage $ands two conditions. Looks in category array ($elemMatch) under categories ($elemMatch) array for elements satisfying both ($all) Portfolio match with ABCDEF value condition followed by condition for element with Currrency value.
$unwind stage is to break down the categories followed by $match to keep the Currency category embedded array documents.
$unwind stage is to break down the category followed by $match to remove the Currency Value embedded document.
Final two stages is to $group + $push the remaining data into embedded array and $project the Currency value.
You can run one stage at a time to view the intermediate output for better understanding.
db.collection.aggregate(
{ $match :
{ $and :
[
{ "Level1.Level2.container.categories":
{ $elemMatch:
{ "category":
{ $all:
[
{ $elemMatch : { "Type": "STRING", "Value": "Portfolio" } },
{ $elemMatch : { "Type": "STRING", "Value": "ABCDEF" } }
]
}
}
}
},
{ "Level1.Level2.container.categories":
{ $elemMatch:
{ "category":
{ $elemMatch : { "Type": "STRING", "Value": "Currency" } }
}
}
}
]
}
},
{ $unwind : "$Level1.Level2.container.categories" },
{ $match : { "Level1.Level2.container.categories.category.Value": "Currency" } },
{ $unwind : "$Level1.Level2.container.categories.category" },
{ $match : { "Level1.Level2.container.categories.category.Value": { $ne : "Currency" } } },
{ $group: { _id: null, "Currency": { $push: "$$ROOT" } } },
{ $project: { _id: 0, "Currency.Level1.Level2.container.categories.category.Value": 1 } } )

Related

Querying a map (<String, Object>) in JSON through MongoDB

How to query a map of type Map<String, List> in JSON form, in MongoDB?
Sample JSON:
{
"WIDTH": 810,
"HEIGHT": 465,
"MODULES": {
"23": {
"XNAME": "COMP1",
"PARAMS": {
"_Klockers": {
"TYPE": "text",
"VALUE": "Klocker#3"
},
"SUBSYS": {
"TYPE": "text",
"VALUE": "2"
},
"EP": {
"TYPE": "integer",
"VALUE": "2"
}
}
},
"24": {
"XNAME": "COMP2",
"PARAMS": {
"_Rockers": {
"TYPE": "text",
"VALUE": "Rocker#3"
},
"Driver": {
"TYPE": "binary",
"VALUE": 1
},
"EP": {
"TYPE": "long",
"VALUE": "233"
}
}
},
"25": {
"XNAME": "COMP3",
"PARAMS": {
"_Mockers": {
"TYPE": "text",
"VALUE": "Mocker#3"
},
"SYSMain": {
"TYPE": "text",
"VALUE": "2342"
},
"TLP": {
"TYPE": "double",
"VALUE": "2.3"
}
}
}
}
}
Basically I want to :
List all the "XNAME" field values of all keys in "MODULES".
Expected output : {"COMP1", "COMP2", "COMP3"}
List all the "TYPE" in "PARAMS" object within each key of "MODULES".
Expected output : {"text", "text", "integer", "text", "binary", "long", "text", "text", "double"}
I am new to MongoDB and any help or redirection is appreciated.
You can use this
db.collection.aggregate([
{
$project: {//You require this as your data is dynamic
"modules": {
"$objectToArray": "$MODULES"
}
}
},
{//Destruct the array
"$unwind": "$modules"
},
{
"$project": {//Again, requires the same as keys are dynamic
"types": {
"$objectToArray": "$modules.v.PARAMS"
},
xname: "$modules.v.XNAME"
}
},
{//Destruct the types
$unwind: "$types"
},
{//Get the distinct values
$group: {
"_id": null,
"xname": {
"$addToSet": "$xname"
},
"types": {
"$addToSet": "$types.v.TYPE"
},
}
}
])

mongodb distinct query values

I have the following mongodb documents:
{
"_id": "",
"name": "example1",
"colors": [
{
"id": 1000000,
"properties": [
{
"id": "1000",
"name": "",
"value": "green"
},
{
"id": "2000",
"name": "",
"value": "circle"
}
]
} ]
}
{
"_id": "",
"name": "example2",
"colors": [
{
"id": 1000000,
"properties": [
{
"id": "1000",
"name": "",
"value": "red"
},
{
"id": "4000",
"name": "",
"value": "box"
}
]
} ]
}
I would like to get distinct queries on the value field in the array where id=1000
db.getCollection('product').distinct('colors.properties.value', {'colors.properties.id':{'$eq': 1000}})
but it returns all values in the array.
The expected Result would be:
["green", "red"]
There are a lot of way to do.
$match eliminates unwanted data
$unwind de-structure the array
$addToSet in $group gives the distinct data
The mongo script :
db.collection.aggregate([
{
$match: {
"colors.properties.id": "1000"
}
},
{
"$unwind": "$colors"
},
{
"$unwind": "$colors.properties"
},
{
$match: {
"colors.properties.id": "1000"
}
},
{
$group: {
_id: null,
distinctData: {
$addToSet: "$colors.properties.value"
}
}
}
])
Working Mongo playground

Mongo Aggregation using $Max

I have a collection that stores history, i.e. a new document is created every time a change is made to the data, I need to extract fields based on the max value of a date field, however my query keeps returning either all of the dates or requires me to push the fields into an array which make the data hard to analyze for an end-user.
Expected output as CSV:
MAX(DATE), docID, url, type
1579719200216, 12371, www.foodnetwork.com, food
1579719200216, 12371, www.cnn.com, news,
1579719200216, 12371, www.wikipedia.com, info
Sample Doc:
{
"document": {
"revenueGroup": "fn",
"metaDescription": "",
"metaData": {
"audit": {
"lastModified": 1312414124,
"clientId": ""
},
"entities": [],
"docId": 1313943,
"url": ""
},
"rootUrl": "",
"taggedImages": {
"totalSize": 1,
"list": [
{
"image": {
"objectId": "woman-reaching-for-basket",
"caption": "",
"url": "",
"height": 3840,
"width": 5760,
"owner": "Facebook",
"alt": "Woman reaching for basket"
},
"tags": {
"totalSize": 4,
"list": []
}
}
]
},
"title": "The 8 Best Food Items of 2020",
"socialTitle": "The 8 Best Food Items of 2020",
"primaryImage": {
"objectId": "woman-reaching-for-basket.jpg",
"caption": "",
"url": "",
"height": 3840,
"width": 5760,
"owner": "Hero Images / Getty Images",
"alt": "Woman reaching for basket in laundry room"
},
"subheading": "Reduce your footprint with these top-performing diets",
"citations": {
"list": []
},
"docId": 1313943,
"revisionId": "1313943_1579719200216",
"templateType": "LIST",
"documentState": {
"activeDate": 579719200166,
"state": "ACTIVE"
}
},
"url": "",
"items": {
"totalSize": "",
"list": [
{
"type": "recipe",
"data": {
"comInfo": {
"list": [
{
"type": "food",
"id": "https://www.foodnetwork.com"
}
]
},
"type": ""
},
"id": 4,
"uuid": "1313ida-qdad3-42c3-b41d-223q2eq2j"
},
{
"type": "recipe",
"data": {
"comInfo": {
"list": [
{
"type": "news",
"id": "https://www.cnn.com"
},
{
"type": "info",
"id": "https://www.wikipedia.com"
}
]
},
"type": "PRODUCT"
},
"id": 11,
"uuid": "318231jc-da12-4475-8994-283u130d32"
}
]
},
"vertical": "food"
}
Below query:
db.collection.aggregate([
{
$match: {
vertical: "food",
"document.documentState.state": "ACTIVE",
"document.templateType": "LIST"
}
},
{
$unwind: "$document.items"
},
{
$unwind: "$document.items.list"
},
{
$unwind: "$document.items.list.contents"
},
{
$unwind: "$document.items.list.contents.list"
},
{
$match: {
"document.items.list.contents.list.type": "recipe",
"document.revenueGroup": "fn"
}
},
{
$sort: {
"document.revisionId": -1
}
},
{
$group: {
_id: {
_id: {
docId: "$document.docId",
date: {$max: "$document.revisionId"}
},
url: "$document.items.list.contents.list.data.comInfo.list.id",
type: "$document.items.list.contents.list.data.comInfo.list.type"
}
}
},
{
$project: {
_id: 1
}
},
{
$sort: {
"document.items.list.contents.list.id": 1, "document.revisionId": -1
}
}
], {
allowDiskUse: true
})
First of all, you need to go through the documentation of the $group aggregation here.
you should be doing this instead:
{
$group: {
"_id": "$document.docId"
"date": {
$max: "$document.revisionId"
},
"url": {
$first: "$document.items.list.contents.list.data.comInfo.list.id"
},
"type": {
$first:"$document.items.list.contents.list.data.comInfo.list.type"
}
}
}
This will give you the required output.

Combining unique elements of arrays without $unwind

I would like to get the unique elements of all arrays in a collection. Consider the following collection
[
{
"collection": "collection",
"myArray": [
{
"name": "ABC",
"code": "AB"
},
{
"name": "DEF",
"code": "DE"
}
]
},
{
"collection": "collection",
"myArray": [
{
"name": "GHI",
"code": "GH"
},
{
"name": "DEF",
"code": "DE"
}
]
}
]
I can achieve this by using $unwind and $group like this:
db.collection.aggregate([
{
$unwind: "$myArray"
},
{
$group: {
_id: null,
data: {
$addToSet: "$myArray"
}
}
}
])
And get the output:
[
{
"_id": null,
"data": [
{
"code": "GH",
"name": "GHI"
},
{
"code": "DE",
"name": "DEF"
},
{
"code": "AB",
"name": "ABC"
}
]
}
]
However, the array "myArray" will have a lot of elements (about 6) and the number of documents passed into this stage of the pipeline will be about 600. So unwinding the array would give me a total of 3600 documents being processed. I would like to know if there's a way for me to achieve the same result without unwinding
You can use below aggregation
db.collection.aggregate([
{ "$group": {
"_id": null,
"data": { "$push": "$myArray" }
}},
{ "$project": {
"data": {
"$reduce": {
"input": "$data",
"initialValue": [],
"in": { "$setUnion": ["$$this", "$$value"] }
}
}
}}
])
Output
[
{
"_id": null,
"data": [
{
"code": "AB",
"name": "ABC"
},
{
"code": "DE",
"name": "DEF"
},
{
"code": "GH",
"name": "GHI"
}
]
}
]

Order by nested document Mongodb

I'm having multiple documents in a collection, each document has this data structure :
{
"_id": {
"$id": "5429409c9ac25ebe338b4567"
},
"data": [
{
"data_id": "70",
"info_data": [
{
"data_id": "98",
"data_index": 0,
"value": "info data"
},
{
"data_id": "99",
"data_index": 0,
"value": "some info data
}
]
},
{
"data_id": "71",
"info_data": [
{
"data_id": "98",
"data_index": 0,
"value": "some data"
},
{
"data_id": "99",
"data_index": 0,
"value": "more data"
}
]
}
]
},
{
"_id": {
"$id": "542940ac9ac25ef6358b4567"
},
"data": [
{
....
I need to conditionally sort these documents. for example I need to sort all the data.info_data documents only when the data.info_data.data_id = 98 by data.info_data.value only
So basically I need to sort an inner document that only matches to some criteria (the inner document no the external one).
I guess I need to use aggregation with unwind but I'm not sure how.