How to retrieve each single array element from mongo pipeline? - mongodb

Let's assume that this is how a sample document looks like in mongo-db,
[
{
"_id": "1",
"attrib_1": "value_1",
"attrib_2": "value_2",
"months": {
"2": {
"month": "2",
"year": "2008",
"transactions": [
{
"field_1": "val_1",
"field_2": "val_2",
},
{
"field_1": "val_4",
"field_2": "val_5",
"field_3": "val_6"
},
]
},
"3": {
"month": "3",
"year": "2018",
"transactions": [
{
"field_1": "val_7",
"field_3": "val_9"
},
{
"field_1": "val_10",
"field_2": "val_11",
},
]
},
}
}
]
The desired output is something like this, (I am just showing it for months 2 & 3)
id
months
year
field_1
field_2
field_3
1
2
2008
val_1
val_2
1
2
2008
val_4
val_5
val_6
1
3
2018
val_7
val_9
1
3
2018
val_10
val_11
My attempt:
I tried something like this in Py-Mongo,
pipeline = [
{
# some filter logic here to filter data basically first
},
{
"$addFields": {
"latest": {
"$map": {
"input": {
"$objectToArray": "$months",
},
"as": "obj",
"in": {
"all_field_1" : {"$ifNull" : ["$$obj.v.transactions.field_1", [""]]},
"all_field_2": {"$ifNull" : ["$$obj.v.transactions.field_2", [""]]},
"all_field_3": {"$ifNull" : ["$$obj.v.transactions.field_3", [""]]},
"all_months" : {"$ifNull" : ["$$obj.v.month", ""]},
"all_years" : {"$ifNull" : ["$$obj.v.year", ""]},
}
}
}
}
},
{
"$project": {
"_id": 1,
"months": "$latest.all_months",
"year": "$latest.all_years",
"field_1": "$latest.all_field_1",
"field_2": "$latest.all_field_2",
"field_3": "$latest.all_field_3",
}
}
]
# and I executed it as
my_db.collection.aggregate(pipeline, allowDiskUse=True)
The above is actually bring the data but it's bringing them in lists. Is there a way to easily bring them one each row in mongo itself?
the above brings data in this way,
id
months
year
field_1
field_2
field_3
1
["2", "3"]
["2008", "2018"]
[["val_1", "val_4"], ["val_7", "val_10"]]
[["val_2", "val_5"], ["", "val_11"]]
[["", "val_6"], ["val_9", ""]]
Would highly appreciate your valuable inputs regarding the same and a better way to do the same as well!
Thanks for your time.
My Mongo version is 3.4.6 and I am using PyMongo as my driver. You can see the query in action at mongo-db-playground

This is might be bad idea to do all process in a aggregation query, you could do this in your client side,
I have created a query which is lengthy may cause performance issues in huge data,
$objectToArray convert months object to array
$unwind deconstruct months array
$unwind deconstruct transactions array and provide index field index
$group by _id, year, month and index, and get first object from transactions in fields
$project you can design your response if you want otherwise this is optional i have added in playground link
my_db.collection.aggregate([
{ # some filter logic here to filter data basically first },
{ $project: { months: { $objectToArray: "$months" } } },
{ $unwind: "$months" },
{
$unwind: {
path: "$months.v.transactions",
includeArrayIndex: "index"
}
},
{
$group: {
_id: {
_id: "$_id",
year: "$months.v.year",
month: "$months.v.month",
index: "$index"
},
fields: { $first: "$months.v.transactions" }
}
}
], allowDiskUse=True);
Playground

Related

mongodb query to filter the array of objects using $gte and $lte operator

My doucments:
[{
"_id":"621c6e805961def3332bcf97",
"title":"monk plus",
"brand":"venture electronics",
"category":"earphones",
"variant":[
{
"price":1100,
"impedance":"16ohm"
},
{
"price":1600,
"impedance":"64ohm"
}],
"salesCount":185,
"buysCount":182,
"viewsCount":250
},
{
"_id":"621c6dab5961def3332bcf92",
"title":"nokia1",
"brand":"nokia",
"category":"mobile phones",
"variant":[
{
"price":10000,
"RAM":"4GB",
"ROM":"32GB"
},
{
"price":15000,
"RAM":"6GB",
"ROM":"64GB"
},
{
"price":20000,
"RAM":"8GB",
"ROM":"128GB"
}],
"salesCount":34,
"buysCount":21,
"viewsCount":80
}]
expected output
[{
_id:621c6e805961def3332bcf97
title:"monk plus"
brand:"venture electronics"
category:"earphones"
salesCount:185
viewsCount:250
variant:[
{
price:1100
impedance:"16ohm"
}]
}]
I have tried this aggregation method
[{
$match: {
'variant.price': {
$gte: 0,$lte: 1100
}
}},
{
$project: {
title: 1,
brand: 1,
category: 1,
salesCount: 1,
viewsCount: 1,
variant: {
$filter: {
input: '$variant',
as: 'variant',
cond: {
$and: [
{
$gte: ['$$variant.price',0]
},
{
$lte: ['$$variant.price',1100]
}
]
}
}
}
}}]
This method returns the expected output, now my question is there any other better approach that return the expected output.Moreover thank you in advance, and as I am new to nosql database so I am curious to learn from the community.Take a note on expected output all properties of particular document must return only the variant array of object I want to filter based on the price.
There's nothing wrong with your aggregation pipeline, and there are other ways to do it. If you just want to return matching documents, with only the first matching array element, here's another way to do it. (The .$ syntax only returns the first match unfortunately.)
db.collection.find({
// matching conditions
"variant.price": {
"$gte": 0,
"$lte": 1100
}
},
{
title: 1,
brand: 1,
category: 1,
salesCount: 1,
viewsCount: 1,
// only return first array element that matched
"variant.$": 1
})
Try it on mongoplayground.net.
Or, if you want to use an aggregation pipeline and return all matching documents in entirety except for the filtered array, you could just "overwrite" the array with the elements you want using "$set" (or its alias "$addFields"). Doing this means you won't need to "$project" anything.
db.collection.aggregate([
{
"$match": {
"variant.price": {
"$gte": 0,
"$lte": 1100
}
}
},
{
"$set": {
"variant": {
"$filter": {
"input": "$variant",
"as": "variant",
"cond": {
"$and": [
{ "$gte": [ "$$variant.price", 0 ] },
{ "$lte": [ "$$variant.price", 1100 ] }
]
}
}
}
}
}
])
Try it on mongoplayground.net.
your solution is good, just make sure to apply your $match and pagination before applying this step for faster queries

MongoDb - Update all properties in an object using MongoShell

I have a collection with many documents containing shipping prices:
{
"_id": {
"$oid": "5f7439c3bc3395dd31ca4f19"
},
"adapterKey": "transport1",
"pricegrid": {
"10000": 23.66,
"20000": 23.75,
"30000": 23.83,
"31000": 43.5,
"40000": 44.16,
"50000": 49.63,
"60000": 50.25,
"70000": 52,
"80000": 56.62,
"90000": 59,
"100000": 62.5,
"119000": 68.85,
"149000": 80,
"159000": 87,
"179000": 94,
"199000": 100.13,
"249000": 118.5,
"299000": 138.62,
"999000": 208.63
},
"zones": [
"25"
],
"franco": null,
"tax": 20,
"doc_created": {
"$date": "2020-09-30T07:54:43.966Z"
},
"idConfig": "0000745",
"doc_modified": {
"$date": "2020-09-30T07:54:43.966Z"
}
}
In pricegrid, all the properties can be different from one grid to another.
I'd like to update all the prices in the field "pricegrid" (price * 1.03 + 1).
I tried this :
db.shipping_settings.updateMany(
{ 'adapterKey': 'transport1' },
{
$mul: { 'pricegrid.$': 1.03 },
$inc: { 'pricegrid.$': 1}
}
)
Resulting in this error :
MongoServerError: Updating the path 'pricegrid.$' would create a conflict at 'grille.$'
So I tried with only $mul (planning on doing $inc in another query) :
db.livraison_config.updateMany(
{ 'adapterKey': 'transport1' },
{
$mul: { 'pricegrid.$': 1.03 }
}
)
But in that case, I get this error :
MongoServerError: The positional operator did not find the match needed from the query.
Could you please direct me on the correct way to write the request ?
You can use an aggregation pipeline in an update. $objectToArray pricegrid to convert it into an array of k-v tuple first. Then, do a $map to perform the computation. Finally, $arrayToObject to convert it back.
db.collection.update({
"adapterKey": "transport1"
},
[
{
$set: {
pricegrid: {
"$objectToArray": "$pricegrid"
}
}
},
{
"$set": {
"pricegrid": {
"$map": {
"input": "$pricegrid",
"as": "p",
"in": {
"k": "$$p.k",
"v": {
"$add": [
{
"$multiply": [
"$$p.v",
1.03
]
},
1
]
}
}
}
}
}
},
{
$set: {
pricegrid: {
"$arrayToObject": "$pricegrid"
}
}
}
])
Here is the Mongo playground for your reference.
You can do it with Aggregation framework:
$objectToArray - to transform pricegrid object to array so you can iterate of its items
$map to iterate over array generated in previous step
$sum and multiply to perform mathematical operations
$arrayToObject to transform updated array back to object
db.collection.update({
"adapterKey": "transport1"
},
[
{
"$set": {
"pricegrid": {
"$arrayToObject": {
"$map": {
"input": {
"$objectToArray": "$pricegrid"
},
"in": {
k: "$$this.k",
v: {
"$sum": [
1,
{
"$multiply": [
"$$this.v",
1.02
]
}
]
}
}
}
}
}
}
}
],
{
"multi": true
})
Working example
I might be wrong, but it looks like there's currently no support for this feature - there's actually an open jira-issue that addresses this topic. Doesn't look like this is going to be implemented though.

MongoDB $push aggregaton won't keep the right order

I tried to make a $group aggregation with MongoDB, like the following example:
"$group": {
"_id": "$test_id",
"feeling": {
"$push": "$feeling"
},
"reference_id": {
"$push": "$_id"
},
"training_start": {
"$push": "$training_start"
},
"training_duration": {
"$push": "$duration_ms"
}
}
The aggregation works fine, but the created arrays are sorted different. That means, if I check the result of the aggregation by looking at reference_id[x] and training_start[x] then the value of training_start in the source collection is not equal to training_start[x].
Maybe an example shows my problem more precisely:
One document after the $group aggregation:
{
_id: "string_1",
reference_id: [1, 2, 3],
training_start: [01:00:00, 02:00:00, 03:00:00] (date times)
}
Documents from source collection:
{
_id:1,
training_start: 01:00:00,
test_id: "string_1"
},
{
_id:2,
training_start: 03:00:00,
test_id: "string_1"
},
{
_id:3,
training_start: 02:00:00,
test_id: "string_1"
}
The first elements in these arrays are always in the right order. So I checked if each grouped field has the same number of entries by using the code below. And the annoying result is, that the amount of entries in each array is equal. So there is no shift in the arrays caused by missing values.
"$group": {
"_id": "$test_id",
"sum": {
"$sum": {
"$cond": {
"if": {
"$lte": [
"$training_start", null
]
},
"then": 0,
"else": 1
}
}
}
Does anybody know, if there is an other way to create arrays (already tried $addToSet) which keep the order, the elements where pushed in? Or am I the problem?
Greetings Max

mongodb aggregation - nested group

I'm trying to perform nested group, I have an array of documents that has two keys (invoiceIndex, proceduresIndex) I need the documents to be arranged like so
invoices (parent) -> procedures (children)
invoices: [ // Array of invoices
{
.....
"procedures": [{}, ...] // Array of procedures
}
]
Here is a sample document
{
"charges": 226.09000000000003,
"currentBalance": 226.09000000000003,
"insPortion": "",
"currentInsPortion": "",
"claim": "notSent",
"status": "unpaid",
"procedures": {
"providerId": "9vfpjSraHzQFNTtN7",
"procedure": "21111",
"description": "One surface",
"category": "basicRestoration",
"surface": [
"m"
],
"providerName": "B Dentist",
"proceduresIndex": "0"
},
"patientId": "mE5vKveFArqFHhKmE",
"patientName": "Silvia Waterman",
"invoiceIndex": "0",
"proceduresIndex": "0"
}
Here is what I have tried
https://mongoplayground.net/p/AEBGmA32n8P
Can you try the following;
db.collection.aggregate([
{
$group: {
_id: "$invoiceIndex",
procedures: {
$push: "$procedures"
},
invoice: {
$first: "$$ROOT"
}
}
},
{
$addFields: {
"invoice.procedures": "$procedures"
}
},
{
"$replaceRoot": {
"newRoot": "$invoice"
}
}
])
I retain the invoice fields with invoice: { $first: "$$ROOT" }, also keep procedures's $push logic as a separate field. Then with $addFields I move that array of procedures into the new invoice object. Then replace root to that.
You shouldn't use the procedureIndex as a part of _id in $group, for you won't be able to get a set of procedures, per invoiceIndex then. With my $group logic it works pretty well as you see.
Link to mongoplayground

Get sum of Nested Array in Aggregate

Ok, I have an issue I cannot seem to solve.
I have a document like this:
{
"playerId": "43345jhiuy3498jh4358yu345j",
"leaderboardId": "5b165ca15399c020e3f17a75",
"data": {
"type": "EclecticData",
"holeScores": [
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 3,
},
{
"type": "RoundHoleData",
"xtraStrokes": 1,
"strokes": 5,
},
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 4
}
]
}
}
Now, what I am trying to accomplish is using aggregate sum the strokes and then order it afterwards. I am trying this:
var sortedBoard = db.collection.aggregate(
{$match: {"leaderboardId": boardId}},
{$group: {
_id: "$playerId",
played: { $sum: 1 },
strokes: {$sum: '$data.holeScores.strokes'}
}
},
{$project:{
type: "$SortBoard",
avgPoints: '$played',
sumPoints: "$strokes",
played : '$played'
}}
);
The issue here is that I do net get the strokes sum correct, since this is inside another array.
Hope someone can help me with this and thanks in advance :-)
You need to say $sum twice:
var sortedBoard = db.collection.aggregate([
{ "$match": { "leaderboardId": boardId}},
{ "$group": {
"_id": "$playerId",
"SortBoard": { "$first": "$SortBoard" },
"played": { "$sum": 1 },
"strokes": { "$sum": { "$sum": "$data.holeScores.strokes"} }
}},
{ "$project": {
"type": "$SortBoard",
"avgPoints": "$playeyed",
"sumPoints": "$strokes",
"played": "$played"
}}
])
The reason is because you are using it both as a way to "sum array values" and also as an "accumulator" for $group.
The other thing you appear to be missing is that $group only outputs the fields you tell it to, therefore if you want to access other fields in other stages or output, you need to keep them with something like $first or another accumulator. We also appear to be missing a pipeline stage in the question anyway, but it's worth noting just to be sure.
Also note you really should wrap aggregation pipelines as an official array [], because the legacy usage is deprecated and can cause problems in some language implementations.
Returns the correct details of course:
{
"_id" : "43345jhiuy3498jh4358yu345j",
"avgPoints" : 1,
"sumPoints" : 12,
"played" : 1
}