Aggregate across two collections and update documents in first collect in pymongo - mongodb

I am using pymongo. I have a collection for which I want to update fields based on values from another collection.
Here's a document from the collection1.
{ _id: ObjectId("5fef7a23d0bdc785d4fc94e7"),
path: 'path1.png',
type: 'negative',
xmin: NaN,
ymin: NaN,
xmax: NaN,
ymax: NaN}
And from collection2:
{ _id: ObjectId("5fef7a24d0bdc785d4fc94e8"),
path: 'path1.png',
xmin: 200,
ymin: 200,
xmax: 300,
ymax: 300}
How do I update collection 1 so that the example document looks like:
{ _id: ObjectId("5fef7a23d0bdc785d4fc94e7"),
path: 'path1.png',
type: 'negative,
xmin: 200,
ymin: 200,
xmax: 300,
ymax: 300}

Fetch collection2 into a dict variable and use $set to update collection1, e.g.
for doc in db.collection2.find({}, {'_id': 0}):
db.collection1.update_one({'path': doc.get('path')}, {'$set': doc})

I have found a way to output it into a separate collection, but still not sure how to get it to the same collection.
db.collection1.aggregate[
{
'$match': {
'xmin': 'NaN'
}
}, {
'$lookup': {
'from': 'collection2',
'localField': 'path',
'foreignField': 'path',
'as': 'inferences'
}
}, {
'$project': {
'inferences.xmin': 1,
'inferences.ymin': 1,
'inferences.xmax': 1,
'inferences.ymax': 1,
'path': 1,
'type': 1,
'_id': 0
}
}, {
'$unwind': {
'path': '$inferences',
'preserveNullAndEmptyArrays': False
}
}, {
'$addFields': {
'xmin': '$inferences.xmin',
'ymin': '$inferences.ymin',
'xmax': '$inferences.xmax',
'ymax': '$inferences.ymax'
}
}, {
'$project': {
'path': 1,
'type': 1,
'xmin': 1,
'ymin': 1,
'xmax': 1,
'ymax': 1
}
}, {
'$out': 'collection3'
}
]

Related

Aggregate is returning (some) duplicate items with same ID

Our Mongo replication servers experienced a crash because of to many traffic and a query that wasn't optimized. This all has been solved and everything is working correctly, we just have one problem we can't figure out.
While our primary server crashed, some items got added. That went ok, but the weird thing is that these items that where added in the crash window are now returned double in our aggregate queries.
If I'm going a find() query, it doesn't show up. How is this even possible?
Here's our aggregate query:
`
[
{
'$match': { is_active: true, is_removed: { $ne: true }, _id: { $in: ['390195122352164864'] } }
},
{ '$sort': { 'list_ranking.default': -1 } },
{ '$limit': 48 },
{
'$lookup': {
from: 'items',
localField: 'parent_id',
foreignField: '_id',
as: 'parent'
}
},
{
'$lookup': {
from: 'items_type',
localField: 'item_type',
foreignField: 'key',
as: 'type'
}
},
{
'$lookup': {
from: 'items_rarity',
localField: 'item_rarity',
foreignField: 'key',
as: 'rarity'
}
},
{
'$lookup': {
from: 'items_serie',
localField: 'item_serie',
foreignField: 'key',
as: 'serie'
}
},
{
'$lookup': {
from: 'items_set',
localField: 'item_set',
foreignField: 'key',
as: 'sets'
}
},
{
'$lookup': {
from: 'items_introduction',
localField: 'item_introduction',
foreignField: 'key',
as: 'introduction'
}
},
{
'$lookup': {
from: 'items_background',
localField: 'item_background',
foreignField: 'key',
as: 'background'
}
},
{
'$lookup': {
from: 'items_set',
localField: 'parent.item_set',
foreignField: 'key',
as: 'parentSets'
}
},
{ '$unwind': { path: '$parent', preserveNullAndEmptyArrays: true } },
{ '$unwind': { path: '$type', preserveNullAndEmptyArrays: true } },
{ '$unwind': { path: '$rarity', preserveNullAndEmptyArrays: true } },
{ '$unwind': { path: '$serie', preserveNullAndEmptyArrays: true } },
{ '$unwind': { path: '$sets', preserveNullAndEmptyArrays: true } },
{
'$unwind': { path: '$introduction', preserveNullAndEmptyArrays: true }
},
{
'$unwind': { path: '$background', preserveNullAndEmptyArrays: true }
},
{
'$unwind': { path: '$parentSets', preserveNullAndEmptyArrays: true }
},
{
'$project': {
slug: 1,
'parent.slug': 1,
name: 1,
'parent.name': 1,
description: 1,
'parent.description': 1,
key: 1,
'parent.key': 1,
icon: 1,
'parent.icon': 1,
featured: 1,
'parent.featured': 1,
media_id: 1,
'parent.media_id': 1,
media_type: 1,
'parent.media_type': 1,
media_uploaded_at: 1,
'parent.media_uploaded_at': 1,
media_processed_at: 1,
'parent.media_processed_at': 1,
list_order: 1,
'parent.list_order': 1,
list_ranking: 1,
'parent.list_ranking': 1,
rating_good: 1,
'parent.rating_good': 1,
rating_bad: 1,
'parent.rating_bad': 1,
estimated_available_combos: 1,
'parent.estimated_available_combos': 1,
obtained_type: 1,
'parent.obtained_type': 1,
obtained_value: 1,
'parent.obtained_value': 1,
v3_itemid: 1,
'parent.v3_itemid': 1,
v3_itemkey: 1,
'parent.v3_itemkey': 1,
v3_mediaid: 1,
'parent.v3_mediaid': 1,
is_active: 1,
'parent.is_active': 1,
is_released: 1,
'parent.is_released': 1,
is_removed: 1,
'parent.is_removed': 1,
modified_at: 1,
'parent.modified_at': 1,
created_at: 1,
'parent.created_at': 1,
'type.slug': 1,
'type.name': 1,
'type.key': 1,
'rarity.slug': 1,
'rarity.name': 1,
'rarity.key': 1,
'rarity.color': 1,
'serie.slug': 1,
'serie.name': 1,
'serie.key': 1,
'serie.color': 1,
'sets.slug': 1,
'sets.name': 1,
'sets.key': 1,
'sets.is_active': 1,
'parentSets.slug': 1,
'parentSets.name': 1,
'parentSets.key': 1,
'parentSets.is_active': 1,
'background.slug': 1,
'background.name': 1,
'background.key': 1,
'introduction.slug': 1,
'introduction.name': 1,
'introduction.key': 1,
'introduction.chapter': 1,
'introduction.season': 1,
'parent._id': 1
}
}
]
`
And this is what we're getting back:
`
[
{
_id: '390195122352164864',
slug: 'ffc-neymar-jr',
name: 'FFC Neymar Jr',
description: 'Knows a thing or two.',
key: 'character_redoasisgooseberry',
icon: 'b74a4677-e2ba-4f25-9e92-25756dafc9d2',
featured: 'b422fadb-6960-4122-9d3d-37cbc06501e8',
media_id: '04ddd281-309e-40e5-811d-767e52d84847',
media_type: 'video/mp4',
media_uploaded_at: null,
media_processed_at: 2022-12-02T07:34:43.371Z,
obtained_type: 'vbucks',
obtained_value: '1200',
rating_good: 101,
rating_bad: 6,
list_order: 6396,
list_ranking: {
default: 6807,
last_1_hr: 534,
last_24_hrs: 6807,
last_7_days: 7682
},
estimated_available_combos: 6,
is_released: true,
is_active: true,
is_removed: false,
modified_at: null,
created_at: 2022-11-30T17:36:06.643Z,
type: { slug: 'outfit', name: 'Outfit', key: 'AthenaCharacter' },
rarity: {
slug: 'rare',
name: 'Rare',
key: 'EFortRarity::Rare',
color: '28C4F2'
},
serie: {
slug: 'icon',
name: 'Icon Series',
key: 'CreatorCollabSeries',
color: '5DD6EA'
},
sets: {
is_active: false,
slug: 'set01',
name: 'Fortnite Football Club',
key: 'SphereKickGroup'
},
introduction: {
slug: 'chapter-3-season-4',
name: 'Introduced in Chapter 3, Season 4.',
key: 22,
chapter: '3',
season: '4'
}
},
{
_id: '390195122352164864',
slug: 'ffc-neymar-jr',
name: 'FFC Neymar Jr',
description: 'Knows a thing or two.',
key: 'character_redoasisgooseberry',
icon: 'b74a4677-e2ba-4f25-9e92-25756dafc9d2',
featured: 'b422fadb-6960-4122-9d3d-37cbc06501e8',
media_id: '04ddd281-309e-40e5-811d-767e52d84847',
media_type: 'video/mp4',
media_uploaded_at: null,
media_processed_at: 2022-12-02T07:34:43.371Z,
obtained_type: 'vbucks',
obtained_value: '1200',
rating_good: 101,
rating_bad: 6,
list_order: 6396,
list_ranking: {
default: 6807,
last_1_hr: 534,
last_24_hrs: 6807,
last_7_days: 7682
},
estimated_available_combos: 6,
is_released: true,
is_active: true,
is_removed: false,
modified_at: null,
created_at: 2022-11-30T17:36:06.643Z,
type: { slug: 'outfit', name: 'Outfit', key: 'AthenaCharacter' },
rarity: {
slug: 'rare',
name: 'Rare',
key: 'EFortRarity::Rare',
color: '28C4F2'
},
serie: {
slug: 'icon',
name: 'Icon Series',
key: 'CreatorCollabSeries',
color: '5DD6EA'
},
sets: {
slug: 'set01',
name: 'Fortnite Football Club',
key: 'SphereKickGroup',
is_active: false
},
introduction: {
slug: 'chapter-3-season-4',
name: 'Introduced in Chapter 3, Season 4.',
key: 22,
chapter: '3',
season: '4'
}
}
]
`
How is this possible?
Thanks,
Sam
Tried to rebuild indexes, restarted all Mongo instances, restarted the api servers, removed the items and added them back.
This is the expected outcome when using $unwind as if there is more than one matching document it will essentially duplicate the whole document, replacing the array with each item in your array as an object.
If you're only expecting a single document to be returned it sounds like the issue is caused by there being more than one matching document from a $lookup.
It could be caused by any one of the lookups, but from the output you posted the two sets are slightly different so I would start by searching items_set to see if there is more than one matching document. If there is then deleting the duplicate should solve the issue.
db.items_set.find({ key: 'SphereKickGroup' })

I want to filter item count added by supplier for each month in mongodb

I am new to MongoDB and I want to get the item count added by the handicraftmen for each of the months.
As an example January = 10, February =15 .... like that.
This is my model
const mongoose=require('mongoose');
const handicraftmen = require('../models/handicraftmen')
const item_model =new mongoose.Schema({
handicraftmen_id:{
type: mongoose.Schema.Types.ObjectId,
ref: handicraftmen,
require: true,
},
category: {
type: String,
require: true,
},
name: {
type: String,
require: true,
},
price: {
type: Number,
require: true,
},
stock: {
type: Number,
require: true,
},
description: {
type: String,
require: true,
},
},{timestamps: true});
const item = mongoose.model('Item',item_model);
module.exports=item;
and this is my one item in item list.
_id:62f5df71dabf3cd385c6beee
handicraftmen_id:62f251c28ce1837d3f275908
category:"Masks"
name:"wooden mask"
price:500
stock:8
description:"Wooden mask"
createdAt:2022-08-12T05:04:49.502+00:00
updatedAt:2022-09-02T16:23:14.885+00:00
there are a lot of items in the list added by different handicraftmens in different months.
I want to get specific handicraftmen items and filter those into months and get item count for each month.
how did I get that using a mongoose query?
You can write an aggregation pipeline for this.
Use $match to filter documents matching handicraftmen.
Add a new field date which is basically a breakdown of createdAt, into various time components. Using $addFields and $dateToParts.
Use $group to get counts for each month.
Convert the numeric month index to string value using $addFields again, and remove _id using $project.
Like this:
db.collection.aggregate([
{
"$match": {
handicraftmen_id: "62f251c28ce1837d3f275908"
}
},
{
"$addFields": {
"date": {
"$dateToParts": {
"date": {
"$toDate": "$createdAt"
},
}
}
}
},
{
"$group": {
"_id": "$date.month",
"count": {
"$sum": 1
}
}
},
{
"$addFields": {
"month": {
$arrayElemAt: [
[
"",
"January",
"February",
"March",
"April",
"May",
"June",
"July",
"August",
"September",
"October",
"November",
"December"
],
"$_id"
]
}
}
},
{
"$project": {
_id: 0
}
}
])
This is the playground link. You can use the mongoose aggregate method, similarly.

MongoDB: Is there a way to compute sum on past records?

Assume I have a dataset like :
yearMonth | amount
201908 | 100
201909 | 100
201910 | 200
201911 | 100
201912 | 200
202001 | 300
202002 | 200
Is there a way I can do a sum/accumulate on pass records to get a result set like :
yearMonth | amount | balance
201908 | 100 | 100
201909 | 100 | 200
201910 | 200 | 400
201911 | 100 | 500
201912 | 200 | 700
202001 | 300 | 1000
202002 | 200 | 1200
Try below aggregation query :
db.collection.aggregate([
/** Sort on entire collection is not preferred, but you need it if 'yearMonth' field is not ordered */
/** Group on empty & push all docs to 'data' array */
{ $group: { _id: "", data: { $push: "$$ROOT" } } },
{
$project: {
data: {
$let: {
vars: {
data: {
$reduce: {
input: "$data", /** Iterate over 'data' array & push newly formed docs to docs array */
initialValue: { amount: 0, docs: [] },
in: {
docs: {
$concatArrays: [
"$$value.docs",
[
{
_id: "$$this._id",
yearMonth: "$$this.yearMonth",
amount: "$$this.amount",
balance: {
$add: ["$$value.amount", "$$this.amount"],
},
},
],
],
},
amount: { $add: ["$$value.amount", "$$this.amount"] },
},
},
},
},
in: "$$data.docs", /** Return only 'docs' array & ignore 'amount' field */
},
},
},
},
/** unwind 'data' array(newly formed 'data' array field) */
{
$unwind: "$data",
},
/** Replace data object as new root for each document in collection */
{
$replaceRoot: {
newRoot: "$data",
},
},
]);
Test : MongoDB-Playground
Ref : aggregation-pipeline-operators
Using the mapReduce collection method with guidance from this answer you can get your desired results.
Here's a pymongo solution using the following options:
map function - this does the initial mapping of the key, value pair to be emitted (the yearMonth and Amount).
reduce function - didn't need any action for this case.
out - specifies where to put the output - could be a collection or as in this case just processed inline.
scope - specifies the rolling total field - just called total
finalize - this does the actual totaling.
Here's the python(pymongo) code:
from pymongo import MongoClient
from bson.code import Code
client = MongoClient()
db = client.tst1
coll = db.mapr1
map1 = Code('''
function () {
emit(
this.yearMonth,
this.amount
);
}
''')
reduce1 = Code('''
function (key, values) {
return value;
}
''')
fin1 = Code('''
function(key, value) {
total += value;
return {amount: value, balance: total};
}
''')
result = coll.map_reduce(map1, reduce1, out={'inline': 1}, scope={'total': 0}, finalize=fin1)
for doc in result['results']:
print(f'The doc is {doc}')
Results:
The doc is {'_id': 201908.0, 'value': {'amount': 100.0, 'balance': 100.0}}
The doc is {'_id': 201909.0, 'value': {'amount': 100.0, 'balance': 200.0}}
The doc is {'_id': 201910.0, 'value': {'amount': 200.0, 'balance': 400.0}}
The doc is {'_id': 201911.0, 'value': {'amount': 100.0, 'balance': 500.0}}
The doc is {'_id': 201912.0, 'value': {'amount': 200.0, 'balance': 700.0}}
The doc is {'_id': 202001.0, 'value': {'amount': 300.0, 'balance': 1000.0}}
The doc is {'_id': 202002.0, 'value': {'amount': 200.0, 'balance': 1200.0}}
Documents in collection:
{'_id': ObjectId('5e89c410b187b1e1abb089af'),
'amount': 100,
'yearMonth': 201908}
{'_id': ObjectId('5e89c410b187b1e1abb089b0'),
'amount': 100,
'yearMonth': 201909}
{'_id': ObjectId('5e89c410b187b1e1abb089b1'),
'amount': 200,
'yearMonth': 201910}
{'_id': ObjectId('5e89c410b187b1e1abb089b2'),
'amount': 100,
'yearMonth': 201911}
{'_id': ObjectId('5e89c410b187b1e1abb089b3'),
'amount': 200,
'yearMonth': 201912}
{'_id': ObjectId('5e89c410b187b1e1abb089b4'),
'amount': 300,
'yearMonth': 202001}
{'_id': ObjectId('5e89c410b187b1e1abb089b5'),
'amount': 200,
'yearMonth': 202002}

MongoDB query - unwind and match preserving null OR different value/ add a new field based on a condition

If a have a following structure :
{
_id: 1,
name: 'a',
info: []
},
{
_id: 2,
name: 'b',
info: [
{
infoID: 100,
infoData: 'my info'
}
]
},
{
_id: 3,
name: 'c',
info: [
{
infoID: 200,
infoData: 'some info 200'
},
{
infoID: 300,
infoData: 'some info 300'
}
]
}
I need to query in such a way to obtain the documents where infoID is 100 showing the infoData, or nothing if info is empty, or contains subdocuments with infoID different from 100.
That is, I would want the following output:
{
_id: 1,
name: 'a',
infoData100: null
},
{
_id: 2,
name: 'b',
infoData100: 'my info'
},
{
_id: 3,
name: 'c',
infoData100: null
}
If I $unwind by info and $match by infoID: 100, I lose records 1 and 3.
Thanks for your responses.
Try below query :
Query :
db.collection.aggregate([
/** Adding a new field or you can use $project instead of addFields */
{
$addFields: {
infoData100: {
$cond: [
{
$in: [100, "$info.infoID"] // Check if any of objects 'info.infoID' has value 100
},
{
// If any of those has get that object & get infoData & assign it to 'infoData100' field
$let: {
vars: {
data: {
$arrayElemAt: [
{
$filter: {
input: "$info",
cond: { $eq: ["$$this.infoID", 100] }
}
},
0
]
}
},
in: "$$data.infoData"
}
},
null // If none has return NULL
]
}
}
}
]);
Test : MongoDB-Playground

mongodb is it possible to get all keys and values of object in documents in the collection?

Is it possible to get all keys and values of object in documents in the collection?
I have a collection in mongo db with structure like
[{
_id: '55534c2e2750b4394debedd2',
selected_options: {
name: 'test',
size: 'S',
color: 'red'
}
},
{
_id: '55534c2e2750b4394debedd3',
selected_options: {
name: 'test2',
size: 'S',
color: 'red'
}
},
{
_id: '55534e087f01fa2a4d30f7f5',
selected_options: {
name: 'test3',
size: 'm',
color: 'green'
}
}
........
]
how can i get output like :
[{
name: 'name',
values: ['test', 'test2', 'test3']
},
{
name: 'size',
values: ['S', 'm']
},
{
name: 'color',
values: ['red', 'green']
}
]
I think you have no option to achieve your result without processing on your client side. However you can try Aggregation Framework to achieve something similar to your desired output with just a single query.
db.yourCollection.aggregate([
{$group:
{
_id: null,
names: {$addToSet: '$selected_options.name'},
sizes: {$addToSet: '$selected_options.size'},
colors: {$addToSet: '$selected_options.color'},
}
},
{$project:
{_id: 0, names: 1, colors: 1, sizes: 1}
}
])
This will output the following:
{
names: ['test', 'test2', 'test3'],
sizes: ['S', 'm'],
colors: ['red', 'green']
}
Another option is running a distinct() for each field.
See the example output in the documentation page.