Count recurrences in a collection and merge aggregation result to another collection - mongodb

I need to count the recurrences of a value in the collection A, so I do
db.collectionA.aggregate( [ { $group : { name : "$name", count :{$sum: 1 } } } ] )
And have something like
{
"name": "Bruce",
"count": 2
},
{
"_id": "Alfred",
"count": 3
}
Then I need to get this result and populate a field of the collection B, I imagine something like a forEach but don't know how to implement
db.collectionB.findAndModify({query: {"name":forEach of the pervious result},
update:{"nameRecurrences": value of the count}})

Looking at your aggregation pipeline :
db.collectionA.aggregate( [
{ $group : { _id : '$name', name : {$first : "$name"}, nameRecurrences :{$sum: 1 } } }, // renamed field `nameRecurrences` to match with field name in `collection-B`
{$project : {_id : 0}} ] ) // Removing _id to avoid conflicts on merge
On MongoDB version >= 4.2 you can use $merge aggregation operator to merge result of aggregation pipeline on one collection to another collection :
Just add below stage as last stage of aggregation pipeline :
{$merge : { into: { db: "dbName", coll: "collectionB" }, on: "name", whenNotMatched: "discard"}} // Remember to create unique index on `name` field on `collectionB`
Since you're using MongoDB version 3.6.16 :
If the collection has to be created now then you can use $out, but if it's an existing collection with lot of fields in each document apart from just name & nameRecurrences then you can try this in code :
Since you've different filters and their respective update part, then you can take advantage of .bulkWrite() to update multiple documents :
let bulkArr = []
for (const i of aggregationResult){
bulkArr.push( { updateOne : {
"filter" : { "name" : i.name },
"update" : { $set : { "nameRecurrences" : i.nameRecurrences } }
} })
}
db.collectionB.bulkWrite(bulkArr)

Related

MongoDB Sorting: Equivalent Aggregation Query

I have following students collection
{ "_id" : ObjectId("5f282eb2c5891296d8824130"), "name" : "Rajib", "mark" : "1000" }
{ "_id" : ObjectId("5f282eb2c5891296d8824131"), "name" : "Rahul", "mark" : "1200" }
{ "_id" : ObjectId("5f282eb2c5891296d8824132"), "name" : "Manoj", "mark" : "1000" }
{ "_id" : ObjectId("5f282eb2c5891296d8824133"), "name" : "Saroj", "mark" : "1400" }
My requirement is to sort the collection basing on 'mark' field in descending order. But it should not display 'mark' field in final result. Result should come as:
{ "name" : "Saroj" }
{ "name" : "Rahul" }
{ "name" : "Rajib" }
{ "name" : "Manoj" }
Following query I tried and it works fine.
db.students.find({},{"_id":0,"name":1}).sort({"mark":-1})
My MongoDB version is v4.2.8. Now question is what is the equivalent Aggregation Query of the above query. I tried following two queries. But both didn't give me desired result.
db.students.aggregate([{"$project":{"name":1,"_id":0}},{"$sort":{"mark":-1}}])
db.students.aggregate([{"$project":{"name":1,"_id":0,"mark":1}},{"$sort":{"mark":-1}}])
Why it is working in find()?
As per Cursor.Sort, When a set of results are both sorted and projected, the MongoDB query engine will always apply the sorting first.
Why it isn't working in aggregate()?
As per Aggregation Pipeline, The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents.
You need to correct:
You should change pipeline order, because if you have not selected mark field in $project then it will no longer available in further pipelines and it will not affect $sort operation.
db.students.aggregate([
{ "$sort": { "mark": -1 } },
{ "$project": { "name": 1, "_id": 0 } }
])
Playground: https://mongoplayground.net/p/xtgGl8AReeH

MongoDB: how do I check that all array entries are unique in the entire collection?

A little brainteaser for mongo users.
I have a collection of documents like
{
"_id" : ObjectId("19628f4f0545a733185b672f"),
"name" : "hello",
"items" : [
{
"itemNumber" : 12512,
"value" : "let"
},
{
"itemNumber" : 2546,
"value" : "put"
}
]
}
I need to make sure that every item's itemNumber is unique globally in the collection.
In SQL database I would have a separate table for items and the query for checking if numbers are unique would be something like
select count(1)
from (
select itemNumber, count(itemNumber) as cnt
from items
group by itemNumber) sel
where cnt>1;
Resulting 0 would mean that all itemNumbers are unique. (Probably there are better ways to make that check in SQL)
With MongoDB the only solution that I can come to is
a) use forEach to extract all items to separate collection
b) make a simple aggregation
db.items.aggregate(
{ $group : { _id : '$itemNumber', count : {$sum : 1} } },
{ $out : "cnt" }
)
c) db.cnt.find({count: {$gt: 1}}).count()
Is there any one-query way to do it?
Performace notice: the collection is about 3M documents, 2,2KB each. I have noticed that aggreations that contain $group run like forever on this collection.
How about something like that:
db.items.aggregate(
{ $unwind: "$items" } ,
{ $group : { _id : '$items.itemNumber', count : { $sum : 1 } } },
{ $match: { "count": { $gt: 1 } } }
)

Mongo DB - how to query for id dependent on oldest date in array of a field

Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]

How to Avoid Duplicate Entries in MongoDb Meteor App

How to avoid duplicate entries in mongoDb in Meteor application.
On the command: db.products.find({},{"TEMPLATE_NAME": 1},{unique : true})
{ "_id" : ObjectId("5555d0a16ce3b01bb759a771"), "TEMPLATE_NAME" : "B" }
{ "_id" : ObjectId("5555d0b46ce3b01bb759a772"), "TEMPLATE_NAME" : "A" }
{ "_id" : ObjectId("5555d0c86ce3b01bb759a773"), "TEMPLATE_NAME" : "C" }
{ "_id" : ObjectId("5555d0f86ce3b01bb759a774"), "TEMPLATE_NAME" : "C" }
{ "_id" : ObjectId("5555d1026ce3b01bb759a775"), "TEMPLATE_NAME" : "A" }
{ "_id" : ObjectId("5555d1086ce3b01bb759a776"), "TEMPLATE_NAME" : "B" }
I want to retrieve only the unique template names and show them on HTML page.
Use the aggregation framework where your pipeline stages consist of the $group and $project operators respectively. The $group operator step groups the input documents by the given key and thus will return distinct documents in the result. The $project operator then reshapes each document in the stream, such as by adding new fields or removing existing fields:
db.products.aggregate([
{
"$group": {
"_id": "$TEMPLATE_NAME"
}
},
{
"$project": {
"_id": 0,
"TEMPLATE_NAME": "$_id"
}
}
])
Result:
/* 0 */
{
"result" : [
{
"TEMPLATE_NAME" : "C"
},
{
"TEMPLATE_NAME" : "A"
},
{
"TEMPLATE_NAME" : "B"
}
],
"ok" : 1
}
You could then use the meteorhacks:aggregate package to implement the aggregation in Meteor:
Add to your app with
meteor add meteorhacks:aggregate
Then simply use .aggregate function like below.
var products = new Mongo.Collection('products');
var pipeline = [
{
"$group": {
"_id": "$TEMPLATE_NAME"
}
},
{
"$project": {
"_id": 0,
"TEMPLATE_NAME": "$_id"
}
}
];
var result = products.aggregate(pipeline);
-- UPDATE --
An alternative that doesn't use aggregation is using underscore's methods to return distinct field values from the collection's find method as follows:
var distinctTemplateNames = _.uniq(Collection.find({}, {
sort: {"TEMPLATE_NAME": 1}, fields: {"TEMPLATE_NAME": true}
}).fetch().map(function(x) {
return x.TEMPLATE_NAME;
}), true)
;
This will return an array with distinct product template names ["A", "B", "C"]
You can check out some tutorials which explain the above approach in detail: Get unique values from a collection in Meteor and METEOR – DISTINCT MONGODB QUERY.
You can use distinct of mongodb like :
db.collectionName.distinct("TEMPLATE_NAME")
This query will return you array of distinct TEMPLATE_NAME

In a Mongo collection, how do you query for a specific object in an array?

I'm trying to retrieve an object from an array in mongodb. Below is my document:
{
"_id" : ObjectId("53e9b43968425b29ecc87ffd"),
"firstname" : "john",
"lastname" : "smith",
"trips" : [
{
"submitted" : 1407824585356,
"tripCategory" : "staff",
"tripID" : "1"
},
{
"tripID" : "2",
"tripCategory" : "volunteer"
},
{
"tripID" : "3",
"tripCategory" : "individual"
}
]
}
My ultimate goal is to update only when trips.submitted is absent so I thought I could query and determine what the mongo find behavior would look like
if I used the $and query operator. So I try this:
db.users.find({
$and: [
{ "trips.tripID": "1" },
{ "trips": { $elemMatch: { submitted: { $exists: true } } } }
]
},
{ "trips.$" : 1 } //projection limits to the FIRST matching element
)
and I get this back:
{
"_id" : ObjectId("53e9b43968425b29ecc87ffd"),
"trips" : [
{
"submitted" : 1407824585356,
"tripCategory" : "staff",
"tripID" : "1"
}
]
}
Great. This is what I want. However, when I run this query:
db.users.find({
$and: [
{ "trips.tripID": "2" },
{ "trips": { $elemMatch: { submitted: { $exists: true } } } }
]
},
{ "trips.$" : 1 } //projection limits to the FIRST matching element
)
I get the same result as the first! So I know there's something odd about my query that isn't correct. But I dont know what. The only thing I've changed between the queries is "trips.tripID" : "2", which in my head, should have prompted mongo to return no results. What is wrong with my query?
If you know the array is in a specific order you can refer to a specific index in the array like this:-
db.trips.find({"trips.0.submitted" : {$exists:true}})
Or you could simply element match on both values:
db.trips.find({"trips" : {$elemMatch : {"tripID" : "1",
"submitted" : {$exists:true}
}}})
Your query, by contrast, is looking for a document where both are true, not an element within the trips field that holds for both.
The output for your query is correct. Your query asks mongo to return a document which has the given tripId and the field submitted within its trips array. The document you have provided in your question satisfies both conditions for both tripIds. You are getting the first element in the array trips because of your projection.
I have assumed you will be filtering records by the person's name and then retrieving the elements inside trips based on the field-exists criteria. The output you are expecting can be obtained using the following:
db.users.aggregate(
[
{$match:
{
"firstname" : "john",
"lastname" : "smith"
}
},
{$unwind: "$trips"},
{$match:
{
"trips.tripID": "1" ,
"trips.submitted": { $exists: true }
}
}
]
)
The aggregation pipeline works as follows. The first $match operator filters one document (in this case the document for john smith) The $unwind operator in mongodb aggregation unwinds the specified array (trips in this case), in effect denormalizing the sub-records associated with the parent records. The second $match operator filters the denormalized/unwound documents further to obtain the one required as per your query.