Count occurrences of duplicate values - mongodb

How do I structure my MongooseJS/MongoDB query to get total duplicates/occurrences of a particular field value? Aka: The total documents with custID of some value for all custIDs
I can do this manually in command line:
db.tapwiser.find({"custID" : "12345"}, {}, {}).count();
Outputs: 1
db.tapwiser.find({"custID" : "6789"}, {}, {}).count();
Outputs: 4
I found this resource:
How to sum distinct values of a field in a MongoDB collection (utilizing mongoose)
But it requires that I specify the unique fields I want to sum.
In this case, I want to loop through all documents, sum the occurrences of each.

All you need to do is $group your documents by custID and use the $sum accumulator operator to return "count" for each group.
db.tapwiser.aggregate(
[
{ "$group": { "_id": "$custID", "count": { "$sum": 1 } } }
], function(err, results) {
// Do something with the results
}
)

Related

Cast String as Numeric in Find and Sort Operations

I have a mongo collection called items. I want to find the 10 highest priced items out of the active ones. My problem is the price is a string. So my question is how can I cast price as numeric and then sort the active items in descending order over price?
My current attempt gives me the highest price in alphanumeric order, i.e. 999. But I have items that are way pricier.
db.getCollection('items').find({"status": "active"})
.sort({"packet.price":-1})
.limit(10)
I tried:
sort({{$toInt:"packet.price"}:-1}),
sort({NumberInt("packet.price"):-1})
but no luck.
It is not possible with find method, you can use the aggregation framework,
$match to match your query condition
$addFields to change type of price field using $toInt operator
$sort by price in descending order
$limit 10 documents
db.getCollection('items').aggregate([
{ "$match": { "status": "active" }, },
{ "$addFields": { "$toInt": "$packet.price" } },
{ "$sort": { "packet.price": -1 } },
{ "$limit": 10 }
])

Multiply rows and search query

I have to multiply rows with a number and then have to filter the data.
I have to multiply salary in field SAL by 12 for annual total and then find which one is greater than 30k.
I have already tried the multiply which is working, but after I get the data I can't filter, I tried to use match keyword also.
db.EMP.aggregate({$group:{_id:"$ENAME",Remuneration:{$sum:{$multiply:["$SAL","$COMM"]}}}})
db.EMP.aggregate([{$project:{total:{$multiply:["$SAL",12]}}} ,{$match:{"$total":{$gte:3000}}}] )
db.EMP.aggregate([{$project:{total:{$multiply:["$SAL",12]}}} ,{$gt:{"$total",30000}}] )
Data for MongoDB:
Your query using $match was very close, but you should use total instead of $total because as per the $match docs:
The $match query syntax is identical to the read operation query
syntax; i.e. $match does not accept raw aggregation expressions.
So your pipeline would be:
db.EMP.aggregate([
{ $project: { total : { $multiply : ["$SAL", 12 ] } } },
{ $match: { total : { $gt: 30000 } } }
])

Display all data in a row using mongodb

I have a collection myCollection,
ID DateTime myID Cost Flag
1 '2016-07-01T00:00:00' 2048 1 'O'
2 '2016-07-02T00:00:00' 2049 2 'O'
if I write sql query for it
"select DateTime, myID,Flag, min(Cost) from myCollection group by DateTime, myID"
it will display all the fields in data e.g
DateTime, myID,Flag, min(Cost)
but in mongo aggregation framework I can group like
db.myCollection.aggregate([
{
$group: {
_id: {
DateTime: '$DateTime',
myID: '$myID'
},
minCost: { $min: '$Cost' }
}
}
])
which will return me
DateTime, myID, min(Cost)
but I need "Flag" field also in single query. I tried out $Push but it works only for an array.
In SQL Server, the query
select DateTime, myID,Flag, min(Cost) from myCollection group by DateTime, myID
is invalid in the select list because the column Flag is not contained in either an aggregate function or the GROUP BY clause.
In MongoDB, to be able to include the field Flag in the aggregate query, you must apply an accumulator operator on the field and in this case you may either use the $first or $last accumulator operators to return the field value within the aggregation.
db.myCollection.aggregate([
{
"$group": {
"_id": {
"DateTime": '$DateTime',
"myID": '$myID'
},
"minCost": { "$min": '$Cost' },
"Flag": { "$first": '$Flag' }
}
}
])
in the above, the $first accumulator operator applied on the Flag field will return a Flag value from the first document for each group. Order is only defined if the documents are in a defined order.

usage for MongoDB sort in array

I would like to ranked in descending order a list of documents in array names via their number value.
Here's the structure part of my collection :
_id: ObjectId("W")
var1: "X",
var2: "Y",
var3: "Z",
comments: {
names: [
{
number: 1;
},
{
number: 3;
},
{
number: 2;
}
],
field: Y;
}
but all my request with db.collection.find().sort( { "comments.names.number": -1 } ) doesn't work.
the desired output sort is :
{ "_id" : ObjectId("W"), "var1" : "X", "var3" : "Z", "comments" : { [ { "number" : 3 }, { "number" : 2 },{ "number" : 1 } ], "field": "Y" } }
Can you help me?
You need to aggregate the result, as below:
Unwind the names array.
Sort the records based on comments.names.number in descending
order.
Group the records based on the _id field.
project the required structure.
Code:
db.collection.aggregate([
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
],{"allowDiskUse":true})
If your collection is large, you might want to add a $match criteria in the beginning of the aggregation pipeline to filter records or use (allowDiskUse:true), to facilitate sorting large number of records.
db.collection.aggregate([
{$match:{"_id":someId}},
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
])
What The below query does:
db.collection.find().sort( { "comments.names.number": -1 } )
is to find all the documents, then sort those documents based on the number field in descending order. What this actually does is for each document get the comments.names.number field value which is the largest, for each document. And then sort the parent documents based on this number. It doesn't manipulate the names array inside each parent document.
You need update document for sort an array.
db.collection.update(
{ _id: 1 },
{
$push: {
comments.names: {
$each: [ ],
$sort: { number: -1 }
}
}
}
)
check documentation here:
http://docs.mongodb.org/manual/reference/operator/update/sort/#use-sort-with-other-push-modifiers
MongoDB queries sort the result documents based on the collection of fields specified in the sort. They do not sort arrays within a document. If you want the array sorted, you need to sort it yourself after you retrieve the document, or store the array in sorted order. See this old SO answer from Stennie.

MongoDB query for distinct field values that meet a conditional

I have a collection named 'sentences'. I would like a list of all the unique values of 'last_syls' where the number of entries containing that value of 'last_syls' is greater than 10.
A document in this collection looks like:
{ "_id" : ObjectId( "51dd9011cf2bee3a843f215a" ),
"last_syls" : "EY1D",
"last_word" : "maid"}
I've looked into db.sentences.distinct('last_syls'), but cannot figure out how to query based on the count for each of these distinct values.
You're going to want to use the aggregation framework:
db.sentences.aggregate([
{
$group: {
_id: "$last_syls",
count: { $sum: 1}
}
},
{
$match: {
count: { $gt: 10 }
}
}
])
This groups documents by their last_syls field with a count per group, then filters that result set to all results with a count greater than 10.