MongoDB query for distinct field values that meet a conditional - mongodb

I have a collection named 'sentences'. I would like a list of all the unique values of 'last_syls' where the number of entries containing that value of 'last_syls' is greater than 10.
A document in this collection looks like:
{ "_id" : ObjectId( "51dd9011cf2bee3a843f215a" ),
"last_syls" : "EY1D",
"last_word" : "maid"}
I've looked into db.sentences.distinct('last_syls'), but cannot figure out how to query based on the count for each of these distinct values.

You're going to want to use the aggregation framework:
db.sentences.aggregate([
{
$group: {
_id: "$last_syls",
count: { $sum: 1}
}
},
{
$match: {
count: { $gt: 10 }
}
}
])
This groups documents by their last_syls field with a count per group, then filters that result set to all results with a count greater than 10.

Related

Get record having highest date inside nested group in Mongodb

I am having a record set like below :
I need to write a query where foreach datatype of every parent I show the data type with highest date i.e
So far I am able to create two groups one on parent id & other on data type but i am unable to understand how to get record with max date.
Below is my query :
db.getCollection('Maintenance').aggregate( [{ $group :
{ _id :{ parentName: "$ParentID" , maintainancename : "$DataType" }}},
{ $group : {
_id : "$_id.parentName",
maintainancename: {
$push: {
term:"$_id.DataType"
}
}
}
}] )
You don't have to $group twice, try below aggregation query :
db.collection.aggregate([
/** group on two fields `ParentID` & `Datatype`,
* which will leave docs with unique `ParentID + Datatype`
* & use `$max` to get max value on `Date` field in unique set of docs */
{
$group: {
_id: {
parentName: "$ParentID",
maintainancename: "$Datatype"
},
"Date": { $max: "$Date" }
}
}
])
Test : mongoplayground
Note : After group stage you can use $project or $addFieldsstages to transform fields the way you want.

Select data where the range between two different fields contains a given number

I want to make a find query on my database for documents that have an input value between or equal to these 2 fields, LOC_CEP_INI and LOC_CEP_FIM
Example: user input a number to the system with value : 69923994, then I use this input to search my database for all documents that have this value between the range of the fields LOC_CEP_INI and LOC_CEP_FIM.
One of my documents (in this example this document is selected by the query because the input is inside the range):
{
"_id" : ObjectId("570d57de457405a61b183ac6"),
"LOC_CEP_FIM" : 69923999, //this field is number
"LOC_CEP_INI" : 69900001, // this field is number
"LOC_NO" : "RIO BRANCO",
"LOC_NU" : "00000016",
"MUN_NU" : "1200401",
"UFE_SG" : "AC",
"create_date" : ISODate("2016-04-12T20:17:34.397Z"),
"__v" : 0
}
db.collection.find( { field: { $gt: value1, $lt: value2 } } );
https://docs.mongodb.com/v3.2/reference/method/db.collection.find/
refer this mongo provide range facility with $gt and $lt .
You have to invert your field names and query value.
db.zipcodes.find({
LOC_CEP_INI: {$gte: 69923997},
LOC_CEP_FIM: {$lte: 69923997}
});
For your query example to work, you would need your documents to hold an array property, and that each item in this prop hold a 69923997 prop. Mongo would then check that this 69923997 prop has a value that is both between "LOC_CEP_INI" and "LOC_CEP_FIM" for each item in your array prop.
Also I'm not sure whether you want LOC_CEP_INI <= 69923997 <= LOC_CEP_FIM or the contrary, so you might need to switch the $gte and $lte conditions.
db.zipcodes.find( {
"LOC_CEP_INI": { "$lte": 69900002 },
"LOC_CEP_FIM": { "$gte": 69900002 } })
Here is the logic use it as per the need:
Userdb.aggregate([
{ "$match": { _id: ObjectId(session._id)}},
{ $project: {
checkout_list: {
$filter: {
input: "$checkout_list",
as: "checkout_list",
cond: {
$and: [
{ $gte: [ "$$checkout_list.createdAt", new Date(date1) ] },
{ $lt: [ "$$checkout_list.createdAt", new Date(date2) ] }
]
}
}
}
}
}
Here i use filter, because of some reason data query on nested data is not gets succeed in mongodb

Count occurrences of duplicate values

How do I structure my MongooseJS/MongoDB query to get total duplicates/occurrences of a particular field value? Aka: The total documents with custID of some value for all custIDs
I can do this manually in command line:
db.tapwiser.find({"custID" : "12345"}, {}, {}).count();
Outputs: 1
db.tapwiser.find({"custID" : "6789"}, {}, {}).count();
Outputs: 4
I found this resource:
How to sum distinct values of a field in a MongoDB collection (utilizing mongoose)
But it requires that I specify the unique fields I want to sum.
In this case, I want to loop through all documents, sum the occurrences of each.
All you need to do is $group your documents by custID and use the $sum accumulator operator to return "count" for each group.
db.tapwiser.aggregate(
[
{ "$group": { "_id": "$custID", "count": { "$sum": 1 } } }
], function(err, results) {
// Do something with the results
}
)

usage for MongoDB sort in array

I would like to ranked in descending order a list of documents in array names via their number value.
Here's the structure part of my collection :
_id: ObjectId("W")
var1: "X",
var2: "Y",
var3: "Z",
comments: {
names: [
{
number: 1;
},
{
number: 3;
},
{
number: 2;
}
],
field: Y;
}
but all my request with db.collection.find().sort( { "comments.names.number": -1 } ) doesn't work.
the desired output sort is :
{ "_id" : ObjectId("W"), "var1" : "X", "var3" : "Z", "comments" : { [ { "number" : 3 }, { "number" : 2 },{ "number" : 1 } ], "field": "Y" } }
Can you help me?
You need to aggregate the result, as below:
Unwind the names array.
Sort the records based on comments.names.number in descending
order.
Group the records based on the _id field.
project the required structure.
Code:
db.collection.aggregate([
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
],{"allowDiskUse":true})
If your collection is large, you might want to add a $match criteria in the beginning of the aggregation pipeline to filter records or use (allowDiskUse:true), to facilitate sorting large number of records.
db.collection.aggregate([
{$match:{"_id":someId}},
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
])
What The below query does:
db.collection.find().sort( { "comments.names.number": -1 } )
is to find all the documents, then sort those documents based on the number field in descending order. What this actually does is for each document get the comments.names.number field value which is the largest, for each document. And then sort the parent documents based on this number. It doesn't manipulate the names array inside each parent document.
You need update document for sort an array.
db.collection.update(
{ _id: 1 },
{
$push: {
comments.names: {
$each: [ ],
$sort: { number: -1 }
}
}
}
)
check documentation here:
http://docs.mongodb.org/manual/reference/operator/update/sort/#use-sort-with-other-push-modifiers
MongoDB queries sort the result documents based on the collection of fields specified in the sort. They do not sort arrays within a document. If you want the array sorted, you need to sort it yourself after you retrieve the document, or store the array in sorted order. See this old SO answer from Stennie.

How to count the number of documents on date field in MongoDB

Scenario: Consider, I have the following collection in the MongoDB:
{
"_id" : "CustomeID_3723",
"IsActive" : "Y",
"CreatedDateTime" : "2013-06-06T14:35:00Z"
}
Now I want to know the count of the created document on the particular day (say on 2013-03-04)
So, I am trying to find the solution using aggregation framework.
Information:
So far I have the following query built:
collection.aggregate([
{ $group: {
_id: '$CreatedDateTime'
}
},
{ $group: {
count: { _id: null, $sum: 1 }
}
},
{ $project: {
_id: 0,
"count" :"$count"
}
}
])
Issue: Now considering above query, its giving me the count. But not based on only date! Its taking time as well into consideration for unique count.
Question: Considering the field has ISO date, Can any one tell me how to count the documents based on only date (i.e excluding time)?
Replace your two groups with
{$project:{day:{$dayOfMonth:'$createdDateTime'},month:{$month:'$createdDateTime'},year:{$year:'$createdDateTime'}}},
{$group:{_id:{day:'$day',month:'$month',year:'$year'}, count: {$sum:1}}}
You can read more about the date operators here: http://docs.mongodb.org/manual/reference/aggregation/#date-operators