MongoDB get only matching elements from nested array - mongodb

I have a collection like this:
{
_id : 123,
username : "xy",
comments : [
{
text : "hi",
postdate : 123456789
},
{
text : "hi1",
postdate : 555555555
},
{
text : "hi2",
postdate : 666666666
},
{
text : "hi3",
postdate : 987654321
}
]}
Now I want only the comments that have postdate 555555555 or higher and 987654321 or lower. I have this query, but it doesn't work:
db.post.aggregate([
{$match : {$and : [
{"_id": ObjectId("123")},
{"comments.posttime" : {$lte : 987654321}},
{"comments.posttime" : {$gte : 555555555}}
]}}
,{$unwind: "$comments"}]).pretty();
But when I try this it gets me all of the array elements. How should this be done?
Thank you!

Use $redact to Restricts the contents of the document,
db.an.aggregate([{
$redact: {
"$cond": [{
$and: [{
"$gte": [{
"$ifNull": ["$postdate", 555555555]
},
555555555
]
}, {
"$lte": [{
"$ifNull": ["$postdate", 987654321]
},
987654321
]
}]
}, "$$DESCEND", "$$PRUNE"]
}
}]).pretty()

you have to unwind the comments first and then do the match. so that comments array will be flattened and match condition can filter it properly.
[{$unwind: "$comments"},{$match : {$and : [
{"_id": ObjectId("123")},
{"comments.posttime" : {$lte : 987654321}},
{"comments.posttime" : {$gte : 555555555}}
]}}]
this will give one row for each comment, if you want the matching comments inside the array, use aggregate on _id and $push the comments

Related

how to find duplicate records in mongo db query to use

I have below collection, need to find duplicate records in mongo, how can we find that as below is one sample of collection we have around more then 10000 records of collections.
/* 1 */
{
"_id" : 1814099,
"eventId" : "LAS012",
"eventName" : "CustomerTab",
"timeStamp" : ISODate("2018-12-31T20:09:09.820Z"),
"eventMethod" : "click",
"resourceName" : "CustomerTab",
"targetType" : "",
"resourseUrl" : "",
"operationName" : "",
"functionStatus" : "",
"results" : "",
"pageId" : "CustomerPage",
"ban" : "290824901",
"jobId" : "87377713",
"wrid" : "87377713",
"jobType" : "IBJ7FXXS",
"Uid" : "sc343x",
"techRegion" : "W",
"mgmtReportingFunction" : "N",
"recordPublishIndicator" : "Y",
"__v" : 0
}
We can first find the unique ids using
const data = await db.collection.aggregate([
{
$group: {
_id: "$eventId",
id: {
"$first": "$_id"
}
}
},
{
$group: {
_id: null,
uniqueIds: {
$push: "$id"
}
}
}
]);
And then we can make another query, which will find all the duplicate documents
db.collection.find({_id: {$nin: data.uniqueIds}})
This will find all the documents that are redundant.
Another way
To find the event ids which are duplicated
db.collection.aggregate(
{"$group" : { "_id": "$eventId", "count": { "$sum": 1 } } },
{"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }
)
To get duplicates from db, you need to get only the groups that have a count of more than one, we can use the $match operator to filter our results. Within the $match pipeline operator, we'll tell it to look at the count field and tell it to look for counts greater than one using the $gt operator representing "greater than" and the number 1. This looks like the following:
db.collection.aggregate([
{$group: {
_id: {eventId: "$eventId"},
uniqueIds: {$addToSet: "$_id"},
count: {$sum: 1}
}
},
{$match: {
count: {"$gt": 1}
}
}
]);
I assume that eventId is a unique id.

MongoDB Closest Match on properties

Let says I have a Users collection in MongoDB whose schema looks like this:
{
name: String,
sport: String,
favoriteColor: String
}
And lets say I passed in values like this to match a user on:
{ name: "Thomas", sport: "Tennis", favoriteColor:"blue" }
What I would like to do is match the user based off all those properties. However, if no user comes back, I would like to match a user on just these properties:
{sport: "Tennis", favoriteColor:"blue" }
And if no user comes back, I would like to match a user on just this property:
{ favoriteColor: "blue" }
Is it possible to do something like this in one query with Mongo? I saw the $switch condition in Mongo that will match on a case and then immediately return, but the problem is that I can't access the document it would have retrieved in the then block. It looks like you can only write strings in there.
Any suggestions on how to accomplish what I'm looking for?
Is the best thing (and only way) to just execute multiple User.find({...}) queries?
This is a good case to use MongoDB text index:
First you need to create text index on those fields:
db.users.ensureIndex({ name: "text", sport: "text", favoriteColor: "text" });
Then you can search the best match with "$text" limited by a number to show:
db.users.find( { $text: { $search: "Tennis blue Thomas" } } ).limit(10)
Try adding rank to all documents with weightage in aggregation pipeline, and sum the rank, $sort descending to get most matched documents on top
name -> 1
sport -> 2
favoriteColor -> 4
by doing this matching favoriteColor will always have higher weightage then sport and name or combination of both
aggregate pipeline
db.col.aggregate([
{$match : {
$or : [
{"name" : {$eq :"Thomas"}},
{"sport" : {$eq : "Tennis"}},
{"favoriteColor" : {$eq : "blue"}}
]
}},
{$addFields : {
rank : {$sum : [
{$cond : [{$eq : ["$name", "Thomas"]}, 1, 0]},
{$cond : [{$eq : ["$sport", "Tennis"]}, 2, 0]},
{$cond : [{$eq : ["$favoriteColor", "blue"]}, 4, 0]}
]}
}},
{$match : {rank :{$gt : 0}}},
{$sort : {rank : -1}}
])
Hope this query will satisfy your require condition, you can get relevant result in single db hit. Just create a query in aggregate pipeline
db.collection.aggregate([
{$match : {
$or : [
{"name" : {$eq :"Thomas"}},
{"sport" : {$eq : "Tennis"}},
{"favoriteColor" : {$eq : "blue"}}
]
}},
{$addFields : {
rank : {$sum : [
{$cond : [{$and:[{$eq : ["$name", "Thomas"]},{$eq : ["$sport", "Tennis"]},{$eq : ["$favoriteColor", "blue"]}] } , 1, 0]},
{$cond : [{$and:[{$eq : ["$name", "Thomas"]},{$eq : ["$sport", "Tennis"]}] } , 2, 0]},
{$cond : [{$and:[{$eq : ["$name", "Thomas"]}] } , 3, 0]},
]}
}},
{$group:{
_id:null,
doc:{$push:'$$ROOT'},
rank:{$max:'$rank'}
}},
{$unwind:'$doc'},
{$redact: {
$cond: {
if: { $eq: [ "$doc.rank", '$rank' ] },
then: "$$KEEP",
else: "$$PRUNE"
}
}},
{
$project:{
name:'$doc.name',
sport:'$doc.sport',
favoriteColor:'$doc.favoriteColor',
}}
])
Simply create a query builder for $match pipe in mongoDB aggregate pipeline or use it for find also, create JavaScript object variable and build your query dynamically.
var query={};
if(name!=null){
query['name']={ '$eq': name};
}
if(sport!=null){
query['sport']={ '$eq': sport};
}
if(favoriteColor!=null){
query['favoriteColor']={ '$eq': favoriteColor};
}
db.collection.find(query)
It will give exactly matched result on dynamic basis
Did you try with $or:https://docs.mongodb.com/manual/reference/operator/query/or/
I used it when I wanted to check if username or email exists..

mongo db how to write a function in an query maybe aggregation?

The question is Calculate the average age of the users who have more than 3 strengths listed.
One of the data is like this :
{
"_id" : 1.0,
"user_id" : "jshaw0",
"first_name" : "Judy",
"last_name" : "Shaw",
"email" : "jshaw0#merriam-webster.com",
"age" : 39.0,
"status" : "disabled",
"join_date" : "2016-09-05",
"last_login_date" : "2016-09-30 23:59:36 -0400",
"address" : {
"city" : "Deskle",
"province" : "PEI"
},
"strengths" : [
"star schema",
"dw planning",
"sql",
"mongo queries"
],
"courses" : [
{
"code" : "CSIS2300",
"total_questions" : 118.0,
"correct_answers" : 107.0,
"incorect_answers" : 11.0
},
{
"code" : "CSIS3300",
"total_questions" : 101.0,
"correct_answers" : 34.0,
"incorect_answers" : 67.0
}
]
}
I know I need to count how many strengths this data has, and then set it to $gt, and then calculate the average age.
However, I don't know how to write 2 function which are count and average in one query. Do I need to use aggregation, if so, how?
Thanks so much
Use $redact to match your array size & $group to calculate the average :
db.collection.aggregate([{
"$redact": {
"$cond": [
{ "$gt": [{ "$size": "$strengths" }, 3] },
"$$KEEP",
"$$PRUNE"
]
}
}, {
$group: {
_id: 1,
average: { $avg: "$age" }
}
}])
The $redact part match the size of strenghs array greater than 3, it will $$KEEP record that match this condition otherwise $$PRUNE the record that don't match. Check $redact documentation
The $group just perform an average with $avg

How to sort sub-documents in the array field?

I'm using the MongoDB shell to fetch some results, ordered. Here's a sampler,
{
"_id" : "32022",
"topics" : [
{
"weight" : 281.58551703724993,
"words" : "some words"
},
{
"weight" : 286.6695125796183,
"words" : "some more words"
},
{
"weight" : 289.8354232846977,
"words" : "wowz even more wordz"
},
{
"weight" : 305.70093587160807,
"words" : "WORDZ"
}]
}
what I want to get is, same structure, but ordered by "topics" : []
{
"_id" : "32022",
"topics" : [
{
"weight" : 305.70093587160807,
"words" : "WORDZ"
},
{
"weight" : 289.8354232846977,
"words" : "wowz even more wordz"
},
{
"weight" : 286.6695125796183,
"words" : "some more words"
},
{
"weight" : 281.58551703724993,
"words" : "some words"
},
]
}
I managed to get some ordered results, but no luck in grouping them by id field. is there a way to do this?
MongoDB doesn't provide a way to do this out of the box but there is a workaround which is to update your documents and use the $sort update operator to sort your array.
db.collection.update_many({}, {"$push": {"topics": {"$each": [], "$sort": {"weight": -1}}}})
You can still use the .aggregate() method like this:
db.collection.aggregate([
{"$unwind": "$topics"},
{"$sort": {"_id": 1, "topics.weight": -1}},
{"$group": {"_id": "$_id", "topics": {"$push": "$topics"}}}
])
But this is less efficient if all you want is sort your array, and you definitely shouldn't do that.
You could always do this client side using the .sort or sorted function.
If you don't want to update but only get documents, you can use the following query
db.test.aggregate(
[
{$unwind : "$topics"},
{$sort : {"topics.weight":-1}},
{"$group": {"_id": "$_id", "topics": {"$push": "$topics"}}}
]
)
It works for me:
db.getCollection('mycollection').aggregate(
{$project:{topics:1}},
{$unwind:"$topics"},
{$sort :{"topics.words":1}})

MongoDB projection into array

The below document has the dob of student and its parent's dob.
{
"_id" : ObjectId("56a31573a3b1f89cb895abd3"),
"dob" : {
"isodate" : ISODate("1996-01-21T18:30:00.000+0000")
},
"parent" : [
{
"dob" : {
"isodate" : ISODate("1956-07-21T18:30:00.000+0000")
},
"type" : "father"
},
{
"dob" : {
"isodate" : ISODate("1958-11-01T18:30:00.000+0000")
},
"type" : "mother"
}
]
}
In one of the application use case, it is better to receive output in the below format
{
"_id" : ObjectId("56a31573a3b1f89cb895abd3"),
"dob" : {
"isodate" : ISODate("1996-01-21T18:30:00.000+0000")
},
"type" : "student"
},
{
"_id" : ObjectId("56a31573a3b1f89cb895abd3"),
"dob" : {
"isodate" : ISODate("1956-07-21T18:30:00.000+0000")
},
"type" : "father"
},
{
"_id" : ObjectId("56a31573a3b1f89cb895abd3"),
"dob" : {
"isodate" : ISODate("1958-11-01T18:30:00.000+0000")
},
"type" : "mother"
}
The approach is to $project the fields into array and then $unwind that array. However, projection doesn't allow me to create array.
I believe $group and its associated aggregation cannot be used as my operations are on the same document in the pipeline.
Is this possible?
Note - i have the flexibility to change the document design as well.
For Mongo 3.0
Here I have included a [null] array which gives me the option to insert array in projection using a combination of $setDiffernce and $cond. The output of this is given to $setUnion with $parent array.
db.p1.aggregate(
{ "$project": {
"allVal": {
'$setUnion': [
{"$setDifference": [
{ "$map": {
"input": [null],
"as": "type",
"in": { "$cond": [
{"$eq": ["$$type", null]},
{dob:"$dob", type:{$literal:'student'}},
null
]}
}},
[null]
]}
,
'$parent'
]
}
}},
{$unwind : '$allVal'}
)
For mongo 3.2
Feels heaven as I have avoided $setDifference and $literal hack adjustments.
db.p1.aggregate([
{
$project:{
parent : 1,
type: {$literal : 'student'},
'dob.isodate' : 1
}
},
{
$project:{
allValues: { $setUnion: [ [{dob:"$dob", type:'$type'}], "$parent" ] }
}
},
{
$unwind : '$allValues'
}
])
In the first projection, I am adding a new field called type
In the 2nd projection, I am creating a new array with 2 different nodes of the same document.
Currently this solution works for Mongo 3.2