MongoDB aggregation and paging - mongodb

I have documents with my internal id field inside of each document and date when this document was added. There could be number of documents with the same id (differents versions of the same document), but dates will always be different for those documents. I want in some query, to bring only one document from all versions of the same document (with same id field) that was relevant to specified date, and I want to display them with paging (50 rows in the page). So, is there any chance to do this in MongoDB (operations - query documents by some field, group them by id field, sort by date field and take only first, and all this should be with paging.) ?
Please see example :Those are documents, some of them different documents,like documents A,B and C, and some are versions of the same documents,
like _id: 1, 2 and 3 are all version of the same document A
Document A {
_id : 1,
"id" : "A",
"author" : "value",
"date" : "2015-11-05"
}
Document A {
_id : 2,
"id" : "A",
"author" : "value",
"date" : "2015-11-06"
}
Document A {
_id : 3,
"id" : "A",
"author" : "value",
"date" : "2015-11-07"
}
Document B {
_id : 4,
"id" : "B",
"author" : "value",
"date" : "2015-11-06"
}
Document B {
_id : 5,
"id" : "B",
"author" : "value",
"date" : "2015-11-07"
}
Document C {
_id : 6,
"id" : "C",
"author" : "value",
"date" : "2015-11-07"
}
And I want to query all documents that has "value" in the "author" field.
And from those documents to bring only one document of each with latest date for
the specified date, for example 2015-11-08. So, I expect the result to be :
_id : 3, _id : 5, _id : 6
And also paging , for example 10 documents in each page.
Thanks !!!!!

Two documents can't have the same _id. There is a unique index on _id by default.
As per 1. you need to have a compound _id field which includes the date:
{
"_id":{
docId: yourFormerIdValue,
date: new ISODate()
}
// other fields
}
To get the version valid at a specified date, the query becomes rather easy:
db.yourColl.find({
"_id":{
"docId": idToFind,
// get only the version valid up to a specific date...
"date":{ "$lte": someISODate }
}
})
// ...sort the results descending...
.sort("_id.date":-1)
// ...and get only the first and therefor newest entry
.limit(1)

Related

How to filter a string field containing date to get last 10 days data in mongodb?

Below is the document structure in the collection, from this I need to filter using createdTime which is of type string, to get last 10 days records from the collection.
{
"_id" : 1,
"data" : {
"name" : "abc",
"createdTime" : "2021-02-19T05:13:29.351+00:00",
"payloadVersion" : "1.0"
},
"detail" : {
"event" : "XYZ",
"spocId" : 1244,
"deviceId" : "12345"
}
}
The DB name is TEST and tried this query
db.TEST.find({"data.createdTime":{"gte":new Date(new Date().setDate(new Date().getDate()-10))}})
but it is not giving any results. What is the best possible solution for this without changing the type of date field from string to date?

Mongodb aggregate lookup return only one field of array

i have some collections for our project.
Casts collection contains movie casts
Contents collection contains movie contents
i want to run aggregate lookup for get information about movie casts with position type.
i removed collections details unnecessary fields.
Casts details:
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_name" : "",
"en_name" : "Ehsan",
"fa_bio" : "",
"en_bio" : ""
}
Contents details:
{
"_id" : ObjectId("5a6b8b734f1408137f79e2cc"),
"casts" : [
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_fictionName" : "",
"en_fictionName" : "Ehsan2",
"positionType" : {
"id" : 3,
"fa_name" : "",
"en_name" : "Director"
}
},
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_fictionName" : "",
"en_fictionName" : "Ehsan1",
"positionType" : {
"id" : 3,
"fa_name" : "",
"en_name" : "Writers"
}
}
],
"status" : 0,
"created" : Timestamp(1516997542, 4),
"updated" : Timestamp(1516997542, 5)
}
when i run aggregate lookup with bellow query, in new generated lookup array only one casts contents If in accordance with above casts array value aggregate lookup should return two casts content with two type. in casts array value exists two type of casts, 1) writers and directors. but returned director casts content. _casts should contains two object not one object!
aggregate lookup query:
{$lookup:{from:"casts",localField:"casts._id",foreignField:"_id",as:"_casts"}}
result:
{
"_id" : ObjectId("5a6b8b734f1408137f79e2cc"),
"casts" : [
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_fictionName" : "",
"en_fictionName" : "Ehsan2",
"positionType" : {
"id" : 3,
"fa_name" : "",
"en_name" : "Director"
}
},
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_fictionName" : "",
"en_fictionName" : "Ehsan1",
"positionType" : {
"id" : 3,
"fa_name" : "",
"en_name" : "Writers"
}
}
],
"_casts" : [
{
"_id" : ObjectId("5a6cf47415621604942386cd"),
"fa_name" : "",
"en_name" : "Ehsan",
"fa_bio" : "",
"en_bio" : ""
}
],
"status" : 0,
"created" : Timestamp(1516997542, 4),
"updated" : Timestamp(1516997542, 5)
}
EDIT-1
finally my problem is solved. i have only one problem with this query, this query doesn't show root document fields. finally solve this problem. finally query exists in EDIT-2.
query:
db.contents.aggregate([
{"$unwind":"$casts"},
{"$lookup":{"from":"casts","localField":"casts._id","foreignField":"_id","as":"casts.info"}},
{"$unwind":"$casts.info"},
{"$group":{"_id":"$_id", "casts":{"$push":"$casts"}}},
])
EDIT-2
db.contents.aggregate([
{"$unwind":"$casts"},
{"$lookup":{"from":"casts","localField":"casts._id","foreignField":"_id","as":"casts.info"}},
{"$unwind":"$casts.info"},
{$group:{"_id":"$_id", "data":{"$first":"$$ROOT"}, "casts":{"$push":"$casts"}}},
{$replaceRoot:{"newRoot":{"$mergeObjects":["$data",{"casts‌​":"$casts"}]}}},
{$project:{"casts":0}}
]).pretty()
This is expected behavior.
From the docs,
If your localField is an array, you may want to add an $unwind stage
to your pipeline. Otherwise, the equality condition between the
localField and foreignField is foreignField: { $in: [
localField.elem1, localField.elem2, ... ] }.
So to join each local field array element with foreign field element you have to $unwind the local array.
db.content.aggregate([
{"$unwind":"$casts"},
{"$lookup":{"from":"casts","localField":"casts._id","foreignField":"_id","as":"_casts"}}
])
Vendor Collection
Items Collection
db.items.aggregate([
{ $match:
{"item_id":{$eq:"I001"}}
},
{
$lookup:{
from:"vendor",
localField:"vendor_id",
foreignField:"vendor_id",
as:"vendor_details"
}
},
{
$unwind:"$vendor_details"
},
{
$project:{
"_id":0,
"vendor_id":0,
"vendor_details.vendor_company_description":0,
"vendor_details._id":0,
"vendor_details.country":0,
"vendor_details.city":0,
"vendor_details.website":0
}
}
]);
Output
Your Casts collection shows only 1 document. Your Contents collection, likewise, shows only 1 document.
This is 1 to 1 - not 1 to 2. Aggregate is working as designed.
The Contents document has 2 "casts." These 2 casts are sub-documents. Work with those as sub-documents, or re-design your collections. I don't like using sub-documents unless I know I will not need to use them as look-ups or join on them.
I would suggest you re-design your collection.
Your Contents collection (it makes me think of "Movies") could look like this:
_id
title
releaseDate
genre
etc.
You can create a MovieCasts collection like this:
_id
movieId (this is _id from Contents collection, above)
castId (this is _id from Casts collection, below)
Casts
_id
name
age
etc.

Find oldest/youngest post in mongodb collection

I have a mongodb collection with many fields. One field is 'date_time', which is in an ISO datetime format, Ex: ISODate("2014-06-11T19:16:46Z"), and another field is 'name'.
Given a name, how do I find out the oldest/youngest post in the collection?
Ex: If there are two posts in the collection 'data' :
[{'name' : 'John', 'date_time' : ISODate("2014-06-11T19:16:46Z")},
{'name' : 'John', 'date_time' : ISODate("2015-06-11T19:16:46Z")}]
Given the name 'John' how do I find out the oldest post in the collection i.e., the one with ISODate("2014-06-11T19:16:46Z")? Similarly for the youngest post.
Oldest:
db.posts.find({ "name" : "John" }).sort({ "date_time" : 1 }).limit(1)
Newest:
db.posts.find({ "name" : "John" }).sort({ "date_time" : -1 }).limit(1)
Index on { "name" : 1, "date_time" : 1 } to make the queries efficient.
You could aggregate it as below:
Create an index on the name and date_time fields, so that the
$match and $sort stage operations may use it.
db.t.ensureIndex({"name":1,"date_time":1})
$match all the records for the desired name(s).
$sort by date_time in ascending order.
$group by the name field. Use the $first operator to get the first
record of the group, which will also be the oldest. Use the $last
operator to get the last record in the group, which will also be the
newest.
To get the entire record use the $$ROOT system variable.
Code:
db.t.aggregate([
{$match:{"name":"John"}},
{$sort:{"date_time":1}},
{$group:{"_id":"$name","oldest":{$first:"$$ROOT"},
"youngest":{$last:"$$ROOT"}}}
])
o/p:
{
"_id" : "John",
"oldest" : {
"_id" : ObjectId("54da62dc7f9ac597d99c182d"),
"name" : "John",
"date_time" : ISODate("2014-06-11T19:16:46Z")
},
"youngest" : {
"_id" : ObjectId("54da62dc7f9ac597d99c182e"),
"name" : "John",
"date_time" : ISODate("2015-06-11T19:16:46Z")
}
}
db.t.find().sort({ "date_time" : 1 }).limit(1).pretty()

MongoDB Group querying for Embeded Document

I have a mongo document which has structure like
{
"_id" : "THIS_IS_A_DHP_USER_ID+2014-11-26",
"_class" : "weight",
"items" : [
{
"dateTime" : ISODate("2014-11-26T08:08:38.716Z"),
"value" : 98.5
},
{
"dateTime" : ISODate("2014-11-26T08:18:38.716Z"),
"value" : 95.5
},
{
"dateTime" : ISODate("2014-11-26T08:28:38.663Z"),
"value" : 90.5
}
],
"source" : "MANUAL",
"to" : ISODate("2014-11-26T08:08:38.716Z"),
"from" : ISODate("2014-11-26T08:08:38.716Z"),
"userId" : "THIS_IS_A_DHP_USER_ID",
"createdDate" : ISODate("2014-11-26T08:38:38.776Z")
}
{
"_id" : "THIS_IS_A_DHP_USER_ID+2014-11-25",
"_class" : "weight",
"items" : [
{
"dateTime" : ISODate("2014-11-25T08:08:38.716Z"),
"value" : 198.5
},
{
"dateTime" : ISODate("2014-11-25T08:18:38.716Z"),
"value" : 195.5
},
{
"dateTime" : ISODate("2014-11-25T08:28:38.716Z"),
"value" : 190.5
}
],
"source" : "MANUAL",
"to" : ISODate("2014-11-25T08:08:38.716Z"),
"from" : ISODate("2014-11-25T08:08:38.716Z"),
"userId" : "THIS_IS_A_DHP_USER_ID",
"createdDate" : ISODate("2014-11-26T08:38:38.893Z")
}
The query that want to fire on this document structure,
finding documents for a particular user id
unwinding the embedded array
Grouping the documents based over _id with -
summing the items.value of the embedded array
getting the minimum of the items.dateTime of the embedded array
Note. The sum and min, I want to get as a object i.e. { value : sum , dateTime : min of the items.dateTime} inside an array of items
Can this be achieved in an single aggregation call using push or some other technique.
When you group over a particular _id, and apply aggregation operators such as $min and $sum, there exists only one record per group(_id), that holds the sum and the minimum date for that group. So there is no way to obtain a different sum and a different minimum date for the same _id, which also logically makes no sense.
What you would want to do is:
db.collection.aggregate([
{$match:{"userId":"THIS_IS_A_DHP_USER_ID"}},
{$unwind:"$items"},
{$group:{"_id":"$_id",
"values":{$sum:"$items.value"},
"dateTime":{$min:"$items.dateTime"}}}
])
But in case when you do not query for a particular userId, then you would have multiple groups, each group having its own sum and min date. Then it makes sense to accumulate all these results together in an array using the $push operator.
db.collection.aggregate([
{$unwind:"$items"},
{$group:{"_id":"$_id",
"result":{$sum:"$items.value"},
"dateTime":{$min:"$items.dateTime"}}},
{$group:{"_id":null,"result":{$push:{"value":"$result",
"dateTime":"$dateTime",
"id":"$_id"}}}},
{$project:{"_id":0,"result":1}}
])
you should use following aggregation may it works
db.collectionName.aggregate(
{"$unwind":"$items"},
{"$match":{"userId":"THIS_IS_A_DHP_USER_ID"}},
{"$group":{"_id":"$_id","sum":{"$sum":"$items.value"},
"minDate":{"$min":"$items.dateTime"}}}
)

MongoDB Why this error : can't append to array using string field name: comments

I have a DB structure like below:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"content" : "xxx"
}
]
}
I update a new subdocument in the comments feild. It is OK.
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
after that the DB structure:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"comments" : [
{
"id" : 3,
"content" : "xxx"
}
],
"content" : "xxx"
}
]
}
But when I update a new subdocument in the comment field that _id is 3, There is a error:
db.test.update(
{"_id" : 1, "comments.comments.id" : 3},
{$push : {"comments.comments.$.comments" : {id : 4, content:"xxx"}}}
)
error message:
can't append to array using string field name: comments
Well, it makes total sense if you think about it. MongoDb has the advantage and the disadvantage of solving magically certain things.
When you query the database for a specific regular field like this:
{ field : "value" }
The query {field:"value"} makes total sense, it wouldn't in case value is part of an array but Mongo solves it for you, so in case the structure is:
{ field : ["value", "anothervalue"] }
Mongo iterates through all of them and matches "value" into the field and you don't have to think about it. It works perfectly.. at only one level, because it's impossible to guess what you want to do if you have multiple levels
In your case the first query works because it's the case in this example:
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
Matches _id in the first level, and comments._id at the second level, it gets an array as a result but Mongo is able to solve it.
But in the second case, think what you need, let's isolate the where clause:
{"_id" : 1, "comments.comments.id" : 3},
"Give me from the main collection records with _id:1" (one doc)
"And comments which comments inside have and id=3" (array * array)
The first level is solved easily, comments.id, the second is not possible due comments returns an array, but one more level is an array of arrays and Mongo gets an array of arrays as a result and it's not possible to push a document into all the records of the array.
The solution is to narrow your where clause to obtain an unique document in comments (could be the first one) but it's not a good solution because you never know what is the position of the document you're looking for, using the shell I think the only option to be accurate is to do it in two steps. Check this query that works (not the solution anyway) but "solves" the multiple array part fixing it to the first record:
db.test.update(
{"_id" : 1, "comments.0.comments._id" : 3},
{$push : {"comments.0.comments.$.comments" : {id : 4, content:"xxx"}}}
)