Fetch all the documents has reference to a certain document in MongoDB - mongodb

I need all the documents from different collections that have reference to a certain "parent" document. Any easy workaround?
My naive approach would be to loop all the collections and loop all the documents and check the condition. Any easier / more efficient way?

My naive approach would be to loop all the collections and loop all the documents and check the condition. Any easier / more efficient way?
Well, without the data model that is hard to answer, but the reference could have an index, right? I mean you know which field is the reference? For example,
Users (parent) {
Id : ObjectId("123"),
...
}
Comments {
Id : ObjectId("abc"),
UserId : ObjectId("123"),
...
}
Questions {
Id : ObjectId("efc"),
UserId : ObjectId("123"),
...
}
Then, knowing that UserId is the reference, you can call db.Comments.find({ UserId : ObjectId("123") }) and find all references. That way, you don't have to loop through all documents. The only thing you need is a lookup table collection name -> name of the reference field

Related

MongoDB - how do I update a value in nested array/object?

I have a document in my Mongo collection which has a field with the following structure:
"_id" : "F7WNvjwnFZZ7HoKSF",
"process" : [
{
"process_id" : "wTGqVk5By32mpXadZ",
"stages" : [
{
"stage_id" : "D6Huk89DGFsd29ds7",
"completed" : "N"
},
{
"stage_id" : "Msd390vekn09nvL23",
"completed" : "N"
}
]
}
]
I need to update the value of completed where the stage_id is equal to 'D6Huk89DGFsd29ds7' - the update query will not know which object in the stages array this value of stage_id will be in.
How do I do this?
Since you have nested arrays in your object, this is bit tricky and I'm not sure if this problem can be solved with help of just one update query.
However, if you happen to know index of your matching object in first array, in your case process[0] you can write your update query like.
db.collection.update(
{"process.stages.stage_id":"D6Huk89DGFsd29ds7"},
{$set:{"process.0.stages.$.completed":"Y"}}
);
Query above will work perfect with your test case. Again, there is still possibility of having multiple objects at root level and there is no guarantee that matching object will always be at 0 index.
Solution I proposed above will fail if you have multiple children of process and if matching index of object is not zero.
However, you can achieve your goal with help of client side programming. That is find matching document, modify on client side and replace whole document with new content.
Since this approach is very in efficient, I'll suggest that you should consider altering your document structure to avoid nesting. Create another collection and move content of process array there.
In the end, I removed the outer process block, so that the process_id and stages were in the root of the document - made the process of updating easier using:
MyColl.update(
{
_id: 'F7WNvjwnFZZ7HoKSF',
"stages.stage_id": 'D6Huk89DGFsd29ds7'
},
{
$set: {"stages.$.completed": 'Y'}
}
);

Need Help on Mongo DB query

There is an existing person collection in the system which is like:
{
"_id" : ObjectId("536378bcc9ecd7046700001f"),
"engagements":{
"5407357013875b9727000111" : {
"role" : "ADMINISTRATOR",
},
"5407357013875b9727000222" : {
"role" : "DEVELOPER",
}
}
}
So that multiple user objects can have the same engagement with a specific role, I need to fire a query in this hierarchy where I can get all the persons which have a specific engagement in the engagements property of person collection.
I want to get all the persons which have
5407357013875b9727000222 in the engagements.
I know $in operator could be used but the problem is that I need to compare the keys of the sub Json engagements.
I think it's as simple as this:
db.users.find({'engagements.5407357013875b9727000222': {$exists: true}})
If you want to match against multiple engagement ids, then you'll have to use $or. Sorry, no $in for you here.
Note, however, that you need to restructure your data, as this one can't be indexed to help this concrete query. Here I assume you care about performance and this query is used often enough to have impact on the database.

How to store an ordered set of documents in MongoDB without using a capped collection

What's a good way to store a set of documents in MongoDB where order is important? I need to easily insert documents at an arbitrary position and possibly reorder them later.
I could assign each item an increasing number and sort by that, or I could sort by _id, but I don't know how I could then insert another document in between other documents. Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?
My first guess would be to increment the sequence of all of the following elements so that there would be space for the new element using a query something like db.items.update({"sequence":{$gte:6}}, {$inc:{"sequence":1}}). My limited understanding of Database Administration tells me that a query like that would be slow and generally a bad idea, but I'm happy to be corrected.
I guess I could set the new element's sequence to 5.5, but I think that would get messy rather quickly. (Again, correct me if I'm wrong.)
I could use a capped collection, which has a guaranteed order, but then I'd run into issues if I needed to grow the collection. (Yet again, I might be wrong about that one too.)
I could have each document contain a reference to the next document, but that would require a query for each item in the list. (You'd get an item, push it onto the results array, and get another item based on the next field of the current item.) Aside from the obvious performance issues, I would also not be able to pass a sorted mongo cursor to my {#each} spacebars block expression and let it live update as the database changed. (I'm using the Meteor full-stack javascript framework.)
I know that everything has it's advantages and disadvantages, and I might just have to use one of the options listed above, but I'd like to know if there is a better way to do things.
Based on your requirement, one of the approaches could be to design your schema, in such a way that each document has the capability to hold more than one document and in itself act as a capped container.
{
"_id":Number,
"doc":Array
}
Each document in the collection will act as a capped container, and the documents will be stored as array in the doc field. The doc field being an array, will maintain the order of insertion.
You can limit the number of documents to n. So the _id field of each container document will be incremental by n, indicating the number of documents a container document can hold.
By doing these you avoid adding extra fields to the document, extra indices, unnecessary sorts.
Inserting the very first record
i.e when the collection is empty.
var record = {"name" : "first"};
db.col.insert({"_id":0,"doc":[record]});
Inserting subsequent records
Identify the last container document's _id, and the number of
documents it holds.
If the number of documents it holds is less than n, then update the
container document with the new document, else create a new container
document.
Say, that each container document can hold 5 documents at most,and we want to insert a new document.
var record = {"name" : "newlyAdded"};
// using aggregation, get the _id of the last inserted container, and the
// number of record it currently holds.
db.col.aggregate( [ {
$group : {
"_id" : null,
"max" : {
$max : "$_id"
},
"lastDocSize" : {
$last : "$doc"
}
}
}, {
$project : {
"currentMaxId" : "$max",
"capSize" : {
$size : "$lastDocSize"
},
"_id" : 0
}
// once obtained, check if you need to update the last container or
// create a new container and insert the document in it.
} ]).forEach( function(check) {
if (check.capSize < 5) {
print("updating");
// UPDATE
db.col.update( {
"_id" : check.currentMaxId
}, {
$push : {
"doc" : record
}
});
} else {
print("inserting");
//insert
db.col.insert( {
"_id" : check.currentMaxId + 5,
"doc" : [ record ]
});
}
})
Note that the aggregation, runs on the server side and is very efficient, also note that the aggregation would return you a document rather than a cursor in versions previous to 2.6. So you would need to modify the above code to just select from a single document rather than iterating a cursor.
Inserting a new document in between documents
Now, if you would like to insert a new document between documents 1 and 2, we know that the document should fall inside the container with _id=0 and should be placed in the second position in the doc array of that container.
so, we make use of the $each and $position operators for inserting into specific positions.
var record = {"name" : "insertInMiddle"};
db.col.update(
{
"_id" : 0
}, {
$push : {
"doc" : {
$each : [record],
$position : 1
}
}
}
);
Handling Over Flow
Now, we need to take care of documents overflowing in each container, say we insert a new document in between, in container with _id=0. If the container already has 5 documents, we need to move the last document to the next container and do so till all the containers hold documents within their capacity, if required at last we need to create a container to hold the overflowing documents.
This complex operation should be done on the server side. To handle this, we can create a script such as the one below and register it with mongodb.
db.system.js.save( {
"_id" : "handleOverFlow",
"value" : function handleOverFlow(id) {
var currDocArr = db.col.find( {
"_id" : id
})[0].doc;
print(currDocArr);
var count = currDocArr.length;
var nextColId = id + 5;
// check if the collection size has exceeded
if (count <= 5)
return;
else {
// need to take the last doc and push it to the next capped
// container's array
print("updating collection: " + id);
var record = currDocArr.splice(currDocArr.length - 1, 1);
// update the next collection
db.col.update( {
"_id" : nextColId
}, {
$push : {
"doc" : {
$each : record,
$position : 0
}
}
});
// remove from original collection
db.col.update( {
"_id" : id
}, {
"doc" : currDocArr
});
// check overflow for the subsequent containers, recursively.
handleOverFlow(nextColId);
}
}
So that after every insertion in between , we can invoke this function by passing the container id, handleOverFlow(containerId).
Fetching all the records in order
Just use the $unwind operator in the aggregate pipeline.
db.col.aggregate([{$unwind:"$doc"},{$project:{"_id":0,"doc":1}}]);
Re-Ordering Documents
You can store each document in a capped container with an "_id" field:
.."doc":[{"_id":0,","name":"xyz",...}..]..
Get hold of the "doc" array of the capped container of which you want
to reorder items.
var docArray = db.col.find({"_id":0})[0];
Update their ids so that after sorting the order of the item will change.
Sort the array based on their _ids.
docArray.sort( function(a, b) {
return a._id - b._id;
});
update the capped container back, with the new doc array.
But then again, everything boils down to which approach is feasible and suits your requirement best.
Coming to your questions:
What's a good way to store a set of documents in MongoDB where order is important?I need to easily insert documents at an arbitrary
position and possibly reorder them later.
Documents as Arrays.
Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?
use the $each and $position operators in the db.collection.update() function as depicted in my answer.
My limited understanding of Database Administration tells me that a
query like that would be slow and generally a bad idea, but I'm happy
to be corrected.
Yes. It would impact the performance, unless the collection has very less data.
I could use a capped collection, which has a guaranteed order, but then I'd run into issues if I needed to grow the collection. (Yet
again, I might be wrong about that one too.)
Yes. With Capped Collections, you may lose data.
An _id field in MongoDB is a unique, indexed key similar to a primary key in relational databases. If there is an inherent order in your documents, ideally you should be able to associate a unique key to each document, with the key value reflecting the order. So while preparing your document for insertion, explicitly add an _id field as this key (if you do not, mongo creates it automatically with a BSON objectid).
As far as retrieving the results are concerned, MongoDB does not guarantee the order of return documents unless you explicitly use .sort() . If you do not use .sort(), the results are usually returned in natural order (order of insertion).Again, there is no guarantee on this behavior.
I'd advise you to override _id with your order while inserting, and use a sort while retrieving. Since _id is a necessary and auto-indexed entity, you will not be wasting any space defining a sort key, and storing the index for it.
For abitrary sorting of any collection, you'll need a field to sort it on. I call mine "sequence".
schema:
{
_id: ObjectID,
sequence: Number,
...
}
db.items.ensureIndex({sequence:1});
db.items.find().sort({sequence:1})
Here is a link to some general sorting database answers that may be relevant:
https://softwareengineering.stackexchange.com/questions/195308/storing-a-re-orderable-list-in-a-database/369754
I suggest going with Floating point solution - adding a position column:
Use a floating-point number for the position column.
You can then reorder the list changing only the position column in the "moved" row.
If your user wants to position "red" after "blue" but before "yellow" Then you just need to calculate
red.position = ((yellow.position - blue.position) / 2) + blue.position
After a few re-positions in the same place (Cuttin in half every time) - you might reach a wall - it's better that if you reach a certain threshold - to resort the list.
When retrieving it you can simply say col.sort() to get it sorted and no need for any client-side code (Like in the case of a Linked list solution)

Composition in mongo query (SQL sub query equivalent)

I have a couple of collections, for example;
members
id
name
//other fields we don't care about
emails
memberid
//other fields we don't care about
I want to delete the email for a given member. In SQL I could use a nested query, something like;
delete emails
where memberid in (select id from members where name = "evanmcdonnal")
In mongo I'm trying something like this;
db.emails.remove( {"memberid":db.members.find( {"name":"evanmcdonnal"}, {id:1, _id:0} ) )
But it returns no results. So I took the nested query and ran it on it's own. The issue I believe is that it returns;
{
"id":"myMadeUpId"
}
Which - assuming inner queries execute first - gives me a query of;
db.emails.remove( {"memberid":{ "id""myMadeUpId"} )
When really I just want the value of id. I've tried using dictionary and dot notation to access the value of id with no luck. Is there a way to do this that is similar to my attempted query above?
Let's see how you'd roughly translate
delete emails where memberid in (select id from members where name = "evanmcdonnal")
into a set of mongo shell operations. You can use:
db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).forEach(function(doc) {
db.emails.remove({ "memberid" : doc.id });
});
However, this does one remove query for each result document from members. You could push the members result ids into an array and use $in:
var plzDeleteIds = db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).toArray();
db.emails.remove({ "memberid" : { "$in" : plzDeleteIds } });
but that could be a problem if plzDeleteIds gets very, very large. You could batch. In all cases we need to do multiple requests to the database because we are querying multiple collections, which always requires multiple operations in MongoDB (as of 2.6, anyway).
The more idiomatic way to do this type of thing in MongoDB is to store the member information you need in the email collection on the email documents, possibly as a subdocument. I couldn't say exactly if and how you should do this since you've given only a bit of your data model that has, apparently, been idealized.
As forEach() way didn't work for me i solved this using:
var plzDeleteIds = db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).toArray();
var aux = plzDeleteIds["0"];
var aux2 = aux.map(function(u) { return u.name; } );
db.emails.remove({ "memberid" : { "$in" : aux2 } });
i hope it help!
I do not believe what you are asking for is possible. MongoDB queries talk to just one collection -- there is no syntax to go cross-collection.
However, what about the following:
The name in members does not seem to be unique. If you were to delete emails from the "emails' collection using name as the search attribute, you might have a problem. Why not store the actual email address in the email collection? And store email address again in the members collection. When your user logs in, you will have retrieved his member record -- including the email address. When you want to delete his emails, you already have his email and you can do:
db.emails.remove({emailAddress: theActualAddress))
Does that work?

Reference an _id in a Subdocument in another Collection Mongodb

I am developing an application with mongodb and nodejs
I should also mention that I am new to both so please help me through this question
my database has a collection categories and then in each category I am storing products in subdocument
just like below :
{
_id : ObjectId(),
name: String,
type: String,
products : [{
_id : ObjectId(),
name : String,
description : String,
price : String
}]
});
When it comes to store the orders in database the orders collection will be like this:
{
receiver : String,
status : String,
subOrders : [
{
products :[{
productId : String,
name : String,
price : String,
status : String
}],
tax : String,
total : String,
status : String,
orderNote : String
}
]
}
As you can see we are storing _id of products which is a subdocument of categories in orders
when storing there is no issue obviously, when it comes to fetch these data if we just need the limited field like name or price there will be no issue as well, but if later on we need some extra fields from products like description,... they are not stored in orders.
My question is this:
Is there any easy way to access other fields of products apart from loop through the whole categories in mongodb, namely I need a sample code for querying the description of a product by only having its _id in mongodb?
or our design and implementation was wrong and I have to re-design it from scratch and separate the products from categories into another collection?
please don't put links to websites or weblogs that generally talks about mongodb and its collections implementations unless they focus on a very similar issue to mine
thanks in advance
I'd assume that you'd want to return as many product descriptions as matched the current list of products, so first, there isn't a query to return only matching array elements. Using $elemMatch you can return a specific element or the first match, but not only matching array elements. However, $elemMatch can also be used as a projection operator.
db.categories({ "products._id" : "PID1" },
{ $elemMatch : { "products._id" : "PID1" },
"products._id" : 1,
"products.description" : 1})
You'd definitely want to index the "products._id" field to achieve reasonable performance.
You might consider instead creating a products collection where each document contains a category identifier, much like you would in a relational database. This is a common pattern in MongoDb when embedding doesn't make sense, or complicates queries and aggregations.
Assuming that is true:
You'll need to load the data from the second collection manually. There are no joins in MognoDb. You might consider using $in which takes a list of values for a field and loads all matching documents.
Depending on the driver you're using to access MongoDb, you should be able to use the projection feature of find, which can limit the fields returned for a document to just those you've specified.
As product descriptions ardently likely to change frequently, you might also consider caching the values for a period on the client (like a web server for example).
db.products.find({ _id: { $in : [ 'PID1', 'PID2'] } }, { description : 1 })