using an Object (subdocument) with varying fields as _id - mongodb

Our (edX) original Mongo persistence representation uses a bson dictionary (aka object or subdocument) as the _id value (see, mongo/base.py). This id is missing a field.
Can some documents' _id values have more subfields than others without totally screwing up indexing?
What's the best way to handle existing documents without the additional field? Remove and replace them? Try to query w/ new _id format and if fails fall over to query w/o the new field? Try to query with both new and old _id format in one query?
To be more specific, the current format is
{'_id': {
'tag': 'i4x', // yeah, it's always this fixed value
'org': your_school_x,
'course': a_catalog_number,
'category': the_xblock_type,
'name': uniquifier_within_course
}}
I need to add 'run': the_session_or_term_for_course_run or 'course_id': org/course/run.

Documents within a collection need not have values for _id that are of the same structure. Hence, it is perfectly acceptable to have the following documents within a collection:
> db.foo.find()
{ "_id" : { "a" : 1 } }
{ "_id" : { "a" : 1, "b" : 2 } }
{ "_id" : { "c" : 1, "b" : 2 } }
Note that because the index is on only _id, only queries that specify a value for _id will use the index:
db.foo.find({_id:1}) // will use the index on _id
db.foo.find({_id:{state:"Alaska"}) // will use the index on _id
db.foo.find({"_id.a":1}) // will NOT use the index on _id
Note also that only a complete match of the "value" of _id will return a document. So this returns no documents for the collection above:
db.foo.find({_id:{c:1}})
Hence, for your case, you are welcome to add fields to the object that is the value for the _id key. And it does not matter that all documents have a different structure. But if you are hoping to query the collection by_id and have it be efficient, you are going to need to add indexes for all relevant sub parts that might be used in isolation. That is not super efficient.
_id is no different than any other key in this regard.

Related

MongoDB match a subdocument inside array (not positional reference)

My MongoDB has a key-value pair structure, inside my document has a data field which is an array that contains many subdocuments of two fields: name and value.
How do I search for a subdocument e.g ( {"name":"position", "value":"manager"}) and also multiple (e.g. {"name":"age", "value" : {$ge: 30}})
EDIT: I am not looking for a specific subdocument as I mentioned in title (not positional reference), rather, I want to retrieve the entire document but I need it to match the two subdocuments exactly.
Here are 2 queries to find the following record:
{
"_id" : ObjectId("sometobjectID"),
"data" : [
{
"name" : "position",
"value" : "manager"
}
]
}
// Both value and name (in the same record):
db.demo.find({$elemMatch: {"value": "manager", "name":"position"}})
// Both value and name (not necessarily in the same record):
db.demo.find({"data.value": "manager", "data.name":"position"})
// Just value:
db.demo.find({"data.value": "manager"})
Note how the . is used, this works for all subdocuments, even if they are in an array.
You can use any operator you like here, including $gte
edit
$elemMatch added to answer because of #Veeram's response
This answer explains the difference between $elemMatch and .

Return MongoDB documents that don't contain specific inner array items

How can I return a set of documents, each not containing a specific item in an inner array?
My data scheme is:
Posts:
{
"_id" : ObjectId("57f91ec96241783dac1e16fe"),
"votedBy" : [
{
"userId" : "101",
"vote": 1
},
{
"userId" : "202",
"vote": 2
}
],
"__v" : NumberInt(0)
}
I want to return a set of posts, non of which contain a given userId in any of the votedBy array items.
The official documentation implies that this is possible:
MongoDB documentation: Field with no specific array index
Though it returns an empty set (for the more simple case of finding a document with a specific array item).
It seems like I have to know the index for a correct set of results, like:
votedBy.0.userId.
This Question is the closest I found, with this solution (Applied on my scheme):
db.collection.find({"votedBy": { $not: {$elemMatch: {userId: 101 } } } })
It works fine if the only inner document in the array matches the one I wish not to return, but in the example case I specified above, the document returns, because it finds the userId=202 inner document.
Just to clarify: I want to return all the documents, that NONE of their votedBy array items have the given userId.
I also tried a simpler array, containing only the userId's as an array of Strings, but still, each of them receives an Id and the search process is just the same.
Another solution I tried is using a different collection for uservotes, and applying a lookup to perform a SQL-similar join, but it seems like there is an easier way.
I am using mongoose (node.js).
User $ne on the embedded userId:
db.collection.find({'votedBy.userId': {$ne: '101'}})
It will filter all the documents with at least one element of userId = "101"

Override existing Docs in production MongoDB

I have recently changed one of my fields from object to array of objects.
In my production I have only 14 documents with this field, so I decided to change those fields.
Is there any best practices to do that?
As it is in my production I need to do it in a best way possible?
I got the document Id's of those collections.like ['xxx','yyy','zzz',...........]
my doc structure is like
_id:"xxx",option1:{"op1":"value1","op2":"value2"},option2:"some value"
and I want to change it like(converting object to array of objects)
_id:"xxx",option1:[{"op1":"value1","op2":"value2"},
{"op1":"value1","op2":"value2"}
],option2:"some value"
Can I use upsert? If so How to do it?
Since you need to create the new value of the field based on the old value, you should retrieve each document with a query like
db.collection.find({ "_id" : { "in" : [<array of _id's>] } })
then iterate over the results and $set the value of the field to its new value:
db.collection.find({ "_id" : { "in" : [<array of _id's>] } }).forEach(function(doc) {
oldVal = doc.option1
newVal = compute_newVal_from_oldVal(oldVal)
db.collection.update({ "_id" : doc._id }, { "$set" : { "option" : newVal } })
})
The document structure is rather schematic, so I omitted putting in actual code to create newVal from oldVal.
Since it is an embedded document type you could use push query
db.collectionname.update({_id:"xxx"},{$push:{option1:{"op1":"value1","op2":"value2"}}})
This will create document inside embedded document.Hope it helps

Updating multiple MongoDB records in Sails.js

I need to update multiple records in mongodb.
From frontend logic , i got the array of id's as below.
ids: [ [ '530ac94c9ff87b5215a0d6e6', '530ac89a7345edc214618b25' ] ]
I have an array of ids as above , i need to update the folder field for all the records in that array.
I tried passing the id's to mongodb query as below , but still that doesn't work.
Post.native(function(err, collection) {
collection.update({
_id : {
"$in" : ids
}
}, { folder : 'X'}, {
multi : true
}, function(err, result) {
console.log(result);
});
});
Please help.
There seem to be two possible problems.
1) your ids array is not an array of ids, it's an array which has a single element which is itself an array, which has two elements. An array of ids would be `[ 'idvalue1', 'idvalue2']
2) your id values inside of arrays are strings - is that how you are storing your "_id" values? If they are ObjectId() type then they are not a string but a type ObjectId("stringhere") which is not the same type and won't be equal to "stringhere".
There is no reason to use the native method in this case. Just do:
Post.update({id : ids}, {folder : 'X'}).exec(console.log);
Waterline automatically does an "in" query when you set a criteria property to an array, and Sails-Mongo automatically translates "id" to "_id" and handles ObjectId translation for you.
Those strings look like the string representation of mongod ObjectIds, so probably what you want to do is turn them into ObjectIds before querying. Assuming you've corrected your problem with the extra level of nesting in the array, that is:
ids = ['530ac94c9ff87b5215a0d6e6', '530ac89a7345edc214618b25']
Then you want to do something like this:
oids = []
for (var i in ids)
oids.push(ObjectId(ids[i]))
db.c.find({_id: {$in: oids}})
Does that fix your problem?

Get position of selected document in collection [mongoDB]

How to get position (index) of selected document in mongo collection?
E.g.
this document: db.myCollection.find({"id":12345})
has index 3 in myCollection
myCollection:
id: 12340, name: 'G'
id: 12343, name: 'V'
id: 12345, name: 'A'
id: 12348, name: 'N'
If your requirement is to find the position of the document irrespective of any order, that is not
possible as MongoDb does not store the documents in specific order.
However,if you want to know the index based on some field, say _id , you can use this method.
If you are strictly following auto increments in your _id field. You can count all the documents
that have value less than that _id, say n , then n + 1 would be index of the document based on _id.
n = db.myCollection.find({"id": { "$lt" : 12345}}).count() ;
This would also be valid if documents are deleted from the collection.
As far as I know, there is no single command to do this, and this is impossible in general case (see Derick's answer). However, using count() for a query done on an ordered id value field seems to work. Warning: this assumes that there is a reliably ordered field, which is difficult to achieve in a concurrent writer case. In this example _id is used, however this will only work with a single writer case.:
MongoDB shell version: 2.0.1
connecting to: test
> use so_test
switched to db so_test
> db.example.insert({name: 'A'})
> db.example.insert({name: 'B'})
> db.example.insert({name: 'C'})
> db.example.insert({name: 'D'})
> db.example.insert({name: 'E'})
> db.example.insert({name: 'F'})
> db.example.find()
{ "_id" : ObjectId("4fc5f040fb359c680edf1a7b"), "name" : "A" }
{ "_id" : ObjectId("4fc5f046fb359c680edf1a7c"), "name" : "B" }
{ "_id" : ObjectId("4fc5f04afb359c680edf1a7d"), "name" : "C" }
{ "_id" : ObjectId("4fc5f04dfb359c680edf1a7e"), "name" : "D" }
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
{ "_id" : ObjectId("4fc5f053fb359c680edf1a80"), "name" : "F" }
> db.example.find({_id: ObjectId("4fc5f050fb359c680edf1a7f")})
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
> db.example.find({_id: {$lte: ObjectId("4fc5f050fb359c680edf1a7f")}}).count()
5
>
This should also be fairly fast if the queried field is indexed. The example is in mongo shell, but count() should be available in all driver libs as well.
This might be very slow but straightforward method. Here you can pass as usual query. Just I am looping all the documents and checking if condition to match the record. Here I am checking with _id field. You can use any other single field or multiple fields to check it.
var docIndex = 0;
db.url_list.find({},{"_id":1}).forEach(function(doc){
docIndex++;
if("5801ed58a8242ba30e8b46fa"==doc["_id"]){
print('document position is...' + docIndex);
return false;
}
});
There is no way that MongoDB can return this as it does not keep documents in order in the database, just like MySQL f.e. doesn't name row numbers.
The ObjectID trick from jhonkola will only work if only one client creates new elements, as the ObjectIDs are generated on the client side, with the first part being a timestamp. There is no guaranteed order if different clients talk to the same server. Still, I would not rely on this.
I also don't quite understand what you are trying to do though, so perhaps mention that in your question? I can then update the answer.
Restructure your collection to include the position of any entry i.e {'id': 12340, 'name': 'G', 'position': 1} then when searching the database collection(myCollection) using the desired position as a query
The queries I use that return the entire collection all use sort to get a reproducible order, find.sort.forEach works with the script above to get the correct index.