Adding new property to each document in a large collection - mongodb

Using the mongodb shell, I'm trying to add a new property to each document in a large collection. The collection (Listing) has an existing property called Address. I'm simply trying to add a new property called LowerCaseAddress which can be used for searching so that I don't need to use a case-insensitive regex for address matching, which is slow.
Here is the script I tried to use in the shell:
for( var c = db.Listing.find(); c.hasNext(); ) {
var listing = c.next();
db.Listing.update( { LowerCaseAddress: listing.Address.toLowerCase() });
}
It ran for ~6 hours and then my PC crashed. Is there a better way to add a new property to each documentin a large collection (~4 million records)?

you JavaScript didn't work, but the code below works. But don't knew how long it takes for 4 Million records.
db.Listing.find().forEach(function(item){
db.Listing.update({_id: item._id}, {$set: { LowerCaseAddress: item.Address.toLowerCase() }})
})

You can use updateMany for this too:
try {
db.<collection>.updateMany( {}, {$set: { LowerCaseAddress:
item.Address.toLowerCase() } } );
} catch(e) {
print(e);}

Related

Mongodb find() don't work

Why dosen't db.find work? The console.log gets undefined...
var course = (db.courses.find({ _id: mongo.helper.toObjectID(param.course)}));
console.log(course.body)
The way you are trying use Selects documents in a collection and returns a cursor to the selected documents., so you can't use the way you are trying to use it.
You need to use a callback() to get the records matching the query.
The below code will give result in an array format :-
db.courses.findOne({ _id: mongo.helper.toObjectID(param.course)}).toArray(function(err, result)
{
console.log(result[0]); // will give you the matched record.
})

Inserting multiple documents into mongodb using one call in Meteor

In the mongo shell, it is possible to insert an array of documents with one call. In a Meteor project, I have tried using
MyCollection = new Mongo.Collection("my_collection")
documentArray = [{"one": 1}, {"two": 2}]
MyCollection.insert(documentArray)
However, when I check my_collection from the mongo shell, it shows that only one document has been inserted, and that document contains the entire array as if it had been a map:
db.my_collection.find({})
{ "_id" : "KPsbjZt5ALZam4MTd", "0" : { "one" : 1 }, "1" : { "two" : 2} }
Is there a Meteor call that I can use to add a series of documents all at once, or must use a technique such as the one described here?
I imagine that inserting multiple documents in a single call would optimize performance on the client side, where the new documents would become available all at once.
You could use the bulk API to do the bulk insert on the server side. Manipulate the array using the forEach() method and within the loop insert the document using bulk insert operations which are simply abstractions on top of the server to make it easy to build bulk operations.
Note, for older MongoDB servers than 2.6 the API will downconvert the operations. However it's not possible to downconvert 100% so there might be some edge cases where it cannot correctly report the right numbers.
You can get raw access to the collection and database objects in the npm MongoDB driver through rawCollection and rawDatabase methods on Mongo.Collection
MyCollection = new Mongo.Collection("my_collection");
if (Meteor.isServer) {
Meteor.startup(function () {
Meteor.methods({
insertData: function() {
var bulkOp = MyCollection.rawCollection().initializeUnorderedBulkOp(),
counter = 0,
documentArray = [{"one": 1}, {"two": 2}];
documentArray.forEach(function(data) {
bulkOp.insert(data);
counter++;
// Send to server in batch of 1000 insert operations
if (counter % 1000 == 0) {
// Execute per 1000 operations and re-initialize every 1000 update statements
bulkOp.execute(function(e, rresult) {
// do something with result
});
bulkOp = MyCollection.rawCollection().initializeUnorderedBulkOp();
}
});
// Clean up queues
if (counter % 1000 != 0){
bulkOp.execute(function(e, result) {
// do something with result
});
}
}
});
});
}
I'm currently using the mikowals:batch-insert package.
Your code would work then with one small change:
MyCollection = new Mongo.Collection("my_collection");
documentArray = [{"one": 1}, {"two": 2}];
MyCollection.batchInsert(documentArray);
The one drawback of this I've noticed is that it doesn't honor simple-schema.

duplicate mongo record in same collection

In mongo I have a collections with records. These record are very complex. Now I would like to duplicate one of them.
I can easily select the one
mongo> var row = db.barfoo.find({"name":"bar"});
Now I actually don't know what to do. I don't know what is in row because I cannot find a way to print its content. How can I change specific properties and finally insert this modified row again
mongo> db.barfoo.insert(row);
thnx
You must change value _id - generate new:
var row = db.barfoo.findOne({"name":"bar"});
row._id = ObjectId();
db.barfoo.insert(row);
Good Luck!
I am going to assume that you're working directly inside the mongo shell.
Once you have your document (not a row :P ), you'd modify the properties in the same way you would a normal JavaScript object:
var doc = db.barfoo.findOne( { "name": "bar" } );
doc.name = "Mr Bar";
Note that the find() command returns a cursor, so if you're looking to extract a single document, you should use the findOne() function. This function returns a single document.
If you are interested in duplicating numerous documents, you can use the find() function and iterate over the cursor to retrieve each document:
db.barfoo.find( { "name": "bar" } ).forEach( function( doc ){
doc.name = "Mr Bar";
}
After you change the relevant properties, you can use the insert/save methods to persist the data back to mongo. Don't forget to change/delete the _id attribute so that you'll actually create a new document.
As a side note, in order to view the contents of an object in the mongo shell, you can use the print() function. If you want a more visually appealing output, you could use printjson().

Update with expression instead of value

I am totally new to MongoDB... I am missing a "newbie" tag, so the experts would not have to see this question.
I am trying to update all documents in a collection using an expression. The query I was expecting to solve this was:
db.QUESTIONS.update({}, { $set: { i_pp : i_up * 100 - i_down * 20 } }, false, true);
That, however, results in the following error message:
ReferenceError: i_up is not defined (shell):1
At the same time, the database did not have any problem with eating this one:
db.QUESTIONS.update({}, { $set: { i_pp : 0 } }, false, true);
Do I have to do this one document at a time or something? That just seems excessively complicated.
Update
Thank you Sergio Tulentsev for telling me that it does not work. Now, I am really struggling with how to do this. I offer 500 Profit Points to the helpful soul, who can write this in a way that MongoDB understands. If you register on our forum I can add the Profit Points to your account there.
I just came across this while searching for the MongoDB equivalent of SQL like this:
update t
set c1 = c2
where ...
Sergio is correct that you can't reference another property as a value in a straight update. However, db.c.find(...) returns a cursor and that cursor has a forEach method:
Queries to MongoDB return a cursor, which can be iterated to retrieve
results. The exact way to query will vary with language driver.
Details below focus on queries from the MongoDB shell (i.e. the
mongo process).
The shell find() method returns a cursor object which we can then iterate to retrieve specific documents from the result. We use
hasNext() and next() methods for this purpose.
for( var c = db.parts.find(); c.hasNext(); ) {
print( c.next());
}
Additionally in the shell, forEach() may be used with a cursor:
db.users.find().forEach( function(u) { print("user: " + u.name); } );
So you can say things like this:
db.QUESTIONS.find({}, {_id: true, i_up: true, i_down: true}).forEach(function(q) {
db.QUESTIONS.update(
{ _id: q._id },
{ $set: { i_pp: q.i_up * 100 - q.i_down * 20 } }
);
});
to update them one at a time without leaving MongoDB.
If you're using a driver to connect to MongoDB then there should be some way to send a string of JavaScript into MongoDB; for example, with the Ruby driver you'd use eval:
connection.eval(%q{
db.QUESTIONS.find({}, {_id: true, i_up: true, i_down: true}).forEach(function(q) {
db.QUESTIONS.update(
{ _id: q._id },
{ $set: { i_pp: q.i_up * 100 - q.i_down * 20 } }
);
});
})
Other languages should be similar.
//the only differnce is to make it look like and aggregation pipeline
db.table.updateMany({}, [{
$set: {
col3:{"$sum":["$col1","$col2"]}
},
}]
)
You can't use expressions in updates. Or, rather, you can't use expressions that depend on fields of the document. Simple self-containing math expressions are fine (e.g. 2 * 2).
If you want to set a new field for all documents that is a function of other fields, you have to loop over them and update manually. Multi-update won't help here.
Rha7 gave a good idea, but the code above is not work without defining a temporary variable.
This sample code produces an approximate calculation of the age (leap years behinds the scene) based on 'birthday' field and inserts the value into suitable field for all documents not containing such:
db.employers.find({age: {$exists: false}}).forEach(function(doc){
var new_age = parseInt((ISODate() - doc.birthday)/(3600*1000*24*365));
db.employers.update({_id: doc._id}, {$set: {age: new_age}});
});
Example to remove "00" from the beginning of a caller id:
db.call_detail_records_201312.find(
{ destination: /^001/ },
{ "destination": true }
).forEach(function(row){
db.call_detail_records_201312.update(
{ _id: row["_id"] },
{ $set: {
destination: row["destination"].replace(/^001/, '1')
}
}
)
});

Identify last document from MongoDB find() result set

I'm trying to 'stream' data from a node.js/MongoDB instance to the client using websockets. It is all working well.
But how to I identify the last document in the result? I'm using node-mongodb-native to connect to MongoDB from node.js.
A simplified example:
collection.find({}, {}, function(err, cursor) {
if (err) sys.puts(err.message);
cursor.each(function(err, doc) {
client.send(doc);
});
});
Since mongodb objectId contatins creation date you can sort by id, descending and then use limit(1):
db.collection.find().sort( { _id : -1 } ).limit(1);
Note: i am not familiar with node.js at all, above command is mongo shell command and i suppose you can easy rewrite it to node.js.
Say I have companies collection. Below snippet gives me last document in the collection.
db.companies.find({},{"_id":1}).skip(db.companies.find().count()-1);
Code cannot rely on _id as it may not be on a specific pattern always if it's a user defined value.
Use sort and limit, if you want to use cursor :
var last = null;
var findCursor = collection.find({}).cursor();
findCursor.on("data", function(data) {
last = data;
...
});
findCursor.on("end", function(data) {
// last result in last
....
});