mongodb aggregate query isn't returning proper sum on using $sum - mongodb

I have a collection students with documents in the following format:-
{
_id:"53fe74a866455060e003c2db",
name:"sam",
subject:"maths",
marks:"77"
}
{
_id:"53fe79cbef038fee879263d2",
name:"ryan",
subject:"bio",
marks:"82"
}
{
_id:"53fe74a866456060e003c2de",
name:"tony",
subject:"maths",
marks:"86"
}
I want to get the count of total marks of all the students with subject = "maths". So I should get 163 as sum.
db.students.aggregate([{ $match : { subject : "maths" } },
{ "$group" : { _id : "$subject", totalMarks : { $sum : "$marks" } } }])
Now I should get the following result-
{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":163}], "ok":1}
But I get-
{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":0}], "ok":1}
Can someone point out what I might be doing wrong here?

Your current schema has the marks field data type as string and you need an integer data type for your aggregation framework to work out the sum. On the other hand, you can use MapReduce to calculate the sum since it allows the use of native JavaScript methods like parseInt() on your object properties in its map functions. So overall you have two choices.
Option 1: Update Schema (Change Data Type)
The first would be to change the schema or add another field in your document that has the actual numerical value not the string representation. If your collection document size is relatively small, you could use a combination of the mongodb's cursor find(), forEach() and update() methods to change your marks schema:
db.student.find({ "marks": { "$type": 2 } }).snapshot().forEach(function(doc) {
db.student.update(
{ "_id": doc._id, "marks": { "$type": 2 } },
{ "$set": { "marks": parseInt(doc.marks) } }
);
});
For relatively large collection sizes, your db performance will be slow and it's recommended to use mongo bulk updates for this:
MongoDB versions >= 2.6 and < 3.2:
var bulk = db.student.initializeUnorderedBulkOp(),
counter = 0;
db.student.find({"marks": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "marks": parseInt(doc.marks) }
});
counter++;
if (counter % 1000 === 0) {
// Execute per 1000 operations
bulk.execute();
// re-initialize every 1000 update statements
bulk = db.student.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 !== 0) bulk.execute();
MongoDB version 3.2 and newer:
var ops = [],
cursor = db.student.find({"marks": {"$exists": true, "$type": 2 }});
cursor.forEach(function (doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "marks": parseInt(doc.marks) } }
}
});
if (ops.length === 1000) {
db.student.bulkWrite(ops);
ops = [];
}
});
if (ops.length > 0) db.student.bulkWrite(ops);
Option 2: Run MapReduce
The second approach would be to rewrite your query with MapReduce where you can use the JavaScript function parseInt().
In your MapReduce operation, define the map function that process each input document. This function maps the converted marks string value to the subject for each document, and emits the subject and converted marks pair. This is where the JavaScript native function parseInt() can be applied. Note: in the function, this refers to the document that the map-reduce operation is processing:
var mapper = function () {
var x = parseInt(this.marks);
emit(this.subject, x);
};
Next, define the corresponding reduce function with two arguments keySubject and valuesMarks. valuesMarks is an array whose elements are the integer marks values emitted by the map function and grouped by keySubject.
The function reduces the valuesMarks array to the sum of its elements.
var reducer = function(keySubject, valuesMarks) {
return Array.sum(valuesMarks);
};
db.student.mapReduce(
mapper,
reducer,
{
out : "example_results",
query: { subject : "maths" }
}
);
With your collection, the above will put your MapReduce aggregation result in a new collection db.example_results. Thus, db.example_results.find() will output:
/* 0 */
{
"_id" : "maths",
"value" : 163
}

Possible causes your sum is being returned 0 are :
The field you are summing up is not an integer but a string.
Make sure the field contains numeric values.
You are using wrong syntax of $sum.
db.c1.aggregate([{
$group: {
_id: "$item",
price: {
$sum: "$price"
},
count: {
$sum: 1
}
}
}])
Make sure you use "$price" and not "price".
One of the most silly mistake due to which this error occurs is:
Use of space or tab inside the quotes while specifying field name.
Example - "$price " won't work !!! But, "$price" would work.

Related

Using $sum on a existent field returns a value of 0 [duplicate]

I have a collection students with documents in the following format:-
{
_id:"53fe74a866455060e003c2db",
name:"sam",
subject:"maths",
marks:"77"
}
{
_id:"53fe79cbef038fee879263d2",
name:"ryan",
subject:"bio",
marks:"82"
}
{
_id:"53fe74a866456060e003c2de",
name:"tony",
subject:"maths",
marks:"86"
}
I want to get the count of total marks of all the students with subject = "maths". So I should get 163 as sum.
db.students.aggregate([{ $match : { subject : "maths" } },
{ "$group" : { _id : "$subject", totalMarks : { $sum : "$marks" } } }])
Now I should get the following result-
{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":163}], "ok":1}
But I get-
{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":0}], "ok":1}
Can someone point out what I might be doing wrong here?
Your current schema has the marks field data type as string and you need an integer data type for your aggregation framework to work out the sum. On the other hand, you can use MapReduce to calculate the sum since it allows the use of native JavaScript methods like parseInt() on your object properties in its map functions. So overall you have two choices.
Option 1: Update Schema (Change Data Type)
The first would be to change the schema or add another field in your document that has the actual numerical value not the string representation. If your collection document size is relatively small, you could use a combination of the mongodb's cursor find(), forEach() and update() methods to change your marks schema:
db.student.find({ "marks": { "$type": 2 } }).snapshot().forEach(function(doc) {
db.student.update(
{ "_id": doc._id, "marks": { "$type": 2 } },
{ "$set": { "marks": parseInt(doc.marks) } }
);
});
For relatively large collection sizes, your db performance will be slow and it's recommended to use mongo bulk updates for this:
MongoDB versions >= 2.6 and < 3.2:
var bulk = db.student.initializeUnorderedBulkOp(),
counter = 0;
db.student.find({"marks": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "marks": parseInt(doc.marks) }
});
counter++;
if (counter % 1000 === 0) {
// Execute per 1000 operations
bulk.execute();
// re-initialize every 1000 update statements
bulk = db.student.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 !== 0) bulk.execute();
MongoDB version 3.2 and newer:
var ops = [],
cursor = db.student.find({"marks": {"$exists": true, "$type": 2 }});
cursor.forEach(function (doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "marks": parseInt(doc.marks) } }
}
});
if (ops.length === 1000) {
db.student.bulkWrite(ops);
ops = [];
}
});
if (ops.length > 0) db.student.bulkWrite(ops);
Option 2: Run MapReduce
The second approach would be to rewrite your query with MapReduce where you can use the JavaScript function parseInt().
In your MapReduce operation, define the map function that process each input document. This function maps the converted marks string value to the subject for each document, and emits the subject and converted marks pair. This is where the JavaScript native function parseInt() can be applied. Note: in the function, this refers to the document that the map-reduce operation is processing:
var mapper = function () {
var x = parseInt(this.marks);
emit(this.subject, x);
};
Next, define the corresponding reduce function with two arguments keySubject and valuesMarks. valuesMarks is an array whose elements are the integer marks values emitted by the map function and grouped by keySubject.
The function reduces the valuesMarks array to the sum of its elements.
var reducer = function(keySubject, valuesMarks) {
return Array.sum(valuesMarks);
};
db.student.mapReduce(
mapper,
reducer,
{
out : "example_results",
query: { subject : "maths" }
}
);
With your collection, the above will put your MapReduce aggregation result in a new collection db.example_results. Thus, db.example_results.find() will output:
/* 0 */
{
"_id" : "maths",
"value" : 163
}
Possible causes your sum is being returned 0 are :
The field you are summing up is not an integer but a string.
Make sure the field contains numeric values.
You are using wrong syntax of $sum.
db.c1.aggregate([{
$group: {
_id: "$item",
price: {
$sum: "$price"
},
count: {
$sum: 1
}
}
}])
Make sure you use "$price" and not "price".
One of the most silly mistake due to which this error occurs is:
Use of space or tab inside the quotes while specifying field name.
Example - "$price " won't work !!! But, "$price" would work.

MongoDB convert string type to float type

Following the suggestions over here MongoDB: How to change the type of a field? I tried to update my collection to change the type of field and its value.
Here is the update query
db.MyCollection.find({"ProjectID" : 44, "Cost": {$exists: true}}).forEach(function(doc){
if(doc.Cost.length > 0){
var newCost = doc.Cost.replace(/,/g, '').replace(/\$/g, '');
doc.Cost = parseFloat(newCost).toFixed(2);
db.MyCollection.save(doc);
} // End of If Condition
}) // End of foreach
upon completion of the above query, when I run the following command
db.MyCollection.find({"ProjectID" : 44},{Cost:1})
I still have Cost field as string.
{
"_id" : ObjectId("576919b66bab3bfcb9ff0915"),
"Cost" : "11531.23"
}
/* 7 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0916"),
"Cost" : "13900.64"
}
/* 8 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0917"),
"Cost" : "15000.86"
}
What am I doing wrong here?
Here is the sample document
/* 2 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0911"),
"Cost" : "$7,100.00"
}
/* 3 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0912"),
"Cost" : "$14,500.00"
}
/* 4 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0913"),
"Cost" : "$12,619.00"
}
/* 5 */
{
"_id" : ObjectId("576919b66bab3bfcb9ff0914"),
"Cost" : "$9,250.00"
}
The problem is that toFixed returns an String, not a Number. Then your are just updating the document with a new, and different String.
Example from Mongo Shell:
> number = 2.3431
2.3431
> number.toFixed(2)
2.34
> typeof number.toFixed(2)
string
If you want a 2 decimals number you must parse it again with something like:
db.MyCollection.find({"ProjectID" : 44, "Cost": {$exists: true}}).forEach(function(doc){
if(doc.Cost.length > 0){
var newCost = doc.Cost.replace(/,/g, '').replace(/\$/g, '');
var costString = parseFloat(newCost).toFixed(2);
doc.Cost = parseFloat(costString);
db.MyCollection.save(doc);
} // End of If Condition
}) // End of foreach
Follow this pattern to convert a currency field of string type to a float. You need to query all the documents in the collection that have the Cost field type string. To do so you would need to take advantage of using the Bulk API for bulk updates. These offer better performance as you will be sending the operations to the server in batches of say 1000, which gives you a better performance as you are not sending every request to the server, but just once in every 1000 requests.
The following demonstrates this approach, the first example uses the Bulk API available in MongoDB versions >= 2.6 and < 3.2. It updates all
the documents in the collection by changing all the Cost fields to floating value fields:
var bulk = db.MyCollection.initializeUnorderedBulkOp(),
counter = 0;
db.MyCollection.find({
"Cost": { "$exists": true, "$type": 2 }
}).forEach(function (doc) {
var newCost = Number(doc.Cost.replace(/[^0-9\.]+/g,""));
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "Cost": newCost }
});
counter++;
if (counter % 1000 == 0) {
bulk.execute(); // Execute per 1000 operations
// re-initialize every 1000 update statements
bulk = db.MyCollection.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 != 0) { bulk.execute(); }
The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk API and provided a newer set of apis using bulkWrite().
It uses the same cursors as above but creates the arrays with the bulk operations using the same forEach() cursor method to push each bulk write document to the array. Because write commands can accept no more than 1000 operations, you will need to group your operations to have at most 1000 operations and re-intialise the array when loop hit the 1000 iteration:
var cursor = db.MyCollection.find({ "Cost": { "$exists": true, "$type": 2 } }),
bulkUpdateOps = [];
cursor.forEach(function(doc){
var newCost = Number(doc.Cost.replace(/[^0-9\.]+/g,""));
bulkUpdateOps.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { "Cost": newCost } }
}
});
if (bulkUpdateOps.length == 1000) {
db.MyCollection.bulkWrite(bulkUpdateOps);
bulkUpdateOps = [];
}
});
if (bulkUpdateOps.length > 0) { db.MyCollection.bulkWrite(bulkUpdateOps); }
Since mongoDB version 4.2, It can be done entirely inside one mongoDB query using Updates with Aggregation Pipeline:
db.collection.updateMany(
{Cost: {$exists: true}},
[{$set: {
Cost: {
$toDouble: {
$reduce: {
input: {$split: [{$substr: ["$Cost", 1, {$strLenCP: "$Cost"}]}, ","]},
initialValue: "",
in: {$concat: ["$$value", "$$this"]}
}
}
}
}}]
)
See how it works on the playground example

Mongo : How to convert all entries using a long timeStamp to an ISODate?

I have a current Mongo database with the accumulated entries/fields
{
name: "Fred Flintstone",
age : 34,
timeStamp : NumberLong(14283454353543)
}
{
name: "Wilma Flintstone",
age : 33,
timeStamp : NumberLong(14283454359453)
}
And so on...
Question : I want to convert all entries in the database to their corresponding ISODate instead - How does one do this?
Desired Result :
{
name: "Fred Flintstone",
age : 34,
timeStamp : ISODate("2015-07-20T14:50:32.389Z")
}
{
name: "Wilma Flintstone",
age : 33,
timeStamp : ISODate("2015-07-20T14:50:32.389Z")
}
Things I've tried
>db.myCollection.find().forEach(function (document) {
document["timestamp"] = new Date(document["timestamp"])
//Not sure how to update this document from here
db.myCollection.update(document) //?
})
Using the aggregation pipeline for update operations, simply run the following update operation:
db.myCollection.updateMany(
{ },
[
{ $set: {
timeStamp: {
$toDate: '$timeStamp'
}
} },
]
])
With you initial attempt, you were almost there, you just need to call the save() method on the modified document to update it since the method uses either the insert or the update command. In the above instance, the document contains an _id fieldand thus the save() method is equivalent to an update() operation with the upsert option set to true and the query predicate on the _id field:
db.myCollection.find().snapshot().forEach(function (document) {
document["timestamp"] = new Date(document["timestamp"]);
db.myCollection.save(document)
})
The above is similar to explicitly calling the update() method as you had previously attempted:
db.myCollection.find().snapshot().forEach(function (document) {
var date = new Date(document["timestamp"]);
var query = { "_id": document["_id"] }, /* query predicate */
update = { /* update document */
"$set": { "timestamp": date }
},
options = { "upsert": true };
db.myCollection.update(query, update, options);
})
For relatively large collection sizes, your db performance will be slow and it's recommended to use mongo bulk updates for this:
MongoDB versions >= 2.6 and < 3.2:
var bulk = db.myCollection.initializeUnorderedBulkOp(),
counter = 0;
db.myCollection.find({"timestamp": {"$not": {"$type": 9 }}}).forEach(function (doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "timestamp": new Date(doc.timestamp") }
});
counter++;
if (counter % 1000 === 0) {
// Execute per 1000 operations
bulk.execute();
// re-initialize every 1000 update statements
bulk = db.myCollection.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 !== 0) bulk.execute();
MongoDB version 3.2 and newer:
var ops = [],
cursor = db.myCollection.find({"timestamp": {"$not": {"$type": 9 }}});
cursor.forEach(function (doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "timestamp": new Date(doc.timestamp") } }
}
});
if (ops.length === 1000) {
db.myCollection.bulkWrite(ops);
ops = [];
}
});
if (ops.length > 0) db.myCollection.bulkWrite(ops);
It seems that there are some cumbersome things happening in mongo when trying to instantiate Date objects from NumberLong values. Mainly becasue the NumberLong values are converted to wrong representations and the fallback to current date is used.
I was fighting 2 days with mongo and finally I found the solution. The key is to convert NumberLong to Double ... and pass double values to Date constructor.
Here is the solution that uses bulb operations and work for me ...
(lastIndexedTimestamp is the collection field that is migrated to ISODate and stored in lastIndexed field. A temporary collection is created, and it is renamed to the original value in the end.)
db.annotation.aggregate( [
{ $project: {
_id: 1,
lastIndexedTimestamp: 1,
lastIndexed: { $add: [new Date(0), {$add: ["$lastIndexedTimestamp", 0]}]}
}
},
{ $out : "annotation_new" }
])
//drop annotation collection
db.annotation.drop();
//rename annotation_new to annotation
db.annotation_new.renameCollection("annotation");

Replace a word from a string

I have mongodb documents with a field like this:
Image : http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-zoom.jpg
How can I replace the zoom part in the string value with some other text in order to get:
Image : http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-product2.jpg
You could use mongo's forEach() cursor method to do an atomic update with the $set operator :
db.collection.find({}).snapshot().forEach(function(doc) {
var updated_url = doc.Image.replace('zoom', 'product2');
db.collection.update(
{"_id": doc._id},
{ "$set": { "Image": updated_url } }
);
});
Given a very large collection to update, you could speed up things a little bit with bulkWrite and restructure your update operations to be sent in bulk as:
var ops = [];
db.collection.find({}).snapshot().forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { "Image": doc.Image.replace('zoom', 'product2') } }
}
});
if ( ops.length === 500 ) {
db.collection.bulkWrite(ops);
ops = [];
}
})
if ( ops.length > 0 )
db.collection.bulkWrite(ops);
db.myCollection.update({image: 'http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-zoom.jpg'}, {$set: {image : 'http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-product2.jpg'}})
If you need to do this multiple times to multiple documents, you need to iterate them with a function. See here: MongoDB: Updating documents using data from the same document
Nowadays,
starting Mongo 4.2, db.collection.updateMany (alias of db.collection.update) can accept an aggregation pipeline, finally allowing the update of a field based on its own value.
starting Mongo 4.4, the new aggregation operator $replaceOne makes it very easy to replace part of a string.
// { "Image" : "http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-zoom.jpg" }
// { "Image" : "http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-boom.jpg" }
db.collection.updateMany(
{ "Image": { $regex: /zoom/ } },
[{
$set: { "Image": {
$replaceOne: { input: "$Image", find: "zoom", replacement: "product2" }
}}
}]
)
// { "Image" : "http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-product2.jpg" }
// { "Image" : "http://static14.com/p/Inc.5-Black-Sandals-5131-2713231-7-boom.jpg" }
The first part ({ "Image": { $regex: /zoom/ } }) is just there to make the query faster by filtering which documents to update (the ones containing "zoom")
The second part ($set: { "Image": {...) is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline):
$set is a new aggregation operator (Mongo 4.2) which in this case replaces the value of a field.
The new value is computed with the new $replaceOne operator. Note how Image is modified directly based on the its own value ($Image).

How to change the type of a field?

I am trying to change the type of a field from within the mongo shell.
I am doing this...
db.meta.update(
{'fields.properties.default': { $type : 1 }},
{'fields.properties.default': { $type : 2 }}
)
But it's not working!
The only way to change the $type of the data is to perform an update on the data where the data has the correct type.
In this case, it looks like you're trying to change the $type from 1 (double) to 2 (string).
So simply load the document from the DB, perform the cast (new String(x)) and then save the document again.
If you need to do this programmatically and entirely from the shell, you can use the find(...).forEach(function(x) {}) syntax.
In response to the second comment below. Change the field bad from a number to a string in collection foo.
db.foo.find( { 'bad' : { $type : 1 } } ).forEach( function (x) {
x.bad = new String(x.bad); // convert field to string
db.foo.save(x);
});
Convert String field to Integer:
db.db-name.find({field-name: {$exists: true}}).forEach(function(obj) {
obj.field-name = new NumberInt(obj.field-name);
db.db-name.save(obj);
});
Convert Integer field to String:
db.db-name.find({field-name: {$exists: true}}).forEach(function(obj) {
obj.field-name = "" + obj.field-name;
db.db-name.save(obj);
});
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update of a field based on its own value:
// { a: "45", b: "x" }
// { a: 53, b: "y" }
db.collection.updateMany(
{ a : { $type: 1 } },
[{ $set: { a: { $toString: "$a" } } }]
)
// { a: "45", b: "x" }
// { a: "53", b: "y" }
The first part { a : { $type: 1 } } is the match query:
It filters which documents to update.
In this case, since we want to convert "a" to string when its value is a double, this matches elements for which "a" is of type 1 (double)).
This table provides the code representing the different possible types.
The second part [{ $set: { a: { $toString: "$a" } } }] is the update aggregation pipeline:
Note the squared brackets signifying that this update query uses an aggregation pipeline.
$set is a new aggregation operator (Mongo 4.2) which in this case modifies a field.
This can be simply read as "$set" the value of "a" to "$a" converted "$toString".
What's really new here, is being able in Mongo 4.2 to reference the document itself when updating it: the new value for "a" is based on the existing value of "$a".
Also note "$toString" which is a new aggregation operator introduced in Mongo 4.0.
In case your cast isn't from double to string, you have the choice between different conversion operators introduced in Mongo 4.0 such as $toBool, $toInt, ...
And if there isn't a dedicated converter for your targeted type, you can replace { $toString: "$a" } with a $convert operation: { $convert: { input: "$a", to: 2 } } where the value for to can be found in this table:
db.collection.updateMany(
{ a : { $type: 1 } },
[{ $set: { a: { $convert: { input: "$a", to: 2 } } } }]
)
For string to int conversion.
db.my_collection.find().forEach( function(obj) {
obj.my_value= new NumberInt(obj.my_value);
db.my_collection.save(obj);
});
For string to double conversion.
obj.my_value= parseInt(obj.my_value, 10);
For float:
obj.my_value= parseFloat(obj.my_value);
db.coll.find().forEach(function(data) {
db.coll.update({_id:data._id},{$set:{myfield:parseInt(data.myfield)}});
})
all answers so far use some version of forEach, iterating over all collection elements client-side.
However, you could use MongoDB's server-side processing by using aggregate pipeline and $out stage as :
the $out stage atomically replaces the existing collection with the
new results collection.
example:
db.documents.aggregate([
{
$project: {
_id: 1,
numberField: { $substr: ['$numberField', 0, -1] },
otherField: 1,
differentField: 1,
anotherfield: 1,
needolistAllFieldsHere: 1
},
},
{
$out: 'documents',
},
]);
To convert a field of string type to date field, you would need to iterate the cursor returned by the find() method using the forEach() method, within the loop convert the field to a Date object and then update the field using the $set operator.
Take advantage of using the Bulk API for bulk updates which offer better performance as you will be sending the operations to the server in batches of say 1000 which gives you a better performance as you are not sending every request to the server, just once in every 1000 requests.
The following demonstrates this approach, the first example uses the Bulk API available in MongoDB versions >= 2.6 and < 3.2. It updates all
the documents in the collection by changing all the created_at fields to date fields:
var bulk = db.collection.initializeUnorderedBulkOp(),
counter = 0;
db.collection.find({"created_at": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
var newDate = new Date(doc.created_at);
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "created_at": newDate}
});
counter++;
if (counter % 1000 == 0) {
bulk.execute(); // Execute per 1000 operations and re-initialize every 1000 update statements
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 != 0) { bulk.execute(); }
The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk API and provided a newer set of apis using bulkWrite():
var bulkOps = [];
db.collection.find({"created_at": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
var newDate = new Date(doc.created_at);
bulkOps.push(
{
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "created_at": newDate } }
}
}
);
})
db.collection.bulkWrite(bulkOps, { "ordered": true });
To convert int32 to string in mongo without creating an array just add "" to your number :-)
db.foo.find( { 'mynum' : { $type : 16 } } ).forEach( function (x) {
x.mynum = x.mynum + ""; // convert int32 to string
db.foo.save(x);
});
What really helped me to change the type of the object in MondoDB was just this simple line, perhaps mentioned before here...:
db.Users.find({age: {$exists: true}}).forEach(function(obj) {
obj.age = new NumberInt(obj.age);
db.Users.save(obj);
});
Users are my collection and age is the object which had a string instead of an integer (int32).
You can easily convert the string data type to numerical data type.
Don't forget to change collectionName & FieldName.
for ex : CollectionNmae : Users & FieldName : Contactno.
Try this query..
db.collectionName.find().forEach( function (x) {
x.FieldName = parseInt(x.FieldName);
db.collectionName.save(x);
});
I need to change datatype of multiple fields in the collection, so I used the following to make multiple data type changes in the collection of documents. Answer to an old question but may be helpful for others.
db.mycoll.find().forEach(function(obj) {
if (obj.hasOwnProperty('phone')) {
obj.phone = "" + obj.phone; // int or longint to string
}
if (obj.hasOwnProperty('field-name')) {
obj.field-name = new NumberInt(obj.field-name); //string to integer
}
if (obj.hasOwnProperty('cdate')) {
obj.cdate = new ISODate(obj.cdate); //string to Date
}
db.mycoll.save(obj);
});
demo change type of field mid from string to mongo objectId using mongoose
Post.find({}, {mid: 1,_id:1}).exec(function (err, doc) {
doc.map((item, key) => {
Post.findByIdAndUpdate({_id:item._id},{$set:{mid: mongoose.Types.ObjectId(item.mid)}}).exec((err,res)=>{
if(err) throw err;
reply(res);
});
});
});
Mongo ObjectId is just another example of such styles as
Number, string, boolean that hope the answer will help someone else.
I use this script in mongodb console for string to float conversions...
db.documents.find({ 'fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.fwtweaeeba = parseFloat( obj.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.0.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[0].content.fwtweaeeba = parseFloat( obj.versions[0].content.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.1.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[1].content.fwtweaeeba = parseFloat( obj.versions[1].content.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.2.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[2].content.fwtweaeeba = parseFloat( obj.versions[2].content.fwtweaeeba );
db.documents.save(obj); } );
And this one in php)))
foreach($db->documents->find(array("type" => "chair")) as $document){
$db->documents->update(
array('_id' => $document[_id]),
array(
'$set' => array(
'versions.0.content.axdducvoxb' => (float)$document['versions'][0]['content']['axdducvoxb'],
'versions.1.content.axdducvoxb' => (float)$document['versions'][1]['content']['axdducvoxb'],
'versions.2.content.axdducvoxb' => (float)$document['versions'][2]['content']['axdducvoxb'],
'axdducvoxb' => (float)$document['axdducvoxb']
)
),
array('$multi' => true)
);
}
The above answers almost worked but had a few challenges-
Problem 1: db.collection.save no longer works in MongoDB 5.x
For this, I used replaceOne().
Problem 2: new String(x.bad) was giving exponential number
I used "" + x.bad as suggested above.
My version:
let count = 0;
db.user
.find({
custID: {$type: 1},
})
.forEach(function (record) {
count++;
const actualValue = record.custID;
record.custID = "" + record.custID;
console.log(`${count}. Updating User(id:${record._id}) from old id [${actualValue}](${typeof actualValue}) to [${record.custID}](${typeof record.custID})`)
db.user.replaceOne({_id: record._id}, record);
});
And for millions of records, here are the output (for future investigation/reference)-