How to iterate queries in mongo shell - mongo-shell

Surely, this is simple. I cannot figure this out. In Mongo shell I have the following command:
db.getCollection("CollectionName")
.findAndModify({query: {"Property.0.Element": {"$type" : 1}},
update: {$set: {"Property.0.Element":""}}
})
If I run this command several times, eventually it returns null and I know that I have changed all of the fields that I wanted to change. If however I run:
for(j = 0; j < 50;j++) {
var loc = "Property."+j+".Element";
db.getCollection("ShelbyCoAssessorDeepStaging")
.findAndModify({query: {loc : {"$type" : 1}},
update: {$set: {loc:""}}
})
}
Then I have null returned, but none of the values actually changed. Why is this? Note: I am running this in studio3T's intellishell against an atlas cluster at version 3.6.6.

You are trying to use dynamic keys for a query object. But JavaScript does not support this approach in object literals. Please consider the following example on the MongoDB shell or Studio 3T's IntelliShell:
> var fieldName = "foo"
> var query = { fieldName: 1 }
> printjsononeline(query)
{ "fieldName" : 1 }
In this small example, we create a variable that contains the name of the key that we want to set inside a query object. But JavaScript does not expect a variable as a key inside an object literal. The symbol is directly used as key name, it absolutely does not matter if the key is quoted or not.
If you want to have dynamic key names in JavaScript, you can use the simple bracket notation (or "member index access"):
> var fieldName = "foo"
> var query = {}
> query[fieldName] = 1
1
> printjsononeline(query)
{ "foo" : 1 }
If we apply this to your example, we get the following code to be run on the MongoDB shell or Studio 3T IntelliShell:
for(j = 0; j < 50;j++) {
var loc = "Property."+j+".Element";
var theQuery = {};
theQuery[loc] = {"$type" : 1};
var theUpdate = {$set: {}};
theUpdate["$set"][loc] = "";
db.getCollection("ShelbyCoAssessorDeepStaging")
.findAndModify({query: theQuery,
update: theUpdate
})
}

Related

Insert data in mongodb using javascript loop

I want to insert data in a mongodb collection with a javascript loop.
Here is my code :
var collection = db.DATA.find();
for (var i = 0; i < 3; i++) {
db.PARSED_DATA.insert(collection[i].folders);
}
I have the error :
2017-08-04T10:17:06.390+0200 E QUERY [thread1] TypeError: collection[i] is undefined :
#(shell):2:9
When i execute in mongodb shell the commands :
var collection = db.DATA.find();
db.PARSED_DATA.insert(collection[0].folders);
db.PARSED_DATA.insert(collection[1].folders);
db.PARSED_DATA.insert(collection[2].folders);
I have the expected result and i still don't uderstand why my for loop doesn't work.
EDIT :
When i execute this inside Robo 3T i get the expected result :
var collection = db.DATA.find().toArray();
for (var i = 0; i < 3; i++) {
db.PARSED_DATA.insert(collection[i].folders);
}
Result in Robo 3T
But when i execute it inside my bash code with
mongo <myip>:27017 < db_parse.js
I get the error again
I achieve to create what i expect with these codes :
db_parse.sh
mongo <myip>:27017 < db_parse.js
db_parse.js
use <db_name>
var collection = db.DATA.find();
while(collection.hasNext()){
db.PARSED_DATA.insert(collection.next().folders);
}
I use this MongoDB manual.

How to compare all documents in two collections with millions of doc and write the diff in a third collection in MongoDB

I have two collections (coll_1, coll_2) with a million documents each.
These two collections are actually created by running two versions of a code from the same data source, so both two collections will have the same number of documents but the document in both collections can have one more field or sub-document missing or have a different values, but both collection's documents will have the same primary_key_id which is indexed.
I have this javascript function saved on the db to get the diff
db.system.js.save({
_id: "diffJSON", value:
function(obj1, obj2) {
var result = {};
for (key in obj1) {
if (obj2[key] != obj1[key]) result[key] = obj2[key];
if (typeof obj2[key] == 'array' && typeof obj1[key] == 'array')
result[key] = arguments.callee(obj1[key], obj2[key]);
if (typeof obj2[key] == 'object' && typeof obj1[key] == 'object')
result[key] = arguments.callee(obj1[key], obj2[key]);
}
return result;
}
});
Which runs fine like this
diffJSON(testObj1, testObj2);
Question: How to run diffJSON on coll1 and coll2, and output diffJSON result into coll3 along with primary_key_id.
I am new to MongoDB, and I understand the JOINS doesn't work as similar to RDBMS, so I wonder if I have to copy the two comparing documents in a single collection and then run the diffJSON function.
Also, most of the time (say 90%) documents in two collections will be identical, I would need to know about only 10% of docs which have any diff.
Here is a simple example document:
(but real doc is around 15k in size, just so you know the scale)
var testObj1 = { test:"1",test1: "2", tt:["td","ax"], tr:["Positive"] ,tft:{test:["a"]}};
var testObj2 = { test:"1",test1: "2", tt:["td","ax"], tr:["Negative"] };
If you know a better way to diff the documents, please feel free to suggest.
you can use a simple shell script to achieve this. First create a file named script.js and paste this code in it :
// load previously saved diffJSON() function
db.loadServerScripts();
// get all the document from collection coll1
var cursor = db.coll1.find();
if (cursor != null && cursor.hasNext()) {
// iterate over the cursor
while (cursor.hasNext()){
var doc1 = cursor.next();
// get the doc with the same _id from coll2
var id = doc1._id;
var doc2 = db.coll2.findOne({_id: id});
// compute the diff
var diff = diffJSON(doc2, doc1);
// if there is a difference between the two objects
if ( Object.keys(diff).length > 0 ) {
diff._id = id;
// insert the diff in coll3 with the same _id
db.coll3.insert(diff);
}
}
}
In this script I assume that your primary_key is the _id field.
then execute it from you shell like this:
mongo --host hostName --port portNumber databaseName < script.js
where databaseName is the came of the database containing the collections coll1 and coll2.
for this samples documents (just added an _id field to your docs):
var testObj1 = { _id: 1, test:"1",test1: "2", tt:["td","ax"], tr:["Positive"] ,tft:{test:["a"]}};
var testObj2 = { _id: 1, test:"1",test1: "2", tt:["td","ax"], tr:["Negative"] };
the script will save the following doc in coll3 :
{ "_id" : 1, "tt" : { }, "tr" : { "0" : "Positive" } }
This solution builds upon the one proposed by felix (I don't have the necessary reputation to comment on his). I made a few small changes to his script that bring important performance improvements:
// load previously saved diffJSON() function
db.loadServerScripts();
// get all the document from collection coll1 and coll2
var cursor1 = db.coll1.find().sort({'_id': 1});
var cursor2 = db.coll2.find().sort({'_id': 1});
if (cursor1 != null && cursor1.hasNext() && cursor2 != null && cursor2.hasNext()) {
// iterate over the cursor
while (cursor1.hasNext() && cursor2.hasNext()){
var doc1 = cursor1.next();
var doc2 = cursor2.next();
var pk = doc1._id
// compute the diff
var diff = diffJSON(doc2, doc1);
// if there is a difference between the two objects
if ( Object.keys(diff).length > 0 ) {
diff._id = pk;
// insert the diff in coll3 with the same _id
db.coll3.insert(diff);
}
}
}
Two cursors are used for fetching all the entries in the database sorted by the primary key. This is a very important aspect and brings most of the performance improvement. By retrieving the documents sorted by primary key, we make sure we match them correctly by the primary key. This is based on the fact that the two collections hold the same data.
This way we avoid making a call to coll2 for each document in coll1. It might seem as something insignificant, but we're talking about 1 million calls which put a lot of stress on the database.
Another important assumption is that the primary key field is _id. If it's not the case, it is crucial to have an unique index on the primary key field. Otherwise, the script might mismatch documents with the same primary key.

mongodb embedded document search

I have a mongodb collection like as follows . My job is to increase the "rate" of a particular document inside "ratings" key. I can do it by following command in mongo shell. db.amitava1.update({_id:1},{"$inc":{"ratings.0.rating":1 } } ) . . Here by 0 I access the first document in "ratings". But I need to use a variable in place of 0. The following does not work.
x = 0;
db.amitava1.update({_id:1},{"$inc":{"ratings.x.rating":1 } } );
Any help would be greatly appreciated. Thanks,
Try to do it with Template String, to parse x in the ratings.x.rating.
> var x = 0;
> var str = `ratings.${x}.rating`;
> db.amitava1.update({_id:1}, {$inc: {[str]: 1}})

MongoDB: filter records by checking if subfield keys include a specified set

I have records in a MongoDB collection with the following structure:
{
'field1': {
'a': 3,
'b': 1,
'c': 4,
...
}
}
I want to find all records for which the keys in field1 are in the following set: ['a','b'].
How can I structure a MongoDB query which will do this?
I found this post describing how to find all records which have a particular subfield. I would like to do the same, but testing for multiple subfields.
Thanks!
EDIT: I am aware I could write a query of the following form:
{'$and': [{'field1.a': {'$exists': true}, {'field1.b': {'$exists': true}]}
However, I would like to find a way to pass in a list of the subfield keys I'm looking for, instead of adding another $exists for each additional key.
I don't know of any MongoDB query that would be able to do this automatically. However, if you're willing to take advantage of the JavaScript available in the mongo shell, you can generate the $and query dynamically, which could be helpful. For example, given your sample data, you could do something like the following:
var q = { "$and" : [] };
var arr = ["a", "b", "c"];
for (key in arr) {
var field = "field1." + arr[key];
var clause = {};
clause[field] = { "$exists" : true };
q["$and"].push(clause);
}
db.collection.find(q);
This would definitely be easier to run than editing the query manually every time you add a key.
[EDIT]
Note that you do not need to use an explicit $and in the query, but just separate the clauses with commas. From this page in the documentation.
MongoDB provides an implicit AND operation when specifying a comma separated list of expressions. Using an explicit AND with the $and operator is necessary when the same field or operator has to be specified in multiple expressions.
This means that you can generate a simpler query as follows:
var q = {};
var arr = ["a", "b", "c"];
for (key in arr) {
var field = "field1." + arr[key];
q[field] = { "$exists" : true };
}
db.collection.find(q);

Dynamic mongo projection - a projection that uses a field in the document to determine the projection

Say I have an object like this:
{default: 'x',
types: {
x:1,
y:2,
z:3
}
}
Is it possible to select just types.x (ie a projection of {"types.x":1}) without knowing that x is the default beforehand? Making two queries is clearly possible and not what I'm looking for.
Unfortunately this is not available yet as part of the aggregation framework. However, according to this JIRA ticket, it is currently "planned by not scheduled". The only way of doing this currently is by using the map/reduce functionality. If you want to go ahead and use that, it would mean doing something as follows:
Map each document by _id and emit the appropriate key.
Since there will be only one value per key, the reduce function will not get called, but you still need to initialise the variable you use for the reduce function. You can use an empty function or an empty string.
Run map/reduce, saving the results in a collection of your choice.
In the mongo shell, it looks something as follows:
var mapper = function() {
var typeValue = this.types[this.default];
emit(this._id, typeValue);
};
var reducer = "";
db.types.mapReduce(mapper, reducer, { out : "results" } );
If you then query the results collection, you will get something as follows:
> db.results.find();
{ "_id" : ObjectId("53d21a270dcfb83c7dba8da9"), "value" : 1 }
If you want to know what the default value was, you can modify the mapper function in order to return the key as a value as well. It would look something like this:
var mapper = function() {
var typeValue = this.types[this.default],
typeKey = "types." + this.default;
emit(this._id, { key : typeKey, val : typeValue } );
};
When run, it would produce results that look as follows:
> db.results.find().pretty();
{
"_id" : ObjectId("53d21a270dcfb83c7dba8da9"),
"value" : {
"key" : "types.x",
"val" : 1
}
}
Note that this is probably a much more convoluted solution than you might want, but it's the only way to do this using MongoDB without adding more logic to your application.