Decode this MongoDB query - mongodb

I have a MongoDB group query. I udnerstand most of it, but the 'if' condition bit in the 'reduce' function is confusing me. I am not sure what the purpose of that is.
db.users.group({
"initial": {
"countstar": 0
},
"reduce": function(obj, prev) {
if (true != null) if (true instanceof Array) prev.countstar += true.length;
else prev.countstar++;
},
"cond": {
"location": null
}
});
I know what the 'initial' parameter does. I also know what the 'cond' parameter is doing. However, the whole bit inside the 'reduce' parameter is confusing.
Also, what is the equivalent SQL?

It's easier to understand if we reformat it:
db.users.group({
"initial": { "countstar": 0 },
"reduce": function(obj, prev) {
if (true != null) {
if (true instanceof Array)
prev.countstar += true.length;
else
prev.countstar++;
}
},
"cond": { "location": null }
});
Something's definitely wrong with this, because true is a boolean value in Javascript, so the first condition always holds, the second condition never holds, and true.length is undefined. It looks like true should instead by some property of obj. In that case, what the code does, sort of, is sum the lengths of the whatever array property true is supposed to refer to, counting non-array values as length 1 arrays. Where did this code come from? Does it actually work and give you a meaningful result? What's the result that you want to get from this code? I think you should throw the function out. You can do the same thing with aggregation and it will work better. Since you shouldn't use this code and I'm not competent with SQL, I won't try to translate it into SQL.

Related

Why is a MongoDB update() slower than find() when no documents are matched and can this be changed?

Consider the following two approaches of (maybe) updating a bunch of records:
Approach 1 ("find and maybe update"):
let ids = db.getCollection("users").find({
"status.lastActivity": {"$lte": timeoutDate}
}, {
"fields": {"_id": 1}
}).fetch().map(doc => {
doc = doc._id;
return doc
});
if (ids.length) {
db.getCollection("users").update({
"_id": {"$in": ids}
}, {
"$set": {
"status.idle": true
}
}, {
"multi": true
});
}
Approach 2 ("directly update"):
db.getCollection("users").update({
"status.lastActivity": {"$lte": timeoutDate}
}, {
"$set": {
"status.idle": true
}
}, {
"multi": true
});
And now to keep it simple let's assume that there are never users with a smaller status.lastActivity than timeoutDate (so ids is also always an empty array).
In that case I get a significantly better performance with Approach 1. Like Approach 1 takes 0.1 to 2 ms while Approach 2 takes 40 to 80 ms.
My question now is, why is that the case? I would have assumed MongoDB is 'clever' enough to do things similar to Approach 1 under the hood when I actually use Approach 2 and doesn't waste resource when there actually is no record matched by the selector...
And can I change it somehow so that it would work that way? Or have I maybe some kind of wrong configuration which is causing this and I could get rid of? Because obviously writing things like in Approach 2 would be leaner...
Is this in JS? db.getCollection("users").find( looks like it should return a Promise, and promises don't have length, so the update code that's gated by ids.length would never run.

Is there a mongo query that returns a boolean if any documents match a query? [duplicate]

I want to return true if a userID already exists and false otherwise from my collection.I have this function but it always returns True.
def alreadyExists(newID):
if db.mycollection.find({'UserIDS': { "$in": newID}}):
return True
else:
return False
How could I get this function to only return true if a user id already exists?
Note: This answer is outdated. More recent versions of MongoDB can use the far more efficient method db.collection.countDocuments. See the answer by Xavier Guihot for a better solution.
find doesn't return a boolean value, it returns a cursor. To check if that cursor contains any documents, use the cursor's count method:
if db.mycollection.find({'UserIDS': { "$in": newID}}).count() > 0
If newID is not an array you should not use the $in operator. You can simply do find({'UserIDS': newID}).
Starting Mongo 4.0.3/PyMongo 3.7.0, we can use count_documents:
if db.collection.count_documents({ 'UserIDS': newID }, limit = 1) != 0:
# do something
Used with the optional parameter limit, this provides a way to find if there is at least one matching occurrence.
Limiting the number of matching occurrences makes the collection scan stop as soon as a match is found instead of going through the whole collection.
Note that this can also be written as follow since 1 is interpreted as True in a python condition:
if db.collection.count_documents({ 'UserIDS': newID }, limit = 1):
# do something
In earlier versions of Mongo/Pymongo, count could be used (deprecated and replaced by count_documents in Mongo 4):
if db.collection.count({ 'UserIDS': newID }, limit = 1) != 0:
# do something
If you're using Motor, find() doesn't do any communication with the database, it merely creates and returns a MotorCursor:
http://motor.readthedocs.org/en/stable/api/motor_collection.html#motor.MotorCollection.find
Since the MotorCursor is not None, Python considers it a "true" value so your function returns True. If you want to know if at least one document exists that matches your query, try find_one():
#gen.coroutine
def alreadyExists(newID):
doc = yield db.mycollection.find_one({'UserIDS': { "$in": newID}})
return bool(doc)
Notice you need a "coroutine" and "yield" to do I/O with Tornado. You could also use a callback:
def alreadyExists(newID, callback):
db.mycollection.find_one({'UserIDS': { "$in": newID}}, callback=callback)
For more on callbacks and coroutines, see the Motor tutorial:
http://motor.readthedocs.org/en/stable/tutorial.html
If you're using PyMongo and not Motor, it's simpler:
def alreadyExists(newID):
return bool(db.mycollection.find_one({'UserIDS': { "$in": newID}}))
Final note, MongoDB's $in operator takes a list of values. Is newID a list? Perhaps you just want:
find_one({'UserIDS': newID})
One liner solution in mongodb query
db.mycollection.find({'UserIDS': { "$in": newID}}).count() > 0 ? true : false
return db.mycollection.find({'UserIDS': newID}).count > 0
This worked for me
result = num.find({"num": num}, { "_id": 0 })
if result.count() > 0:
return
else:
num.insert({"num": num, "DateTime": DateTime })

MongoDB Query Help - query on values of any key in a sub-object

I want to perform a query on this collection to determine which documents have any keys in things that match a certain value. Is this possible?
I have a collection of documents like:
{
"things": {
"thing1": "red",
"thing2": "blue",
"thing3": "green"
}
}
EDIT: for conciseness
If you don't know what the keys will be and you need it to be interactive, then you'll need to use the (notoriously performance challenged) $where operator like so (in the shell):
db.test.find({$where: function() {
for (var field in this.settings) {
if (this.settings[field] == "red") return true;
}
return false;
}})
If you have a large collection, this may be too slow for your purposes, but it's your only option if your set of keys is unknown.
MongoDB 3.6 Update
You can now do this without $where by using the $objectToArray aggregation operator:
db.test.aggregate([
// Project things as a key/value array, along with the original doc
{$project: {
array: {$objectToArray: '$things'},
doc: '$$ROOT'
}},
// Match the docs with a field value of 'red'
{$match: {'array.v': 'red'}},
// Re-project the original doc
{$replaceRoot: {newRoot: '$doc'}}
])
I'd suggest a schema change so that you can actually do reasonable queries in MongoDB.
From:
{
"userId": "12347",
"settings": {
"SettingA": "blue",
"SettingB": "blue",
"SettingC": "green"
}
}
to:
{
"userId": "12347",
"settings": [
{ name: "SettingA", value: "blue" },
{ name: "SettingB", value: "blue" },
{ name: "SettingC", value: "green" }
]
}
Then, you could index on "settings.value", and do a query like:
db.settings.ensureIndex({ "settings.value" : 1})
db.settings.find({ "settings.value" : "blue" })
The change really is simple ..., as it moves the setting name and setting value to fully indexable fields, and stores the list of settings as an array.
If you can't change the schema, you could try #JohnnyHK's solution, but be warned that it's basically worst case in terms of performance and it won't work effectively with indexes.
Sadly, none of the previous answers address the fact that mongo can contain nested values in arrays or nested objects.
THIS IS THE CORRECT QUERY:
{$where: function() {
var deepIterate = function (obj, value) {
for (var field in obj) {
if (obj[field] == value){
return true;
}
var found = false;
if ( typeof obj[field] === 'object') {
found = deepIterate(obj[field], value)
if (found) { return true; }
}
}
return false;
};
return deepIterate(this, "573c79aef4ef4b9a9523028f")
}}
Since calling typeof on array or nested object will return 'object' this means that the query will iterate on all nested elements and will iterate through all of them until the key with value will be found.
You can check previous answers with a nested value and the results will be far from desired.
Stringifying the whole object is a hit on performance since it has to iterate through all memory sectors one by one trying to match them. And creates a copy of the object as a string in ram memory (both inefficient since query uses more ram and slow since function context already has a loaded object).
The query itself can work with objectId, string, int and any basic javascript type you wish.

Using IF/ELSE in map reduce

I am trying to make a simple map/reduce function on one of my MongoDB database collections.
I get data but it looks wrong. I am unsure about the Map part. Can I use IF/ELSE in this way?
UPDATE
I want to get the amount of authors that ownes the files. In other words how many of the authors own the uploaded files and thus, how many authors has no files.
The objects in the collection looks like this:
{
"_id": {
"$id": "4fa8efe33a34a40e52800083d"
},
"file": {
"author": "john",
"type": "mobile",
"status": "ready"
}
}
The map / reduce looks like this:
$map = new MongoCode ("function() {
if (this.file.type != 'mobile' && this.file.status == 'ready') {
if (!this.file.author) {
return;
}
emit (this.file.author, 1);
}
}");
$reduce = new MongoCode ("function( key , values) {
var count = 0;
for (index in values) {
count += values[index];
}
return count;
}");
$this->cimongo->command (array (
"mapreduce" => "files",
"map" => $map,
"reduce" => $reduce,
"out" => "statistics.photographer_count"
)
);
The map part looks ok to me. I would slightly change the reduce part.
values.forEach(function(v) {
count += v;
}
You should not use for in loop to iterate an array, it was not meant to do this. It is for enumerating object's properties. Here is more detailed explanation.
Why do you think your data is wrong? What's your source data? What do you get? What do you expect to get?
I just tried your map and reduce in mongo shell and got correct (reasonable looking) results.
The other way you can do what you are doing is get rid of the inner "if" condition in the map but call your mapreduce function with appropriate query clause, for example:
db.files.mapreduce(map,reduce,{out:'outcollection', query:{"file.author":{$exists:true}}})
or if you happen to have indexes to make the query efficient, just get rid of all ifs and run mapreduce with query:{"file.author":{$exists:true},"file.type":"mobile","file.status":"ready"} clause. Change the conditions to match the actual cases you want to sum up over.
In 2.2 (upcoming version available today as rc0) you can use the aggregation framework for this type of query rather than writing map/reduce functions, hopefully that will simplify things somewhat.

Mongo query different attributes on different rows

I have a mongo collection with the fields
visit_id, user_id, date, action 1, action 2
example:
1 u100 2012-01-01 phone-call -
2 u100 2012-01-02 - computer-check
Can I get in mongodb the user that has made both a phone-call and a computer-check no matter the time ? ( basically it's an AND on different rows )
I guess it is not possible without map/reduce work.
I see it can be done in following way:
1.First you need run map/reduce that produce to you results like this:
{
_id : "u100",
value: {
actions: [
"phone-call",
"computer-check",
"etc..."
]
}
}
2.Then you can query above m/r result via elemMatch
You won't be able to do this with a single query-- if this is something you're doing frequently in your application I wouldn't recommend map/reduce-- I'd recommend doing a query in mongodb using the $or operator, and then processing it on the client to get a unique set of user_id's.
For example:
db.users.find({$or:[{"action 1":"phone-call"}, {"action 2":"computer-check"}]})
In the future, you should save your data in a different format like the one suggested above by Andrew.
There is the MongoDB group method that can be used for your query, comparable to an SQL group by operator.
I haven't tested this, but your query could look something similar to:
var results = db.coll.group({
key: { user_id: true },
cond: { $or: [ { action1: "phone-call" }, { action2: "computer-check" } ] },
initial: { actionFlags: 0 },
reduce: function(obj, prev) {
if(obj.action1 == "phone-call") { prev.actionFlags |= 1; }
if(obj.action2 == "computer-check") { prev.actionFlags |= 2; }
},
finalize: function(doc) {
if(doc.actionFlags == 3) { return doc; }
return null;
}
});
Again, I haven't tested this, it's based on my reading of the documentation. You're grouping by the user_id (the key declaration). The rows you want to let through have either action1 == "phone-call" or action2 == "computer-check" (the cond declaration). The initial state when you start checking a particular user_id is 0 (initial). For each row you check if action1 == "phone-call" and set its flag, and check action2 == "computer-check" and set it's flag (the reduce function). Once you've marked the row types, you check to make sure both flags are set. If so, keep the object, otherwise eliminate it (the finalize function).
That last part is the only part I'm unsure of, since the documentation doesn't explicitly state that you can knock out records in the finalize function. It will probably take me more time to get some test data set up to see than it would for you to see if the example above works.