Using an 'or' where clause with a projection in mongodb? - mongodb

I am new to mongodb so forgive me if this question has an obvious answer or if it's a little unclear but I have searched for hours now and I have no idea why this isn't working. I want to get all users(collection name) with a red shirt or is over the age of 20 which I can get, but I only want to print out the users id and name.
Example of document in collection:
{"_id" : ObjectId(hy65fi6152g8off589992y),
"id" : 02451,
"age" : 20,
"shirt" : "red",
"name" : "bob"}
Query I thought would work:
db.users.find({
$or:[
{"shirt" : "red"}, {"age" : {$gt:20}}
]
}
{id : 1, name: 1, _id: 0}
)

Related

Use distinct() on a variable in mongodb

I'm trying several queries in mongodb. Each document of my colelction is like this :
{
"_id" : 1,
"name" : 1,
"isReferenceProteome" : 1,
"isRepresentativeProteome" : 1,
"component" : 1,
"reference" : 1,
"upid" : 1,
"modified" : 1,
"taxonomy" : 1,
"superregnum" : 1,
"description" : 1,
"dbReference" : 1
}
the "reference" field has nested fields, one is "authorList", an array containing 'name' fields.
"reference" {
"authorList" [
{"name": "author1"},
{"name": "author2"},
{"name": "author3"} ...etc...
]
}
I have stored in a variable the result of the following query :
var testing = db.mycollection.find({'reference.authorList.30': {$exists: true}})
which stores all documents where the authorList is at least 30 names long.
Then I wanted to use distinct() on this variable, in order to have the distinct names of all authors :
testing.distinct("reference.authorList.name")
I tried this way because my first query returned an empty array :
db.mycollection.distinct( "reference.authorList.name", {"reference.authorList.name.30": {$exists: true}} )
I'm also trying whit $where command, but I got syntaxError for now.
What I am missing ?
Thanks.
Use
db.head_human_prot.distinct( "reference.authorList.name", {"reference.authorList.30": {$exists: true}} )
instead of
db.head_human_prot.distinct( "reference.authorList.name", {"reference.authorList.name.30": {$exists: true}} )
Silly me...

Getting the correct query in Mongodb

I'm trying to get this simple query to just get a subfield out of a collection. So far I just keep getting the entire field so what should I correct to just print out the subfield I'm looking for?
I'm trying to list the titles (only) of all movies with a rank of less than 9.2 and with at least 5 votes, print the titles in alphabetical order.
This is my query so far but its incorrect and just returns the whole object. How can I get it to return just the rank and votes of Jungle Book? Thank you very much in advance.
db.collections.find({"title": {$exists:true}}, {"_id":0, "rank":{$lt : 9.2}})
{ "_id" : ObjectId("10"), "rank" : 6, "votes" : 8.8, "title" : "Jungle Book" }
{ "_id" : ObjectId("11"), "rank" : 8, "votes" : 8.7, "title" : "Spawn" }
You need to have the filter/query all in the first parameter. The second parameter is a set of booleans for which properties it should return.
db.ratings.find({title: {$exists:true}, rank:{$lt : 9.2},
votes: {$gte : 5 } }, {_id:0, title:1}).sort({title:1})
This will return a set that looks like this:
[{"title" : "Jungle Book"}, {"title" : "Spawn"}]
If you want only the titles, and not in object form you could use "distinct" here:
db.ratings.distinct('title', {title: {$exists:true},
rank:{$lt : 9.2}, votes: {$gte : 5 } });
The distinct query should be sorted by default. If you want to sort it a different way you'll need to use an aggregate query.
I've run this EXACT set of code against my local install:
MongoDB shell version: 2.4.8
connecting to: test
rs0:PRIMARY> db.ratings.insert({rank:6, votes:8.8, title:"Jungle Book"});
rs0:PRIMARY> db.ratings.insert({rank:8, votes:8.7, title:"Spawn"});
rs0:PRIMARY> db.ratings.find({title: {$exists:true}, rank:{$lt : 9.2}, votes: {$gte : 5 } }, {_id:0, title:1}).sort({title:1})
{ "title" : "Jungle Book" }
{ "title" : "Spawn" }
rs0:PRIMARY> db.ratings.distinct('title', {title: {$exists:true}, rank:{$lt : 9.2}, votes: {$gte : 5 } });
[ "Jungle Book", "Spawn" ]
rs0:PRIMARY>

MongoDB Why this error : can't append to array using string field name: comments

I have a DB structure like below:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"content" : "xxx"
}
]
}
I update a new subdocument in the comments feild. It is OK.
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
after that the DB structure:
{
"_id" : 1,
"comments" : [
{
"_id" : 2,
"comments" : [
{
"id" : 3,
"content" : "xxx"
}
],
"content" : "xxx"
}
]
}
But when I update a new subdocument in the comment field that _id is 3, There is a error:
db.test.update(
{"_id" : 1, "comments.comments.id" : 3},
{$push : {"comments.comments.$.comments" : {id : 4, content:"xxx"}}}
)
error message:
can't append to array using string field name: comments
Well, it makes total sense if you think about it. MongoDb has the advantage and the disadvantage of solving magically certain things.
When you query the database for a specific regular field like this:
{ field : "value" }
The query {field:"value"} makes total sense, it wouldn't in case value is part of an array but Mongo solves it for you, so in case the structure is:
{ field : ["value", "anothervalue"] }
Mongo iterates through all of them and matches "value" into the field and you don't have to think about it. It works perfectly.. at only one level, because it's impossible to guess what you want to do if you have multiple levels
In your case the first query works because it's the case in this example:
db.test.update(
{"_id" : 1, "comments._id" : 2},
{$push : {"comments.$.comments" : {_id : 3, content:"xxx"}}}
)
Matches _id in the first level, and comments._id at the second level, it gets an array as a result but Mongo is able to solve it.
But in the second case, think what you need, let's isolate the where clause:
{"_id" : 1, "comments.comments.id" : 3},
"Give me from the main collection records with _id:1" (one doc)
"And comments which comments inside have and id=3" (array * array)
The first level is solved easily, comments.id, the second is not possible due comments returns an array, but one more level is an array of arrays and Mongo gets an array of arrays as a result and it's not possible to push a document into all the records of the array.
The solution is to narrow your where clause to obtain an unique document in comments (could be the first one) but it's not a good solution because you never know what is the position of the document you're looking for, using the shell I think the only option to be accurate is to do it in two steps. Check this query that works (not the solution anyway) but "solves" the multiple array part fixing it to the first record:
db.test.update(
{"_id" : 1, "comments.0.comments._id" : 3},
{$push : {"comments.0.comments.$.comments" : {id : 4, content:"xxx"}}}
)

mongodb replace all instances of certain value for subdocument

My collection (called "workers"):
"_id" : "500"
"type" : "Manager",
"employees" : [{
"name" : "bob"
"id" : 101
},{
"name" : "phil"
"id" : 102
}]
Goal: for every _id that is a type: Manager AND that contains a subdocument that has an "id" of 102: replace 102 with 202.
Desired End result:
"_id" : "500"
"type" : "Manager",
"employees" : [{
"name" : "bob"
"id" : 101
},{
"name" : "phil"
"id" : 202
}]
I have tried:
db.workers.update({type:'Manager','employees.id':'102'},{$set:{'employees.id':'202'}},{multi:true})
I then did the following two things to verify:
db.workers.find({type: "Manager", 'employees.id': 102}).count()
I get a result of 9.
I also tried this to verify:
db.workers.find({$and: [{type: "Manager"},{"employees.id":60}]}).count()
This returned 0.
I am pretty confused at this point. Is my update wrong? is my find wrong? Is it both? Is the '9' result wrong? Is the '0' wrong?
You need to use the $ positional update operator to update the specific element that matched your query. Your update is also using values of '102' and '202' which makes the update try and match strings when those fields are numbers.
Your update should look like this instead:
db.workers.update(
{type: 'Manager', 'employees.id': 102},
{$set: {'employees.$.id': 202}},
{multi: true})

Listing, counting factors of unique Mongo DB values over all keys

I'm preparing a descriptive "schema" (quelle horreur) for a MongoDB I've been working with.
I used the excellent variety.js to create a list of all keys and show coverage of each key. However, in cases where the values corresponding to the keys have a small set of values, I'd like to be able to list the entire set as "available values." In R, I'd be thinking of these as the "factors" for the categorical variable, ie, gender : ["M", "F"].
I know I could just use R + RMongo, query each variable, and basically do the same procedure I would to create a histogram, but I'd like to know the proper Mongo.query()/javascript/Map,Reduce way to approach this. I understand the db.collection.aggregate() functions are designed for exactly this.
Before asking this, I referenced:
http://docs.mongodb.org/manual/reference/aggregation/
http://docs.mongodb.org/manual/reference/method/db.collection.distinct/
How to query for distinct results in mongodb with python?
Get a list of all unique tags in mongodb
http://cookbook.mongodb.org/patterns/count_tags/
But can't quite get the pipeline order right. So, for example, if I have documents like these:
{_id : 1, "key1" : "value1", "key2": "value3"}
{_id : 2, "key1" : "value2", "key2": "value3"}
I'd like to return something like:
{"key1" : ["value1", "value2"]}
{"key2" : ["value3"]}
Or better, with counts:
{"key1" : ["value1" : 1, "value2" : 1]}
{"key2" : ["value3" : 2]}
I recognize one problem with doing this will be any values that have a wide range of different values---so, text fields, or continuous variables. Ideally, if there were more than x different possible values, it would be nice to truncate, say to no more than 20 unique values. If I find it's actually more, I'd query that variable directly.
Is this something like:
db.collection.aggregate(
{$limit: 20,
$group: {
_id: "$??varname",
count: {$sum: 1}
}})
First, how can I reference ??varname? for the name of each key?
I saw this link which had 95% of it:
Binning and tabulate (unique/count) in Mongo
with...
input data:
{ "_id" : 1, "age" : 22.34, "gender" : "f" }
{ "_id" : 2, "age" : 23.9, "gender" : "f" }
{ "_id" : 3, "age" : 27.4, "gender" : "f" }
{ "_id" : 4, "age" : 26.9, "gender" : "m" }
{ "_id" : 5, "age" : 26, "gender" : "m" }
This script:
db.collection.aggregate(
{$project: {gender:1}},
{$group: {
_id: "$gender",
count: {$sum: 1}
}})
Produces:
{"result" :
[
{"_id" : "m", "count" : 2},
{"_id" : "f", "count" : 3}
],
"ok" : 1
}
But what I don't understand is how could I do this generically for an unknown number/name of keys with a potentially large number of return values? This sample knows the key name is gender, and that the response set will be small (2 values).
If you already ran a script that outputs the names of all keys in the collection, you can generate your aggregation framework pipeline dynamically. What that means is either extending the variety.js type script or just writing your own.
Here is what it might look like in JS if passed an array called "keys" which has several non-"_id" named fields (I'm assuming top level fields and that you don't care about arrays, embedded documents, etc).
keys = ["key1", "key2"];
group = { "$group" : { "_id" : null } } ;
keys.forEach( function(f) {
group["$group"][f+"List"] = { "$addToSet" : "$" + f }; } );
db.collection.aggregate(group);
{
"result" : [
{
"_id" : null,
"key1List" : [
"value2",
"value1"
],
"key2List" : [
"value3"
]
}
],
"ok" : 1
}