I'm trying to write up a mongo query that finds all entries where the "steps" field has no values within an array argument.
So for example, given two entries with values:
Entry1:
steps: [3, 4]
Entry2:
steps: [3, 5]
The query should return entry1, but not entry 2, for input array [4, 8, 10]. I'm quite new to mongo - any ideas appreciated.
You mean you have some records:
db.foo.find()
{ "_id" : 1, "steps" : [ 3, 4 ] }
{ "_id" : 2, "steps" : [ 3, 5 ] }
Then you would query:
> db.foo.find({steps:{$in:[4,8,10]}})
{ "_id" : 1, "steps" : [ 3, 4 ] }
the $in clause will pick records in which any stored element matches any of terms in the array supplied in the query
Related
I have a document with entry_height, exit_height (might be null).
Height is a bitcoin thing (block height). Several entries can have the same entry_height or exit_height.
I want to show a list where there is a row for entry_height and, if exit_height is filled, also show a 2nd row.
I need to order by height of both fields.
Let's say I have these entries:
1) entry_height: 1, exit_height: 5, entry_data, exit_data, ...
2) entry_height: 2, exit_height: 3, entry_data, exit_data, ...
3) entry_height: 4, exit_height: null, entry_data, null
The query result would be:
1) height: 1, entry related data...
2) height: 2, entry related data...
3) height: 3, exit related data ...
4) height: 4, entry related data...
5) height: 5, exit related data...
What indices should be set and how to read the data from the database?
Thanks.
update: after some reading I think that entry_height and exit_height should be a unique array field, say, height: [1, 5], [2, 3], [4].
That way I'd set up a multikey index. But I'm not sure how to know if a value comes from the 1st height entry or the seond.
Regarding the query I think the only option is the aggregate framework.
What do you say?
This seems to do the trick:
var r = [
{_id:0, "entry_height":1, "exit_height":5, "entry_data":"entry_DAT1", "exit_data":"exit_DAT1"},
{_id:1, "entry_height":2, "exit_height":3, "entry_data":"entry_DAT2", "exit_data":"exit_DAT2"},
{_id:2, "entry_height":4, "entry_data":"entry_DAT3"}
];
db.foo.insert(r);
c = db.foo.aggregate([
// The Juice! "Array-ify" the doc and assign height and payload to common field names (h and d):
{$project: {x:[ {h:"$entry_height",d:"$entry_data",t:"ENTRY"}, {h:"$exit_height",d:"$exit_data",t:"EXIT"} ] }}
// The unwind creates 2 docs (in this case) for each input item
,{$unwind: "$x"}
// Toss out those items with no exit height (like _id = 2 above):
,{$match: {"x.h": {$exists: true} }}
// Finally: The sort you seek:
,{$sort: {"x.h":1}}
]);
{ "_id" : 0, "x" : { "h" : 1, "d" : "entry_DAT1", "t" : "ENTRY" } }
{ "_id" : 1, "x" : { "h" : 2, "d" : "entry_DAT2", "t" : "ENTRY" } }
{ "_id" : 1, "x" : { "h" : 3, "d" : "exit_DAT2", "t" : "EXIT" } }
{ "_id" : 2, "x" : { "h" : 4, "d" : "entry_DAT3", "t" : "ENTRY" } }
{ "_id" : 0, "x" : { "h" : 5, "d" : "exit_DAT1", "t" : "EXIT" } }
The heights are in order and you can use the t field to figure out if the d field is entry or exit data. If you have other ways of sniffing into the data then perhaps you do not need the t field.
With respect to indexing, not sure what you want to index to reduce the lookup space at the top of the aggregation pipeline.
Let's say I have the following two documents:
{ "_id" : ObjectId("5d1faa57a370cc52f0614313"), "bucket" : [ [2, 3], [ 111, 111 ]]}
{ "_id" : ObjectId("6d1faa57a370cc52f0614311"), "bucket" : [ [2, 3], [ 999, 999 ]]}
Both documents share the [2,3] value on the bucket field.
How can I retrieve such documents from a mongodb collection? Is this even possible?
Sorry if the question is stupid, I'm a mongo newbie.
That's possible. You just need to pass the matching argument inside find().
db.getCollection('test').find({"bucket" :[2,3]})
I have provided an image link for your reference.
We have a very big collection, with several indexes.
Since an index is basically a table with the indexed field as key, and a list of ObjectIDs as value, we were wondering if we could somehow get the key that has the highest number of Objects it points to.
For example, if we have a collection:
{ _id: 1, a : 1, b : 1 },
{ _id: 2, a : 2, b : 2 },
{ _id: 3, a : 2, b : 3 },
{ _id: 4, a : 2, b : 4 },
{ _id: 5, a : 3, b : 4 },
{ _id: 6, a : 3, b : 4 },
{ _id: 7, a : 4, b : 4 }
Where there's an index of "a".
I assume there a table somewhere that looks like this:
index a:
"1" => [ 1 ],
"2" => [ 2, 3, 4 ],
"3" => [ 5, 6 ],
"4" => [ 7 ]
In which case we'd like to somehow query for the index value with the longest list of objects - "2".
Is something like that possible in MongoDB?
The answer is no.
Actually, indexes don't have a list of document objectId's, just pointers to the data. To get document's objectId, system must go to the disk to read it. That's why, if you have a index what can answer to your query and you don't need _id in your result, remember always project {_id:0, key1:1, key2:1, keyX:1} so result can be returned from index and there is no need to go to the disk.
Recently I wanted to filter out records that contain a certain keyword array in MongoDB, for example: I have five records that contain keywords array:
{a:[1,2]}
{a:[1,3,8]}
{a:[1,2,5]}
{a:[3,5,1]}
{a:[4,5]}
If I input the array [1,2,3,5] for search, then I want to get:
{a:[1,2]}
{a:[1,2,5]}
{a:[3,5,1]}
Each of them is a sub array of [1,2,3,5].
Any idea?
Please don't use a where clause (when possbile). Thanks!
Its simple to do in mongodb, but the harder part is preparing the data for the query. Let me explain that in oder
Simple part
You can use $in to find the matching elements in an array. Let us try
db.coll.find({a:{$in:[1,2,3,5]})
and the result is
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42439ed13aa728e9efc"), "a" : [ 1, 3, 8 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f37c43439ed13aa728e9efe"), "a" : [ 3, 5, 1 ] }
{ "_id" : ObjectId("4f37c43e39ed13aa728e9eff"), "a" : [ 4, 5 ] }
ohh, its not the result we expected. Yes because $in return an item if any matching element found (not necessarily all).
So we can fix this by passing the exact array elements to $in, for example if we want to find the items matching these exact arrays {a:[1,2]} {a:[1,2,5]} and {a:[4,5,6]}
db.coll.find({a:{$in:[[1,2],[1,2,5],[4,5,6]]}})
you will get
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
Thats all
Hardest part
The real hardest part is forming all the possible combination of your input array [1,2,3,5]. You need to find a way to get all the combination of the source array (from your client) and pass it to $in.
For example, this JS method will give you all the combinations of the given array
var combine = function(a) {
var fn = function(n, src, got, all) {
if (n == 0) {
if (got.length > 0) {
all[all.length] = got;
}
return;
}
for (var j = 0; j < src.length; j++) {
fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
}
return;
}
var all = [];
for (var i=0; i < a.length; i++) {
fn(i, a, [], all);
}
all.push(a);
return all;
}
>> arr= combine([1,2,3,5])
will give you
[
[
1
],
[
2
],
[
3
],
[
5
],
[
1,
2
],
[
1,
3
],
[
1,
5
],
[
2,
3
],
[
2,
5
],
[
3,
5
],
[
1,
2,
3
],
[
1,
2,
5
],
[
1,
3,
5
],
[
2,
3,
5
],
[
1,
2,
3,
5
]
]
and you can pass this arr to $in to find all the macthing elements
db.coll.find({a:{$in:arr}})
will give you
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
Wait!, its still not returning the remaining two possible items.
Because have a good look at the arr, it finds only the combination. it returns [1,3,5] but the data in document is [3,5,1]. So its clear that $in checks the items in given order (weird!).
So now you understand its the really hard comparing the mongodb query!. You can change the above JS combination former code to find the possible permutation to each combination and pass it to mongodb $in. Thats the trick.
Since you didn't mention any language choice its hard to recommend any permutation code. But you can find lot of different approaches in Stackoverflow or googling.
If I understood, you want to return only the objects whose all values of property a are in the find array argument.
By following the Travis' suggestion in the comments, you must follow these steps:
Define a JS function to achieve your desires (since there's no native way to do that in MongoDB);
Save the function on the server;
Use the function within $where.
If define your function to use only to that specific property (a, in this case), you may want skip the step 2. However, since it can be an useful function for other properties of other documents, I defined a more generic function, which must to be save on the server to be used AFAIK (I'm new on Mongo, too).
Below there are my tests on the mongo shell:
<--! language: lang-js -->
// step 1: defining the function for your specific search
only = function(property, values) {
for(var i in property) if (values.indexOf(property[i]) < 0) return false
return true
}
// step 2: saving it on the server
db.system.js.save( { _id : 'only', value : only } )
// step 3: using the function with $where
db.coll.find({$where: "only(this.a, [1,2,3,5])"})
With the 5 objects you provided on the question, you will obtain:
{ "_id" : ObjectId("4f3838f85594f902212eb532"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f3839075594f902212eb534"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f38390e5594f902212eb535"), "a" : [ 3, 5, 1 ] }
The downside is performance. See more.
The the post document looks like this:
{
...
comments: [{
_id:...
body:...
createDate:...
},
...
]
}
How do I get recent 10 comments from the collection?
If your comments are always in a predictable order (i.e. newest first, or newest last), then you can use the $slice operator to return just a subset of the full comments field when querying:
test> db.foo.save({name: "hello", comments: [1, 2, 3, 4, 5]})
test> db.foo.find({}, {comments: {$slice: 3}})
{ "_id" : ObjectId("4ec7d1c8e72da9b6f31e2528"), "name" : "hello", "comments" : [ 1, 2, 3 ] }
test> db.foo.find({}, {comments: {$slice: -3}})
{ "_id" : ObjectId("4ec7d1c8e72da9b6f31e2528"), "name" : "hello", "comments" : [ 3, 4, 5 ] }
You can read more about controlling the returned fields at http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields
There is no way to partially select the items from embedded document. No matter what it will return the entire array of document. You have to do the filter in your application code. Thats the only way.
But i recommend to have a separate collection for comments. That way you can skip & limit the set.