query for indexed key values that has the most references? - mongodb

We have a very big collection, with several indexes.
Since an index is basically a table with the indexed field as key, and a list of ObjectIDs as value, we were wondering if we could somehow get the key that has the highest number of Objects it points to.
For example, if we have a collection:
{ _id: 1, a : 1, b : 1 },
{ _id: 2, a : 2, b : 2 },
{ _id: 3, a : 2, b : 3 },
{ _id: 4, a : 2, b : 4 },
{ _id: 5, a : 3, b : 4 },
{ _id: 6, a : 3, b : 4 },
{ _id: 7, a : 4, b : 4 }
Where there's an index of "a".
I assume there a table somewhere that looks like this:
index a:
"1" => [ 1 ],
"2" => [ 2, 3, 4 ],
"3" => [ 5, 6 ],
"4" => [ 7 ]
In which case we'd like to somehow query for the index value with the longest list of objects - "2".
Is something like that possible in MongoDB?

The answer is no.
Actually, indexes don't have a list of document objectId's, just pointers to the data. To get document's objectId, system must go to the disk to read it. That's why, if you have a index what can answer to your query and you don't need _id in your result, remember always project {_id:0, key1:1, key2:1, keyX:1} so result can be returned from index and there is no need to go to the disk.

Related

MongoDB: Retrieving an entire array from a specific document

I have set up some test data in mongoDB that has the following form:
{
"_id" : ObjectId("579ab44c0f9f0dc3aeec42ab"),
"name" : "Bob",
"references" : [ 1, 2, 3, 4, 5, 6 ]
}
{
"_id" : ObjectId("579ab7a20f9f0dc3aeec42ac"),
"name" : "Jeff",
"references" : [ 11, 12, 13, 14, 15 ]
}
I want to be able to return the references array only for Bob. Currently I am able to return the complete Document for Bob with the following query:
db.test_2.find({"name" : "Bob"}, bob).pretty()
Basically the general question is how to return an array for a single document in a collection in MongoDB? If I could get any help for this that would be much appreciated!
You can add a projection document to limit the fields returned.
For example:
db.products.find( { qty: { $gt: 25 } }, { item: 1, qty: 1 } )
Take a look at the documentation:
https://docs.mongodb.com/manual/reference/method/db.collection.find/#db.collection.find
The other option would be to select the field from the given document (if you use it in a loop for example).
In any case mongo will return a json document which you need to take the array from.
Regards
Jony
You can do this...
db.test_2.findOne({ "name": "Bob" }).select({ references: 1, _id: 1 })
P.S this is with MongoDB v4.2
db.test_2.find({ "name": "Bob" }, { "references": 1 });

Force list type in $min update operator

I have documents with the following structure:
{
"_id" : 0,
"mins" : {
"ts1" : {
"node1" : [
1,
2,
3
],
"node2" : [
4,
5,
6
]
}
}
}
I'd like to update documents by taking the component-wise minimum for an array. As MongoDB does not support $min on arrays (I think), I'm updating each index individually like so:
db.foo.updateOne(
{"_id" : 0},
{$min: {
"mins.ts3.node1.0": 1,
"mins.ts3.node1.1": 2
}}
)
This works fine but the problem is that if the document does not have the array before updating, MongoDB creates a nested document instead of an array:
{
"_id" : 0,
"mins" : {
"ts1" : {
"node1" : [
1,
2,
3
],
"node2" : [
4,
5,
6
]
},
"ts3" : {
"node1" : {
"0" : 1,
"1" : 2
}
}
}
}
Is there a way to tell MongoDB it is updating a list even if the list does not exist yet?
I'd like to avoid creating empty lists for each document as that would break my current program design.

MongoDB, how to use document as the smallest unit to search the document in array?

Sorry for the title, but I really do not know how to make it clear. But I can show you.
Here I have insert two document
> db.test.find().pretty()
{
"_id" : ObjectId("557faa461ec825d473b21422"),
"c" : [
{
"a" : 3,
"b" : 7
}
]
}
{
"_id" : ObjectId("557faa4c1ec825d473b21423"),
"c" : [
{
"a" : 1,
"b" : 3
},
{
"a" : 5,
"b" : 9
}
]
}
>
I only want to select the first document with a value which is greater than 'a' and smaller than 'b', like '4'.
But when i search, i cannot get the result i want
> db.test.find({'c.a': {$lte: 4}, 'c.b': {$gte: 4}})
{ "_id" : ObjectId("557faa461ec825d473b21422"), "c" : [ { "a" : 3, "b" : 7 } ] }
{ "_id" : ObjectId("557faa4c1ec825d473b21423"), "c" : [ { "a" : 1, "b" : 3 }, { "a" : 5, "b" : 9 } ] }
>
Because '4' is greater than the '"a" : 1' and smaller than '"b" : 9' in the second document even it is not in the same document in the array, so the second one selected.
But I only want the first one selected.
I found this http://docs.mongodb.org/manual/reference/operator/query/elemMatch/#op._S_elemMatch, but it seems the example is not suitable for my situation.
You would want to
db.test.findOne({ c: {$elemMatch: {a: {$lte: 4}, b: {$gte: 4} } } })
With your query, you are searching for documents that have an object in the 'c' array that has a key 'a' with a value <= 4, and a key 'b' with a value >= 4.
The second record is return because c[0].a is <= 4, and c[1].b is >= 4.
Since you specified you wanted to select only the first document, you would want to do a findOne() instead of a find().
Use $elemMatch as below :
db.test.find({"c":{"$elemMatch":{"a":{"$lte":4},"b":{"$gte":4}}}})
Or
db.test.find({"c":{"$elemMatch":{"a":{"$lte":4},"b":{"$gte":4}}}},{"c.$":1})

Mongo: Array field has no values in given array?

I'm trying to write up a mongo query that finds all entries where the "steps" field has no values within an array argument.
So for example, given two entries with values:
Entry1:
steps: [3, 4]
Entry2:
steps: [3, 5]
The query should return entry1, but not entry 2, for input array [4, 8, 10]. I'm quite new to mongo - any ideas appreciated.
You mean you have some records:
db.foo.find()
{ "_id" : 1, "steps" : [ 3, 4 ] }
{ "_id" : 2, "steps" : [ 3, 5 ] }
Then you would query:
> db.foo.find({steps:{$in:[4,8,10]}})
{ "_id" : 1, "steps" : [ 3, 4 ] }
the $in clause will pick records in which any stored element matches any of terms in the array supplied in the query

Mongodb -how to find records that contain certain keywords array

Recently I wanted to filter out records that contain a certain keyword array in MongoDB, for example: I have five records that contain keywords array:
{a:[1,2]}
{a:[1,3,8]}
{a:[1,2,5]}
{a:[3,5,1]}
{a:[4,5]}
If I input the array [1,2,3,5] for search, then I want to get:
{a:[1,2]}
{a:[1,2,5]}
{a:[3,5,1]}
Each of them is a sub array of [1,2,3,5].
Any idea?
Please don't use a where clause (when possbile). Thanks!
Its simple to do in mongodb, but the harder part is preparing the data for the query. Let me explain that in oder
Simple part
You can use $in to find the matching elements in an array. Let us try
db.coll.find({a:{$in:[1,2,3,5]})
and the result is
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42439ed13aa728e9efc"), "a" : [ 1, 3, 8 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f37c43439ed13aa728e9efe"), "a" : [ 3, 5, 1 ] }
{ "_id" : ObjectId("4f37c43e39ed13aa728e9eff"), "a" : [ 4, 5 ] }
ohh, its not the result we expected. Yes because $in return an item if any matching element found (not necessarily all).
So we can fix this by passing the exact array elements to $in, for example if we want to find the items matching these exact arrays {a:[1,2]} {a:[1,2,5]} and {a:[4,5,6]}
db.coll.find({a:{$in:[[1,2],[1,2,5],[4,5,6]]}})
you will get
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
Thats all
Hardest part
The real hardest part is forming all the possible combination of your input array [1,2,3,5]. You need to find a way to get all the combination of the source array (from your client) and pass it to $in.
For example, this JS method will give you all the combinations of the given array
var combine = function(a) {
var fn = function(n, src, got, all) {
if (n == 0) {
if (got.length > 0) {
all[all.length] = got;
}
return;
}
for (var j = 0; j < src.length; j++) {
fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
}
return;
}
var all = [];
for (var i=0; i < a.length; i++) {
fn(i, a, [], all);
}
all.push(a);
return all;
}
>> arr= combine([1,2,3,5])
will give you
[
[
1
],
[
2
],
[
3
],
[
5
],
[
1,
2
],
[
1,
3
],
[
1,
5
],
[
2,
3
],
[
2,
5
],
[
3,
5
],
[
1,
2,
3
],
[
1,
2,
5
],
[
1,
3,
5
],
[
2,
3,
5
],
[
1,
2,
3,
5
]
]
and you can pass this arr to $in to find all the macthing elements
db.coll.find({a:{$in:arr}})
will give you
{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
Wait!, its still not returning the remaining two possible items.
Because have a good look at the arr, it finds only the combination. it returns [1,3,5] but the data in document is [3,5,1]. So its clear that $in checks the items in given order (weird!).
So now you understand its the really hard comparing the mongodb query!. You can change the above JS combination former code to find the possible permutation to each combination and pass it to mongodb $in. Thats the trick.
Since you didn't mention any language choice its hard to recommend any permutation code. But you can find lot of different approaches in Stackoverflow or googling.
If I understood, you want to return only the objects whose all values of property a are in the find array argument.
By following the Travis' suggestion in the comments, you must follow these steps:
Define a JS function to achieve your desires (since there's no native way to do that in MongoDB);
Save the function on the server;
Use the function within $where.
If define your function to use only to that specific property (a, in this case), you may want skip the step 2. However, since it can be an useful function for other properties of other documents, I defined a more generic function, which must to be save on the server to be used AFAIK (I'm new on Mongo, too).
Below there are my tests on the mongo shell:
<--! language: lang-js -->
// step 1: defining the function for your specific search
only = function(property, values) {
for(var i in property) if (values.indexOf(property[i]) < 0) return false
return true
}
// step 2: saving it on the server
db.system.js.save( { _id : 'only', value : only } )
// step 3: using the function with $where
db.coll.find({$where: "only(this.a, [1,2,3,5])"})
With the 5 objects you provided on the question, you will obtain:
{ "_id" : ObjectId("4f3838f85594f902212eb532"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f3839075594f902212eb534"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f38390e5594f902212eb535"), "a" : [ 3, 5, 1 ] }
The downside is performance. See more.