I am trying to do a find request in mongoDB with the condition:
"if element contains a list that contains exactly theses elements".
It makes more sense with an example:
{
"categories" : [
[
"dogs",
"cats"
],
[
"dogs",
"octopus"
]
]
}
I want to find an element with a category containing only "dogs" and "octopus".
find({ 'categories' : ['dogs','octopus']}) finds the element
find({ 'categories' : ['octopus','dogs']}) doesn't find and that's where my issue is since I don't care about the order in the list
The output would be all the elements with a category containing only "dogs" and "octopus"
I am not sure if it's possible but if it's not the two solutions I see would be to store them in alphabetic order (good but what if I need the order afterwards?) or to store/search all the possible orders (very ugly)
You can use aggregation pipelines
db.collection.aggregate([
{ "$unwind": "$categories" },
{ "$match": { "categories" : { "$all" : [ "dogs", "octopus" ]}}}
])
This gives you the following document
{
"_id" : ObjectId("54c6685e7cdaa3f3e4dd8def"),
"categories" : [ "dogs", "octopus" ]
}
Related
I have this dataModel
{
"_id": ObjectId("5f0a9c07b001406068c073c1"),
"EmailData" : [
{
"Attachments" : {
"Files" : [
{
"Name" : "a.txt"
},
{
"Name" : "b.txt"
},
{
"Name" : "c.txt"
}
]
}
}
]
}
I want to filter those documents that their name elements insides the Files array be exactly the same as a specific array.
Consider I have this array :{"a.txt", "b.txt", "c.txt"}. I want to write a query to compare what is inside the Files element with this array. In my example condition is met but if the array is like:
{"a.txt", "b.txt"}
It does not meet. I know I have to use multiple $elemMatch but it does not work. Is there any ways to write it without aggregate?
You are describing an exact comparison, try for example:
{'EmailData.Attachments.Files': [{Name: 'a.txt'},{Name: 'b.txt'},{Name:'c.txt'}]}
Or to allow matching in any order:
{'EmailData.Attachments.Files': {$all: [{Name: 'a.txt'},{Name: 'b.txt'},{Name:'c.txt'}], $size: 3}}
I have a mongo collection "test" which contains elements like so (with the nodes array being a set and meaningful order):
"test" : {
"superiorID" : 1,
"nodes" : [
{
"subID" : 2
},
{
"subID" : 1
},
{
"subID" : 3
}
]
}
or
"test" : {
"superiorID" : 4,
"nodes" : [
{
"subID" : 2
},
{
"subID" : 1
},
{
"subID" : 3
}
]
}
I am using spring Criteria to try and build a mongo query which will return to me all elements where the 'subID' equals a user input id 'inputID' AND the 'superiorID' position is NOT before the 'inputID' (if the superior id is even in the sub ids which is not required).
So for example, if my user input was 3 I would NOT want to pull the first document but I WOULD want to pull the second document (first has a superior that exists in the nodes BEFORE the userInput node second's superior id is not equal to the user input).
I know that the $indexOfArray function exists but I don't know how to translate this to Criteria.
You can get the result you are looking for through the aggregation framework. I've made a speude query for you to show what you should be looking for. This returns
showMe = false for doc1 and showMe = true for doc2, which you could obiously match for. You do not need 2 project phases for this query, I only did that to make a working query which is also easy-ish to read. This will not be a very fast query. If you want fast queries you might want to rethink your data structure.
db.getCollection('test').aggregate([
{ "$project":
{
"superiorIndex": {"$indexOfArray" : [ "$nodes.subID","$superiorID" ]},
"inputIndex": {"$indexOfArray" : [ "$nodes.subID",3 ]},
}
},
{ "$project":
{
"showMe" :
{
$cond:
{
if: { $eq: [ "$superiorIndex", -1 ] },
then: true,
else: {$gt:[ "$superiorIndex","$inputIndex"]}
}
}
}
}
])
db.collection.find({nodes.2.subID:2}) that query will lookup 2th element subid from nodes field.
I'm not able to optimize a distinct query using indexes.
My collection look like this :
{
"_id" : ObjectId("592ed92296232608d00358bd"),
"measurement" : ObjectId("592ed92196232608d0034c23"),
"loc" : {
"coordinates" : [
2.65939299848366,
50.4380671935187
],
"type" : "Point"
},
"elements" : [
ObjectId("592ed92196232608d0034c24"),
ObjectId("592ed92196232608d0034c26"),
ObjectId("592ed92196232608d0034c28")
]
}
I'm trying to execute a query like
db.mycol.distinct('elements', {
$and:[
measurement:{
$in:[
ObjectId("592ed92196232608d0034c23"),
ObjectId("592ed92196232608d0034c24")
]
},
{
loc:{
$geoWithin:{
$geometry:{
type:'Polygon',
coordinates:[[
[
2.0214843750000004,
50.25071752130677
],
[
2.0214843750000004,
50.65294336725709
],
[
3.0487060546875004,
50.65294336725709
],
[
3.0487060546875004,
50.25071752130677
],
[
2.0214843750000004,
50.25071752130677
]
]]
}
}
}
}
]
})
And I have this index :
{
measurement: 1,
loc: '2dsphere',
elements: 1
}
The query plan (db.mycol.explain().distinct(...)) shows an IXSCAN, but the query is taking ages. I added the index hoping that it could use a Mongo covered query. The doc states that
all the fields in the query are part of an index,
and all the fields returned in the results are in the same index.
So I guessed I needed an index including elements. But according to the query execution time, it's not using it.
What is the best way to index a collection for such a query ?
Covered queries don't work with arrays.
From the same page referred in the question:
Restrictions on Indexed Fields
An index cannot cover a query if:
any of the indexed fields in any of the documents in the collection includes an array. If an indexed field is an array, the index becomes a multi-key index and cannot support a covered query.
I have a very large collection ( more than 800k ) and I need to implement a query for auto-complete ( based on word beginnings only ) functionality based on tags. my documents look like this:
{
"_id": "theid",
"somefield": "some value",
"tags": [
{
"name": "abc tag1",
"vote": 5
},
{
"name": "hij tag2",
"vote": 22
},
{
"name": "abc tag3",
"vote": 5
},
{
"name": "hij tag4",
"vote": 77
}
]
}
if for example my query would be for all tags that start with "ab" and has a "somefield" that is "some value" the result would be "abc tag1","abc tag3" ( only names ).
I care about the speed of the queries much more than the speed of the inserts and updates.
I assume that the aggregation framework would be the right way to go here, but what would be the best pipeline and indexes for very fast querying ?
the documents are not 'tag' documents they are documents representing a client object, they contain much more data fields that I left out for simplicity, each client has several tags and another field ( I changed its name so it wont be confused with the tags array ). I need to get a set without duplicates of all tags that a group of clients have.
Your document structure doesn't make sense - I'm assuming tags is an array and not an object. Try queries like this
db.tags.find({ "somefield" : "some value", "tags.name" : /^abc/ })
with an index on { "maintag" : 1, "tags.name" : 1 }. MongoDB optimizes left-anchored regex queries into range queries, which can be fulfilled efficiently using an index (see the $regex docs).
You can get just the tags from this document structure using an aggregation pipeline:
db.tags.aggregate([
{ "$match" : { "somefield" : "some value", "tags.name" : /^abc/ } },
{ "$unwind" : "$tags" },
{ "$match" : { "tags.name" : /^abc/ } },
{ "$project" : { "_id" : 0, "tag_name" : "$tags.name" } }
])
Index only helps for first $match, so same indexes for the pipeline as for the query.
Here is an example of a document from the collection I am querying
meteor:PRIMARY> db.research.findOne({_id: 'Z2zzA7dx6unkzKiSn'})
{
"_id" : "Z2zzA7dx6unkzKiSn",
"_userId" : "NtE3ANq2b2PbWSEqu",
"collaborators" : [
{
"userId" : "aTPzFad8DdFXxRrX4"
}
],
"name" : "new one",
"pending" : {
"collaborators" : [ ]
}
}
I want to find all documents within this collection with either _userId: 'aTPzFad8DdFXxRrX4' or from the collaborators array, userId: 'aTPzFad8DdFXxRrX4'
So I want to look though the collection and check if the _userId field is 'aTPzFad8DdFXxRrX4'. If not then check the collaborators array on the document and check if there is an object with userId: 'aTPzFad8DdFXxRrX4'.
Here is the query I am trying to use:
db.research.find({$or: [{_userId: 'aTPzFad8DdFXxRrX4'}, {collaborators: {$in: [{userId: 'aTPzFad8DdFXxRrX4'}]}}] })
It does not find the document and gives me a syntax error. What is my issue here? Thanks
The $in operator is basically a simplified version of $or but you really only have one argument here so you should not even need it. Use dot notation instead:
db.research.find({
'$or': [
{ '_userId': 'aTPzFad8DdFXxRrX4'},
{ 'collaborators.userId': 'aTPzFad8DdFXxRrX4'}
]
})
If you need more than one value then use $in:
db.research.find({
'$or': [
{ '_userId': 'aTPzFad8DdFXxRrX4'},
{ 'collaborators.userId': {
'$in': ['aTPzFad8DdFXxRrX4','aTPzFad8DdFXxRrX5']
}}
]
})