I have a MongoDB Collection that contains documents with a nested map, similar to the following document:
{
"_id": "1"
"accounts": {
"account-id-1": { "email": "example1#example.com", ... },
"account-id-2": { "email": "example2#example.com", ... },
}
}
The accounts map contains account IDs as keys and the remaining account data as values/objects. Now I want to add an index for the email field of the nested object, but I can't do that by defining the fields as one would normally do for nested fields, e.g. accounts.account-id-1.email because the mid part (account-id-1) is different for each entry.
I have read about wildcard indexes, but it seems to me that the index expression always ends withe the special wildcard symbol $**, but never has it in the middle.
My question is whether it's possible to define such an index in the following way or similarly: accounts.$**.email, so that only the email field gets indexed.
Related
I understand that index has a cost in firestore. Most of the time we simply store objects without really caring about index and even if we don’t want most of the fields to be indexed.
If I understand correctly, any field at any level are indexed. I.e. for the following document in pseudo json
{
"root_field1": "abc" (indexed)
"root_field2": "def" (indexed)
"root_field3": {
"Sub_field1: "ghi" (indexed)
"sub_field2: "jkl" (indexed)
"sub_field3: {
"Inner_field1: "mno" (indexed)
"Inner_field2: "pqr" (indexed)
}
}
Let’s assume that I have the following record
{
"name": "abc"
"birthdate": "2000-01-01"
"gender": "m"
}
Let’s assume that I just want the field "name" to be indexed. One solution (A), without having to specify every field is to define it this way (i.e. move the root fields to a sub level unindexed), and exclude unindexed from being indexed
{
"name": "abc"
"unindexed" {
"birthdate": "2000-01-01"
"gender": "m"
}
Ideally I would like to just specify a prefix such as _ to prevent each field to be indexed but there is no global solution for that.
{
"name": "abc"
"_birthdate": "2000-01-01"
"_gender": "m"
}
Is my solution (A) correct and is there a more elegant generic solution?
Thanks!
Accordinig to the documentation
https://cloud.google.com/firestore/docs/query-data/indexing
Add a single-field index exemption
Single-field index exemptions allow you to override automatic index
settings for specific fields in a collection. You can add a
single-field exemptions from the console:
Go to the Single Field Indexes section.
Click Add Exemption.
Enter a Collection ID and Field path.
Select new indexing settings for this field. Enable or disable
automatically updated ascending, descending, and array-contains
single-field indexes for this field.
Click Save Exemption.
I'm new with mongo
Entity:
{
"sender": {
"id": <unique key inside type>,
"type": <enum value>,
},
"recipient": {
"id": <unique key inside type>,
"type": <enum value>,
},
...
}
I need to create effective seach by query "find entities where sender or recipient equal to user from collection" with paging
foreach member in memberIHaveAccessTo:
condition ||= member == recipient || member == sender
I have read some about mongo indexes. Probably my problem can be solve by storing addional field "members" which will be array contains sender and recipient and then create index on this array
Is it possible to build such an index with monga?
Is mongo good choise to create indexes like?
Some thoughts about the issues raised in the question about querying and the application of indexes on the queried fields.
(i) The $or and two indexes:
I need to create effective search by query "find entities where sender
or recipient equal to user from collection...
Your query is going to be like this:
db.test.find( { $or: [ { "sender.id": "someid" }, { "recipient.id": "someid" } ] } )
With indexes defined on "sender.id" and "recipient.id", two individual indexes, the query with the $or operator will use both the indexes.
From the docs ($or Clauses and Indexes):
When evaluating the clauses in the $or expression, MongoDB either
performs a collection scan or, if all the clauses are supported by
indexes, MongoDB performs index scans.
Running the query with an explain() and examining the query plan shows that indexes are used for both the conditions.
(ii) Index on members array:
Probably my problem can be solve by storing addtional field "members"
which will be array contains sender and recipient and then create
index on this array...
With the members array field, the query will be like this:
db.test.find( { members_array: "someid" } )
When an index is defined on members_array field, the query will use the index; the generated query plan shows the index usage. Note that an index defined on an array field is referred as Multikey Index.
This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
Closed 5 years ago.
I've tried several ways of creating an aggregation pipeline which returns just the matching entries from a document's embedded array and not found any practical way to do this.
Is there some MongoDB feature which would avoid my very clumsy and error-prone approach?
A document in the 'workshop' collection looks like this...
{
"_id": ObjectId("57064a294a54b66c1f961aca"),
"type": "normal",
"version": "v1.4.5",
"invitations": [],
"groups": [
{
"_id": ObjectId("57064a294a54b66c1f961acb"),
"role": "facilitator"
},
{
"_id": ObjectId("57064a294a54b66c1f961acc"),
"role": "contributor"
},
{
"_id": ObjectId("57064a294a54b66c1f961acd"),
"role": "broadcaster"
},
{
"_id": ObjectId("57064a294a54b66c1f961acf"),
"role": "facilitator"
}
]
}
Each entry in the groups array provides a unique ID so that a group member is assigned the given role in the workshop when they hit a URL with that salted ID.
Given a _id matching an entry in a groups array like ObjectId("57064a294a54b66c1f961acb"), I need to return a single record like this from the aggregation pipeline - basically returning the matching entry from the embedded groups array only.
{
"_id": ObjectId("57064a294a54b66c1f961acb"),
"role": "facilitator",
"workshopId": ObjectId("57064a294a54b66c1f961aca")
},
In this example, the workshopId has been added as an extra field to identify the parent document, but the rest should be ALL the fields from the original group entry having the matching _id.
The approach I have adopted can just about achieve this but has lots of problems and is probably inefficient (with repetition of the filter clause).
return workshopCollection.aggregate([
{$match:{groups:{$elemMatch:{_id:groupId}}}},
{$unwind:"$groups"},
{$match:{"groups._id":groupId}},
{$project:{
_id:"$groups._id",
role:"$groups.role",
workshopId:"$_id",
}},
]).toArray();
Worse, since it explicitly includes named fields from the entry, it will omit any future fields which are added to the records. I also can't generalise this lookup operation to the case of 'invitations' or other embedded named arrays unless I can know what the array entries' fields are in advance.
I have wondered if using the $ or $elemMatch operators within a $project stage of the pipeline is the right approach, but so far they have either been either ignored or triggered operator validity errors when running the pipeline.
QUESTION
Is there another aggregation operator or alternative approach which would help me with this fairly mainstream problem - to return only the matching entries from a document's array?
The implementation below can handle arbitrary queries, serves results as a 'top-level document' and avoids duplicate filtering in the pipeline.
function retrieveArrayEntry(collection, arrayName, itemMatch){
var match = {};
match[arrayName]={$elemMatch:itemMatch};
var project = {};
project[arrayName+".$"] = true;
return collection.findOne(
match,
project
).then(function(doc){
if(doc !== null){
var result = doc[arrayName][0];
result._docId = doc._id;
return result;
}
else{
return null;
}
});
}
It can be invoked like so...
retrieveArrayEntry(workshopCollection, "groups", {_id:ObjectId("57064a294a54b66c1f961acb")})
However, it relies on the collection findOne(...) method instead of aggregate(...) so will be limited to serving the first matching array entry from the first matching document. Projections referencing an array match clause are apparently not possible through aggregate(...) in the same way they are through findXXX() methods.
A still more general (but confusing and inefficient) implementation allows retrieval of multiple matching documents and subdocuments. It works around the difficulty MongoDb has with syntax consistency of Document and Subdocument matching through the unpackMatch method, so that an incorrect 'equality' criterion e.g. ...
{greetings:{_id:ObjectId("437908743")}}
...gets transferred into the required syntax for a 'match' criterion (as discussed at Within a mongodb $match, how to test for field MATCHING , rather than field EQUALLING )...
{"greetings._id":ObjectId("437908743")}
Leading to the following implementation...
function unpackMatch(pathPrefix, match){
var unpacked = {};
Object.keys(match).map(function(key){
unpacked[pathPrefix + "." + key] = match[key];
})
return unpacked;
}
function retrieveArrayEntries(collection, arrayName, itemMatch){
var matchDocs = {},
projectItems = {},
unwindItems = {},
matchUnwoundByMap = {};
matchDocs.$match={};
matchDocs.$match[arrayName]={$elemMatch:itemMatch};
projectItems.$project = {};
projectItems.$project[arrayName]=true;
unwindItems.$unwind = "$" + arrayName;
matchUnwoundByMap.$match = unpackMatch(arrayName, itemMatch);
return collection.aggregate([matchDocs, projectItems, unwindItems, matchUnwoundByMap]).toArray().then(function(docs){
return docs.map(function(doc){
var result = doc[arrayName];
result._docId = doc._id;
return result;
});
});
}
I am working with a set of documents like this:
{
name : "BCC 204",
//etc
}
I have a list of names that I want to map to their DB entries.
For example:
var names = [ "BCC 204", "STEW 101", "SMTH 123" ]
and I want to make a query like this
db.labs.find( { name : { $in: names } } );
But the $in operator does not ensure that each item in the names array matches a result in the db.
(More info, names are unique)
You can't do this in the query. $in will check that a document matches at least one entry in the array given, but it's not going to consider the entire result set. This is a concern you'll need to manage in your application. Given a list of inputs, you will need to retrieve your results then check that given_names - results.map(:name) is empty.
To put it more simply, queries match documents, which compose a result set - they don't match a result set.
Suppose I have following collection :
{ _id" : ObjectId("4f1d8132595bb0e4830d15cc"),
"Data" : "[
{ "id1": "100002997235643", "from": {"name": "Joannah" ,"id": "100002997235643"} , "label" : "test" } ,
{ "id1": "100002997235644", "from": {"name": "Jon" ,"id": "100002997235644"} , "label" : "test1" }
]" ,
"stat" : "true"
}
How can I retrieve id1 , name , id ,label or any other element?
I am able to get _id field , DATA (complete array) but not the inner elements in DATA.
You cannot query for embedded structures. You always query for top level documents. If you want to query for individual elements from your array you will have to make those element top level documents (so, put them in their own collection) and maintain an array of _ids in this document.
That said, unless the array becomes very large it's almost always more efficient to simply grab your entire document and find the appropriate element in your app.
I don't think you can do that. It is explained here.
If you want to access specific fields, then following MongoDB Documentation,
you could add a flag parameter to your query, but you should redesign your documents for this to be useful:
Field Selection
In addition to the query expression, MongoDB queries can take some additional arguments. For example, it's possible to request only certain fields be returned. If we just wanted the social security numbers of users with the last name of 'Smith,' then from the shell we could issue this query:
// retrieve ssn field for documents where last_name == 'Smith':
db.users.find({last_name: 'Smith'}, {'ssn': 1});
// retrieve all fields *except* the thumbnail field, for all documents:
db.users.find({}, {thumbnail:0});