MongoDB search via index of documents containing JSON - mongodb

Say I have objects in a MongoDB collection:
{
...
"json" : "{\"things\":[2494090781803658355,5114030115038563045,3035856943768375362,8931213615561493991,7574631742057150605,480863244020297489]}"
}
It's an Azure "MongoDB" so doesn't support all the features, but suppose it does.
This search will find that document:
db.coll.find({"json" : {$regex : "5114030115038563045|8931213615561493991"}})
Of course, it's scanning the whole collection to pull these records out. What's an efficient/faster way to find documents where the list of "things"
contains any of a list of "things" in a query? It seems like throwing a search engine like Solr or ElasticSearch would solve this, and perhaps
using another Azure's Data Lake storage would make this more searchable, so I'm considering those options. They're outside the scope of this
question though; I'd like to know if there's a Mongo-ish way to search this collection by index.

The only option you have available to you if you're storing a JSON string is to use a text index with a $text operator.
If this document structure isn't set in stone, however, you might consider also separately storing the JSON as a nested subdocument (with the appropriate sanitation, of course). This would allow you to construct an index on json.things, while still storing the JSON string, and allow you to perform a query on e.g. "json.things": {$in: [ "5114030115038563045", "8931213615561493991" ]}

Related

What is the best way to structure the database for tag based fetching in cloud database

I am confused the way i need to structure my documents to effectively search/fetch items by tag when needed.
Meaning,structure of each document goes like:
{
name: "Delicious blackforest cake",
tags: ["blackforest","birthday","designer"]
...
}
{
name: "Red velvet cake",
tags: ["party","anniversary","designer"]
...
}
...
There's total of 32 tags , and i want to fetch the cakes based on tags.This is my present structure which i feel would be inefficient while fetching.
And I want to search based on tags and name of the cake, for example
if i search de
The search suggestions should be
designer cake /* This is based on tag */
Delicious blackForest cake /* This is based on actual name */
As per my knowledge i guess this is difficult to achieve in firebase. Should i opt for mongoDb or should i change the structure of the document.
I want suggestion to effectively search and fetch according to my above stated needs.
Firestore can be used for this use case. The array-contains operator can be used to query documents where tags array contains a specific value.
await colRef.where("tags", "array-contains", "tag")
If your use case required to find documents with multiple tags, then you might have to use a map instead of array. Checkout Firestore search array contains for multiple values.
MongoDB has a $all operator that can be used for this as shown below:
await collection.find({ tags: { $all: ["tag"] } })
For full-text search, you'll have to use a search service as also mentioned in the documentation for best results. Although MongoDB has a $search operator (uses Apache Lucene as far as I know), it can be used only when you host your database on Atlas otherwise you'll have to rely on $text operator.
Firestore Algolia Extension should do most of the work for you and let you use all full text search capabilities of Algolia.
Additionally, if you use Algolia with Firestore, you can get even better support for filtering by tags so you won't have to use a map instead of array as mentioned earlier.

Mongodb: searching embedded documents by the '_id' field

If I have a data with a structure like this as a single document in a collection:
{
_id: ObjectId("firstid"),
"name": "sublimetest",
"child": {
_id: ObjectId("childid"),
"name": "materialtheme"
}
}
is there a way to search for the embedded document by the id "childid" ?
because mongo doesn't index the _id fields of embedded documents (correct me if I am wrong here),
as this query doesn't work :
db.collection.find({_id:"childid"});
Also please suggest me if there is any other document database that would be suitable for this kind of retreiving data that is structured as a tree, where the requirement is to :
query children without having to issue joins
find any node in the tree as fast as you would find the root node, as if all these nodes were stored as separate documents in a collection.
Why this is not a duplicate of question(s) suggested :
the potential-duplicate-question, queries document by using dot notation. But what if the document is nested 7 levels deep ? In such case it would not be suitable to write a query using dot notation. what I want is that, all documents, whether top level, or nested, if they have the _id field, should be in the bucket of _id indexes, so that when you search db.collection.find({_id: "asdf"}), it should take into account documents that are nested too that have the _id field matching "asdf". In short, it should be as if the inner document weren't nested, but present parallel to the outer one.
You can use the dot notation:
db.posts.find({"child._id": "childid"})

Is there a way to implement "find or aggregate" using single operation in MongoDB?

I have a collection of simple documents like:
{tag: "...", data: {...}}
What i want to do is to find a document with given tag and make a projection from its data. But if this document cannot be found, i want to aggregate data from another documents.
I wonder if there is a way to make it using single operation.

ElasticSearch indexing and references to other documents

I have an ElasticSearch instance indexing a MongoDB database using the river by richardwilly98
There are two types of documents that are indexed:
documents referencing users
documents representing users
When these objects are added to mongodb richardwilly98's river generates something like the following:
document = {'user': {"$id" :
"5159a004c87126641f4f9530" } }
user_document = {'_id':"5159a004c87126641f4f9530",'username':'bob'}
If I perform a search for 'bob' i'd like any documents that reference the bob document to be returned. At the moment this doesn't happen because the username field is not related to the referencing documents in anyway.
Is it possible to do this? Does ElasticSearch have object references?
Thanks - let me know if I haven't been clear.
If each document belong to no more than one user, you can index documents as children of users. Then you can use has_parent filter to perform the search. However, if a single document can belong to more than one user, you will have to perform search in two steps. First you would have to find the user and then issue another search to find documents.
Elasticsearch supports parent field [1]. MongoDB river supports custom mapping [2] so _parent can now be used.
http://www.elasticsearch.org/guide/reference/mapping/parent-field/
https://github.com/richardwilly98/elasticsearch-river-mongodb/issues/64

MongoDB - forcing stored value to uppercase and searching

in SQL world I could do something to the effect of:
SELECT name FROM table WHERE UPPER(name) = UPPER('Smith');
and this would match a search for "Smith", "SMITH", "SmiTH", etc... because it forces the query and the value to be the same case.
However, MongoDB doesn't seem to have this capability without using a RegEx, which won't use indexes and would be slow for a large amount of data.
Is there a way to convert a stored value to a particular case before doing a search against it in MongoDB?
I've come across the $toUpper aggregate, but I can't figure out how that would be used in this particular case.
If there's not way to convert stored values before searching, is it possible to have MongoDB convert a value when it's created in Mongo? So when I add a document to the collection it would force the "name" attribute to a particular case? Something like a callback in the Rails world.
It looks like there's the ability to create stored JS for MongoDB as well, similar to a Stored Procedure. Would that be a feasible solution as well?
Mostly looking for a push in the right direction; I can figure out the particular code once I know what I'm looking for, but so far I'm not even sure if my desired functionality is doable.
You have to normalize your data before storing them. There is no support for performing normalization as part of a query at runtime.
The simplest thing to do is probably to save both a case-normalized (i.e. all-uppercase) and display version of the field you want to search by. Suppose you are storing users and want to do a case-insensitive search on last name. You might store:
{
_id: ObjectId(...),
first_name: "Dan",
last_name: "Crosta",
last_name_upper: "CROSTA"
}
You can then create an index on last_name_upper, and query like:
> db.users.find({last_name_upper: "CROSTA"})