MongoDB Atlas Search not working as expected with Must and Should?

MongoDB Atlas Search not working as expected with Must and Should? - mongodb

I am using MongoDB Atlas search for full-text search, as I write the query for Must and Should, must always work faster than should I don't know why?
As per my understanding and documentation -
Must work as AND Operator
Should work as OR Operator
MongoDB Schema -
{
"_id": {
"$oid": "63876f3ad75881cafe41a3e9"
},
"articleid": "b89bfa05-70b3-11ed-b775-2c59e5044e7b",
"headline": "Innovative Lessons for Rest of the World",
"subtitle": "",
"fulltext": "While the world wants to indigenize high-tech, weuses simple, local technologies to solve most of the problems.",
"pubdate": "2022-12-01",
"article_type": "print",
"date": 2022-12-01T00:00:00.000+00:00
}
}
Now, whenever I Search with must it works really fast and get results in 1-2 seconds from 2 million records.
[
{
"$search":{
"index":"fulltext",
"compound":{
"filter":[
{
"range":{
"path":"date",
"gte":"2023-01-30T00:00:00.000Z",
"lte":"2023-02-05T00:00:00.000Z"
}
}
],
"must":[
{
"text":{
"query":"indigenize",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
},
{
"text":{
"query":"technologies",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
}
]
}
}
}
]
And when I used Should Operator for search it takes 10X times to search and get the results, I don't understand why?
[
{
"$search":{
"index":"fulltext",
"compound":{
"filter":[
{
"range":{
"path":"date",
"gte":"2023-01-30T00:00:00.000Z",
"lte":"2023-02-05T00:00:00.000Z"
}
}
],
"should":[
{
"text":{
"query":"indigenize",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
},
{
"text":{
"query":"technologies",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
}
]
}
}
}
]
Do I miss anything? My goal is to search with Must (AND) Operator and Should (OR) Operator with multiple words. How will I search efficiently in the above collection schema to get data within seconds?

Related

MongoDB: Can't update in nested arrays

I've been trying to modify a value in multiple arrays for a few arrays and I can't find documentation on how to do this.
My collection looks like this
"rates": [
{
"category": "Web",
"seniorityRates": [
{
"seniority": "junior",
"rate": 100
},
{
"seniority": "intermediate",
"rate": 135
},
{
"seniority": "senior",
"rate": 165
}
]
}
]
I'm just trying to modify "junior" to "beginner", this should be simple.
Thanks to these answers:
How can I update a multi level nested array in MongoDB?
MongoDB updating fields in nested array
I've manage to write that python code (pymongo), but it doesn't works...
result = my_coll.update_many({},
{
"$set":
{
"rates.$[].seniorityRates.$[j].seniority" : new
}
},
upsert=False,
array_filters= [
{
"j.seniority": old
}
]
)
The path 'rates' must exist in the document in order to apply array updates.
It correspond to this command that doesn't work either
db.projects.updateMany({},
{
$set:
{
"rates.$[].seniorityRates.$[j].seniority" : "debutant"
}
},
{ arrayFilters = [
{
"j.seniority": "junior"
}
]
}
)
clone(t={}){const r=t.loc||{};return e({loc:new Position("line"in r?r.line:this.loc.line,"column"in r?r.column:......)} could not be cloned
What am I doing wrong ?
Any help would be very appreciated

The other option could be Sample
db.collection.update({},
{
$set: {
"rates.$[].seniorityRates.$[j].seniority": "debutant"
}
},
{
arrayFilters: [
{
"j.rate": { //As per your data, you can apply the condition o rate field to modify the level
$lte: 100
}
}
]
})
Or
The actual query should work Sample
db.collection.update({},
{
$set: {
"rates.$[].seniorityRates.$[j].seniority": "debutant"
}
},
{
arrayFilters: [
{
"j.seniority": "junior"
}
]
})
The same should work in python, a sample question

So I was just dumb here, I inverted two parameters so I didn't have the correct collection in the python code...
Thanks Gibbs for pointing out where the mistake was in the mongo command.
I will not delete this post as it can help other to know how to do this kind of queries.

mongoDB - find a specific result, with a document nested in different ways

I have a little problem with my command lines in MongoDB. Here is what one of my documents looks like :
{
"id": "6249c4eb85a7563e673b4360",
"collection": {
"id": "6244099cb9ebc40007ac2ce3"
},
"characters": [
{
"id": "6244074ab9ebc40007ac2cf9"
},
{
"id": "6244074ab9ebc40007ac2ce1"
}
],
}
Then, in the MongoDB shell, I try to retrieve some books according to their criteria, here is an example :
db.comic.find({
characters: {
$elemMatch:{
_id: {
$in: [
ObjectId("6244074ab9ebc40007ac2ce5")
]
}
}
},
collection: {
$elemMatch:{
_id: {
$in: [
ObjectId("6244099cb9ebc40007ac2cfb")
]
}
}
}
}).pretty()
If I remove the "collection" part, I find the books related to the characters. However, if I put back the "collection" part, I have no result. I think it's related to the fact that "collection" is not an array, but I don't know how to solve this problem, knowing that the goal is to have several collections in the search, hence the "$elemMatch" and the "$in".
Thanks in advance for your help, have a nice day.

You need to use collection._id instead of collection:{$elemMatch:{_id since $elemMatch is for finding an object inside an array.
So your query should look like:
db.comic.find({
characters: {
$elemMatch:{
_id: {
$in: [
ObjectId("6244074ab9ebc40007ac2ce5")
]
}
}
},
"collection._id": {
$in: [ObjectId("6244099cb9ebc40007ac2cfb")]
}
}).pretty()

Atlas Search works too slow when using facet

I have a big collection (over 22M records, approx. 25GB) on an M10 cluster with MongoDB version 4.4.10. I set up an Atlas search index on one field (address) and it works pretty fast when I request through the search tester. However, when I try to paginate it by specifying a facet, it gets extremely slow in comparison with the query without the facet. Is there a way to optimize the facet or somehow replace the facet with one that works faster ? Below are the plain query and another one with the facet:
db.getCollection("users").aggregate([{
$search: {
index: 'address',
text: {
query: '7148 BIG WOODS DR',
path: {
'wildcard': '*'
}
}
}
}]);
db.getCollection("users").aggregate([{
$search: {
index: 'address',
text: {
query: '7148 BIG WOODS DR',
path: {
'wildcard': '*'
}
}
}
}, {
$facet: {
paginatedResult: [
{
$limit: 50
},
{
$skip: 0
}
],
totalCount: [
{
$count: 'total'
}
]
}
}]);

The fast and recommend way is using facet with the $searchMeta stage to retrieve metadata results only for the query
"$searchMeta": {
"index":"search_index_with_facet_fields",
"facet":{
"operator":{
"compound":{
"must":[
{
"text":{
"query":"red shirt",
"path":{
"wildcard":"*"
}
}
},
{
"compound":{
"filter":[
{
"text":{
"query":["clothes"],
"path":"category"
}
},
{
"text":{
"query":[
"maroon",
"blackandred",
"blackred",
"crimson",
"burgandy",
"burgundy"
],
"path":"color"
}
}
]
}
}
]
}
},
"facets":{
"brand":{
"type":"string",
"path":"brand"
},
"size":{
"type":"string",
"path":"size"
},
"color":{
"type":"string",
"path":"color"
}
}
}
}
}
Here we are fetching 3 facets brand, size, and color, which we need to be defined in your search_index as Facet fields such as
{
"mappings": {
"dynamic": false,
"fields": {
"category": [
{
"type": "string"
}
],
"brand": [
{
"type": "string"
},
{
"type": "stringFacet"
}
],
"size": [
{
"type": "string"
},
{
"type": "stringFacet"
}
],
"color": [
{
"type": "string"
},
{
"type": "stringFacet"
}
]
}
}
}
category is defined only as string since we are not using it in facets but only as a filter field.
We can also replace filter op with must or should based on our requirement.
Finally, we will get as our result.
*p.s. I am also new to mongo and got to this solution after searching a lot, so please upvote if you find it useful, also let me know if there is any error/improvement you notice. Thanks *

MongoDb aggregation project onto collection

I've a problem with a huge MongoDb aggregation pipeline. I've many constraint and I've simplified the problem a lot. Hence, don't discuss the goal for this query.
I've a mongo aggregation that gives something similar to this:
[
{
"content": {
"processes": [
{
"id": "101a",
"title": "delivery"
},
{
"id": "101b",
"title": "feedback"
}
]
}
}
]
To this intermediate result I'm forced to apply a project operation in order to obtain something similar to this:
[
{
"results":
{
"titles": [
{
"id": "101a",
"value": "delivery"
},
{
"id": "101b",
"value": "feedback"
}
]
}
}
]
enter code here
But applying this projections:
"results.titles.id": "$content.processes.id",
"results.titles.value": "$content.processes.title"
I obtain this:
[
{
"results":
{
"titles": {
"id": ["101a", "101b"]
"value": ["delivery", "feedback"]
}
}
}
}
]
Collection are created but not in the proper position.
Is it possible to exploit some operator inside the project operation in order to tell mongo to create an array in a parent position?
Something like this:
"results.titles.$[x].value" : "$content.processes.value"

You can use the dot notation to project entire array:
db.col.aggregate([
{
$project: {
"results.titles": "$content.processes"
}
}
])
and if you need to rename title to value then you have to apply $map operator:
db.col.aggregate([
{
$project: {
"results.titles": {
$map: {
input: "$content.processes",
as: "process",
in: {
id: "$$process.id",
value: "$$process.title"
}
}
}
}
}
])

Mongodb, find entries from name or categories from a list of defined categories

Using mongodb, I am trying to find the entries where the name matches the user input among a definite list of categories OR where the categories from the same list match the user input.
I want to filter the collection to only show these categories ["major olympian", "twelve titan", "primordial deity"]
The user can search for a name from one of these categories or can search through the categories to display all the entries from the categories that matched the user input and are mentioned in the array
Before the collection was filtered this worked but now I only want the results from the filtered array of categories:
let _query = {
'$or': [
{
"name": {
"$regex": search,
"$options": "i"
}
},
{
"category": {
"$regex": search,
"$options": "i"
}
}
]
};
Here is a concrete example:
If the user type an it will return all entries where name contains an from categories ["major olympian", "twelve titan", "primordial deity"] as well as all the entries from major olympian and twelve titan
Here is a sample of my collection with one category creature that is never displayed :
{
"name": "Zeus",
"greekName": "Ζεύς, Zeus",
"romanName": "Jupiter",
"description": "King of the gods, ruler of Mount Olympus, and god of the sky, weather, thunder, lightning, law, order, and justice. He is the youngest son of Cronus and Rhea. He overthrew Cronus and gained the sovereignty of heaven for himself. In art he is depicted as a regal, mature man with a sturdy figure and dark beard. His usual attributes are the royal scepter and the lightning bolt. His sacred animals include the eagle and the bull. His Roman counterpart is Jupiter, also known as Jove.",
"category": "major olympian"
},
{
"name": "Ophiogenean dragon",
"greekName": "",
"romanName": "",
"description": "a dragon that guarded Artemis' sacred grove in Mysia.",
"category": "creature",
},
{
"greekName": "Ἀχλύς (Akhlýs)",
"name": "Achlys",
"description": "The goddess of poisons and the \"Death-Mist\", and personification of misery and sadness. Said to have existed before Chaos itself.",
"category": "primordial deity",
"romanName": ""
}

You can try something like this.
const _query = {
category: {
$in: TEMP_CATEGORIES
},
'$or': [{
"name": {
"$regex": search,
"$options": "i"
}
}, {
"category": {
"$regex": search,
"$options": "i"
}
}]
};

Thanks to #Veeram comment I managed to make it work.
const TEMP_CATEGORIES = ["major olympian", "twelve titan", "primordial deity"];
const _query = {
'$or': [
{
$and: [
{
"name": {
"$regex": search,
"$options": "i"
}
},
{
category: { $in: TEMP_CATEGORIES }
}
],
},
{
$and: [
{
"category": {
"$regex": search,
"$options": "i"
}
},
{
category: { $in: TEMP_CATEGORIES }
}
],
},
]
};
Greek
.find(_query)
.sort({ name: 1 })
.exec((err, greeks) => {
if (err) {
console.warn(err);
reject(err)
} else {
resolve(greeks);
}
});
});
I don't know if it is the proper way but, it appears to work.(I guess I should unit test it...)