I am trying to implement a search feature to MongoDB and this is the aggregate pipeline I am using:
[
{
'$search': {
'text': {
'query': 'albus',
'path': [
'first_name', 'email', 'last_name'
]
}
}
}, {
'$project': {
'_id': 1,
'first_name': 1,
'last_name': 1
}
}, {
'$limit': 5
}
]
The command returns documents that contain only exactly albus or Albus, but return nothing for queries like alb, albu, etc. In the demo video I watched here: https://www.youtube.com/watch?time_continue=8&v=kZ77X67GUfk, the instructor was able to search based on substring.
The search index I am currently using is the default dynamic one.
How would I need to change my command?
You need to use the autocomplete feature, so your query will look like this:
{
$search: {
"autocomplete": {
'query': 'albus',
'path': [
'first_name', 'email', 'last_name'
]
}
}
}
Mind you both first_name, email and last_name need to be mapped as autocomplete type so a name like albus will be indexed as a, al, alb, albu, albus. Obviously this will vastly increase your index size.
Another thing to consider is tweaking the maxGrams and tokenization parameters. this will allow very long names to still work as expected and if you want to allow substring match like lbu matching albus.
Related
I am trying to do a text search of the collection called DAFacility in MongoDB Compass:
_id:62170597b3fa8994a0d9a0c8
author:"User"
organizationName:"TSTT"
eventName:"Facility Assessment After Disaster"
eventDate:2022-02-01T00:00:00.000+00:00
area:"Siparia"
disasterNature:"Earthquake"
threatLevel:"High"
surroundingDamage:"Cracked Foundations and Roads"
facilityName:"Point Lisas Main Facility"
facLocation:Array
facStatus:"Operable"
operEqu:23
inoperEqu:7
facilityDamage:"Cracked Walls and Floors"
facImage:Array
__v:0
I am trying to search the field facilityDamage where i can search maybe one word from the entire data entry (eg searching the word "Walls" and having the entire entry shows up)
I am trying to perform it within mongoDB data aggregation option with the template being:
[
{
'$search': {
'index': 'string',
'text': {
'query': 'string',
'path': 'string'
}
}
}
]
I have read the document which got me more confused as to what goes in the index, query and path.
can you all advise me as to what variables goes into index, query and path.
Whenever i use it within node it returns an empty array:
exports.DAEquipment_damage_search = (req, res) => {
DAEquipment.aggregate([
[
{
'$match': {
'$or': [
{'facilityDamage':{ '$regex':'.*' + req.body.facilityDamage + '.*','$options': 'i' } }
]
}
}
}
]
]).then((DAEquipment) => {
res.send(DAEquipment);
console.log(DAEquipment);
})
.catch((e) => {
res.send(e);
});
};
await DAFacility.aggregate([
{
$match: {
$or: [
{"facilityDamage":{ $regex:'.*' + searchText + '.*',$options: 'i' } },
]
}
},
])
May it will help you
When building an autocomplete aggregation pipeline using MongoDb Atlas search indexes. How do I limit the autocomplete to only search through specific ID's?
I'm building search functionality where a user can search for people and the application should autocomplete the search but the user should only be suggested users that it is allowed to view.
My pipeline (works but need to be filtered):
{
'compound': {
'should': [{
'autocomplete': {
'query': "John",
'path': 'firstName'
}
}, {
'autocomplete': {
'query': "Doe",
'path': 'lastName'
}
},
]
}
}
If I have an array of people Id:s that the user can view, how do I go about only applying the autocomplete search on the people with the ID:s I supply?
Something like {_id: {$in myIdList}}
You can use the equals operator to match on an ObjectID. Specifically, in the must clause of your current Compound query, nest another Compound query which has 1 should clause per ObjectID you would like to filter for. Each should clause would be defined as an equals operator for a specific ObjectID. I can provide an example of how it would look like if needed. Also, just a reminder that the field that contains an array of ObjectID's needs to be indexed in Atlas Search so you can use the equals operator on it.
I would consider adding a filter clause to your compound query like so:
{
'compound': {
'filter': {
text': {
'query': userId,
'path': 'userId'
}
}
'should': [{
'autocomplete': {
'query': "John",
'path': 'firstName'
}
}, {
'autocomplete': {
'query': "Doe",
'path': 'lastName'
}
},
]
}
}
There's a great example here.
I have a schema which has one field named ownerId and a field which is an array named participantIds. In the frontend users can select participants. I'm using these ids to filter documents by querying the participantIds with the $all operator and the list of participantsIds from the frontend. This is perfect except that the participantsIds in the document don't include the ownerId. I thought about using aggregate to add a new field which consists of a list like this one: [participantIds, ownerId] and then querying against this new field with $all and after that delete the field again since it isn't need in the frontend.
How would such a query look like or is there any better way to achieve this behavior? I'm really lost right now since I'm trying to implement this with mongo_dart for the last 3 hours.
This is how the schema looks like:
{
_id: ObjectId(),
title: 'Title of the Event',
startDate: '2020-09-09T00:00:00.000',
endDate: '2020-09-09T00:00:00.000',
startHour: 1,
durationHours: 1,
ownerId: '5f57ff55202b0e00065fbd10',
participantsIds: ['5f57ff55202b0e00065fbd14', '5f57ff55202b0e00065fbd15', '5f57ff55202b0e00065fbd13'],
classesIds: [],
categoriesIds: [],
roomsIds: [],
creationTime: '2020-09-10T16:42:14.966',
description: 'Some Desc'
}
Tl;dr I want to query documents with the $all operator on the participantsIds field but the ownerId should be included in this query.
What I want is instead of querying against:
participantsIds: ['5f57ff55202b0e00065fbd14', '5f57ff55202b0e00065fbd15', '5f57ff55202b0e00065fbd13']
I want to query against:
participantsIds: ['5f57ff55202b0e00065fbd14', '5f57ff55202b0e00065fbd15', '5f57ff55202b0e00065fbd13', '5f57ff55202b0e00065fbd10']
Having fun here, by the way, it's better to use Joe answer if you are doing the query frequently, or even better a "All" field on insertion.
Additional Notes: Use projection at the start/end, to get what you need
https://mongoplayground.net/p/UP_-IUGenGp
db.collection.aggregate([
{
"$addFields": {
"all": {
$setUnion: [
"$participantsIds",
[
"$ownerId"
]
]
}
}
},
{
$match: {
all: {
$all: [
"5f57ff55202b0e00065fbd14",
"5f57ff55202b0e00065fbd15",
"5f57ff55202b0e00065fbd13",
"5f57ff55202b0e00065fbd10"
]
}
}
}
])
Didn't fully understand what you want to do but maybe this helps:
db.collection.find({
ownerId: "5f57ff55202b0e00065fbd10",
participantsIds: {
$all: ['5f57ff55202b0e00065fbd14',
'5f57ff55202b0e00065fbd15',
'5f57ff55202b0e00065fbd13']
})
You could use the pipeline form of update to either add the owner to the participant list or add a new consolidated field:
db.collection.update({},[{$set:{
allParticipantsIds: {$setUnion: [
"$participantsIds",
["$ownerId"]
]}
}}])
General question about Mongo query performance and the order of query arguments.
We have a collection for storing "files" meta data which includes the file-name, and status of the file (integer code value). There will only be a small number of files in the collection with the same name (maybe a few dozen at most), however there can be thousands of files with the same status.
If there is a Mongo query structured something like this:
db.getCollection('files').find( {
'$and': [
{ 'name': 'someFileName.csv' },
{ 'status': { '$in': [ 12, 6 ] } }
]
})
...would it perform any differently then the same query formatted like this:
db.getCollection('files').find( {
'$and': [
{ 'status': { '$in': [ 12, 6 ] } },
{ 'name': 'someFileName.csv' }
]
})
Which is to say: does the order of the $and clause arguments matter? Would scenario #1 perform better than scenario #2 since theoretically the file-name search would eliminate all but a few records? Or does Mongo operate in that manner under the covers?
No, the order of the fields in a query doesn't matter.
Also, due to query fields being implicitly "anded", these would also be equivalent to:
db.getCollection('files').find( {
'status': { '$in': [ 12, 6 ] },
'name': 'someFileName.csv'
})
and
db.getCollection('files').find( {
'name': 'someFileName.csv',
'status': { '$in': [ 12, 6 ] }
})
They're all treated the same by the query analyzer when determining the optimal way to execute the query.
I have a document which looks like this
{'name':'abc',
'location': 'xyz',
'social_links' : { 'facebook' : 'links',
'stackoverflow': 'links',
'quora' : 'links' ... }
}
I want to count the total number of links for each social_links in my collection
Currently my code looks like this
db.main_candidate.aggregate( [ { '$match': {'social_links.quora': {'$exists': true}}}, {'$group': { '_id' :'quora', 'count': {'$sum':1 }}}])
While this is correctly returning the counts for the specific social_link, I want to write a query which will be able to count for all the social_links in a single query instead of having to write for each specific name.
I think there is no way to group what you want with a query without hardcoding the specific names. Maybe you should try with MapReduce.
You should store social_links as an array instead as a document, which makes more sense to me. Something like:
{'name':'abc',
'location': 'xyz',
'social_links' : [ { 'name':'facebook', 'link' : 'links'},
{ 'name':'quora', 'link' : 'links'},
{ 'name':'stackoverflow', 'link' : 'links'}]
}
Then you could do the following query:
db.col.aggregate(
{
$unwind: "$social_links"
},
{
$group:
{
_id: "$social_links.name",
count: $sum: 1
}
})