Sorting by relevance with MongoDB - mongodb

I have a collection of documents in the following form:
{ _id: ObjectId(...)
, title: "foo"
, tags: ["bar", "baz", "qux"]
}
The query should find all documents with any of these tags. I currently use this query:
{ "tags": { "$in": ["bar", "hello"] } }
And it works; all documents tagged "bar" or "hello" are returned.
However, I want to sort by relevance, i.e. the more matching tags the earlier the document should occur in the result. For example, a document tagged ["bar", "hello", "baz"] should be higher in the results than a document tagged ["bar", "baz", "boo"] for the query ["bar", "hello"]. How can I achieve this?

MapReduce and doing it client-side is going to be too slow - you should use the aggregation framework (new in MongoDB 2.2).
It might look something like this:
db.collection.aggregate([
{ $match : { "tags": { "$in": ["bar", "hello"] } } },
{ $unwind : "$tags" },
{ $match : { "tags": { "$in": ["bar", "hello"] } } },
{ $group : { _id: "$title", numRelTags: { $sum:1 } } },
{ $sort : { numRelTags : -1 } }
// optionally
, { $limit : 10 }
])
Note the first and third pipeline members look identical, this is intentional and needed. Here is what the steps do:
pass on only documents which have tag "bar" or "hello" in them.
unwind the tags array (meaning split into one document per tags element
pass on only tags exactly "bar" or "hello" (i.e. discard the rest of the tags)
group by title (it could be also by "$_id" or any other combination of original document
adding up how many tags (of "bar" and "hello") it had
sort in descending order by number of relevant tags
(optionally) limit the returned set to top 10.

You could potentially use MapReduce for something like that. You'd process each document in the Map step, figuring out how many tags match the query, and assign a score. Then you could sort based on that score.
http://www.mongodb.org/display/DOCS/MapReduce

Something that complex should be done after querying. Either server-side through db.eval (if your client supports this) or just clientside. Here's an example for what you're looking for.
It will retreive all posts with the tags you specified, then sorts them according to the amount of matches.
remove the db.eva( part and translate it to the language your client uses to query to get the clientside effect (
db.eval(function () {
var tags = ["a","b","c"];
return db.posts.find({tags:{$in:tags}}).toArray().sort(function(a,b){
var matches_a = 0;
var matches_b = 0;
a.tags.forEach(function (tag) {
for (t in tags) {
if (tag == t) {
matches_a++;
} else {
matches_b++;
}
}
});
b.tags.forEach(function(tag) {
for (t in tags) {
if (tag == t) {
matches_b++;
} else {
matches_a++;
}
}
});
return matches_a - matches_b;
});
});

Related

mongoose search text in nested object

there is a nested mongodb document that looks like the image below. I want to do a search in the body field in this document, but I could not succeed. how can i do this search. thanks in advance
I tried to search like this. Is my path correct?
const result = await this.campaignModel.find({
"sequencesPageData": {
"$elemMatch": {
"mailSteps": {
"$elemMatch": {
"mailTemplate": x
}
}
}
}
});
I did a search like this but got no results.
const result = await this.campaignModel.find({
"sequencesPageData": {
"$elemMatch": {
"mailSteps": {
"$elemMatch": {
"mailTemplate": x
}
}
}
}
});
For object fields, simply access them using dot notation. For array fields, use $elemMatch to perform the match.
db.collection.find({
"sequencesPageData.mailSteps": {
"$elemMatch": {
"mailTemplate": {
"$elemMatch": {
"subject": "asdf"
}
}
}
}
})
Mongo Playground
If you are only testing a single field in the array, there is no need to use $elemMatch, you can use dot notation:
db.collection.find({ "sequencesPageData.mailSteps.mailTemplate.body": "x" })
$elemMatch is useful to ensure that 2 or more tests are satisfied by the same array element, but since this example uses only 1 test, it works without.

mongodb query: nested elemMatch

Currently, that's my current document:
{
_id: 'd283015f-91e9-4404-9202-093c28d6a931',
referencedGeneralPractitioner: [
{
resourceType: 'practitioner',
cachedIdentifier: [
{
system: { value: 'urn:oid:1.3.6.1.4.1.19126.3' },
value: { value: '14277399B' }
}
]
}
]
}
Here, there's two nested objects arrays: referencedGeneralPractitioner[{cachedIdentifier[{}]}].
Currently, I'm getting results using this query:
{
"referencedGeneralPractitioner":{
"$elemMatch":{
"cachedIdentifier.value.value":"14277399B",
"cachedIdentifier.system.value":"urn:oid:1.3.6.1.4.1.19126.3"
}
}
}
It's getting my desired document, but I don't quite figure out if above query is which I'm really looking for.
I mean, I'm only applying $elemMatch on referencedGeneralPractitioner field array.
Is it really enought?
Should I add a nested $elemMatch on cachedIdentifier?
Any ideas?
It looks like you need to query it like this:
db.collection.find({
"referencedGeneralPractitioner.cachedIdentifier": {
"$elemMatch": {
"value.value": "14277399B",
"system.value": "urn:oid:1.3.6.1.4.1.19126.3"
}
}
})
playground
This is in case you need to find the full document having $and of both values in same element in any of the elements in the nested array , if you need to extract specific element you will need to $filter
if you need to search also based on element in the 1st array level then you need to modify as follow:
{
"referencedGeneralPractitioner": {
"$elemMatch": {
resourceType: 'practitioner',
"cachedIdentifier": {
"$elemMatch": {
"value.value": 1,
"system.value":2
}
}
}
}
}
This will give you all full documents where at same time there is resouceType:"practitioner" and { value.value:3 and system.value: 2 }
Also is important to stress that this will not gona work correctly!:
{
"referencedGeneralPractitioner":{
"$elemMatch":{
"cachedIdentifier.value.value":"14277399B",
"cachedIdentifier.system.value":"urn:oid:1.3.6.1.4.1.19126.3"
}
}
}
Since it will match false positives based on any single value in the nested elements like:
wrong playground

PyMongo query field of documents

In my DynamoDB every document has several fields, one of the fields is a document called "engines" that holds several documents (all the engines) that hold several fields, as the picture shows below:
I would like to get all the couples of (engine,definitions) that their definition date is greater than a specific date.
I tried:
cursor=collection.find(
{'engines': { "$elemMatch" :
{ "definitions" :
{'$gt': startdate} } } }
,{'engines':{'$elemMatch':1}},{'engines':{'$elemMatch':{'definitions':1}}} )
but I get:
TypeError: skip must be an instance of int
Can someone help with the query?
You've mixed up the closing } and ended up passing {'engines':{'$elemMatch':{'definitions':1}}} as a skip argument value.
I think you meant:
cursor = collection.find(
{
'engines': {
"$elemMatch": {
"definitions": {
'$gt': startdate
}
}
}
},
{
'engines': {
'$elemMatch': {
'definitions': 1
}
}
}
)

MongoDB conditionally $addToSet sub-document in array by specific field

Is there a way to conditionally $addToSet based on a specific key field in a subdocument on an array?
Here's an example of what I mean - given the collection produced by the following sample bootstrap;
cls
db.so.remove();
db.so.insert({
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples"
}
]
});
n defines a unique document key. I only want one entry with the same n value in the array at any one time. So I want to be able to update the pfms array using n so that I end up with just this;
{
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples",
"mState": 1111234
}
]
}
Here's where I am at the moment;
db.so.update({
"Name": "fruitBowl",
},{
// not allowed to do this of course
// "$pull": {
// "pfms": { n: "apples" },
// },
"$addToSet": {
"pfms": {
"$each": [
{
"n": "apples",
"mState": 1111234
}
]
}
}
}
)
Unfortunately, this adds another array element;
db.so.find().toArray();
[
{
"Name" : "fruitBowl",
"_id" : ObjectId("53ecfef5baca2b1079b0f97c"),
"pfms" : [
{
"n" : "apples"
},
{
"n" : "apples",
"mState" : 1111234
}
]
}
]
I need to effectively upsert the apples document matching on n as the unique identifier and just set mState whether or not an entry already exists. It's a shame I can't do a $pull and $addToSet in the same document (I tried).
What I really need here is dictionary semantics, but that's not an option right now, nor is breaking out the document - can anyone come up with another way?
FWIW - the existing format is a result of language/driver serialization, I didn't choose it exactly.
further
I've gotten a little further in the case where I know the array element already exists I can do this;
db.so.update({
"Name": "fruitBowl",
"pfms.n": "apples",
},{
$set: {
"pfms.$.mState": 1111234,
},
}
)
But of course that only works;
for a single array element
as long as I know it exists
The first limitation isn't a disaster, but if I can't effectively upsert or combine $addToSet with the previous $set (which of course I can't) then it the only workarounds I can think of for now mean two DB round-trips.
The $addToSet operator of course requires that the "whole" document being "added to the set" is in fact unique, so you cannot change "part" of the document or otherwise consider it to be a "partial match".
You stumbled on to your best approach using $pull to remove any element with the "key" field that would result in "duplicates", but of course you cannot modify the same path in different update operators like that.
So the closest thing you will get is issuing separate operations but also doing that with the "Bulk Operations API" which is introduced with MongoDB 2.6. This allows both to be sent to the server at the same time for the closest thing to a "contiguous" operations list you will get:
var bulk = db.so.initializeOrderedBulkOp();
bulk.find({ "Name": "fruitBowl", "pfms.n": "apples": }).updateOne({
"$pull": { "pfms": { "n": "apples" } }
});
bulk.find({ "Name": "fruitBowl" }).updateOne({
"$push": { "pfms": { "n": "apples", "state": 1111234 } }
})
bulk.execute();
That pretty much is your best approach if it is not possible or practical to move the elements to another collection and rely on "upserts" and $set in order to have the same functionality but on a collection rather than array.
I have faced the exact same scenario. I was inserting and removing likes from a post.
What I did is, using mongoose findOneAndUpdate function (which is similar to update or findAndModify function in mongodb).
The key concept is
Insert when the field is not present
Delete when the field is present
The insert is
findOneAndUpdate({ _id: theId, 'likes.userId': { $ne: theUserId }},
{ $push: { likes: { userId: theUserId, createdAt: new Date() }}},
{ 'new': true }, function(err, post) { // do the needful });
The delete is
findOneAndUpdate({ _id: theId, 'likes.userId': theUserId},
{ $pull: { likes: { userId: theUserId }}},
{ 'new': true }, function(err, post) { // do the needful });
This makes the whole operation atomic and there are no duplicates with respect to the userId field.
I hope this helpes. If you have any query, feel free to ask.
As far as I know MongoDB now (from v 4.2) allows to use aggregation pipelines for updates.
More or less elegant way to make it work (according to the question) looks like the following:
db.runCommand({
update: "your-collection-name",
updates: [
{
q: {},
u: {
$set: {
"pfms.$[elem]": {
"n":"apples",
"mState": NumberInt(1111234)
}
}
},
arrayFilters: [
{
"elem.n": {
$eq: "apples"
}
}
],
multi: true
}
]
})
In my scenario, The data need to be init when not existed, and update the field If existed, and the data will not be deleted. If the datas have these states, you might want to try the following method.
// Mongoose, but mostly same as mongodb
// Update the tag to user, If there existed one.
const user = await UserModel.findOneAndUpdate(
{
user: userId,
'tags.name': tag_name,
},
{
$set: {
'tags.$.description': tag_description,
},
}
)
.lean()
.exec();
// Add a default tag to user
if (user == null) {
await UserModel.findOneAndUpdate(
{
user: userId,
},
{
$push: {
tags: new Tag({
name: tag_name,
description: tag_description,
}),
},
}
);
}
This is the most clean and fast method in the scenario.
As a business analyst , I had the same problem and hopefully I have a solution to this after hours of investigation.
// The customer document:
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK"
}
]
}
// The problem : I want to insert dateField "01.01.2016" to customer documents where customerCodes subdocument has a document with code "YK" but does not have dateField. The final document must be as follows :
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK" ,
"dateField" : "01.01.2016"
}
]
}
// The solution : the solution code is in three steps :
// PART 1 - Find the customers with customerCodes "YK" but without dateField
// PART 2 - Find the index of the subdocument with "YK" in customerCodes list.
// PART 3 - Insert the value into the document
// Here is the code
// PART 1
var myCursor = db.customers.find({ customerCodes:{$elemMatch:{code:"YK", dateField:{ $exists:false} }}});
// PART 2
myCursor.forEach(function(customer){
if(customer.customerCodes != null )
{
var size = customer.customerCodes.length;
if( size > 0 )
{
var iFoundTheIndexOfSubDocument= -1;
var index = 0;
customer.customerCodes.forEach( function(clazz)
{
if( clazz.code == "YK" && clazz.changeDate == null )
{
iFoundTheIndexOfSubDocument = index;
}
index++;
})
// PART 3
// What happens here is : If i found the indice of the
// "YK" subdocument, I create "updates" document which
// corresponds to the new data to be inserted`
//
if( iFoundTheIndexOfSubDocument != -1 )
{
var toSet = "customerCodes."+ iFoundTheIndexOfSubDocument +".dateField";
var updates = {};
updates[toSet] = "01.01.2016";
db.customers.update({ "id" : customer.id } , { $set: updates });
// This statement is actually interpreted like this :
// db.customers.update({ "id" : "1212" } ,{ $set: customerCodes.0.dateField : "01.01.2016" });
}
}
}
});
Have a nice day !

In mongoDb, how do you remove an array element by its index?

In the following example, assume the document is in the db.people collection.
How to remove the 3rd element of the interests array by it's index?
{
"_id" : ObjectId("4d1cb5de451600000000497a"),
"name" : "dannie",
"interests" : [
"guitar",
"programming",
"gadgets",
"reading"
]
}
This is my current solution:
var interests = db.people.findOne({"name":"dannie"}).interests;
interests.splice(2,1)
db.people.update({"name":"dannie"}, {"$set" : {"interests" : interests}});
Is there a more direct way?
There is no straight way of pulling/removing by array index. In fact, this is an open issue http://jira.mongodb.org/browse/SERVER-1014 , you may vote for it.
The workaround is using $unset and then $pull:
db.lists.update({}, {$unset : {"interests.3" : 1 }})
db.lists.update({}, {$pull : {"interests" : null}})
Update: as mentioned in some of the comments this approach is not atomic and can cause some race conditions if other clients read and/or write between the two operations. If we need the operation to be atomic, we could:
Read the document from the database
Update the document and remove the item in the array
Replace the document in the database. To ensure the document has not changed since we read it, we can use the update if current pattern described in the mongo docs
You can use $pull modifier of update operation for removing a particular element in an array. In case you provided a query will look like this:
db.people.update({"name":"dannie"}, {'$pull': {"interests": "guitar"}})
Also, you may consider using $pullAll for removing all occurrences. More about this on the official documentation page - http://www.mongodb.org/display/DOCS/Updating#Updating-%24pull
This doesn't use index as a criteria for removing an element, but still might help in cases similar to yours. IMO, using indexes for addressing elements inside an array is not very reliable since mongodb isn't consistent on an elements order as fas as I know.
in Mongodb 4.2 you can do this:
db.example.update({}, [
{$set: {field: {
$concatArrays: [
{$slice: ["$field", P]},
{$slice: ["$field", {$add: [1, P]}, {$size: "$field"}]}
]
}}}
]);
P is the index of element you want to remove from array.
If you want to remove from P till end:
db.example.update({}, [
{ $set: { field: { $slice: ["$field", 1] } } },
]);
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to update an array by removing an element at a given index:
// { "name": "dannie", "interests": ["guitar", "programming", "gadgets", "reading"] }
db.collection.update(
{ "name": "dannie" },
[{ $set:
{ "interests":
{ $function: {
body: function(interests) { interests.splice(2, 1); return interests; },
args: ["$interests"],
lang: "js"
}}
}
}]
)
// { "name": "dannie", "interests": ["guitar", "programming", "reading"] }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify. The function here simply consists in using splice to remove 1 element at index 2.
args, which contains the fields from the record that the body function takes as parameter. In our case "$interests".
lang, which is the language in which the body function is written. Only js is currently available.
Rather than using the unset (as in the accepted answer), I solve this by setting the field to a unique value (i.e. not NULL) and then immediately pulling that value. A little safer from an asynch perspective. Here is the code:
var update = {};
var key = "ToBePulled_"+ new Date().toString();
update['feedback.'+index] = key;
Venues.update(venueId, {$set: update});
return Venues.update(venueId, {$pull: {feedback: key}});
Hopefully mongo will address this, perhaps by extending the $position modifier to support $pull as well as $push.
I would recommend using a GUID (I tend to use ObjectID) field, or an auto-incrementing field for each sub-document in the array.
With this GUID it is easy to issue a $pull and be sure that the correct one will be pulled. Same goes for other array operations.
For people who are searching an answer using mongoose with nodejs. This is how I do it.
exports.deletePregunta = function (req, res) {
let codTest = req.params.tCodigo;
let indexPregunta = req.body.pregunta; // the index that come from frontend
let inPregunta = `tPreguntas.0.pregunta.${indexPregunta}`; // my field in my db
let inOpciones = `tPreguntas.0.opciones.${indexPregunta}`; // my other field in my db
let inTipo = `tPreguntas.0.tipo.${indexPregunta}`; // my other field in my db
Test.findOneAndUpdate({ tCodigo: codTest },
{
'$unset': {
[inPregunta]: 1, // put the field with []
[inOpciones]: 1,
[inTipo]: 1
}
}).then(()=>{
Test.findOneAndUpdate({ tCodigo: codTest }, {
'$pull': {
'tPreguntas.0.pregunta': null,
'tPreguntas.0.opciones': null,
'tPreguntas.0.tipo': null
}
}).then(testModificado => {
if (!testModificado) {
res.status(404).send({ accion: 'deletePregunta', message: 'No se ha podido borrar esa pregunta ' });
} else {
res.status(200).send({ accion: 'deletePregunta', message: 'Pregunta borrada correctamente' });
}
})}).catch(err => { res.status(500).send({ accion: 'deletePregunta', message: 'error en la base de datos ' + err }); });
}
I can rewrite this answer if it dont understand very well, but I think is okay.
Hope this help you, I lost a lot of time facing this issue.
It is little bit late but some may find it useful who are using robo3t-
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
interests: "guitar" // you can change value to
}
},
{ multi: true }
);
If you have values something like -
property: [
{
"key" : "key1",
"value" : "value 1"
},
{
"key" : "key2",
"value" : "value 2"
},
{
"key" : "key3",
"value" : "value 3"
}
]
and you want to delete a record where the key is key3 then you can use something -
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
property: { key: "key3"} // you can change value to
}
},
{ multi: true }
);
The same goes for the nested property.
this can be done using $pop operator,
db.getCollection('collection_name').updateOne( {}, {$pop: {"path_to_array_object":1}})