MongoDB :: Order Search result depend on search condition - mongodb

I have a data
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
},
{ "name":"QQQ",
"keyword":"key3",
"city":"xyz"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
}]
and i need to search records which have name= "BS" OR keyword="key2" with the help of query
db.collection.find({"$OR" : [{"name":"BS"}, {"keyword":"Key2"}]});
These records i need in the sequence
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
}]
but i am getting in following sequences:
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
}]
Please provide some suggestion i am stuck with this problem since 2 days.
Thanks

The order of results returned by MongoDB is not guaranteed unless you explicitly sort your data using the sort function. For smaller datasets you maybe "lucky" in the sense that the results are always returned in the same order, however, for bigger datasets and in particular when you have sharded Mongo clusters this is very unlikely. As proposed by Yathish you need to explicitly order your results using the sort function. Based on the suggested output, it seems you want to sort by name in descending order so I have set the sorting flag to -1 for the field name.
db.collection.find({"$or" : [{"name":"BS"}, {"keyword":"Key2"}]}).sort({"name" : -1});
If you need a more complex sorting algorithm as specified in your comment, you can convert your results to a Javascript array and create a custom sort function. This sort function will first list documents with a name equal to "BS" and then documents containing the keyword "Key2"
db.data.find({
"$or": [{
"name": "BS"
}, {
"keyword": "Key2"
}]
}).toArray().sort(function(doc1, doc2) {
if (doc1.name == "BS" && doc2.keyword == "Key2") {
return -1
} else if (doc2.name == "BS" && doc1.keyword == "Key2") {
return 1
} else {
return doc1.name < doc2.name
}
});

Related

mongodb query to verify embedded array sequence numbers

given a document structure as shown, where the trades array can have thousands of items... how on earth could one do a query that would verify that the sequence always has 'startTradeId' one number higher than the previous items 'endTradeId', all the way through the array? is this even possible?
{
"name": "STOCK",
"trades": [{
"endTradeId": 41306,
"startTradeId": 41302,
...
},
{
"endTradeId": 41301,
"startTradeId": 41297,
...
},
{
"endTradeId": 41296,
"startTradeId": 41240,
...
},
...
]
}
You can use $where operator like below :
db.your_collection.find( { $where : function(){ return "this.trades.startTradeId > this.trades.endTradeId" }});

mongodb: document with the maximum number of matched targets

I need help to solve the following issue. My collection has a "targets" field.
Each user can have 0 or more targets.
When I run my query I'd like to retrieve the document with the maximum number of matched targets.
Ex:
documents=[{
targets:{
"cluster":"01",
}
},{
targets:{
"cluster":"01",
"env":"DC",
"core":"PO"
}
},{
targets:{
"cluster":"01",
"env":"DC",
"core":"PO",
"platform":"IG"
}
}];
userTarget={
"cluster":"01",
"env":"DC",
"core":"PO"
}
You seem to be asking to return the document where the most conditions were met, and possibly not all conditions. The basic process is an $or query to return the documents that can match either of the conditions. Then you basically need a statement to calculate "how many terms" were met in the document, and return the one that matched the most.
So the combination here is an .aggregate() statement using the intitial results from $or to calculate and then sort the results:
// initial targets object
var userTarget = {
"cluster":"01",
"env":"DC",
"core":"PO"
};
// Convert to $or condition
// and the calcuation condition to match
var orCondition = [],
scoreCondition = []
Object.keys(userTarget).forEach(function(key) {
var query = {},
cond = { "$cond": [{ "$eq": ["$target." + key, userTarget[key]] },1,0] };
query["target." + key] = userTarget[key];
orCondition.push(query);
scoreCondition.push(cond);
});
// Run aggregation
Model.aggregate(
[
// Match with condition
{ "$match": { "$or": orCondition } },
// Calculate a "score" based on matched fields
{ "$project": {
"target": 1,
"score": {
"$add": scoreCondition
}
}},
// Sort on the greatest "score" (descending)
{ "$sort": { "score": -1 } },
// Return the first document
{ "$limit": 1 }
],
function(err,result) {
// check errors
// Remember that result is an array, even if limitted to one document
console.log(result[0]);
}
)
So before processing the aggregate statement, we are going to generate the dynamic parts of the pipeline operations based on the input in the userTarget object. This would produce an orCondition like this:
{ "$match": {
"$or": [
{ "target.cluster" : "01" },
{ "target.env" : "DC" },
{ "target.core" : "PO" }
]
}}
And the scoreCondition would expand to a coding like this:
"score": {
"$add": [
{ "$cond": [{ "$eq": [ "$target.cluster", "01" ] },1,0] },
{ "$cond": [{ "$eq": [ "$target.env", "DC" ] },1,0] },
{ "$cond": [{ "$eq": [ "$target.core", "PO" ] },1,0] },
]
}
Those are going to be used in the selection of possible documents and then for counting the terms that could match. In particular the "score" is made by evaluating each condition within the $cond ternary operator, and then either attributing a score of 1 where there was a match, or 0 where there was not a match on that field.
If desired, it would be simple to alter the logic to assign a higher "weight" to each field with a different value going towards the score depending on the deemed importance of the match. At any rate, you simply $add these score results together for each field for the overall "score".
Then it is just a simple matter of applying the $sort to the returned "score", and then using $limit to just return the top document.
It's not super efficient, since even though there is a match for all three conditions the basic question you are asking of the data cannot presume that there is, hence it needs to look at all data where "at least one" condition was a match, and then just work out the "best match" from those possible results.
Ideally, I would personally run an additional query "first" to see if all three conditions were met, and if not then look for the other cases. That still is two separate queries, and would be different from simply just pushing the "and" conditions for all fields as the first statement in $or.
So the preferred implementation I think should be:
Look for a document that matches all given field values; if not then
Run the either/or on every field and count the condition matches.
That way, if all fields match then the first query is fastest and only needs to fall back to the slower but required implementaion shown in the listing if there was no actual result.

How can I create an index in on an array field in MongoDB?

I have a MongoDB collection with data in the format of:
[
{
"data1":1,
"data2":2,
"data3":3,
"data4":4,
"horses":[
{
"opponent":{
"jockey":"MyFirstName MyLastName",
"name":"MyHorseName",
"age":4,
"sex":"g",
"scratched":"false",
"id":"1"
},
"id":"1"
},
{
"opponent":{
"jockey":"YourFirstName YourLastName",
"name":"YourHorseName",
"age":4,
"sex":"m",
"scratched":"false",
"id":"2"
},
"id":"2"
}
]
},
...
]
Executing the following query returns exactly what I need:
db.race_results.find({ "$and": [ { "horses":
{ "$elemMatch": { "$and": [
{ "opponent.name": "MyFirstName MyLastName" },
{ "opponent.jockey": "MyHorseName"}
] } }
}
]})
However, this query takes 0.5 seconds to execute with my collection (there are a lot of records).
I am trying to find out how to create an index on the horses.opponent.name field of the data. I have read the docs about multikey indexes (here), but I'm not sure if this is exactly what I need or not. What I need (I think) is an index on the array element of horses, but only the name and jockey fields. Is this possible?
Is there a way to create an index to make my specific query (the one above) any faster?
Any pointers would be greatly appreciated. I am fairly new to MongoDB, but learning fast!
The index to create is:
db.race_results.ensureIndex({"horses.opponent.name":1, "horses.opponent.jockey":1})
After creating this index, the query in your case should return number of scanned objects that is equal to the number of matched objects:
db.race_results.find( { horses: { $elemMatch: { "opponent.name": "MyHorseName", "opponent.jockey": "MyFirstName MyLastName" } } }
).explain()

MongoDB conditionally $addToSet sub-document in array by specific field

Is there a way to conditionally $addToSet based on a specific key field in a subdocument on an array?
Here's an example of what I mean - given the collection produced by the following sample bootstrap;
cls
db.so.remove();
db.so.insert({
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples"
}
]
});
n defines a unique document key. I only want one entry with the same n value in the array at any one time. So I want to be able to update the pfms array using n so that I end up with just this;
{
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples",
"mState": 1111234
}
]
}
Here's where I am at the moment;
db.so.update({
"Name": "fruitBowl",
},{
// not allowed to do this of course
// "$pull": {
// "pfms": { n: "apples" },
// },
"$addToSet": {
"pfms": {
"$each": [
{
"n": "apples",
"mState": 1111234
}
]
}
}
}
)
Unfortunately, this adds another array element;
db.so.find().toArray();
[
{
"Name" : "fruitBowl",
"_id" : ObjectId("53ecfef5baca2b1079b0f97c"),
"pfms" : [
{
"n" : "apples"
},
{
"n" : "apples",
"mState" : 1111234
}
]
}
]
I need to effectively upsert the apples document matching on n as the unique identifier and just set mState whether or not an entry already exists. It's a shame I can't do a $pull and $addToSet in the same document (I tried).
What I really need here is dictionary semantics, but that's not an option right now, nor is breaking out the document - can anyone come up with another way?
FWIW - the existing format is a result of language/driver serialization, I didn't choose it exactly.
further
I've gotten a little further in the case where I know the array element already exists I can do this;
db.so.update({
"Name": "fruitBowl",
"pfms.n": "apples",
},{
$set: {
"pfms.$.mState": 1111234,
},
}
)
But of course that only works;
for a single array element
as long as I know it exists
The first limitation isn't a disaster, but if I can't effectively upsert or combine $addToSet with the previous $set (which of course I can't) then it the only workarounds I can think of for now mean two DB round-trips.
The $addToSet operator of course requires that the "whole" document being "added to the set" is in fact unique, so you cannot change "part" of the document or otherwise consider it to be a "partial match".
You stumbled on to your best approach using $pull to remove any element with the "key" field that would result in "duplicates", but of course you cannot modify the same path in different update operators like that.
So the closest thing you will get is issuing separate operations but also doing that with the "Bulk Operations API" which is introduced with MongoDB 2.6. This allows both to be sent to the server at the same time for the closest thing to a "contiguous" operations list you will get:
var bulk = db.so.initializeOrderedBulkOp();
bulk.find({ "Name": "fruitBowl", "pfms.n": "apples": }).updateOne({
"$pull": { "pfms": { "n": "apples" } }
});
bulk.find({ "Name": "fruitBowl" }).updateOne({
"$push": { "pfms": { "n": "apples", "state": 1111234 } }
})
bulk.execute();
That pretty much is your best approach if it is not possible or practical to move the elements to another collection and rely on "upserts" and $set in order to have the same functionality but on a collection rather than array.
I have faced the exact same scenario. I was inserting and removing likes from a post.
What I did is, using mongoose findOneAndUpdate function (which is similar to update or findAndModify function in mongodb).
The key concept is
Insert when the field is not present
Delete when the field is present
The insert is
findOneAndUpdate({ _id: theId, 'likes.userId': { $ne: theUserId }},
{ $push: { likes: { userId: theUserId, createdAt: new Date() }}},
{ 'new': true }, function(err, post) { // do the needful });
The delete is
findOneAndUpdate({ _id: theId, 'likes.userId': theUserId},
{ $pull: { likes: { userId: theUserId }}},
{ 'new': true }, function(err, post) { // do the needful });
This makes the whole operation atomic and there are no duplicates with respect to the userId field.
I hope this helpes. If you have any query, feel free to ask.
As far as I know MongoDB now (from v 4.2) allows to use aggregation pipelines for updates.
More or less elegant way to make it work (according to the question) looks like the following:
db.runCommand({
update: "your-collection-name",
updates: [
{
q: {},
u: {
$set: {
"pfms.$[elem]": {
"n":"apples",
"mState": NumberInt(1111234)
}
}
},
arrayFilters: [
{
"elem.n": {
$eq: "apples"
}
}
],
multi: true
}
]
})
In my scenario, The data need to be init when not existed, and update the field If existed, and the data will not be deleted. If the datas have these states, you might want to try the following method.
// Mongoose, but mostly same as mongodb
// Update the tag to user, If there existed one.
const user = await UserModel.findOneAndUpdate(
{
user: userId,
'tags.name': tag_name,
},
{
$set: {
'tags.$.description': tag_description,
},
}
)
.lean()
.exec();
// Add a default tag to user
if (user == null) {
await UserModel.findOneAndUpdate(
{
user: userId,
},
{
$push: {
tags: new Tag({
name: tag_name,
description: tag_description,
}),
},
}
);
}
This is the most clean and fast method in the scenario.
As a business analyst , I had the same problem and hopefully I have a solution to this after hours of investigation.
// The customer document:
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK"
}
]
}
// The problem : I want to insert dateField "01.01.2016" to customer documents where customerCodes subdocument has a document with code "YK" but does not have dateField. The final document must be as follows :
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK" ,
"dateField" : "01.01.2016"
}
]
}
// The solution : the solution code is in three steps :
// PART 1 - Find the customers with customerCodes "YK" but without dateField
// PART 2 - Find the index of the subdocument with "YK" in customerCodes list.
// PART 3 - Insert the value into the document
// Here is the code
// PART 1
var myCursor = db.customers.find({ customerCodes:{$elemMatch:{code:"YK", dateField:{ $exists:false} }}});
// PART 2
myCursor.forEach(function(customer){
if(customer.customerCodes != null )
{
var size = customer.customerCodes.length;
if( size > 0 )
{
var iFoundTheIndexOfSubDocument= -1;
var index = 0;
customer.customerCodes.forEach( function(clazz)
{
if( clazz.code == "YK" && clazz.changeDate == null )
{
iFoundTheIndexOfSubDocument = index;
}
index++;
})
// PART 3
// What happens here is : If i found the indice of the
// "YK" subdocument, I create "updates" document which
// corresponds to the new data to be inserted`
//
if( iFoundTheIndexOfSubDocument != -1 )
{
var toSet = "customerCodes."+ iFoundTheIndexOfSubDocument +".dateField";
var updates = {};
updates[toSet] = "01.01.2016";
db.customers.update({ "id" : customer.id } , { $set: updates });
// This statement is actually interpreted like this :
// db.customers.update({ "id" : "1212" } ,{ $set: customerCodes.0.dateField : "01.01.2016" });
}
}
}
});
Have a nice day !

MongoDb MapReduce on child array

I've searched the internet long and hard but can't find a solution to this problem. Whilst there are lots of Map reduce examples, i'm getting confused because my document has a property which is an array of objects.
I'm pretty sure this should be easy for someone with experience but i'm a noob at the minute.
I have a document which looks roughly like this
{
_id:guid,
clientId:guid,
reference:'abc123'
items:
[
{ _id:guid, category:'A', length:100, active:true },
{ _id:guid, category:'B', length:150, active:true },
{ _id:guid, category:'A', length:10, active:false },
{ _id:guid, category:'A', length:111, active:true },
]
}
and I want to produce this output
dateFromIdGuid(day) category countOfItems countOfActive sumOfLength
I'd like to keep the data in this format to reduce the number of write operations (there are already over 1000 writes to this collection per second and rising)
This is driving me insane so any help would be very much appreciated.
Thanks.
If you are talking about extracting a timestamp and reducing that to a discrete day from a GUID, then MongoDB is not going to be of much help to you there. You would need an external language implementation that would support such a function and implement an external mapReduce process such as with Hadoop.
It makes me wonder though if we are in fact talking about a GUID or whether you actually mean an ObjectID which would be the default value for the _id field of your document unless this has been specifically overridden to have a GUID in there.
Even if that is not true, you would be helped by adding a "timestamp" field of some sort to your document and using the correct BSON Date object type as shown below:
{
_id:guid,
"timestamp": ISODate("2014-05-27T00:00:00Z")
"clientId":guid,
"reference":'abc123'
"items":
[
{ _id:guid, category:'A', length:100, active:true },
{ _id:guid, category:'B', length:150, active:true },
{ _id:guid, category:'A', length:10, active:false },
{ _id:guid, category:'A', length:111, active:true },
]
}
This allows you to use the MongoDB aggregation framework as it can operate on Date objects of this type in order to break down the results to discrete days:
db.collection.aggregate([
{ "$unwind": "$items" },
{ "$group": {
"_id": {
"day": { "$dayOfYear": "$timestamp" },
"category": "$items.category"
},
"countOfItems": { "$sum": 1 },
"countOfActive": {
"$sum": {
"$cond": [
"$items.active",
1,
0
]
}
},
"sumOfLength": { "$sum": "$items.length" }
}}
])
That not only gives you the results in the fastest way MongoDB can do it but that "timestamp" value is also useful for filtering queries within date ranges which is something you cannot easily do from other values.
Also there is a way in the JavaScript available to MongoDB mapReduce that allows you to get the date from an ObejctId. This runs slower than the aggregation framework though:
db.collection.mapReduce(
function() {
var date = this._id.getTimestamp();
items.forEach(function(item) {
var day =
"" + date.getFullyear() +
"" + ( date.getMonth() + 1 ) +
"" + date.getDate();
emit(
{
day: day,
category: item.category
},
{
countOfItems: 1,
countOfActive: ( item.active ) ? 1 : 0,
sumOfLength: item.length
}
);
});
},
function( key, values ) {
var reduced = {
countOfItems: 0,
countOfActive: 0,
sumOfLength: 0
};
values.forEach(function(value) {
for ( var k in value ) {
reduced[k] += value[k];
}
});
return reduced;
},
{
"out": { "inline": 1 }
}
)
That basically does the same thing where the mapper breaks apart the array and provides grouping keys while the reducer just sums up the values from the mapper. So even if you had to extract from GUID's that gives you a basic layout for a mapper and reducer in a language such as Java when using Hadoop.
Take a look at the aggregate and mapReduce manual pages for more information on options you can apply.