How to use max operator in MongoDB - mongodb

I want to know how to use the $max operator for the following problem:
Suppose this is the data given:
{
"_id" : ,
"attributes":
{
"value1":10,
"value2":50,
"value3":70,
"value4":25,
"value5":50,
"value6":20
}
}
I have provided one set of JSON. There are multiple JSON.
I want to find the maximum value for value3 = 70 in this case and get a "_id" corresponding to it which can be passed in next query as a parameter.
I have tried it but i am unable to get a correct answer.
db.collection.group(
{"_id": $null , {"attribute.value1":true},
reduce: function(obj,prev) {
if (prev.maxValue < obj."attribute.value") {
prev.maxValue = obj."attribute.value";
}
},
initial: { maxValue: 0 }}
);

Related

Modify an element of an array inside an object in MongoDB

I have some documents like this:
doc = {
"tag" : "tag1",
"field" : {
"zone" :"zone1",
"arr" : [
{ vals: [-12.3,-1,0], timestamp: ""},
{ vals: [-30.40,-23.2,0], timestamp: "" }
]
}
}
I want to modify one of the elements of the array (for example, the first element, the one with index 0) of one of such documents.
I want to end it up looking like:
doc = {
"tag" : "tag1",
"field" : {
"zone" :"zone1",
"arr" : [
{ vals: [-1, -1, -1], timestamp: "the_new_timestamp"}, // this one was modified
{ vals: [-30.40, -23.2, 0], timestamp: "" }
]
}
}
I know something about find_and_modify:
db.mycollection.find_and_modify(
query = query, // you find the document of interest with this
fields = { }, // you can focus on one of the fields of your document with this
update = { "$set": data }, // you update your data with this
)
The questions that feel closer to what I want are these:
https://stackoverflow.com/a/28829203/1253729
https://stackoverflow.com/a/23554454/1253729
I've been going through them but I'm getting stuck when trying to work out the solution for my case. I don't know if I should really use the fields parameter. I don't know how to use $set correctly for my case.
I hope you could help me.
https://docs.mongodb.com/manual/reference/operator/update/positional/
This will help you!
db.mycollection.updateOne(
query,
fields,
{ $set: { "field.arr.0.timestamp": "the_new_timestamp"} }
)

MongoDB :: Order Search result depend on search condition

I have a data
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
},
{ "name":"QQQ",
"keyword":"key3",
"city":"xyz"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
}]
and i need to search records which have name= "BS" OR keyword="key2" with the help of query
db.collection.find({"$OR" : [{"name":"BS"}, {"keyword":"Key2"}]});
These records i need in the sequence
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
}]
but i am getting in following sequences:
[{ "name":"BS",
"keyword":"key1",
"city":"xyz"
},
{ "name":"AGS",
"keyword":"Key2",
"city":"xyz1"
},
{ "name":"BS",
"keyword":"Keyword",
"city":"city"
}]
Please provide some suggestion i am stuck with this problem since 2 days.
Thanks
The order of results returned by MongoDB is not guaranteed unless you explicitly sort your data using the sort function. For smaller datasets you maybe "lucky" in the sense that the results are always returned in the same order, however, for bigger datasets and in particular when you have sharded Mongo clusters this is very unlikely. As proposed by Yathish you need to explicitly order your results using the sort function. Based on the suggested output, it seems you want to sort by name in descending order so I have set the sorting flag to -1 for the field name.
db.collection.find({"$or" : [{"name":"BS"}, {"keyword":"Key2"}]}).sort({"name" : -1});
If you need a more complex sorting algorithm as specified in your comment, you can convert your results to a Javascript array and create a custom sort function. This sort function will first list documents with a name equal to "BS" and then documents containing the keyword "Key2"
db.data.find({
"$or": [{
"name": "BS"
}, {
"keyword": "Key2"
}]
}).toArray().sort(function(doc1, doc2) {
if (doc1.name == "BS" && doc2.keyword == "Key2") {
return -1
} else if (doc2.name == "BS" && doc1.keyword == "Key2") {
return 1
} else {
return doc1.name < doc2.name
}
});

"too much data for sort()" on a small collection

When trying to do a find and sort on a mongodb collection I get the error below. The collection is not large at all - I have only 28 documents and I start getting this error when I cross the limit of 23 records.
The special thing about that document is that it holds a large ArrayCollection inside but I am not fetching that specific field at all, I am only trying to get a DateTime field.
db.ANEpisodeBreakdown.find({creationDate: {$exists:true}}, {creationDate: true} ).limit(23).sort( { creationDate: 1}
{ "$err" : "too much data for sort() with no index. add an index or specify a smaller limit", "code" : 10128 }
So the problem here is a 32MB limit and you have no index that can be used as an "index only" or "covered" query to get to the result. Without that, your "big field" still gets loaded in the data to sort.
Easy to replicate;
var string = "";
for ( var n=0; n < 10000000; n++ ) {
string += 0;
}
for ( var x=0; x < 4; x++ ) {
db.large.insert({ "large": string, "date": new Date() });
sleep(1000);
}
So this query will blow up, unless you limit to 3:
db.large.find({},{ "date": 1 }).sort({ "date": -1 })
To overcome this:
Create an index on date (and other used fields) so the whole document is not loaded in your covered index query:
db.large.ensureIndex({ "date": 1 })
db.large.find({},{ "_id": 0, "date": 1 }).sort({ "date": -1 })
{ "date" : ISODate("2014-07-07T10:08:33.067Z") }
{ "date" : ISODate("2014-07-07T10:08:31.747Z") }
{ "date" : ISODate("2014-07-07T10:08:30.391Z") }
{ "date" : ISODate("2014-07-07T10:08:29.038Z") }
Don't index and use aggregate instead, as the $project there does not suffer the same limitations as the document actually gets altered before passing to $sort.
db.large.aggregate([
{ "$project": { "_id": 0, "date": 1 }},
{ "$sort": {"date": -1 }}
])
{ "date" : ISODate("2014-07-07T10:08:33.067Z") }
{ "date" : ISODate("2014-07-07T10:08:31.747Z") }
{ "date" : ISODate("2014-07-07T10:08:30.391Z") }
{ "date" : ISODate("2014-07-07T10:08:29.038Z") }
Either way gets you the results under the limit without modifying cursor limits in any way.
Without an index, the size you can use for a sort only extends over shellBatchSize which by default is 20.
DBQuery.shellBatchSize = 23;
This should do the trick.
The problem is that projection in this particular scenario still loads the entire document, it just sends it to your application without the large array field.
As such MongoDB is still sorting with too much data for its 32mb limit.

MongoDB mapreduce missing data with 'null' in return

So this is strange. I'm trying to use mapreduce to group datetime/metrics under a unique port:
Document layout:
{
"_id" : ObjectId("5069d68700a2934015000000"),
"port_name" : "CL1-A",
"metric" : "340.0",
"port_number" : "0",
"datetime" : ISODate("2012-09-30T13:44:00Z"),
"array_serial" : "12345"
}
and mapreduce functions:
var query = {
'array_serial' : array,
'port_name' : { $in : ports },
'datetime' : { $gte : from, $lte : to}
}
var map = function() {
emit( { portname : this.port_name } , { datetime : this.datetime,
metric : this.metric });
}
var reduce = function(key, values) {
var res = { dates : [], metrics : [], count : 0}
values.forEach(function(value){
res.dates.push(value.datetime);
res.metrics.push(value.metric);
res.count++;
})
return res;
}
var command = {
mapreduce : collection,
map : map.toString(),
reduce : reduce.toString(),
query : query,
out : { inline : 1 }
}
mongoose.connection.db.executeDbCommand(command, function(err, dbres){
if(err) throw err;
console.log(dbres.documents);
res.json(dbres.documents[0].results);
})
If a small number of records is requested, say 5 or 10, or even 60 I get all the data back I'm expecting. Larger queries return truncated values....
I just did some more testing and it seems like it's limiting the record output to 100?
This is minutely data and when I run a query for a 24 hour period I would expect 1440 records back... I just ran it a received 80. :\
Is this expected? I'm not specifying a limit anywhere I can tell...
More data:
Query for records from 2012-10-01T23:00 - 2012-10-02T00:39 (100 minutes) returns correctly:
[
{
"_id": {
"portname": "CL1-A"
},
"value": {
"dates": [
"2012-10-01T23:00:00.000Z",
"2012-10-01T23:01:00.000Z",
"2012-10-01T23:02:00.000Z",
...cut...
"2012-10-02T00:37:00.000Z",
"2012-10-02T00:38:00.000Z",
"2012-10-02T00:39:00.000Z"
],
"metrics": [
"1596.0",
"1562.0",
"1445.0",
...cut...
"774.0",
"493.0",
"342.0"
],
"count": 100
}
}
]
...add one more minute to the query 2012-10-01T23:00 - 2012-10-02T00:39 (101 minutes) :
[
{
"_id": {
"portname": "CL1-A"
},
"value": {
"dates": [
null,
"2012-10-02T00:40:00.000Z"
],
"metrics": [
null,
"487.0"
],
"count": 2
}
}
]
the dbres.documents object shows the correct expected emitted records:
[ { results: [ [Object] ],
timeMillis: 8,
counts: { input: 101, emit: 101, reduce: 2, output: 1 },
ok: 1 } ]
...so is the data getting lost somewhere?
Rule number one of MapReduce:
Thou shall return from Reduce the exact same format that you emit with your key in Map.
Rule number two of MapReduce:
Thou shall reduce the array of values passed to reduce as many times as necessary. Reduce function may be called many times.
You've broken both of those rules in your implementation of reduce.
Your Map function is emitting key, value pairs.
key: port name (you should simply emit the name as the key, not a document)
value: a document representing three things you need to accumulate (date, metric, count)
Try this instead:
map = function() { // if you want to reduce to an array you have to emit arrays
emit ( this.port_name, { dates : [this.datetime], metrics : [this.metric], count: 1 });
}
reduce = function(key, values) { // for each key you get an array of values
var res = { dates: [], metrics: [], count: 0 }; // you must reduce them to one
values.forEach(function(value) {
res.dates = value.dates.concat(res.dates);
res.metrics = value.metrics.concat(res.metrics);
res.count += value.count; // VERY IMPORTANT reduce result may be re-reduced
})
return res;
}
Try to output the map reduce data in a temp collection instead of in memory. May that is the reason. From Mongo Docs:
{ inline : 1} - With this option, no collection will be created, and
the whole map-reduce operation will happen in RAM. Also, the results
of the map-reduce will be returned within the result object. Note that
this option is possible only when the result set fits within the 16MB
limit of a single document. In v2.0, this is your only available
option on a replica set secondary.
Also, It may not be the reason but MongoDB has data size limitations (2GB) on a 32bit machine.

How to count document elements inside a mongo collection with php?

I have the following structure of a mongo document:
{
"_id": ObjectId("4fba2558a0787e53320027eb"),
"replies": {
"0": {
"email": ObjectId("4fb89a181b3129fe2d000000"),
"sentDate": "2012-05-21T11: 22: 01.418Z"
}
"1": {
"email": ObjectId("4fb89a181b3129fe2d000000"),
"sentDate": "2012-05-21T11: 22: 01.418Z"
}
"2" ....
}
}
How do I count all the replies from all the documents in the collection?
Thank you!
In the following answer, I'm working with a simple data set with five replies across the collection:
> db.foo.find()
{ "_id" : ObjectId("4fba6b0c7c32e336fc6fd7d2"), "replies" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("4fba6b157c32e336fc6fd7d3"), "replies" : [ 1, 2 ] }
Since we're not simply counting documents, db.collection.count() won't help us here. We'll need to resort to MapReduce to scan each document and aggregate the reply array lengths. Consider the following:
db.foo.mapReduce(
function() { emit('totalReplies', { count: this.replies.length }); },
function(key, values) {
var result = { count: 0 };
values.forEach(function(value) {
result.count += value.count;
});
return result;
},
{ out: { inline: 1 }}
);
The map function (first argument) runs across the entire collection and emits the number of replies in each document under a constant key. Mongo will then consider all emitted values and run the reduce function (second argument) a number of times to consolidate (literally reduce) the result. Hopefully the code here is straightforward. If you're new to map/reduce, one caveat is that the reduce method must be capable of processing its own output. This is explained in detail in the MapReduce docs linked above.
Note: if your collection is quite large, you may have to use another output mode (e.g. collection output); however, inline works well for small data sets.
Lastly, if you're using MongoDB 2.1+, we can take advantage of the Aggregation Framework to avoid writing JS functions and make this even easier:
db.foo.aggregate(
{ $project: { replies: 1 }},
{ $unwind: "$replies" },
{ $group: {
_id: "result",
totalReplies: { $sum: 1 }
}}
);
Three things are happening here. First, we tell Mongo that we're interested in the replies field. Secondly, we want to unwind the array so that we can iterate over all elements across the fields in our projection. Lastly, we'll tally up results under a "result" bucket (any constant will do), adding 1 to the totalReplies result for each iteration. Executing this query will yield the following result:
{
"result" : [{
"_id" : "result",
"totalReplies" : 5
}],
"ok" : 1
}
Although I wrote the above answers with respect to the Mongo client, you should have no trouble translating them to PHP. You'll need to use MongoDB::command() to run either MapReduce or aggregation queries, as the PHP driver currently has no helper methods for either. There's currently a MapReduce example in the PHP docs, and you can reference this Google group post for executing an aggregation query through the same method.
I haven't checked your code, might work as well. I've did the following and it just works:
$replies = $db->command(
array(
"distinct" => "foo",
"key" => "replies"
)
);
$all = count($replies['values']);
I've did it again using the group command of the PHP Mongo Driver. It's similar to a MapReduce command.
$keys = array("replies.type" => 1); //keys for group by
$initial = array("count" => 0); //initial value of the counter
$reduce = "function (obj, prev) { prev.count += obj.replies.length; }";
$condition = array('replies' => array('$exists' => true), 'replies.type' => 'follow');
$g = $db->foo->group($keys, $initial, $reduce, $condition);
echo $g['count'];
Thanks jmikola for giving links to Mongo.
JSON should be
{
"_id": ObjectId("4fba2558a0787e53320027eb"),
"replies":[
{
0: {
"email": ObjectId("4fb89a181b3129fe2d000000"),
"sentDate": "2012-05-21T11: 22: 01.418Z"
},
1: {
"email": ObjectId("4fb89a181b3129fe2d000000"),
"sentDate": "2012-05-21T11: 22: 01.418Z"
},
2: {....}
]
}