selecting all the fields in a row using mapReduce - mongodb

I am using mongoose with nodejs. I am using mapReduce to fetch data grouped by a field.So all it gives me as a collection is the key with the grouping field only from every row of database.
I need to fetch all the fields from the database grouped by a field and sorted on the basis of another field.e.g.: i have a database having details of places and fare for travelling to those places and a few other fields also.Now i need to fetch the data in such a way that i get the data grouped on the basis of places sorted by the fare for them. MapReduce helps me to get that, but i cannot get the other fields.
Is there a way to get all the fields using map reduce, rather than just getting the two fields as mentioned in the above example??

I must admit I'm not sure I understand completely what you're asking.
But maybe one of the following thoughts helps you:
either) when you iterate over your mapReduce results, you could fetch complete documents from mongodb for each result. That would give you access to all fields in each document for the cost of some network traffic.
or) The value that you send into emit(key, value) can be an object. So you could construct a value object that contains all your desired fields. Just be sure to use the exactly same object structure for your reduce method's return value.
I try to illustrate with an (untested) example.
map = function() {
emit(this.place,
{
'field1': this.field1,
'field2': this.field2,
'count' : 1
});
}
reduce = function(key, values) {
var result = {
'field1': values[0].field1,
'field2': values[0].field2,
'count' : 0 };
for (v in values) {
result.count += values[v].count;
}
return obj;
}

Related

Auto increment / Sum one field of the query result?

I explain with an example:
I have a collection with 1 Million Items with ID: 123 each worth a different value:"worth"
user x with MONEY can "buy" the items. i basically want to know how many items the user can buy in a most elegant way.
so i got.
db.items.find({Item_ID:123},{Item_Age:1,Item_worth:1}.sort({Item_age:1})
-> gives me all Items with Item_ID:123 sorted by age.
I could now
Iterate through all items till Sum of Item Worth == User_Money but somehow i think this is not really efficient if the List returned matches 1 Million items and user might have only enough money for 1000
or
do the loop and query 1000 times
or
limit the query to 100 items. but this is very variably could still result in alot of loops
SO is there a query method which returns the sum of all values in each document??
Or any other efficient suggestions might be helpfull.
Thanks
Better than iterating would be do to this with mapReduce to get the running total for the items, and then filter the result.
So define a mapper as follows:
var mapper = function () {
totalWorth = totalWorth + this.Item_Worth;
var canBuy = userMoney >= totalWorth;
if ( canBuy ) {
emit(
{
Item_ID: this.Item_ID,
Item_Age: this.Item_Age
},
{
worth: this.Item_Worth,
totalWorth: totalWorth,
canBuy: canBuy
}
);
}
}
This accumulates a variable for totalWorth with the current "worth" of the item. Then a check is made to see the current totalWorth value exceeds the amount of userMoney that was input. If not, then you don't emit. Which is automatic filtering.
All the emitted keys are unique so just run the mapReduce as below:
db.items.mapReduce(
mapper,
function(){}, // reduce argument is required though not called
{
query: { Item_ID: 123 }
sort: { Item_ID: 1, Item_Age: 1 },
out: { inline: 1 },
scope: {
totalWorth: 0,
userMoney: 30
},
}
)
So looking at the other parts of that :
query: Is a standard query object you use to get your selection
sort: Is not required really because you are looking at Item_Age in ascending order. But if you wanted the oldest Item_Age first then you can reverse the sort.
out: Gives you an inline object as the result that you can use to get the matching items.
scope: Defines the global variables that can be accessed by the functions. So we provide an initial value for totalWorth and pass the parameter value of userMoney as how much money the user has to buy.
At the end of the day the result contains the filtered list of the items that fall under the amount of money the user can afford to purchase.

How compare the size of two arrays in mongo?

I have two fields with separate arrays that have comparable data in them.
The first has a Name, and an ID. The second has a nickname.
I want to make sure that the count of the two are the same. If they are not the same, I want to know the mongoID of that document.
How would I do this?
With MapReduce it would be possible. If your document looks like:
document: { array1: [ a, b], array2: [c] }
You could write map and reduce functions like:
map = function(){
if(this.array1.length!=this.array2.length)
emit(this_id,1);
}
reduce = function(key,values){ return key;}
For instance, to get the results inline:
db.foo.mapReduce(map,reduce,{out:{inline:1}}).results

Fetch Record from mongo db based on type and ancestry field

in mongodb records are store like this
{_id:100,type:"section",ancestry:nil,.....}
{_id:300,type:"section",ancestry:100,.....}
{_id:400,type:"problem",ancestry:100,.....}
{_id:500,type:"section",ancestry:100,.....}
{_id:600,type:"problem",ancestry:500,.....}
{_id:700,type:"section",ancestry:500,.....}
{_id:800,type:"problem",ancestry:100,.....}
i want to fetch records in order like this
first record whose ancestry is nil
then all record whose parent is first record we search and whose type is 'problem'
then all record whose parent is first record we search and whose type is 'section'
Expected output is
{_id:100,type:"section",ancestry:nil,.....}
{_id:400,type:"problem",ancestry:100,.....}
{_id:800,type:"problem",ancestry:100,.....}
{_id:300,type:"section",ancestry:100,.....}
{_id:500,type:"section",ancestry:100,.....}
{_id:600,type:"problem",ancestry:500,.....}
{_id:700,type:"section",ancestry:500,.....}
Try this MongoDB shell command:
db.collection.find().sort({ancestry:1, type: 1})
Different languages, where ordered dictionaries aren't available, may use a list of 2-tuples to the sort argument. Something like this (Python):
collection.find({}).sort([('ancestry', pymongo.ASCENDING), ('type', pymongo.ASCENDING)])
#vinipsmaker 's answer is good. However, it doesn't work properly if _ids are random numbers or there exist documents that aren't part of the tree structure. In that case, the following code would work rightly:
function getSortedItems() {
var sorted = [];
var ids = [ null ];
while (ids.length > 0) {
var cursor = db.Items.find({ ancestry: ids.shift() }).sort({ type: 1 });
while (cursor.hasNext()) {
var item = cursor.next();
ids.push(item._id);
sorted.push(item);
}
}
return sorted;
}
Note that this code is not fast because db.Items.find() will be executed n times, where n is the number of documents in the tree structure.
If the tree structure is huge or you will do the sort many times, you can optimize this by using $in operator in the query and sort the result on the client side.
In addition, creating index on the ancestry field will make the code quicker in either case.

How can I sort the results of a map/reduce using Doctrine ODM

I'm having an issue trying to sort after a map reduce. The collection has statistics (like scores) with users attached, so it's finding the highest score of the user. That portion works, the sort is to then sort those results for a leaderboard. I've put these functions into javascript variables and run them in the mongo console and a simple find().sort({'value.max.value':-1}) works fine on the resulting collection but I can't get it to work here. (My results come back unordered).
$query->map('function() {
var x = { value : parseInt(this.value), _id : this._id, date : this.postDate };
emit(this.user, { max : x });
}')
->reduce('function(key, vals) {
var res = vals[0];
for (var i=1; i<vals.length; i++)
{
if(vals[i].max.value > res.max.value)
res.max = vals[i].max;
}
return res;
}')
->sort('value.max.value', 'desc');
When you do a ->map or ->reduce call, Doctrine internally switch the "query mode" from "find" mode to "mapreduce" mode.
So you are actually doing a mapReduce MongoDB command as in the MongoDB Documentation
This means that your sort() call is translated to a sort property in the mapReduce command, and so it sorts only the input documents.
To actually sort the output you have 2 options:
Use the out() method in the query to output the results to a temporary Collection, then query the data from there with sort
Sort the results in php
The sort of a map-reduce is applied to the input documents (as fed to map), not the result. If you want to sort the results of the map-reduce you'd need to output to a collection and then perform a sorted find query on that collection (as you've successfully tried).

MongoDB map/reduce over multiple collections?

First, the background. I used to have a collection logs and used map/reduce to generate various reports. Most of these reports were based on data from within a single day, so I always had a condition d: SOME_DATE. When the logs collection grew extremely big, inserting became extremely slow (slower than the app we were monitoring was generating logs), even after dropping lots of indexes. So we decided to have each day's data in a separate collection - logs_YYYY-mm-dd - that way indexes are smaller, and we don't even need an index on date. This is cool since most reports (thus map/reduce) are on daily data. However, we have a report where we need to cover multiple days.
And now the question. Is there a way to run a map/reduce (or more precisely, the map) over multiple collections as if it were only one?
A reduce function may be called once, with a key and all corresponding values (but only if there are multiple values for the key - it won't be called at all if there's only 1 value for the key).
It may also be called multiple times, each time with a key and only a subset of the corresponding values, and the previous reduce results for that key. This scenario is called a re-reduce. In order to support re-reduces, your reduce function should be idempotent.
There are two key features in a idempotent reduce function:
The return value of the reduce function should be in the same format as the values it takes in. So, if your reduce function accepts an array of strings, the function should return a string. If it accepts objects with several properties, it should return an object containing those same properties. This ensures that the function doesn't break when it is called with the result of a previous reduce.
Don't make assumptions based on the number of values it takes in. It isn't guaranteed that the values parameter contains all the values for the given key. So using values.length in calculations is very risky and should be avoided.
Update: The two steps below aren't required (or even possible, I haven't checked) on the more recent MongoDB releases. It can now handle these steps for you, if you specify an output collection in the map-reduce options:
{ out: { reduce: "tempResult" } }
If your reduce function is idempotent, you shouldn't have any problems map-reducing multiple collections. Just re-reduce the results of each collection:
Step 1
Run the map-reduce on each required collection and save the results in a single, temporary collection. You can store the results using a finalize function:
finalize = function (key, value) {
db.tempResult.save({ _id: key, value: value });
}
db.someCollection.mapReduce(map, reduce, { finalize: finalize })
db.anotherCollection.mapReduce(map, reduce, { finalize: finalize })
Step 2
Run another map-reduce on the temporary collection, using the same reduce function. The map function is a simple function that selects the keys and values from the temporary collection:
map = function () {
emit(this._id, this.value);
}
db.tempResult.mapReduce(map, reduce)
This second map-reduce is basically a re-reduce and should give you the results you need.
I used map-reduce method. here is an example.
var mapemployee = function () {
emit(this.jobid,this.Name);};
var mapdesignation = function () {
emit(this.jobid, this.Designation);};
var reduceF = function(key, values) {
var outs = {Name:null,Designation: null};
values.forEach(function(v){
if(outs.Name ==null){
outs.Name = v.Name }
if(outs.Name ==null){
outs.Nesignation = v.Designation}
});
return outs;
};
result = db.employee.mapReduce(mapemployee, reduceF, {out: {reduce: 'output'}});
result = db.designation.mapReduce(mapdesignation,reduceF, {out: {reduce: 'output'}});
Refference : http://www.itgo.me/a/x3559868501286872152/mongodb-join-two-collections