Meteor js mongodb SUM collection documents is incomplete - mongodb

I have a problem which I do not know where to start. I have a button that calculates total amount from all matching documents in a collection. ie: certain month, year and branch calculations. So far, when I click it first time it gives me "y" amount, second time returns "x" amount.
What I suppose is that collection documents is incomplete when first time.
How may I solve this? Please excuse me if this has already been answered, I have looked around unsuccessfully.
When button is clicked, in template events executes:
'click #calculate': function(event,instance){
var _transactions = instance.Transactions().fetch();
var capital = _transactions.reduce(function(sum, row){
return row.is_accountable ? sum + row.transaction_amount : sum + 0;
}, 0);
instance.capital.set(capital);
}

Since you're using Blaze you might consider making this computation a helper instead of an event handler. That way (a) the user doesn't need to click to get the sum, (b) the total will always be up-to-date even if the data changes reactively.
Template.myTemplate.helper({
sum(){
s = 0;
Transactions.find().map((doc)={
s += doc.is_accountable ? doc.transaction_amount : 0;
});
return s;
}
});

Related

Cloudant View Unusual Conditional Behavior with Dates

Found some peculiar behavior was wondering if anyone could help me to understand it so to avoid similar issues in the future.
Creating a cloudant view I want to return only records with a timestamp of the current day.
I was having a hard time getting it to work and found the difference is in having a space before the end of the if condition.
See below for working and not working
if (new Date(Date.parse(doc.details.timestamp)).setHours(0,0,0,0) === new Date().setHours(0,0,0,0) ){
Works to check the current date against the Cloudant doc date
if (new Date(Date.parse(doc.details.timestamp)).setHours(0,0,0,0) === new Date().setHours(0,0,0,0)){
Does not work to check the date against the Cloudant doc date
Full working view below for context
function (doc) {
if (doc.details.location){
if (new Date(Date.parse(doc.details.timestamp)).setHours(0,0,0,0) === new Date().setHours(0,0,0,0) ){
emit(doc.details.location.toLowerCase(), { "location": doc.details.location.toLowerCase(), "status": doc.details.status, "user": doc.details.username, "time": doc.details.timestamp})
}
}
}
All the best,
Scott.
I suspect this may be to do with the execution date/time of the if statement rather than the space involved.
The map function needs to be deterministic regardless of execution time.
It looks like you are guessing that the map function is running at query time (so new Date would emit today's date). Instead, it runs at indexing time, so the value of new Date is whatever the datetime is when the indexing happens. As indexing runs at a different time to document insertion (sometime between insert and when the view is queried), using any form of value that changes with time will produce unpredictable results.
I suspect therefore the space is co-incidental and instead the output of new Date is changing, and so altering what's emitted into your view.
For your problem -- querying for things "today" -- I think you want to instead emit a key like [location, year, month, day]. Your map function would look like:
function (doc) {
if (doc.details.location) {
var l = doc.details.location.toLowerCase();
var d = Date.parse(doc.details.timestamp);
emit([l, d.getFullYear(), d.getMonth(), d.getDate()], {
"location": l,
"status": doc.details.status,
"user": doc.details.username,
"time": doc.details.timestamp
});
}
}
As JavaScript uses 0-based indexes for the month, to query for all items at location Bristol on today, 2-Feb-2017, you'd use key=["bristol",2017,1,2] to in your query to the view.

Mongo -Select parent document with maximum child documents count, Faster way?

I'm quite new to mongo, and trying to get work following query.and is working fine too, But it's taking a little bit more time. I think I'm doing something wrong.
There are many number of documents in a collection parent, near about 6000. Each document has certain number of childs (childs is an another collection with 40000 documents in it). parents & childs are associated with each other by an attribute in the document called parent_id. Please see the following code. Following code takes approximate 1 minute to execute the queries. I don't think mongo should take that much time.
function getChildMaxDocCount(){
var maxLen = 0;
var bigSizeParent = null;
db.parents.find().forEach(function (parent){
var currentcount = db.childs.count({parent_id:parent._id});
if(currcount > maxLen){
maxLen = currcount;
bigSizeParent = parent._id;
}
});
printjson({"maxLen":maxLen, "bigSizeParent":bigSizeParent });
}
Is there any feasible/optimal way to achieve this?
If I got you right, you want to have the parent with the most childs. This is easy to accomplish using the aggregation framework. When each child only can have one parent, the aggregation query would look like this
db.childs.aggregate(
{ $group: { _id:"$parent_id", children:{$sum:1} } },
{ $sort: { "children":-1 } },
{ $limit : 1 }
);
Which should return a document like:
{ _id:"SomeParentId", children:15}
If a child can have more than one parent, it heavily depends on the data modeling how the query would look like.
Have a look at the aggregation framework documentation for details.
Edit: Some explanation
The aggregation pipeline takes every document it is told do do so through a series of steps in a way that all documents are first processed through the first step and the resulting documents are put into the next step.
Step 1: Grouping
We group all documents into new documents (virtual ones, if you want) and tell mongod to increment the field children by one for each document which has the same parent_id. Since we are referring to a field of the current document, we need to add a $ sign.
Step 2: Sorting
Now that we have a bunch of documents which hold the parent_id and the number of children this parent has, we sort it by the children field in descending (-1) order.
Step3: Limiting
Since we are only interested in the parent_id which has the most children, we only let mongod return the first document after sorting.

Auto increment / Sum one field of the query result?

I explain with an example:
I have a collection with 1 Million Items with ID: 123 each worth a different value:"worth"
user x with MONEY can "buy" the items. i basically want to know how many items the user can buy in a most elegant way.
so i got.
db.items.find({Item_ID:123},{Item_Age:1,Item_worth:1}.sort({Item_age:1})
-> gives me all Items with Item_ID:123 sorted by age.
I could now
Iterate through all items till Sum of Item Worth == User_Money but somehow i think this is not really efficient if the List returned matches 1 Million items and user might have only enough money for 1000
or
do the loop and query 1000 times
or
limit the query to 100 items. but this is very variably could still result in alot of loops
SO is there a query method which returns the sum of all values in each document??
Or any other efficient suggestions might be helpfull.
Thanks
Better than iterating would be do to this with mapReduce to get the running total for the items, and then filter the result.
So define a mapper as follows:
var mapper = function () {
totalWorth = totalWorth + this.Item_Worth;
var canBuy = userMoney >= totalWorth;
if ( canBuy ) {
emit(
{
Item_ID: this.Item_ID,
Item_Age: this.Item_Age
},
{
worth: this.Item_Worth,
totalWorth: totalWorth,
canBuy: canBuy
}
);
}
}
This accumulates a variable for totalWorth with the current "worth" of the item. Then a check is made to see the current totalWorth value exceeds the amount of userMoney that was input. If not, then you don't emit. Which is automatic filtering.
All the emitted keys are unique so just run the mapReduce as below:
db.items.mapReduce(
mapper,
function(){}, // reduce argument is required though not called
{
query: { Item_ID: 123 }
sort: { Item_ID: 1, Item_Age: 1 },
out: { inline: 1 },
scope: {
totalWorth: 0,
userMoney: 30
},
}
)
So looking at the other parts of that :
query: Is a standard query object you use to get your selection
sort: Is not required really because you are looking at Item_Age in ascending order. But if you wanted the oldest Item_Age first then you can reverse the sort.
out: Gives you an inline object as the result that you can use to get the matching items.
scope: Defines the global variables that can be accessed by the functions. So we provide an initial value for totalWorth and pass the parameter value of userMoney as how much money the user has to buy.
At the end of the day the result contains the filtered list of the items that fall under the amount of money the user can afford to purchase.

Ordering a result set randomly in mongo

I've recently discovered that Mongo has no SQL equivalent to "ORDER BY RAND()" in it's command syntax (https://jira.mongodb.org/browse/SERVER-533)
I've seen the recommendation at http://cookbook.mongodb.org/patterns/random-attribute/ and frankly, adding a random attribute to a document feels like a hack. This won't work because this places an implicit limit to any given query I want to randomize.
The other widely given suggestion is to choose a random index to offset from. Because of the order that my documents were inserted in, that will result in one of the string fields being alphabetized, which won't feel very random to a user of my site.
I have a couple ideas on how I could solve this via code, but I feel like I'm missing a more obvious and native solution. Does anyone have a thought or idea on how to solve this more elegantly?
I have to agree: the easiest thing to do is to install a random value into your documents. There need not be a tremendously large range of values, either -- the number you choose depends on the expected result size for your queries (1,000 - 1,000,000 distinct integers ought to be enough for most cases).
When you run your query, don't worry about the random field -- instead, index it and use it to sort. Since there is no correspondence between the random number and the document, you should get fairly random results. Note that collisions will likely result in documents being returned in natural order.
While this is certainly a hack, you have a very easy escape route: given MongoDB's schema-free nature, you can simply stop including the random field once there is support for random sort in the server. If size is an issue, you could run a batch job to remove the field from existing documents. There shouldn't be a significant change in your client code if you design it carefully.
An alternative option would be to think long and hard about the number of results that will be randomized and returned for a given query. It may not be overly expensive to simply do shuffling in client code (i.e., if you only consider the most recent 10,000 posts).
What you want cannot be done without picking either of the two solutions you mention. Picking a random offset is a horrible idea if your collection becomes larger than a few thousands documents. The reason for this is that the skip(n) operation takes O(n) time. In other words, the higher your random offset the longer the query will take.
Adding a randomized field to the document is, in my opinion, the least hacky solution there is given the current feature set of MongoDB. It provides stable query times and gives you some say over how the collection is randomized (and allows you to generate a new random value after each query through a findAndModify for example). I also do not understand how this would impose an implicit limit on your queries that make use of randomization.
You can give this a try - it's fast, works with multiple documents and doesn't require populating rand field at the beginning, which will eventually populate itself:
add index to .rand field on your collection
use find and refresh, something like:
// Install packages:
// npm install mongodb async
// Add index in mongo:
// db.ensureIndex('mycollection', { rand: 1 })
var mongodb = require('mongodb')
var async = require('async')
// Find n random documents by using "rand" field.
function findAndRefreshRand (collection, n, fields, done) {
var result = []
var rand = Math.random()
// Append documents to the result based on criteria and options, if options.limit is 0 skip the call.
var appender = function (criteria, options, done) {
return function (done) {
if (options.limit > 0) {
collection.find(criteria, fields, options).toArray(
function (err, docs) {
if (!err && Array.isArray(docs)) {
Array.prototype.push.apply(result, docs)
}
done(err)
}
)
} else {
async.nextTick(done)
}
}
}
async.series([
// Fetch docs with unitialized .rand.
// NOTE: You can comment out this step if all docs have initialized .rand = Math.random()
appender({ rand: { $exists: false } }, { limit: n - result.length }),
// Fetch on one side of random number.
appender({ rand: { $gte: rand } }, { sort: { rand: 1 }, limit: n - result.length }),
// Continue fetch on the other side.
appender({ rand: { $lt: rand } }, { sort: { rand: -1 }, limit: n - result.length }),
// Refresh fetched docs, if any.
function (done) {
if (result.length > 0) {
var batch = collection.initializeUnorderedBulkOp({ w: 0 })
for (var i = 0; i < result.length; ++i) {
batch.find({ _id: result[i]._id }).updateOne({ rand: Math.random() })
}
batch.execute(done)
} else {
async.nextTick(done)
}
}
], function (err) {
done(err, result)
})
}
// Example usage
mongodb.MongoClient.connect('mongodb://localhost:27017/core-development', function (err, db) {
if (!err) {
findAndRefreshRand(db.collection('profiles'), 1024, { _id: true, rand: true }, function (err, result) {
if (!err) {
console.log(result)
} else {
console.error(err)
}
db.close()
})
} else {
console.error(err)
}
})
The other widely given suggestion is to choose a random index to offset from. Because of the order that my documents were inserted in, that will result in one of the string fields being alphabetized, which won't feel very random to a user of my site.
Why? If you have 7.000 documents and you choose three random offsets from 0 to 6999, the chosen documents will be random, even if the collection itself is sorted alphabetically.
One could insert an id field (the $id field won't work because its not an actual number) use modulus math to get a random skip. If you have 10,000 records and you wanted 10 results you could pick a modulus between 1 and 1000 randomly sucH as 253 and then request where mod(id,253)=0 and this is reasonably fast if id is indexed. Then randomly sort client side those 10 results. Sure they are evenly spaced out instead of truly random, but it close to what is desired.
Both of the options seems like non-perfect hacks to me, random filed and will always have same value and skip will return same records for a same number.
Why don't you use some random field to sort then skip randomly, i admit it is also a hack but in my experience gives better sense of randomness.

How to map/reduce two MongoDB collections

I am new to map / reduce and trying to figure out a way to collect the following data using map / reduce instead doing it my my (slow) application logic:
I have a collection 'projects' with a 1:n relation to a collection 'tasks'. Now I'd like to receive an array of results that gives me the project names where the first is the project with the most tasks, and the last the project with the least tasks.
Or even better an array of hashes that also tells me how many tasks every project has (assuming the project name is unique:
[project_1: 23, project_2: 42, project_3: 82]
For map I tried something like:
map = function () {
emit(this.project_id, { count:1 });
}
and reduce:
reduce = function (key, values) {
var sum = 0;
values.forEach(function(doc){ sum += 1; });
return { count:sum };
}
I fired this against my tasks collection:
var mr = db.tasks.mapReduce(map, reduce, { out: "results" });
But I get crucial results when querying:
db[mr.result].find();
I am using Mongoid on Rails and am completely lost with it. Can someone point me into the right direction?
Thx in advance.
Felix
Looks generally right, but I spot at least one problem: The summation step in the reduce function should be
values.forEach(function(doc){ sum += doc.count ; });
because the function may be reducing values that are themselves the product of a prior reduction step, and that therefore have count values > 1.
That's a common oversight, mentioned here: http://www.mongodb.org/display/DOCS/Troubleshooting+MapReduce
Hope that helps!