Group By / Sum Aggregate Query with Parse Cloud Code - group-by

I have Inventory table in my Parse database with two relevant fields: productId and quantity. When a shipment is received, a record is created containing the productId and quantity. Similarly, when a sale occurs, an inventory record is made with the productId and quantity (which will be negative, since the inventory will decrease after the sale).
I would like to run a group by/ sum aggregate query on the Inventory table with Parse Cloud Code that outputs a dictionary containing unique productIds as the keys and the sum of the quantity column for those Ids as the values.
I have seen a number of old posts saying that Parse does not do this, but then more recent posts refer to Cloud Code such as averageStart in the Cloud Code Guide: https://parse.com/docs/cloud_code_guide
However, it seems that Parse.Query used in averageStars has a maximum limit of 1000 records. Thus, when I sum the quantity column, I am only doing so on 1000 records rather than the whole table. Is there a way that I can compute the group by/ sum across all the records in the inventory table?
For example:
Inventory Table
productId quantity
Record 1: AAAAA 50
Record 2: BBBBB 40
Record 3: AAAAA -5
Record 4: BBBBB -2
Record 5: AAAAA 10
Record 6: AAAAA -7
Output dictionary:
{AAAAA: 48, BBBBB: 38}

You can use Parse.Query.each(). It has no limit. If your class has too many entries it will timeout though.
See docs
e.g.:
var totalQuantity = 0;
var inventoryQuery = new Parse.Query("Inventory");
inventoryQuery.each(
function(result){
totalQuantity += result.get("quantity");
}, {
success: function() {
// looped through everything
},
error: function(error) {
// error is an instance of Parse.Error.
}
});
});
If it times out, you have to build something like this.

In case you want to see the code with the dictionary:
Parse.Cloud.define("retrieveInventory", function(request, response) {
var productDictionary ={};
var query = new Parse.Query("Inventory");
query.equalTo("personId", request.params.personId);
query.each(
function(result){
var num = result.get("quantity");
if(result.get("productId") in productDictionary){
productDictionary[result.get("productId")] += num;
}
else{
productDictionary[result.get("productId")] = num;
}
}, {
success: function() {
response.success(productDictionary);
},
error: function(error) {
response.error("Query failed. Error = " + error.message);
}
});
});

Related

Max item count in Cosmos DB trigger

I'm creating a pre-trigger for a Cosmos DB container. The pre-trigger is supposed to fetch all data related to the triggering document id. The incoming_document.items is always returning 100 when there are more than 100 documents expected (which seems to be limited by the query). I tried to set the pageSize property to -1 in the FeedOptions parameters and to use continuation, but it is still giving me 100. How can I fix this to give the total count?
Here is a simplified version of the code (without the continuation, I used a similar code to here):
function trgAddStats() {
var context = getContext();
var request = context.getRequest();
var incoming_document = request.getBody();
var container = context.getCollection();
var incoming_document.items = 1;
var filterQuery = {
"query": `SELECT t.customer, t.amount FROM Transactions_ds t WHERE t.customer = #customer`,
"parameters": [{
"name": "#customer",
"value": incoming_document.customer
}
]
};
var isAccepted = container.queryDocuments(container.getSelfLink(), filterQuery, {},
function (err, items, responseOptions) {
if (err) throw new Error("Error" + err.message);
incoming_document.items += items.length;
request.setBody(incoming_document);
}
);
if (!isAccepted) throw "Unable to update transaction, abort";
}
For getting more than 100 documents in Cosmos DB we can make use of x-ms-max-item-count.
The maximum number of values that can be returned by the query execution is done by the x-ms-max-item-count header.
The default value of the query results is 100 and it can be configured from 1–1000 using this header.
For more details regarding Pagination of query results in Microsoft Documentation.
You can Customize the number for Items per page in Query Explorer too like here.

Faster way to get delta of 2 collections in Mongo

I have a complete list of catalog inventory data in Mongo
The basic schema is:
productSku (string)
inventory (number)
This collection consists of approximately 14 million records.
I have another list of actively sold products with a similar schema.
Right now I have it as a json file.
It consists of approximately 23,000 records.
Every 5 hours the 14 million records updates with the latest inventory data.
Once that happens I need to create a CSV of the 23,000 product's latest inventory.
I'm doing it like this:
const inventoryModel = require('../data/inventoryModel');
const activeProducts = require('./activeProducts.json');
const inventoryUpdate = [];
for (const product of activeProducts) {
let latest = await inventoryModel.findOne({ productSku: product.sku }).exec()
latest = latest ? latest._doc : null;
// If there's no current inventory record for the product
if (!lastest) {
// If there was previously an inventory greater than 0
if (product.inventory) {
// Set the latest inventory to you
inventoryUpdate.push({ sku: product.sku, inventory: 0 });
}
} else {
// If there's a change in inventory
if (latest.inventory != product.inventory) {
inventoryUpdate.push({ sku: product.sku, inventory: latest.inventory });
}
}
}
This gives me an array inventoryUpdate that I can use to create a CSV for a mass update. This works fine but it's very slow. It takes about an hour to complete!
I was thinking about maybe adding activeProducts to Mongo as well and if I can somehow keep the execution of the logic within Mongo this would be a lot faster. If possible this is beyond my current understanding and ability.
Anyone have any suggestions?

Auto incrementing an indexed field in mongodb when there are multiple concurrent requests

I am trying to auto increment an indexed field in mongodb whenever there is an insertion happens, I read many posts on SO this and mongoose-auto-increment, but I am not getting how are they working Because consider below scenario
Suppose I want to auto increment a field counter in my collection and currently the first record already exist whose counter value is 1, now suppose there are three concurrent inserts happens in the database now as counter value is 1 so all of them must be trying to set counter 2. But as we know know among these three whoever will get the first lock will successfully set its counter as 2, but what about other two operations because now when they will acquire lock they will also try to set counter value as 2 but as 2 is already taken so I guess mongoose will give error duplicate key error.
Can anyone please tell me how does above two posts solves the concurreny problem for auto-incrementing an indexed field in mongodb.
I know I am missing some conecpt but what ??
Thanks.
I encounter the same problem so I ended up building my own increment handling concurrency and it was quite easy! Bottom line, the fast answer, I used a try catch loop while I save the document to catch the duplicated key error on my incremented field. Here is how I emplemented this on mongoose and in my controller/service/model architecture:
First, I need to store the auto increment, it won't be a big collection since I will never have more than a dozen concerned collections in a project, so I don't even need special indexes or whatever:
counter.model.js
// requires modules blabla...
// The mongoose schema for the counter collection
const CounterSchema = new Schema({
// entity describes the concerned collection.field
entity: {
type: {
collection: { type: String, required: true },
field: { type: String, required: true }
}, required: true, unique: true
},
// the actual counter for the collection.field
counter: { type: Number, required: true }
});
// The mongoose-based function to query the counter
async function nextCount(collection, field){
let entityCounter = await CounterModel.findOne({
entity: { collection, field })
let counter = entityCounter.counter + 1
entityCounter.counter = counter;
await entityCounter.save();
return counter
}
// mongoose boiler plate
CounterSchema.statics.nextCount = nextCount;
const CounterModel = mongoose.model("counter", Counterschema)
module.exports = CounterModel
Then, I made a service to use my counter model. We also use the service to format the auto-increment as needed. For example, accountancy wants that all client number starts with "411" ans adds a 5 figures id, so client number 1 actually will be 41100001
counter.service.js
// requires modules blabla ....
class CounterService {
constructor(){}
static nextCount = async(collection, field, prefix, len){
// Gets next counter from db
const counter = await CounterModel.nextCount(collection, field)
// Formats the counter as requested in params
let counterString = `${counter}`;
while (counterString.length < len) {
counterString = `0${counterString}`;}
return `${prefix}${counterString}`;
}
}
module.exports = CounterService
Then here is where we handle the concurrency: in the client model (I won't put here all the client model file but only what need for the explanation). Let's assume we have the client collection with the "num" field that needs the auto increment as described before:
client.model.js
// ...
const ClientSchema = new Schema({
firstName: ...
lastName: ...
num: { type: String, required: true, unique: true }
})
async function addClient(clientToAdd){
let newClient;
let genuineCounter = false
while(!genuineCounter){
try{
// Gets next increment from counter service
clientToAdd.num = await CounterService.nextcount("client","num","411",5)
newClient = new ClientModel(clientToAdd)
await newClient.save();
// if code reaches this point, no concurrency problem we end the loop
genuineCounter = true
} catch(error) {
// If num is duplicated, an error is catched
// we must ensure that the duplicated error comes from num field:
// 11000 is the mongoDB returned error for duplicate unique index
// and we check the duplicated field (could be other unique field!)
if (error instanceof MongoError
&& error.code === 11000
&& Object.keys(error.keyValue)[0] === "num")
genuineCounter = false
// For any other duplicated field or error we throw the error
else throw error
}
}
return newClient
}
}
And here we go ! If two users query the counter at the same time, second one will keep querying the counter until the key is not a duplicate.
A small bonus to test it: creates the small moodule to easily fake delay where you want:
file delay.helper.js
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
exports.delay = delay;
// use anywhere after module import with:
// await delay(5000)
Then import this module into the counter module and fake some delay between counter query and counter save:
counter.model.js
// previous described file and nextCount function with the use of delay()
const delay = require("./delay.helper")
async function nextCount(collection, field) {
let entityCounter = await CounterModel.findOne(...)
await delay(5000)
...
await entityCounter.save()
}
Then, from your front-end project, or your api end point, have two identical tabs and send 2 queries in a row:
Let's say the actual counter in db is 12
Query A reads counter in db = 12
Query A waits for 5 seconds
Query B reads counter in db = still 12
Query A increments and stores new client with num = 411000013; stores counter = 13
Query B increments and tries to store new client, 411000013 already exists, error catched and tries again
Query B reads counter in db = 13
Query B waits for 5 seconds, then increments and store new client with num = 411000014 and stores also new counter = 14

How do I publish two random items from a Meteor collection?

I'm making an app where two random things from a collection are displayed to the user. Every time the user refreshes the page or clicks on a button, she would get another random pair of items.
For example, if the collection were of fruits, I'd want something like this:
apple vs banana
peach vs pineapple
banana vs peach
The code below is for the server side and it works except for the fact that the random pair is generated only once. The pair doesn't update until the server is restarted. I understand it is because generate_pair() is only called once. I have tried calling generate_pair() from one of the Meteor.publish functions but it only sometimes works. Other times, I get no items (errors) or only one item.
I don't mind publishing the entire collection and selecting random items from the client side. I just don't want to crash the browser if Items has 30,000 entries.
So to conclude, does anyone have any ideas of how to get two random items from a collection appearing on the client side?
var first_item, second_item;
// This is the best way I could find to get a random item from a Meteor collection
// Every item in Items has a 'random_number' field with a randomly generated number between 0 and 1
var random_item = function() {
return Items.find({
random_number: {
$gt: Math.random()
}
}, {
limit: 1
});
};
// Generates a pair of items and ensure that they're not duplicates.
var generate_pair = function() {
first_item = random_item();
second_item = random_item();
// Regenerate second item if it is a duplicate
while (first_item.fetch()[0]._id === second_item.fetch()[0]._id) {
second_item = random_item();
}
};
generate_pair();
Meteor.publish('first_item', function() {
return first_item;
});
// Is this good Meteor style to have two publications doing essentially the same thing?
Meteor.publish('second_item', function() {
return second_item;
});
The problem with your approach is that subscribing to the same publication with the same arguments (no arguments in this case) over and over in the client will only get you subscribed only once to the server-side logic, this is because Meteor is optimizing its internal Pub/Sub mechanism.
To truly discard the previous subscription and get the server-side publish code to re-execute and send two new random documents, you need to introduce a useless random argument to your publication, your client-side code will subscribe over and over to the publication with a random number and each time you'll get unsubscribed and resubscribed to new random documents.
Here is a full implementation of this pattern :
server/server.js
function randomItemId(){
// get the total items count of the collection
var itemsCount = Items.find().count();
// get a random number (N) between [0 , itemsCount - 1]
var random = Math.floor(Random.fraction() * itemsCount);
// choose a random item by skipping N items
var item = Items.findOne({},{
skip: random
});
return item && item._id;
}
function generateItemIdPair(){
// return an array of 2 random items ids
var result = [
randomItemId(),
randomItemId()
];
//
while(result[0] == result[1]){
result[1] = randomItemId();
}
//
return result;
}
Meteor.publish("randomItems",function(random){
var pair = generateItemIdPair();
// publish the 2 items whose ids are in the random pair
return Items.find({
_id: {
$in: pair
}
});
});
client/client.js
// every 5 seconds subscribe to 2 new random items
Meteor.setInterval(function(){
Meteor.subscribe("randomItems", Random.fraction(), function(){
console.log("fetched these random items :", Items.find().fetch());
});
}, 5000);
You'll need to meteor add random for this code to work.
Meteor.publish 'randomDocs', ->
ids = _(Docs.find().fetch()).pluck '_id'
randomIds = _(ids).sample 2
Docs.find _id: $in: randomIds
Here's another approach, uses the excellent publishComposite package to populate matches in a local (client-only) collection so it doesn't conflict with other uses of the main collection:
if (Meteor.isClient) {
randomDocs = new Mongo.Collection('randomDocs');
}
if (Meteor.isServer) {
Meteor.publishComposite("randomDocs",function(select_count) {
return {
collectionName:"randomDocs",
find: function() {
let self=this;
_.sample(baseCollection.find({}).fetch(),select_count).forEach(function(doc) {
self.added("randomDocs",doc._id,doc);
},self);
self.ready();
}
}
});
}
in onCreated: this.subscribe("randomDocs",3);
(then in a helper): return randomDocs.find({},{$limit:3});

how to calculate count and unique count over two fields in mongo reduce function

I have a link tracking table that has (amongst other fields) track_redirect and track_userid. I would like to output both the total count for a given link, and also the unique count - counting duplicates by the user id. So we can differentiate if someone has clicked the same link 5 times.
I've tried emitting this.track_userid in both the key and values parts but can't get to grips with how to correctly access them in the reduce function.
So if I roll back to when it actually worked, I have the very simple code below - just like it would be in a 'my first mapreduce function' example
map
function() {
if(this.track_redirect) {
emit(this.track_redirect,1);
}
}
reduce
function(k, vals) {
var sum = 0;
for (var i in vals) {
sum += vals[i];
}
return sum;
}
I'd like to know the correct way to emit the additional userid information and access it in the mapreduce please. or am i thinking about it in the wrong way?
in case it's not clear, I don't want to calculate the total clicks a userid has made, but to count the unique clicks of each url + userid - not counting any duplicate clicks a userid made on each link
can someone point me in the right direction please? thanks!
You can actually pass arbitrary object on the second parameter of the emit call. That means you can take advantage of this and store the userid in it. For example, your map function can look like this:
var mapFunc = function() {
if (this.track_redirect) {
var tempDoc = {};
tempDoc[this.track_userid] = 1;
emit(this.track_redirect, {
users_clicked: tempDoc,
total_clicks: 1
});
}
};
And your reduce function might look like this:
var reduceFunc = function(key, values) {
var summary = {
users_clicked: {},
total_clicks: 0
};
values.forEach(function (doc) {
summary.total_clicks += doc.total_clicks;
// Merge the properties of 2 objects together
// (and these are actually the userids)
Object.extend(summary.users_clicked, doc.users_clicked);
});
return summary;
};
The users_clicked property of the summary object basically stores the id of every user as a property (since you can't have duplicate properties, you can guarantee that it will store unique users). Also note that you have to be careful of the fact that some of the values passed to the reduce function can be result of a previous reduce and the sample code above takes that into account. You can find more about the said behavior in the docs here.
In order to get the unique count, you can pass in the finalizer function that gets called when the reduce phase is completed:
var finalFunc = function(key, value) {
// Counts the keys of an object. Taken from:
// http://stackoverflow.com/questions/18912/how-to-find-keys-of-a-hash
var countKeys = function(obj) {
var count = 0;
for(var i in obj) {
if (obj.hasOwnProperty(i))
{
count++;
}
}
return count;
};
return {
redirect: key,
total_clicks: value.total_clicks,
unique_clicks: countKeys(value.users_clicked)
};
};
Finally, you can execute the map reduce job like this (modify the out attribute to fit your needs):
db.users.mapReduce(mapFunc, reduceFunc, { finalize: finalFunc, out: { inline: 1 }});