I'm pretty new with monogodb and I'm trying to use the aggregation framework to get the sum of revenue for each agent.
My data looks like this:
{
"_id" : ObjectId("56ce2ce5b69c4a909eb50f22"),
"property_id" : 5594,
"reservation_id" : "3110414.1",
"arrival" : "2016-02-24",
"los" : 1,
"updated" : ISODate("2016-02-24T22:21:27.000Z"),
"offer_list" : {
"68801" : {
"pitched" : "yes",
"accepted" : "yes",
"prime_price" : "3",
"agent_price" : "",
"price_per" : "night"
},
"63839" : {
"pitched" : "yes",
"accepted" : "yes",
"prime_price" : "8",
"agent_price" : "",
"price_per" : "night"
}
},
"status" : "accepted",
"comments" : [],
"created" : ISODate("2016-02-24T22:21:18.000Z"),
"agent_id" : 50941
}
For each agent_id, I would like to get the sum of all "agent_price" (or "prime_price" if "agent_price" is None) multiplied by field "los" when "accepted"=="yes".
For the example above the expected output would be:
{'sum':11, 'agent_id": 50941}
The sum is the two "accepted" "prime_price" (8+3) times los (1) = 11.
I tried to use $unwind but it only works for a list, not object. Can anyone help on that?
I doubt it is possible with aggregation, yet should be straightforward with mapreduce:
db.collection.mapReduce(
function(){
for(key in this.offer_list) {
if (this.offer_list[key].accepted == 'yes') {
if (this.offer_list[key].agent_price == '') {
emit(this.agent_id, this.los * this.offer_list[key].prime_price);
} else {
emit(this.agent_id, this.los * this.offer_list[key].agent_price);
}
]
}
},
function(agent, values) {
return Array.sum(values);
}
)
For real-life code, I would also add a query option for data sanitation.
Related
I want to find one document and clone/copy that document and create 100 new documents with new value for few fields using shell script in mongodb.
Below is my document
{
"_id" : ObjectId("5ef59bde562c9824176e9f20"),
"productDefinition" : {
"product" : {
"companies" : {
"company" : {
"productionformation" : {
"productNumber" : "E128",
"venderNumber" : "0470",
"venderName" : "ALPHA SERVICES LLC"
}
}
}
}
},
"executionId" : "123456"
}
After executing the shell script, i want to have new 100 collection with new values for the below fields
"executionId" : "NewExecutionId" // This value will be Fixed for all new 100 documents
"productNumber" : "1" //This value will be increasing.. for first document 1, for second document 2, etc..
"venderNumber" : "1" //This value will be increasing.. for first document 1, for second document 2, etc..
My new collection will be looking like this.
First new document
{
"_id" : ObjectId("5ef59bde562c9824176e9f20"),
"productDefinition" : {
"product" : {
"companies" : {
"company" : {
"productionformation" : {
"productNumber" : "1",
"venderNumber" : "1",
"venderName" : "ALPHA SERVICES LLC"
}
}
}
}
},
"executionId" : "newExecutionId"
}
Second new document
{
"_id" : ObjectId("5ef59bde562c9824176e9f20"),
"productDefinition" : {
"product" : {
"companies" : {
"company" : {
"productionformation" : {
"productNumber" : "2",
"venderNumber" : "2",
"venderName" : "ALPHA SERVICES LLC"
}
}
}
}
},
"executionId" : "newExecutionId"
}
Third new document
{
"_id" : ObjectId("5ef59bde562c9824176e9f20"),
"productDefinition" : {
"product" : {
"companies" : {
"company" : {
"productionformation" : {
"productNumber" : "3",
"venderNumber" : "3",
"venderName" : "ALPHA SERVICES LLC"
}
}
}
}
},
"executionId" : "newExecutionId"
}
Like this fourth document , fifth document, etc... till 100th document...
I tried with this script. but its not working.
copy = db.myCollection.find({"executionId" : "123456",
"productDefinition.product.companies.company.productionformation.productNumber" : "E128" ,
"productDefinition.product.companies.company.productionformation.venderNumber" :"0470" })
for (var i = 1; i< 101; i++){
copy.executionId = "newExecutionId";
copy.productDefinition.product.companies.company.productionformation.productNumber = i;
copy.productDefinition.product.companies.company.productionformation.venderNumber" = i;
db.myCollection.insert(copy);
}
You will be needing to fix following things:
Use findOne instead of find as it will return single matching document.
Use let (instead of var) while running the loop because there are asynchronous DB operations in loop body.
Similarly, create a Deep copy of matchedDoc result / (copy variable) inside for loop body, to avoid updating same object's reference value.
Hope it helps !
I need to be able to get a count of distinct 'transactions' the problem I'm having is that using .distinct() comes back with an error because the documents too large.
I'm not familiar with aggregation either.
I need to be able to group it by 'agencyID' as you see below there are 2 different agencyID's
I need to be able to count transactions where the agencyID is 01721487 etc
db.myCollection.distinct("bookings.transactions").length
this doesn't work as I need to be able to group by agencyID and if there are too many results I get an error saying it's too large.
{
"_id" : ObjectId("5624a610a6e6b53b158b4744"),
"agencyID" : "01721487",
"paxID" : "-530189664",
"bookings" : [
{
"bookingID" : "24232",
"transactions" : [
{
"tranID" : "001",
"invoices" : [
{
"invNum" : "1312",
"type" : "r",
"inv_date" : "20150723",
"inv_time" : "0953",
"inv_val" : -300
}
],
"tranType" : "Fee",
"tranDate" : "20150723",
"tranTime" : "0952",
"opCode" : "admin",
"udf_1" : "j s"
}
],
"acctID" : "acct11",
"agt_id" : "xy"
}
],
"title" : "",
"firstname" : "",
"surname" : "f bar"
}
I've also tried this but it didn't work for me.
thank you for text data -
this is something you could play with:
db.kieron.aggregate([{
$unwind : "$bookings"
}, {
$match : {
"bookings.transactions" : {
$exists : true,
$not : {
$size : 0
}
}
}
}, {
$group : {
_id : "$agencyID",
count : {
$sum : {
$size : "$bookings.transactions"
}
}
}
}
])
as there is nested array we need to unwind it first, and then we can check size of inner array.
Happy reporting!
This is a follow-up from this question, where I tried to solve this problem with the aggregation framework. Unfortunately, I have to wait before being able to update this particular mongodb installation to a version that includes the aggregation framework, so have had to use MapReduce for this fairly simple pivot operation.
I have input data in the format below, with multiple daily dumps:
"_id" : "daily_dump_2013-05-23",
"authors_who_sold_books" : [
{
"id" : "Charles Dickens",
"original_stock" : 253,
"customers" : [
{
"time_bought" : 1368627290,
"customer_id" : 9715923
}
]
},
{
"id" : "JRR Tolkien",
"original_stock" : 24,
"customers" : [
{
"date_bought" : 1368540890,
"customer_id" : 9872345
},
{
"date_bought" : 1368537290,
"customer_id" : 9163893
}
]
}
]
}
I'm after output in the following format, that aggregates across all instances of each (unique) author across all daily dumps:
{
"_id" : "Charles Dickens",
"original_stock" : 253,
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
etc...
]
}
I have written this map function...
function map() {
for (var i in this.authors_who_sold_books)
{
author = this.authors_who_sold_books[i];
emit(author.id, {customers: author.customers, original_stock: author.original_stock, num_sold: 1});
}
}
...and this reduce function.
function reduce(key, values) {
sum = 0
for (i in values)
{
sum += values[i].customers.length
}
return {num_sold : sum};
}
However, this gives me the following output:
{
"_id" : "Charles Dickens",
"value" : {
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
],
"original_stock" : 253,
"num_sold" : 1
}
}
{ "_id" : "JRR Tolkien", "value" : { "num_sold" : 3 } }
{
"_id" : "JK Rowling",
"value" : {
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
],
"original_stock" : 183,
"num_sold" : 1
}
}
{ "_id" : "John Grisham", "value" : { "num_sold" : 2 } }
The even indexed documents have the customers and original_stock listed, but an incorrect sum of num_sold.
The odd indexed documents only have the num_sold listed, but it is the correct number.
Could anyone tell me what it is I'm missing, please?
Your problem is due to the fact that the format of the output of the reduce function should be identical to the format of the map function (see requirements for the reduce function for an explanation).
You need to change the code to something like the following to fix the problem, :
function map() {
for (var i in this.authors_who_sold_books)
{
author = this.authors_who_sold_books[i];
emit(author.id, {customers: author.customers, original_stock: author.original_stock, num_sold: author.customers.length});
}
}
function reduce(key, values) {
var result = {customers:[] , num_sold:0, original_stock: (values.length ? values[0].original_stock : 0)};
for (i in values)
{
result.num_sold += values[i].num_sold;
result.customers = result.customers.concat(values[i].customers);
}
return result;
}
I hope that helps.
Note : the change num_sold: author.customers.length in the map function. I think that's what you want
I have some data that looks like this:
[
{
"_id" : ObjectId("4e2f2af16f1e7e4c2000000a"),
"advertisers" : [
{
"created_at" : ISODate("2011-07-26T21:02:19Z"),
"category" : "Infinity Pro Spin Air Brush",
"updated_at" : ISODate("2011-07-26T21:02:19Z"),
"lowered_name" : "conair",
"twitter_name" : "",
"facebook_page_url" : "",
"website_url" : "",
"user_ids" : [ ],
"blog_url" : "",
},
and I was thinking that a query like this would give the id of the advertiser:
var start = new Date(2011, 1, 1);
> var end = new Date(2011, 12, 12);
> db.agencies.find( { "created_at" : {$gte : start , $lt : end} } , { _id : 1 , program_ids : 1 , advertisers { name : 1 } } ).limit(1).toArray();
But my query didn't work. Any idea how I can add the fields inside the nested elements to my list of fields I want to get?
Thanks!
Use dot notation (e.g. advertisers.name) to query and retrieve fields from nested objects:
db.agencies.find({
"advertisers.created_at": {
$gte: start,
$lt: end
}
},
{
_id: 1,
program_ids: 1,
"advertisers.name": 1
}
}).limit(1).toArray();
Reference: Retrieving a Subset of Fields
and Dot Notation
db.agencies.find(
{ "advertisers.created_at" : {$gte : start , $lt : end} } ,
{ program_ids : 1 , advertisers.name : 1 }
).limit(1).pretty();
There is one thing called dot notation that MongoDB provides that allows you to look inside arrays of elements. Using it is as simple as adding a dot for each array you want to enter.
In your case
"_id" : ObjectId("4e2f2af16f1e7e4c2000000a"),
"advertisers" : [
{
"created_at" : ISODate("2011-07-26T21:02:19Z"),
"category" : "Infinity Pro Spin Air Brush",
"updated_at" : ISODate("2011-07-26T21:02:19Z"),
"lowered_name" : "conair",
"twitter_name" : "",
"facebook_page_url" : "",
"website_url" : "",
"user_ids" : [ ],
"blog_url" : "",
},
{ ... }
If you want to go inside the array of advertisers to look for the property created_at inside each one of them, you can simply write the query with the property {'advertisers.created_at': query} like follows
db.agencies.find( { 'advertisers.created_at' : { {$gte : start , $lt : end} ... }
I am trying to run a map/reduce function in mongodb where I group by 3 different fields contained in objects in my collection. I can get the map/reduce function to run, but all the emitted fields run together in the output collection. I'm not sure this is normal or not, but outputting the data for analysis takes more work to clean up. Is there a way to separate them, then use mongoexport?
Let me show you what I mean:
The fields I am trying to group by are the day, user ID (or uid) and destination.
I run these functions:
map = function() {
day = (this.created_at.getFullYear() + "-" + (this.created_at.getMonth()+1) + "-" + this.created_at.getDate());
emit({day: day, uid: this.uid, destination: this.destination}, {count:1});
}
/* Reduce Function */
reduce = function(key, values) {
var count = 0;
values.forEach(function(v) {
count += v['count'];
}
);
return {count: count};
}
/* Output Function */
db.events.mapReduce(map, reduce, {query: {destination: {$ne:null}}, out: "TMP"});
The output looks like this:
{ "_id" : { "day" : "2012-4-9", "uid" : "1234456", "destination" : "Home" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "2345678", "destination" : "Home" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "3456789", "destination" : "Login" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "4567890", "destination" : "Contact" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "5678901", "destination" : "Help" }, "value" : { "count" : 1 } }
When I attempt to use mongoexport, I can not separate day, uid, or destination by columns because the map combines the fields together.
What I would like to have would look like this:
{ { "day" : "2012-4-9" }, { "uid" : "1234456" }, { "destination" : "Home"}, { "count" : 1 } }
Is this even possible?
As an aside - I was able to make the output work by applying sed to the file and cleaning up the CSV. More work, but it worked. It would be ideal if I could get it out of mongodb in the correct format.
MapReduce only returns documents of the form {_id:some_id, value:some_value}
see: How to change the structure of MongoDB's map-reduce results?