MongoDB MapReduce Emit Strangeness - mongodb

I'm very much a noob when it comes to MapReduce and I have been pulling my hair out with this issue. Hopefully someone can give me a hand.
My Goal: Get the product revenue and a count of the units sold.
Transactions Collection sample document where i'm querying from:
{ "_id" : ObjectId( "xxxxxxxxxx" ),
"MerchantID" : { "$ref" : "merchants",
"$id" : ObjectId( "xxxxxxxx" ) },
"TransactionSocialKey" : "xxxxxxxx",
"PurchaseComplete: true,
"Products" : [
{ "ProductID" : { "$ref" : "products",
"$id" : ObjectId( "4ecae2b9cf72ab1f6900xxx1" ) },
"ProductPrice" : 14.99,
"ProductQuantity" : "1" },
{ "ProductID" : { "$ref" : "products",
"$id" : ObjectId( "4ecae2b9cf72ab1f690xxx2" ) },
"ProductPrice" : 14.99,
"ProductQuantity" : "1" } ],
"DateTimeCreated" : Date( 1321919161000 ) }
As you can see I have an embedded array called Products with the ProductID, Product Price, and Product Quantity.
My Map Function
map = function(){
if(this.PurchaseComplete === true){
this.Products.forEach(function(Product){
if(Product.ProductID.$id.toString() == Product_ID.toString()){
emit(Product_ID, {
"ProductQuantity" : Product.ProductQuantity,
"ProductPrice" : Product.ProductPrice,
"ProductID" : Product.ProductID.$id.toString()
});
}
});
}
}
So with this i'm only going to emit Transactions that were completed. If the Transaction was completed I'm looping through the Products array and if the Product.ProductID.$id is equal to the Product_ID that I set in the MapReduce Scope then I'm going to emit the Product from the set.
For testing sake I've set up my Reduce function as:
reduce = function(key, Product_Transactions){
return {"Transactions" : Product_Transactions};
}
For some odd reason i'm getting this sort of result:
[results] => Array
(
[0] => Array
(
[_id] => MongoId Object
(
[$id] => 4ecae2b9cf72ab1f6900xxx1
)
[value] => Array
(
[Transactions] => Array
(
[0] => Array
(
[Transactions] => Array
(
[0] => Array
(
[ProductQuantity] => 1
[ProductPrice] => 14.99
[ProductID] => 4ecae2b9cf72ab1f6900xxx1
)
[1] => Array
(
[ProductQuantity] => 1
[ProductPrice] => 14.99
[ProductID] => 4ecae2b9cf72ab1f6900xxx1
)
It Continues…
)
)
[1] => Array
(
[ProductQuantity] => 1
[ProductPrice] => 12.74
[ProductID] => 4ecae2b9cf72ab1f6900xxx1
)
[2] => Array
(
[ProductQuantity] => 1
[ProductPrice] => 12.74
[ProductID] => 4ecae2b9cf72ab1f6900xxx1
)
)
)
)
)
I'm not sure why I'm getting this odd embedded array. The emit key is always the same and never changes. I'm really lost for ideas on where to start trouble shooting. Any help or guidance would be appreciated.

Output of map should be in the same format that reduce consumes and produces. The idea is that reduce may run in parallel and/or against partially reduced results.
Here's how your code should look like (pseudo-code)
var map = function() {
if(some condition) {
emit(product_id, {Transactions: [{ // <= note the array here!
"ProductQuantity" : Product.ProductQuantity,
"ProductPrice" : Product.ProductPrice,
"ProductID" : ID
}]})
}
};
var reduce = function(key, vals) {
var result = {Transactions: []};
vals.forEach(function(v) {
v.Transactions.forEach(t) {
result.Transactions.push(t);
}
});
return result;
}

Related

Fetch a value from an array in a MongoDB collection

Below is my MongoDB collection
{
"_id" : ObjectId("5b2a2ee19d332d0118b26dfe"),
"Name" : "Rock",
"Job" : [
{
"Id" : ObjectId("5b2b93c63629d0271ce366ae"),
"JobName" : "abc",
"JobTrack" : [
"123"
]
}
]
}
I want to fetch both ObjectId values
my $cursor = $custColl->find(
{ 'Job.JobName' => "abc", 'Job.JobTrack' => "123" },
{ '_id' => 1, 'Job.Id' => 1 }
);
while ( my $next = $cursor->next ) {
my $CustomerId = "$next->{_id}";
my $JobId = "$next->{'Job.Id'}";
say "$CustomerId => $JobId\n";
}
The result I got from above code as follows
5b2a2ee19d332d0118b26dfe =>
With this code I'm not able to get $JobId.
Assuming the query finds a document, $next is a Perl data structure that resembles your original JSON:
{
'Name' => 'Rock',
'_id' => bless( {
'value' => '5b2a2ee19d332d0118b26dfe'
}, 'MongoDB::OID' ),
'Job' => [
{
'JobName' => 'abc',
'JobTrack' => [
'123'
],
'Id' => bless( {
'value' => '5b2b93c63629d0271ce366ae'
}, 'MongoDB::OID' )
}
]
}
To get the job ID, you need to dereference that structure using Perl syntax, not MongoDB syntax:
my $JobId = "$next->{Job}[0]{Id}";
You need to access a specific element of the Job array. Because there is only one element in your example it must have an index of zero. Accordingly you need
my $JobId = "$next->{'Job.0.Id'}";

Laravel MongoDB library 'jenssegers/laravel-mongodb' hasMany relationship is not working

I am using MongoDB library https://github.com/jenssegers/laravel-mongodb version 3.1.0-alpha in Laravel 5.3.28 I have two collections in MongoDB and I want to make a hasMany relation b/w them. Means each Employee performs many tasks. I have used reference and added employee_ids in the task collection.
Below are my code:
MongoDB:
1st Collection: Employee
{
"_id" : ObjectId("586ca8c71a72cb07a681566d"),
"employee_name" : "John",
"employee_description" : "test description",
"employee_email" : "john#email.com",
"updated_at" : "2017-01-04 11:45:20",
"created_at" : "2017-01-04 11:45:20"
},
{
"_id" : ObjectId("586ca8d31a72cb07a6815671"),
"employee_name" : "Carlos",
"employee_description" : "test description",
"employee_email" : "carlos#email.com",
"updated_at" : "2017-01-04 11:45:20",
"created_at" : "2017-01-04 11:45:20"
}
2nd Collection: Task
{
"_id" : ObjectId("586ccbcf1a72cb07a6815b04"),
"task_name" : "New Task",
"task_description" : "test description",
"task_status" : 1,
"task_start" : "2017-04-01 12:00:00",
"task_end" : "2017-04-01 02:00:00",
"task_created_at" : "2017-04-01 02:17:00",
"task_updated_at" : "2017-04-01 02:17:00",
"employee_id" : [
ObjectId("586ca8c71a72cb07a681566d"),
ObjectId("586ca8d31a72cb07a6815671")
]
},
{
"_id" : ObjectId("586cd3261a72cb07a6815c69"),
"task_name" : "2nd Task",
"task_description" : "test description",
"task_status" : 1,
"task_start" : "2017-04-01 12:00:00",
"task_end" : "2017-04-01 02:00:00",
"task_created_at" : "2017-04-01 02:17:00",
"task_updated_at" : "2017-04-01 02:17:00",
"employee_id" : ObjectId("586ca8c71a72cb07a681566d")
}
Laravel:
Model:
Employee:
<?php
namespace App\Models;
use Jenssegers\Mongodb\Eloquent\Model as Eloquent;
class Employee extends Eloquent {
protected $collection = 'employee';
protected $primaryKey = '_id';
public function tasks()
{
return $this->hasMany('App\Models\Task');
}
}
Laravel:
Model:
Task:
<?php
namespace App\Models;
use Jenssegers\Mongodb\Eloquent\Model as Eloquent;
class Task extends Eloquent {
protected $collection = 'task';
protected $primaryKey = '_id';
public function employees()
{
return $this->belongsTo('App\Models\Employee');
}
}
I want to get tasks assigned to the specific employee.
Controller:
public function EmployeeData($data)
{
$employees = Employee::with('tasks')->where('_id', new \MongoDB\BSON\ObjectID('586ca8d31a72cb07a6815671'))->get();
echo "<pre>";
print_r($employees);exit;
}
Output:
Illuminate\Database\Eloquent\Collection Object
(
[items:protected] => Array
(
[0] => App\Models\Employee Object
(
[connection:protected] => mongodb
[collection:protected] => lt_employees
[primaryKey:protected] => _id
[employee_id:App\Models\Employee:private] =>
[employee_name:App\Models\Employee:private] =>
[employee_description:App\Models\Employee:private] =>
[employee_email:App\Models\Employee:private] =>
[employee_created_at:App\Models\Employee:private] =>
[employee_updated_at:App\Models\Employee:private] =>
[parentRelation:protected] =>
[table:protected] =>
[keyType:protected] => int
[perPage:protected] => 15
[incrementing] => 1
[timestamps] => 1
[attributes:protected] => Array
(
[_id] => MongoDB\BSON\ObjectID Object
(
[oid] => 586ca8d31a72cb07a6815671
)
[employee_name] => Carlos
[employee_description] => test description
[employee_email] => carlos#email.com
[updated_at] => 2017-01-04 11:45:20
[created_at] => 2017-01-04 11:45:20
)
[original:protected] => Array
(
[_id] => MongoDB\BSON\ObjectID Object
(
[oid] => 586ca8d31a72cb07a6815671
)
[employee_name] => Carlos
[employee_description] => test description
[employee_email] => carlos#email.com
[updated_at] => 2017-01-04 11:45:20
[created_at] => 2017-01-04 11:45:20
)
[relations:protected] => Array
(
[tasks] => Illuminate\Database\Eloquent\Collection Object
(
[items:protected] => Array
(
)
)
)
[hidden:protected] => Array
(
)
[visible:protected] => Array
(
)
[appends:protected] => Array
(
)
[fillable:protected] => Array
(
)
[guarded:protected] => Array
(
[0] => *
)
[dates:protected] => Array
(
)
[dateFormat:protected] =>
[casts:protected] => Array
(
)
[touches:protected] => Array
(
)
[observables:protected] => Array
(
)
[with:protected] => Array
(
)
[exists] => 1
[wasRecentlyCreated] =>
)
)
)
In the output, relation tasks items are empty.
Can anyone suggest me that the relation b/w collections are correct?
Update
I have used belongsToManyin the relation. Now my models are:
In the Employee Model:
public function tasks()
{
return $this->belongsToMany('App\Models\Task');
}
In the Task Model:
public function employees()
{
return $this->belongsToMany('App\Models\Employee');
}
These are the documents:
Employee collection
{
"_id" : ObjectId("586ca8c71a72cb07a681566d"),
"employee_name" : "Carlos",
"employee_description" : "test description",
"employee_email" : "carlos#email.com",
"updated_at" : "2017-01-04 11:45:20",
"created_at" : "2017-01-04 11:45:20",
"task_ids" : [
ObjectId("586ccbcf1a72cb07a6815b04"),
ObjectId("586cd3261a72cb07a6815c69")
]
},
{
"_id" : ObjectId("586ca8d31a72cb07a6815671"),
"employee_name" : "John",
"employee_description" : "test description",
"employee_email" : "john#email.com",
"updated_at" : "2017-01-04 11:45:20",
"created_at" : "2017-01-04 11:45:20"
}
Task collection
{
"_id" : ObjectId("586ccbcf1a72cb07a6815b04"),
"task_name" : "New Task",
"task_description" : "test description",
"task_status" : 1,
"task_start" : "2017-04-01 12:00:00",
"task_end" : "2017-04-01 02:00:00",
"task_created_at" : "2017-04-01 02:17:00",
"task_updated_at" : "2017-04-01 02:17:00",
"employee_ids" : [
ObjectId("586ca8c71a72cb07a681566d"),
ObjectId("586ca8d31a72cb07a6815671")
]
},
{
"_id" : ObjectId("586cd3261a72cb07a6815c69"),
"task_name" : "2nd Task",
"task_description" : "test description",
"task_status" : 1,
"task_start" : "2017-04-01 12:00:00",
"task_end" : "2017-04-01 02:00:00",
"task_created_at" : "2017-04-01 02:17:00",
"task_updated_at" : "2017-04-01 02:17:00",
"employee_ids" : ObjectId("586ca8c71a72cb07a681566d")
}
I get the first employee with these documents:
$employee = Employee::with('tasks')->first();
dd($employee);
And I gotthe output with empty relation:
Employee {#176
#connection: "mongodb"
#collection: "employee"
#primaryKey: "_id"
-employee_id: null
-employee_name: null
-employee_description: null
-employee_email: null
-employee_created_at: null
-employee_updated_at: null
#parentRelation: null
#table: null
#keyType: "int"
#perPage: 15
+incrementing: true
+timestamps: true
#attributes: array:10 [
"_id" => ObjectID {#170}
"employee_name" => "Carlos"
"employee_description" => "test description"
"employee_email" => "carlos#email.com"
"updated_at" => "2017-01-04 11:45:20"
"created_at" => "2017-01-04 11:45:20"
"task_ids" => array:2 [
0 => ObjectID {#174}
1 => ObjectID {#175}
]
]
#original: array:10 [
"_id" => ObjectID {#170}
"employee_name" => "Carlos"
"employee_description" => "test description"
"employee_email" => "carlos#email.com"
"updated_at" => "2017-01-04 11:45:20"
"created_at" => "2017-01-04 11:45:20"
"task_ids" => array:2 [
0 => ObjectID {#174}
1 => ObjectID {#175}
]
]
#relations: array:1 [
"tasks" => Collection {#173
#items: []
}
]
#hidden: []
#visible: []
#appends: []
#fillable: []
#guarded: array:1 [
0 => "*"
]
#dates: []
#dateFormat: null
#casts: []
#touches: []
#observables: []
#with: []
+exists: true
+wasRecentlyCreated: false
}
I understood by your other question, that a task can belong to many employees, right? So you should be using belongsToMany relationship in your Task model. Also your example "task" collection shows that in one document employee_id is an array and in the other document it is an ObjectId, when both should be arrays.
Anyway, I've had a hard time trying to figure this out, but I've seen that you can't use hasMany as the inverse of belongsToMany, because belongsToMany creates an array of ids, and hasMany doesn't work well with arrays. I would say that we would need something like hasManyInArray, but when I associate a belongsToMany relationship, the "parent" document gets created an array of ids, which leads me to think that the parent should also use belongsToMany even though it doesn't "belong to" but actually "has". So when you would associate an employee to a task like this:
$task->employees()->save($employee);
The "employee" document will end up having a "task_ids" attribute with the only task id it should have. So that seems to be the way to go with Jenssegers: to use belongsToMany in both models:
Laravel: Model: Employee:
<?php
namespace App\Models;
use Jenssegers\Mongodb\Eloquent\Model as Eloquent;
class Employee extends Eloquent
{
protected $collection = 'employee';
public function tasks()
{
return $this->belongsToMany(Task::class);
}
}
Laravel: Model: Task:
<?php
namespace App\Models;
use Jenssegers\Mongodb\Eloquent\Model as Eloquent;
class Task extends Eloquent
{
protected $collection = 'task';
public function employees()
{
return $this->belongsToMany(Employee::class);
}
}
And you would use this like:
// Give a task a new employee
$task->employees()->save($employee);
// Or give an employee a new task
$employee->tasks()->save($task);
The only thing about this is that when you look at the database, you will see that your employee documents have an array called "task_ids", and inside it, the id of the only task each employee have. I hope this helped.
Just some side notes, you know that you don't have to define the name of the primary key on each model, right? You don't need this:
protected $primaryKey = '_id';
Also you don't have to define the name of the collection (i.e. protected $collection = 'employee';), unless you really want them to be in singular (by default they are in plural).
I got up in the middle of the night (it's 3:52AM here) and checked something on the computer and then checked SO an saw your question, I hope this time I answered soon enough for you, we seem to be in different timezones.
Update
These are the documents I created for testing:
employee collection
{
"_id" : ObjectId("5870ba1973b55b03d913ba54"),
"name" : "Jon",
"updated_at" : ISODate("2017-01-07T09:51:21.316Z"),
"created_at" : ISODate("2017-01-07T09:51:21.316Z"),
"task_ids" : [
"5870ba1973b55b03d913ba56"
]
},
{
"_id" : ObjectId("5870ba1973b55b03d913ba55"),
"name" : "Doe",
"updated_at" : ISODate("2017-01-07T09:51:21.317Z"),
"created_at" : ISODate("2017-01-07T09:51:21.317Z"),
"task_ids" : [
"5870ba1973b55b03d913ba56"
]
}
task collection
{
"_id" : ObjectId("5870ba1973b55b03d913ba56"),
"name" : "New Task",
"updated_at" : ISODate("2017-01-07T09:51:21.317Z"),
"created_at" : ISODate("2017-01-07T09:51:21.317Z"),
"employee_ids" : [
"5870ba1973b55b03d913ba54",
"5870ba1973b55b03d913ba55"
]
}
With these documents I get the first employee like this:
$employee = Employee::with('tasks')->first();
dd($employee);
And in the output we can see the relations attribute is an array:
Employee {#186 ▼
#collection: "employee"
#primaryKey: "_id"
// Etc.....
#relations: array:1 [▼
"tasks" => Collection {#199 ▼
#items: array:1 [▼
0 => Task {#198 ▼
#collection: "task"
#primaryKey: "_id"
// Etc....
#attributes: array:5 [▼
"_id" => ObjectID {#193}
"name" => "New Task"
"updated_at" => UTCDateTime {#195}
"created_at" => UTCDateTime {#197}
"employee_ids" => array:2 [▶]
]
}
]
}
]
}
Update 2
The belongsToMany method isn't in the file you mention because that class (i.e. Jenssegers\Mongodb\Eloquent\Model) extends Laravel's Eloquent Model class, and that's where the belongsToMany method is.
Ok so that must be why it's not working for you, because the arrays have to be strings instead of ObjectIds. Why is this? Because that's how the Jenssegers library work, it saves the Ids as strings. I've also found this behaviour strange, but that's how it works. Remember that you are supposed to relate objects using the Jenssegers library, not by creating the data manually in the database.
How can you index the ids? Just create a normal index in MongoDB, like tasks.createIndex({task_ids: 1}). Here's the documentation on how to create indexes: https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/. You can also create indexes on migrations, here are the docs on migrations, make sure to read Jenssegers notes on migrations too.
You can access the tasks realtion like this: $employee->tasks;. You access relations by getting a property with the same name of the method you declared your relation with, so if you have:
class Post
{
public function owner()
{
return $this->belongsTo(User::class);
}
}
You get the relation as $post->owner;. Here's the documentation on relations: https://laravel.com/docs/5.3/eloquent-relationships

How to apply correctly $limit and $skip in subfields?

I'm starting with mongodb and I'm finding many difficulties with the following scheme.
{
"_id" : "AAA",
"events" : [
{
"event" : "001",
"time" : 1456823333
},
{
"event" : "002",
"time" : 1456828888
},
{
"event" : "003",
"time" : 1456825555
},...
]
}
I want to get the events sorted by date and apply limit and skip.
I'm using the following query:
$op = array(
array('$match' => array('_id' => $userId)),
array('$unwind' => '$events'),
array('$sort' => array('events.time' => -1)),
array('$group' => array('_id' => '$_id',
'events' => array('$push' => '$events')))
//,array('$project' => array('_id' => 1, 'events' => array('$events', 0, 3)))
//,array('$limit' => 4)
//,array('$skip' => 3)
);
$result= Mongo->aggregate('mycollection', $op);
I have tried everything to filter $project or $limit and $skip but none of it works.
How should I apply the limit and skyp conditions in events?
If I do not apply the conditions of "limit" above the result is ordered correctly.
Result:
{ "waitedMS":0,
"result":[
{
"_id":"AAA",
"events":[
{
"event":"002",
"time":1456828888,
},
{
"event":"003",
"time":1456825555,
},
{
"event":"001",
"time":1456823333,
},...
}
],
"ok":1
}
Order correctly but I can not limit the number of results for paging.

how to check two fields having same values in same table in laravel 5 with mongodb

I have to check the two fields having same values in same table in laravel 5. I am using Mongodb.
{
"id": "565d23ef5c2a4c9454355679",
"title": "Event1",
"summary": "test",
"total": NumberInt(87),
"remaining": NumberInt(87),
"status": "1"
}
I need to check "total" and "remaining" fields are same. How to write query in laravel 5.1. Please help.
One approach you could take would be using the aggregation framework methods from the raw MongoDB collection object provided from the underlying driver. In the mongo shell, you would essentially run the following aggregation pipeline operation to compare the two fields and return the documents which satisfy that criteria:
db.collection.aggregate([
{
"$project": {
"isMatch": { "$eq" : ["$total", "$remaining"] }, // similar to "valueof(total) == valueof(remaining)"
"id" : 1,
"title" : 1,
"summary" : 1,
"total" : 1,
"remaining" : 1,
"status" : 1
}
},
{
"$match": { "isMatch": true } // filter to get documents that only satisfy "valueof(total) == valueof(remaining)"
}
]);
Or using the $where operator in the find() query:
db.collection.find({ "$where" : "this.total == this.remaining" })
Thus in laravel, you can get the documents using raw expressions as follows
$result = DB::collection("collectionName") -> raw(function ($collection)
{
return $collection->aggregate(array(
array(
"$project" => array(
"id" => 1,
"title" => 1,
"summary" => 1,
"total" => 1,
"remaining" => 1,
"status" => 1,
"isMatch" => array(
"$eq" => array( "$total", "$remaining" )
)
)
),
array(
"$match" => array(
"isMatch" => true
)
)
));
});
In the case of $where, you can inject the expressions directly into the query:
Model::whereRaw(array("$where" => "this.total == this.remaining"))->get();
Or using the raw expression on the internal MongoCollection object executed on the query builder. Note that using the raw() method requires using a cursor because it is a low-level call:
$result = Model::raw()->find(array("$where" => "this.total == this.remaining"));
Collectionname::whereRaw(array('$where' => "this.filed1 > this.field2"))

MongoDB Aggregation: Value in Array

We have the following Testsnippet in Ruby
def self.course_overview(course_member=nil)
course_member = CourseMember.last if course_member == nil
group_global = {"$group" =>
{"_id" => { "course_id" => "$course_id",
"title" => "$title",
"place" => "$place",
"description" => "$description",
"choosen_id" => "$choosen_id",
"year" => {"$year" => "$created_at"},
"course_member_ids" => "$course_member_ids"}}
}
match_global = {"$match" => {"_id.course_member_ids" => {"$in" => "#{course_member.id}"} }}
test = CoursePlan.collection.aggregate([group_global, match_global])
return test
end
The problem is the "match_global" statement. We would like to match all Documents where the course_member ID is appearing in the course_member_ids array.
The above statement fails with the error: "...must be an array". This make sense to me but according to other comments on the web this should be possible this way.
Any advice? How is it possible to return the docs where the course_member id is in the array of the course_member ids?
Sample CoursePlan Object:
{
"_id" : ObjectId("5371e70651a53ed5ad000055"),
"course_id" : ObjectId("5371e2e051a53ed5ad000039"),
"course_member_ids" : [
ObjectId("5371e2a751a53ed5ad00002d"),
ObjectId("5371e2b251a53ed5ad000030"),
ObjectId("5371e2bb51a53ed5ad000033")
],
"created_at" : ISODate("2014-05-13T09:33:58.042Z"),
"current_user" : "51b473bf6986aee9c0000002",
"description" : "Schulung 1 / Elektro",
"fill_out" : ISODate("2014-04-30T22:00:00.000Z"),
"place" : "TEST",
"title" : "Schulung 1",
"updated_at" : ISODate("2014-05-13T09:33:58.811Z"),
"user_ids" : [
ObjectId("51b473bf6986aee9c0000002"),
ObjectId("521d7f606986ae4826000002"),
ObjectId("521d8b3f6986aed678000007")
]
}
Since course_member_ids is an array of course members you should test for equality. In shell syntax:
{$match:{"_id.course_member_ids":<valueYouWantToTest>}}
You don't need $in as this query is analogous to a find when you want to select documents that have a particular single value you are looking for.