MongoDB Aggregation - Accessing lookup fields in project - mongodb

I access a lookup field in $project using $unwind but this breaks the accessibility of the other nested fields from the main collection. Is there any way to access the fields from both collections in $project. I thought of merging the arrays but still not sure if it's the right approach.
Users collection
{
"_id" : ObjectId("5a54f739fe0a00373e7ef1e8"),
"team" : {
"name" : "test",
},
"updated_at" : ISODate("2018-05-22T04:28:00Z"),
"created_at" : ISODate("2018-01-09T17:09:13Z"),
"users" : [
{
"updated_at" : ISODate("2018-11-22T11:55:22Z"),
"created_at" : ISODate("2018-01-09T17:09:13Z"),
"_id" : ObjectId("5a54f739fe0a00373e7ef1e9"),
"name" : test,
"status" : "active",
"title" : "Engineer",
},
{
"updated_at" : ISODate("2018-11-22T11:55:22Z"),
"created_at" : ISODate("2018-01-09T17:09:13Z"),
"_id" : ObjectId("5a54f739fe0a00373e7ef1e9"),
"name" : test1,
"status" : "passive",
"title" : "Tester",
}
]
}
Comments collection:
{
"_id" : ObjectId("6062178fc73fe806e45c9b69"),
"userId" : "5a54f739fe0a00373e7ef1e9",
'text' : 'this is a test',
"status" : "1",
"timestamp" : ISODate("2021-03-29T18:08:14.317Z")
}
Pipeline
$pipeline = [['$match' => [
'users' => [
'$elemMatch' => [
'field1' => $field1,
],
]
]
],
['$unwind' => '$users'],
['$match' => [
'users.field1' => $field1,
]
],
['$addFields' => ['userId' => ['$toString' => '$userId' ]]],
['$lookup' => [
'from' => 'comments',
'localField' => 'userId',
'foreignField' => 'userId',
'as' => 'userComments'
]
],
['$unwind' => '$userComments'],
['$project' => [
'comments' => [
'$switch' => [
'branches' => [
[ 'case' => [
'$eq' => ['$userComments.status','verified']
],
'then' => 1],
[ 'case' => [
'$lte' => ['$userComments.status', '']
],
'then' => 1],
],
'default' => 0
]
],
'status' => '$users.status',
'total' => [false],
]
],
['$group' => [
'_id' => $groupBy,
'text' => ['$sum' => '$comments'],
'total' => ['$sum' => '$total'],
'completed' => ['$sum' => '$status'],
]
],
];
result
{"_id" :"categories","text": 21,"total": 100,"completed":50}

Related

laravel eloquent SUM with relation

Hi guys I want to calulate sum of rate for comments group by type like this
Post::with(['comments' => function ($q) {
$q->selectRaw('type, SUM(rate) as total_rate')
->groupBy('type');
}])
I'm waiting for a result like this:
0 => array:4 [
"id" => 5
"start_date" => "2022-01-01"
"end_date" => "2022-01-31"
"comments" => array:2 [
0 => array:3 [
"type" => "personal"
"total_rate" => 44244.0
]
1 => array:3 [
"type" => "business"
"total_rate" => 22358.0
]
]
but the result is
0 => array:4 [
"id" => 5
"start_date" => "2022-01-01"
"end_date" => "2022-01-31"
"comments" => []
]

MongoDB lookup issues with performance php

I am trying to fetch data from mongoDB using lookup
collections companies
{
"id" : 1,
"company_id" : 2,
"user_group" : 1,
"company_name" : "xyz",
"created_on" : "00-00-0000"
}
collection users
{
"id" : 1,
"company_id" : 1,
"user_group" : 1,
"name" : "abcd",
"email" : "abcd#abcd.abcd"
}
{
"id" : 1,
"company_id" : 2,
"active": 1,
"user_group" : 1,
"name" : "efgh",
"email" : "efgh#efgh.efgh"
}
Query used to fetch data using php
$collection->aggregate([
['$match' => ['company_id' => 2]],
['$lookup' => [
'from' => 'users',
'localField' => 'user_group',
'foreignField' => 'user_group',
'as' => 'company_users',
]],
['$unwind' => ['path' => '$company_users', 'preserveNullAndEmptyArrays' => true]],
['$match' => ['$and' => [['company_users.company_id' => 2], ['company_users.active' => 1]]]],
['$project' => [
'_id' => false,
'company_id' => true,
'user_group' => true,
'company_users.name' => true,
'company_users.email' => true
]
]
]);
Query is working correctly but takes more time to retrieve data if document is greater than 1000

group by date in laravel mongodb

I'm using mongo and laravel.
I'm getting data in periods of time, usually 30 days. I want to group the data by day. I've tried $project $dayOfMonth and group by day but it grouping them in order of days in month and I want to be ordered in days in the period.
is there a way?
[
'$match' => [
"created_at" => [
'$gte' => new \MongoDB\BSON\UTCDateTime($thisMonth),
]
],
],
[
'$project' => [
'day' => [
'$dayOfMonth' => [
'date' => '$created_at',
]
],
]
],
[
'$group' => [
'_id' => '$day',
'count' => ['$sum' => 1],
],
],
sample:
{#940
flag::STD_PROP_LIST: false
flag::ARRAY_AS_PROPS: true
iteratorClass: "ArrayIterator"
storage: array:16 [
"_id" => ObjectId {#933
+"oid": "5b2ff00e35826377be16ff82"
}
"orderNumber" => "10000"
"userName" => "dGFraHRlLTkwMDQ2NDcyOJRbugaZlHWdMR+nCNzaUfY="
"updated_at" => UTCDateTime {#938
+"milliseconds": "1529868539000"
}
"created_at" => UTCDateTime {#939
+"milliseconds": "1529868302000"
}
]
}
You can use $dateFromString to converts a date object to a string according to a your desired format.
[ "$match" => [
"created_at" => [ '$gte' => new \MongoDB\BSON\UTCDateTime($thisMonth) ]
]],
[ "$group" => [
"_id" => [ "$dateToString" => [ "format" => "%Y-%m-%d", "date" => "$created_at" ] ],
"count" => [ "$sum" => 1 ]
]]

Aggregation Multiple arrays

Hey i'm having troubles with getting my aggregation right.
I'm having this dataset and within the collection there are a few million other documents alike:
{
"_id": ObjectId("5757c73344ce54ae1d8b456c"),
"hostname": "Baklap4",
"timestamp": NumberLong(1465370500),
"networkList": [
{
"name": "46.243.152.13",
"openConnections": NumberLong(3)
},
{
"name": "46.243.152.50",
"openConnections": NumberLong(4)
}
],
"webserver": "nginx",
"deviceList": [
{
"deviceName": "eth0",
"receive": NumberLong(183263),
"transmit": NumberLong(781595)
},
{
"deviceName": "wlan0",
"receive": NumberLong(0),
"transmit": NumberLong(0)
}
]
}
What I want:
I'd like to get a resultset where i'm doing an average (of every numeric value) for every document within a 300 second timespan.
[
[
'$match' => [
'timestamp' => ['$gte' => $todayMidnight],
'hostname' => $serverName
]
],
[
'$unwind' => '$networkList'
],
[
'$unwind' => '$deviceList'
],
[
'$group' => [
'_id' => [
'interval' => [
'$subtract' => [
'$timestamp',
[
'$mod' => ['$timestamp', 300]
]
]
],
'network' => '$networkList.name',
'device' => '$deviceList.name',
],
'openConnections' => [
'$sum' => '$networkList.openConnections'
],
'cpuLoad' => [
'$avg' => '$cpuLoad'
],
'bytesPerSecond' => [
'$avg' => '$bytesPerSecond'
],
'requestsPerSecond' => [
'$avg' => '$requestsPerSecond'
],
'webserver' => [
'$last' => '$webserver'
],
'timestamp' => [
'$max' => '$timestamp'
]
]
],
[
'$project' => [
'_id' => 0,
'timestamp' => 1,
'cpuLoad' => 1,
'bytesPerSecond' => 1,
'requestsPerSecond' => 1,
'webserver' => 1,
'openConnections' => 1,
'networkList' => '$networkList',
'deviceList' => '$_id.device',
]
],
[
'$sort' => [
'timestamp' => -1
]
]
];
Yet this doesn't give me a list with all devices and per device an average of received and trasmited bytes.
How would one get those?
per given example I was able to get result using this mongo shel query:
var projectTime = {
$project : {
_id : 1,
hostname : 1,
timestamp : 1,
networkList : 1,
webserver : 1,
deviceList : 1,
isoDate : {
$add : [new Date(0), {
$multiply : ["$timestamp", 1000]
}
]
}
}
}
var group = {
$group : {
"_id" : {
time : {
"$add" : [{
"$subtract" : [{
"$subtract" : ["$isoDate", new Date(0)]
}, {
"$mod" : [{
"$subtract" : ["$isoDate", new Date(0)]
},
1000 * 60 * 5 // 1000 milsseconds * 60 seconds * 5 minutes
]
}
]
},
new Date(0)
]
},
"hostname" : "$hostname",
"deviceList_deviceName" : "$deviceList.deviceName",
"networkList_name" : "$networkList.name",
},
xreceive : {
$sum : "$deviceList.receive"
},
xtransmit : {
$sum : "$deviceList.transmit"
},
xopenConnections : {
$avg : "$networkList.openConnections"
},
}
}
var unwindNetworkList = {
$unwind : "$networkList"
}
var unwindSeviceList = {
$unwind : "$deviceList"
}
var match = {
$match : {
"_id.time" : ISODate("2016-06-09T08:05:00.000Z")
}
}
var finalProject = {
$project : {
_id : 0,
timestamp : "$_id.time",
hostname : "$_id.hostname",
deviceList_deviceName : "$_id.deviceList_deviceName",
networkList_name : "$_id.networkList_name",
xreceive : 1,
xtransmit : 1,
xopenConnections : 1
}
}
db.baklap.aggregate([projectTime, unwindNetworkList,
unwindSeviceList,
group,
match,
finalProject
])
db.baklap.findOne()
then output:
{
"xreceive" : NumberLong(0),
"xtransmit" : NumberLong(0),
"xopenConnections" : 4.0,
"timestamp" : ISODate("2016-06-09T08:05:00.000Z"),
"hostname" : "Baklap4",
"deviceList_deviceName" : "wlan0",
"networkList_name" : "46.243.152.50"
}
{
"xreceive" : NumberLong(183263),
"xtransmit" : NumberLong(781595),
"xopenConnections" : 4.0,
"timestamp" : ISODate("2016-06-09T08:05:00.000Z"),
"hostname" : "Baklap4",
"deviceList_deviceName" : "eth0",
"networkList_name" : "46.243.152.50"
}
{
"xreceive" : NumberLong(183263),
"xtransmit" : NumberLong(781595),
"xopenConnections" : 3.0,
"timestamp" : ISODate("2016-06-09T08:05:00.000Z"),
"hostname" : "Baklap4",
"deviceList_deviceName" : "eth0",
"networkList_name" : "46.243.152.13"
}
{
"xreceive" : NumberLong(0),
"xtransmit" : NumberLong(0),
"xopenConnections" : 3.0,
"timestamp" : ISODate("2016-06-09T08:05:00.000Z"),
"hostname" : "Baklap4",
"deviceList_deviceName" : "wlan0",
"networkList_name" : "46.243.152.13"
}
The main point is be aware than every time $unwind is processed, our data gets a bit of pollution. This could give a side effect when summing data (average will be same as (2+2+3+3)/4 is same as (2+3)/2))
To check that - you could add x:{$push:"$$ROOT"} in group stage and check values after pipeline executed - as you will have all source documents for given data peroid

Mongodb aggrgation function for sum of embedded documents not working

i get an error like
Array
(
[errmsg] => exception: the $unwind field path must be specified as a string
[code] => 15981
[ok] => 0
)
While i used following query for given embedded document (I want to sum of rate_number from my given record of table structure)
global $DB, $mongo;
$theObjId = new MongoId($post_id);
$collection = $mongo->getCollection('mongo_hw_posts');
$rt_sum = $collection->aggregate(
array('$unwind'=>$rate),
array('$group'=>
array(
'_id' => $theObjId
),
array(
'rate_number'=>array('$sum' =>$rate.'rate_number')
))
);
table structure
{
"_id": ObjectId("51ff3b38636e3b9803000001"),
"class_id": NumberInt(2986),
"created_by": NumberInt(1758),
"created_datetime": NumberInt(1375681336),
"deleted": NumberInt(0),
"learn": {
"0": {
"user_id": NumberInt(0),
"learn_date": NumberInt(0)
}
},
"parent_id": "0",
"post_text": "2%20C",
"post_type": "text_comment",
"rate": {
"0": {
"user_id": NumberInt(0),
"rate_date": NumberInt(0),
"rate_number": NumberInt(0)
},
"1": {
"user_id": NumberInt(1457),
"rate_date": NumberInt(1375764137),
"rate_number": NumberInt(3)
},
"2": {
"user_id": NumberInt(1619),
"rate_date": NumberInt(1375764694),
"rate_number": NumberInt(8)
}
},
"serialized_data": "",
"unique_key": "8bdddfe8137d14702b4517f7e8e88ee3",
"user_role": "student"
}
There are a few things wrong with your aggrgation command.
$unwind
Instead of:
array( '$unwind' => $rate ),
You need to use:
array( '$unwind' => '$rate'),
$rate is not just a PHP variable, but a field-value expression in MongoDB.
But you can't use the $unwind like this either, because of:
"errmsg" : "exception: Value at end of $unwind field path '$rate' must be an Array, but is a Object",
That's because rate is:
"rate": {
"0": {
"user_id": NumberInt(0),
"rate_date": NumberInt(0),
"rate_number": NumberInt(0)
},
"1": {
"user_id": NumberInt(1457),
"rate_date": NumberInt(1375764137),
"rate_number": NumberInt(3)
},
"2": {
"user_id": NumberInt(1619),
"rate_date": NumberInt(1375764694),
"rate_number": NumberInt(8)
}
}
But it needs to look like:
"rate": [
{
"user_id": NumberInt(0),
"rate_date": NumberInt(0),
"rate_number": NumberInt(0)
},
{
"user_id": NumberInt(1457),
"rate_date": NumberInt(1375764137),
"rate_number": NumberInt(3)
},
{
"user_id": NumberInt(1619),
"rate_date": NumberInt(1375764694),
"rate_number": NumberInt(8)
}
]
Otherwise, $unwind will not work. You need to change your documents for this.
$group
Your $group is also wrong, instead of:
array('$group'=>
array(
'_id' => $theObjId
),
array(
'rate_number'=>array('$sum' =>$rate.'rate_number')
)
)
You need to make it syntax wise:
array('$group'=>
array(
'_id' => $theObjId
'rate_number'=>array('$sum' => '$rate.rate_number')
)
)
I don't understand why you have:
'_id' => $theObjId
Are you trying to only summarize the rates for one post? If that's the case, you will need to add a $match and change the $theObjId to null, something like this:
$rt_sum = $collection->aggregate(
array( '$match' => array( '_id' => $theObjId ) ),
array( '$unwind' => '$rate' ),
array( '$group'=>
array(
'_id' => null,
'rate_number' => array('$sum' => '$rate.rate_number')
)
)
);
The full example is here:
<?php
$m = new MongoClient;
$c = $m->test->so;
$c->drop();
$post_id = "51ff3b38636e3b9803000001";
$theObjId = new MongoId($post_id);
$c->insert( array(
"_id" => new MongoId("51ff3b38636e3b9803000001"),
"class_id" => 2986,
"created_by" => 1758,
"created_datetime" => 1375681336,
"deleted" => 0,
"learn" => array(
array(
"user_id" => 0,
"learn_date" => 0
)
),
"parent_id" => "0",
"post_text" => "2%20C",
"post_type" => "text_comment",
"rate" => array(
array(
"user_id" => 0,
"rate_date" => 0,
"rate_number" => 0
),
array(
"user_id" => 1457,
"rate_date" => 1375764137,
"rate_number" => 3
),
array(
"user_id" => 1619,
"rate_date" => 1375764694,
"rate_number" => 8
)
),
"serialized_data" => "",
"unique_key" => "8bdddfe8137d14702b4517f7e8e88ee3",
"user_role" => "student"
) );
$rt_sum = $c->aggregate(
array( '$match' => array( '_id' => $theObjId ) ),
array( '$unwind' => '$rate' ),
array( '$group'=>
array(
'_id' => null,
'rate_number' => array('$sum' => '$rate.rate_number')
)
)
);
var_dump ($rt_sum);
And the output is:
array(2) {
'result' =>
array(1) {
[0] =>
array(2) {
'_id' => NULL
'rate_number' => int(11)
}
}
'ok' =>
double(1)
}