MongoDB Aggregation query doesn't return all fields - mongodb

I am trying to use the aggregation function to display info in a chart. For this example, a document in the collection looks like this (excluding unnecessary fields for this query):
{
'locid' : <someid>, #Reference to a city & state collection
'collat' : <dateobj>, #a date object when this entry was saved
'pid' : <someid>, #Reference to a person collection
'pos' : <int> #Value I am interested in matching with location & date
}
So I basically start with a pid. I use this as my first $match parameter to limit the amount of data that gets thrown into the pipeline.
array(
'$match' => array(
'pid' => new \MongoId($pid)
)
),
So now that I have selected the correct pid, I tell it I only want/need certain fields:
array(
'$project' => array(
'pos' => 1,
'collat' => 1,
'locid' => 1
)
),
The second match is to say I only care about these locations right now ($ids contains an array of locid):
array(
'$match' => array(
'locid' => array('$in' => $ids)
)
),
And finally, I am saying group all the returned documents by collat and locid
array(
'$group' => array(
'_id' => array(
'locid' => '$locid',
'collat' => '$collat'
)
)
)
While the query completes OK and returns data, I am not getting the pos field back, it is only returning the locid and collat.
Questions
Isn't that what $project is for? I use it to tell the driver what fields I want returned?
Once I get the pos field returning as well, how can I tell the driver I only want the lowest value for each locid & collat combo pair? So say there are two entries for that date, location, and person: 4 & 8. I only care about pos=4
My end goal is to create a line chart with the X-Axis as the dates (from collat) and the Y-Axis will be the pos field, and each line will plot individual locid data.
Here is the entire parameters being sent to the aggregation driver.
$ops = array(
array(
'$match' => array(
'pid' => new \MongoId($pid)
)
),
array(
'$project' => array(
'pos' => 1,
'collat' => 1,
'locid' => 1
)
),
array(
'$match' => array(
'locid' => array('$in' => $ids)
)
),
array(
'$group' => array(
'_id' => array(
'locid' => '$locid',
'collat' => '$collat'
)
)
)
);
$out = $myCollection->aggregate($ops);
Update This is the way I got it to group & return pos without throwing an error. I need to spot check it though to make sure it's actually returning the correct values though.
array(
'$group' => array(
'_id' => array(
'locid' => '$locid',
'collat' => '$collat'
),
array('$min' => '$pos')
)
)

Aggregation query is like an SQL statement group by. You are telling {$group} what field(s) you want to 'GROUP BY' but you are not telling it how you want to aggregate the grouped information.
The {$group} you want is probably something like:
{$group : { _id : { locid: "$locid", collat: "$collat"},
pos : {$min : "$pos"}
}
}

Related

How to get distinct documents by some field in mongodb phalcon?

In mongo shell the query:
db.collections.distinct('user_id');
simply gives the distinct documents.
I am working on phalcon and there is no option like
Collections::distinct()
So how do i query distinct in phalcon. Please help me
Perhaps Phalcon's answer in the forum could help: https://forum.phalconphp.com/discussion/6832/option-for-function-distinct-phalconmvccollection
They suggest using aggregations:
$data = Article::aggregate(
array(
array(
'$project' => array('category' => 1)
),
array(
'$group' => array(
'_id' => array('category' => '$category'),
'id' => array('$max' => '$_id')
)
)
)
);

MongoDB Query select with inner select with Group By

In my old MYSQL database I use the following MYSQL query.
Because I migratie my database to mongoDB, I have to migrate the queries.
In my old mysql database I used the following query
SELECT idValue,
Timestamp,
Value
FROM (
SELECT *
FROM Metingen2
WHERE idGroep = 123
ORDER BY idValue ASC ,
Timestamp DESC
)
AS t GROUP BY t.idValue;
Can anybody explain how I can do a select in a inner select query?
I tried the following, but with no success:
$ops = array(
array(
'$match' => array(
'idGroep' => (int)123,
)
),
array(
'$sort' => array(
"idValue" => 1,
"Timestamp" => -1,
)
),
array(
'$group' => array(
"_id" => array("idValue" => '$idValue',
),
),
),
array(
'$project' => array(
"_id" => '$_id.idValue',
"Timestamp" => '$_id.Timestamp',
"Value" => '$value',
),
),
);
$cursor_metingen2 = $collection_metingen2->aggregate($ops);
At a pass it looks like the query is trying to first order the content and then "GROUP" together using a "idValue" and returning the "first" results on the grouping boundary, after filtering out fo "idGroep" of course.
The inner select does not really do much here other than "filter" and "sort". The aggregation pipeline handles these differently, so what is happening in that execution is of little consequence. It's all about the results.
As such, your aggregation pipeline needs to do the same things:
$cursor = $collection_metingen2->aggregate(array(
array(
'$match' => array( 'idGroep' => 123 )
),
array(
'$sort' => array(
'idValue' => 1,
'TimeStamp' => -1
)
),
array(
'$group' => array(
'_id' => '$idValue',
'TimeStamp' => array( '$first' => '$TimeStamp' ),
'Value' => array( '$first' => '$value' )
)
),
array(
'$sort' => array( '_id' => 1 )
)
));
Noting here that when you $group, not only must the other fields included use an "accumulator" such as $first ( which should be correct here ) but you also "must" include everything you want in output.
It is a "pipeline", so the only thing that goes "into" a following stage is what comes "out" of the stage you specifiy.
And of course $sort "both" before grouping for the correct boundary values as well as at the "end" of the pipeline. The last sort is because $group does not guarantee keys in any order, but this may or may not be of any consequence to what is processing the results from here.
Also be careful with "casting", as unless you really need to convert to an integer type for comparison with data ( and you probably do not ) then you might get mismatched results. So only "cast" when you know you need to.

Mongodb-PHP: find query with '$and' function is not working

i am working on mongodb & php & want to retrive data based on multiple conditions.
I want to retrive forms whose form_status is both Active & Draft only
My query is:
$formData = $formInfo->find(array('team_id' => $_GET['id'], '$and' =>array('form_status' => 'Active','form_status' => 'Draft')));
It is not working. What could be the right syntax in PHP??
The $and operator takes a "real array" of documents as it's argument. In PHP you wrap the array to produce that kind of syntax:
$formData = $formInfo->find(
array(
'team_id' => $_GET['id'],
'$and' => array(
array( 'form_status' => 'Active' ),
array( 'form_status' => 'Draft' )
)
)
);
Note that this really woudn't make any sense unless "form_status" is actually and array itself. In which case the $all operator is a much cleaner approach:
$formData = $formInfo->find(
array(
'team_id' => $_GET['id'],
'form_status' => array(
'$all' => array( 'Active', 'Draft' )
)
)
);
And again if this field was not an array then you really meant $or but that can also be more clearly written for the same field with $in:
$formData = $formInfo->find(
array(
'team_id' => $_GET['id'],
'form_status' => array(
'$in' => array( 'Active', 'Draft' )
)
)
);
So $all is to $and what $in is to $or, but just allows you to use the same field without specifying the full document form

MongoDB finding nested objects that meet criteria

I have a MongoDB document that is structured similar to the structure below follows. I am searching based on people.search_columns.surname and people.columns.givenname. So for example, when I search for the given name of "Valentine", I want to get the document back, but Nicholas Barsaloux should not be included.
Data structure:
[_id] => MongoId Object (
[$id] => 53b1b1ab72f4f852140dbdc9
)
[name] => People From 1921
[people] => Array (
[0] => Array (
[name] => Barada, Valentine
[search_columns] => Array (
[surname] => Array (
[0] => Mardan,
[1] => Barada
)
[givenname] => Array (
[0] => Valentine
)
)
)
[1] => Array (
[name] => Barsaloux, Nicholas
[search_columns] => Array (
[surname] => Array (
[1] => Barsaloux
)
[givenname] => Array (
[0] => Nicholas
)
[place] => Array (
)
)
)
)
Here is the code I was working on:
$criteria = array("people" => array(
'$elemMatch' => array("givenname" => "Valentine")
));
$projection = array("people" => true);
$documents_with_results = $db->genealogical_data->find($criteria, $projection)->skip(0)->limit(5);
Currently that code returns zero results.
Since the arrays are nested you cannot use basic projection as you can with find. Also in order to "filter" the array contents from a document you need to "unwind" the array content first. For this you use the aggregation framework:
$results = $db->genealogical_data->aggregate(array(
array( '$match' => array(
'people.search_columns.givenname' => 'Valentine'
)),
array( '$unwind' => '$people' ),
array( '$match' => array(
'people.search_columns.givenname' => 'Valentine'
)),
array( '$group' => array(
'_id' => '$id',
'name' => array( '$first' => '$name' ),
'people' => array( '$push' => '$people' )
))
));
The point of the first $match stage is to reduce the documents that possibly match your criteria. The second time is done after the $unwind, where the actual "array" items in the document are "filtered" from the results.
The final $group puts the original array back to normal, minus the items that do not match the criteria.

MongoDB keeps giving me null

I have some difficulty in getting Mongodb aggregate to work. It keeps giving me null. Please help. Below are the codes written in php. Thanks.
What I want to do is to sum up the values of 2 fields, Requests and Responses, between 2 particular dates
try {
$mongodb = new MongoClient("mongodb://ad:pass2word1#localhost");
$database = $mongodb->selectDB('backend');
$collection = new MongoCollection($database, 'RequestSummary');
$pipeline = array(
array(
'$group' => array(
'_id' => array(
'request' => array('$sum' => '$Requests'),
'response' => array('$sum' => '$Responses')
)
)
),
array(
'$match' => array(
'RequestDate' => array(
'$gte' => intval($_SESSION['range_from']),
'$lte' => intval($_SESSION['range_to'])
)
)
)
);
$collection->aggregate($pipeline);
var_dump($g);
} catch (MongoConnectionException $exc) {
echo $exc->getTraceAsString();
}
The _id of your $group can't contain aggregation operators like $sum. Those sums need to be defined as fields at the same level as _id. If you don't want to group on a specific field you can use NULL for the _id like this:
array(
'$group' => array(
'_id' => NULL,
'request' => array('$sum' => '$Requests'),
'response' => array('$sum' => '$Responses')
)
),