Nested aggregation subgrouping results with Mongodb aggregation framework - mongodb

I have a dataset with metrics collected from a group of sensors.
My dataset looks like this:
{type: 1, display: 'foo', value: 'A'}
{type: 2, display: 'bar', value: 'B'}
{type: 2, display: 'foo', value: 'B'}
I am trying to aggregate the results and get some meaning insights via a REST API. I am somehow trying to produce aggregated results as:
[{
type: 1,
displays: [
{
name: 'foo',
count: 1
}
],
values: [
{
name: 'A',
count: 1
}
],
total_count: 1
},{
type: 2,
displays: [
{
name: 'foo',
count: 1
} , {
name: 'bar',
count: 1
}
],
values: [
{
name: 'B',
count: 2
}
],
total_count: 2
}]
Summarizing the aggregated results and producing shallow results is straight forward, I am struggling though as I can't created the nested counters for types and displays all together.
I have tried to use various aggregation operators with no luck.
Basically I can get one group by types or displays as:
db.logs.aggregate([
{
$group: {
_id: {
type: '$type',
display: '$display'
},
count: { $sum: 1 }
}
}, {
$group: {
_id: '$_id.type',
displays: {
$push: {
name: "$_id.display",
count: "$count"
}
}
}
}
]);
Any help will be highly appreciated.

Related

Get only matched array object along with parent fields

I also checked the following question and tried various other things but
couldn't get it working
Retrieve only the queried element in an object array in MongoDB collection
I have the following document sample
{
_id: ObjectId("634b08f7eb5cb6af473e3ab2"),
name: 'India',
iso_code: 'IN',
states: [
{
name: 'Karnataka',
cities: [
{
name: 'Hubli Tabibland',
pincode: 580020,
location: { type: 'point', coordinates: [Array] }
},
{
name: 'Hubli Vinobanagar',
pincode: 580020,
location: { type: 'point', coordinates: [Array] }
},
{
name: 'Hubli Bengeri',
pincode: 580023,
location: { type: 'point', coordinates: [Array] }
},
{
name: 'Kusugal',
pincode: 580023,
location: { type: 'point', coordinates: [Array] }
}
]
}
]
}
I need only the following
{
_id: ObjectId("634b08f7eb5cb6af473e3ab2"),
name: 'India',
iso_code: 'IN',
states: [
{
name: 'Karnataka',
cities: [
{
name: 'Kusugal',
pincode: 580023,
location: { type: 'point', coordinates: [Array] }
}
]
}
]
}
Following is the query that I have tried so far but it returns all the cities
db.countries.find(
{
'states.cities': {
$elemMatch: {
'name' : 'Kusugal'
}
}
},
{
'_id': 1,
'name': 1,
'states.name': 1,
'states.cities.$' : 1
}
);
I was able to achieve it with the help of aggregation.
db.countries.aggregate([
{ $match: { "states.cities.name": /Kusugal/ } },
{ $unwind: "$states" },
{ $unwind: "$states.cities" },
{ $match: { "states.cities.name": /Kusugal/ } }
]);
1st line $match will query the records with cities with only Kusugal
2nd & 3rd line $unwind will create a separate specific collection of documents from the filtered records
3rd line $match will filter these records again based on the condition
In simple aggregation processes commands and sends to next command and returns as an single result.

Algolia retrieve results by multiple facets

First of all, I am using Algolia JavaScript API Client V3 (Deprecated)
I have the following records
{
category: SEDAN,
manufacturer: Volkswagen,
id: '123'
},
{
category: COUPE,
manufacturer: Renault,
id: '234'
},
{
category: SEDAN,
manufacturer: Fiat,
id: '345'
},
{
category: COUPE,
manufacturer: Peugeot,
id: '456'
},
{
category: SUV,
manufacturer: Volkswagen,
id: '567'
}
I want to query Algolia and get something similar to the following json
{
categories: {
SEDAN: {
count: 2
items: [{
Volkswagen: {
count 1,
items: [{
id: '123'
}]
}
},
{
Fiat: {
count 1,
items: [{
id: '345'
}]
}
}]
},
COUPE: {
count: 2
items: [{
Renault: {
count 1,
items: [{
id: '234'
}]
}
},
{
Peugeot: {
count 1,
items: [{
id: '456'
}]
}
}]
},
SUV: {
count: 1,
items: [{
Volkswagen: {
count 1,
items: [{
id: '567'
}]
}
}]
}
}
}
I have been trying to query Algolia
index
.search({
query: '',
facets: ['category', 'manufacturer'],
attributesToRetrieve: []
})
.then((result) => {
console.log(result.facets);
});
But I am not sure if it is possible to combine the facets
facets added to a query doesn't work that way. It will simply return the record count for each facet value, not the actual records (https://www.algolia.com/doc/api-reference/api-parameters/facets/)
You can create filters around facets and use those to display results by facet value, but there isn't a way to build a single response JSON that is already grouped by facets like you show above. https://www.algolia.com/doc/api-reference/api-parameters/filters/

mongo db - map reduce and lookup

Is it possible to perform both a map reduce with a lookup in the same query pipeline efficiently?
Let's say I've two collections:
items: { _id, group_id, createdAt }
purchases: { _id, item_id }
I want to get the top n item groups, based on the number of purchases on the most recent x items per group.
If I had the number of purchases available in the item documents, then I could aggregate and sort, but this is not the case.
I can get the most recent x items per group as so:
let x = 3;
let map = function () {
emit(this.group_id, { items: [this] });
};
let reduce = function (key, values) {
return { items: getLastXItems(x, values.map(v => v.items[0])) };
};
let scope = { x };
db.items.mapReduce(map, reduce, { out: { inline: 1 }, scope }, function(err, res) {
if (err) {
...
} else {
// res is an array of { group_id, items } where items is the last x items of the group
}
});
But I'm missing purchase count so I can't use it to sort groups, and output the top n groups (which btw I'm not even sure I can do)
I'm using this on a web server, and running the query with scope variable depending on the user context, so I don't want to output the result to another collection and have to do everything inline.
=== edit 1 === add data example:
Sample data could be:
// items
{ _id: '1, group_id: 'a', createdAt: 0 }
{ _id: '2, group_id: 'a', createdAt: 2 }
{ _id: '3, group_id: 'a', createdAt: 4 }
{ _id: '4, group_id: 'b', createdAt: 1 }
{ _id: '5, group_id: 'b', createdAt: 3 }
{ _id: '6, group_id: 'b', createdAt: 5 }
{ _id: '7, group_id: 'b', createdAt: 7 }
{ _id: '8, group_id: 'c', createdAt: 5 }
{ _id: '9, group_id: 'd', createdAt: 5 }
// purchases
{ _id: '1', item_id: '1' }
{ _id: '2', item_id: '1' }
{ _id: '3', item_id: '3' }
{ _id: '4', item_id: '5' }
{ _id: '5', item_id: '5' }
{ _id: '6', item_id: '6' }
{ _id: '7', item_id: '7' }
{ _id: '8', item_id: '7' }
{ _id: '9', item_id: '7' }
{ _id: '10', item_id: '3' }
{ _id: '11', item_id: '9' }
and sample result with n = 3 and x = 2 would be:
[
group_id: 'a', numberOfPurchasesOnLastXItems: 4,
group_id: 'b', numberOfPurchasesOnLastXItems: 3,
group_id: 'c', numberOfPurchasesOnLastXItems: 1,
]
I think this can be solved with the aggregation pipeline, but I've no idea on how bad this is, especially performance wise.
Concerns I have are:
will the aggregation pipeline be able to benefits from indexes, on lookup and sort?
can the lookup + projection that's only used to count matching items be simplified
Anyway, I think one solution I could be:
x = 2;
n = 3;
items.aggregate([
{
$lookup: {
from: 'purchases',
localField: '_id',
foreignField: 'item_id',
as: 'purchases',
},
},
/*
after the join, the data is like {
_id: <itemId>,
group_id: <itemGroupId>,
createdAt: <itemCreationDate>,
purchases: <arrayOfPurchases>,
}
*/
{
$project: {
group_id: 1,
createdAt: 1,
pruchasesCount: { $size: '$purchases' },
}
}
/*
after the projection, the data is like {
_id: <itemId>,
group_id: <itemGroupId>,
createdAt: <itemCreationDate>,
purchasesCount: <numberOfPurchases>,
}
*/
{
$sort: { createdAt: 1 }
},
{
$group: {
_id: '$group_id',
items: {
$push: '$purchasesCount',
}
}
}
/*
after the group, the data is like {
_id: <groupId>,
items: <array of number of purchases per item, sorted per item creation date>,
}
*/
{
$project: {
numberOfPurchasesOnMostRecentItems: { $sum: { $slice: ['$purchasesCount', x] } },
}
}
/*
after the projection, the data is like {
_id: <groupId>,
numberOfPurchasesOnMostRecentItems: <number of purchases on the last x items>,
}
*/
{
$sort: { numberOfPurchasesOnMostRecentItems: 1 }
},
{ $limit : n }
]);

mongodb is it possible to get all keys and values of object in documents in the collection?

Is it possible to get all keys and values of object in documents in the collection?
I have a collection in mongo db with structure like
[{
_id: '55534c2e2750b4394debedd2',
selected_options: {
name: 'test',
size: 'S',
color: 'red'
}
},
{
_id: '55534c2e2750b4394debedd3',
selected_options: {
name: 'test2',
size: 'S',
color: 'red'
}
},
{
_id: '55534e087f01fa2a4d30f7f5',
selected_options: {
name: 'test3',
size: 'm',
color: 'green'
}
}
........
]
how can i get output like :
[{
name: 'name',
values: ['test', 'test2', 'test3']
},
{
name: 'size',
values: ['S', 'm']
},
{
name: 'color',
values: ['red', 'green']
}
]
I think you have no option to achieve your result without processing on your client side. However you can try Aggregation Framework to achieve something similar to your desired output with just a single query.
db.yourCollection.aggregate([
{$group:
{
_id: null,
names: {$addToSet: '$selected_options.name'},
sizes: {$addToSet: '$selected_options.size'},
colors: {$addToSet: '$selected_options.color'},
}
},
{$project:
{_id: 0, names: 1, colors: 1, sizes: 1}
}
])
This will output the following:
{
names: ['test', 'test2', 'test3'],
sizes: ['S', 'm'],
colors: ['red', 'green']
}
Another option is running a distinct() for each field.
See the example output in the documentation page.

Query sub-documents with an offset in MongoDB

Given the following data:
{
_id: '123',
name: 'Foobar',
friends: [
{ name: 'a' },
{ name: 'b' },
{ name: 'c' },
{ name: 'd' },
{ name: 'e' }
]
}
Is there a way to query MongoDB to return a list of friends with an offset - e.g. skip the first two friends in the array ('a' and 'b') and return only 'c', 'd' and 'e'?
I've tried to use $slice, but it seem to require a "limit" as well, e.g.
db.users.findOne({ _id: '123' }, { friends: { $slice: [2,-1] } })
This will not work, since the "limit" (-1 in the above example) needs to be a positive integer.
It isn't terribly elegant, but just provide a limit value large enough to effectively not be a limit:
db.users.findOne({ _id: '123' }, { friends: { $slice: [2,1000000000] } })