MongoDB sort documents by specified element in array

MongoDB sort documents by specified element in array - mongodb

I have a collection like that:
[
{
student: "a",
package: [
{name: "one", createdAt: "2021-10-12T00:00:00", active: true},
{name: "two", createdAt: "2021-10-13T00:00:00", active: false},
{name: "three", createdAt: "2021-10-14T00:00:00", active: false}
]
},
{
student: "b",
package: [
{name: "one", createdAt: "2021-10-16T00:00:00", active: true},
{name: "two", createdAt: "2021-10-17T00:00:00", active: false},
{name: "three", createdAt: "2021-10-18T00:00:00", active: false}
]
},
{
student: "c",
package: [
{name: "one", createdAt: "2021-10-10T00:00:00", active: true},
{name: "two", createdAt: "2021-10-17T00:00:00", active: false},
{name: "three", createdAt: "2021-10-18T00:00:00", active: false}
]
}
]
I have no idea how can I do a query (Mongodb) to sort this collection based on the createdAt with active: true in the package array?
The expectation looks like this:
[
{
student: "c",
package: [
{name: "one", createdAt: "2021-10-10T00:00:00", active: true},
...
]
},
{
student: "a",
package: [
{name: "one", createdAt: "2021-10-12T00:00:00", active: true},
...
]
},
{
student: "b",
package: [
{name: "one", createdAt: "2021-10-16T00:00:00", active: true},
...
]
},
]
Could anyone help me with this? The idea comes up to my mind just to use the code to sort it, but is it possible to use a query MongoDB?

Query
creates a sort-key for each document, this is the latest date of the active package members (the $reduce does, this keeping the max date)
sort by it
unset to remove this extra key
*for descended or ascedent, you can chage the $gt with $lt and the sort 1, with sort -1. depending on what you need. If you use $lt replace "0" also with a max string like "9". Or if you have real dates, with a min or max date.
PlayMongo
aggregate(
[{"$set":
{"sort-key":
{"$reduce":
{"input": "$package",
"initialValue": "0",
"in":
{"$cond":
[{"$and":
["$$this.active", {"$gt": ["$$this.createdAt", "$$value"]}]},
"$$this.createdAt", "$$value"]}}}}},
{"$sort": {"sort-key": 1}},
{"$unset": ["sort-key"]}])

You can use this aggregation query:
First $unwind to deconstruct the array ang get each value.
Then $sort by active.
$group to get the initial data but sorted.
And last $sort again by createdAt.
db.collection.aggregate([
{
"$unwind": "$package"
},
{
"$set": {
"package.createdAt": {
"$toDate": "$package.createdAt"
}
}
},
{
"$sort": {
"package.active": -1
}
},
{
"$group": {
"_id": "$_id",
"student": {
"$first": "$student"
},
"package": {
"$push": "$package"
}
}
},
{
"$sort": {
"package.createdAt": 1
}
}
])
Example here
Also, to do the sorting, is better if createdAt is a Date field, otherwise you should parse to date. Like this example

Related

MongoDB Use logical OR in group stage

I have a MongoDB collection which looks like this:
{user: "1", likes: 111, isEnabled: true},
{user: "1", likes: 222, isEnabled: false},
{user: "2", likes: 333, isEnabled: false},
{user: "2", likes: 444, isEnabled: false},
I want to sum group the them up by the user, sum up the likes and check if one of the documents of a user has "isEnabled" = true (basically a logical OR).
I want this result at the end:
{user: "1", allLikes: "333", isAnyEnabled: true},
{user: "2", allLikes: "777", isAnyEnabled: false},
I can use the $sum accumulator of the group stage to sum up the likes. My problem is that there seems to be no group operator which supports logical operations on booleans. Is there any way to do this?
I tried this so far
db.myCollection.aggregate([{$match: {
"_id":"$user",
"allLikes": {
"$sum": "$likes"
},
"isAnyEnabled":{"$or":"$isEnabled"}
}}])
But it seems that $or is not supported for this: unknown group operator '$or'

You can use $anyElementTrue in a project stage like this:
First $group by _id to get all likes sumed and a list of all isEnabled values
Then use $anyElementTrue to check if there is at least one true and then isAnyEnabled is true or false.
db.collection.aggregate([
{
"$group": {
"_id": "$user",
"allLikes": {
"$sum": "$likes"
},
"enabled": {
"$push": "$isEnabled"
}
}
},
{
"$project": {
"_id": "$_id",
"allLikes": "$allLikes",
"isAnyEnabled": {
"$anyElementTrue": [
"$enabled"
]
}
}
}
])
Example here

Adding a new field to the N most reset docs by day in MongoDB 4.2

I have a collection of documents with scheme :
{ _id: ObjectId, userId: ObjectId, marker: string, datetime: Date, etc... }
This is a collection of markers (marker) bounded to a user (userId). The date of bound is stored in datetime field.
Each day user can receive an arbitrary number of markers.
When I'm fetching data from this collection, I need to add an extra field called allowed of type boolean and this field have to be true only if this record is in the N most resent records for calendar day for a user.
For example, if initial collection looks like this and N == 2 :
{_id: ..., userId: "a", marker: "m1", datetime: "2020-01-01.10:00"}
{_id: ..., userId: "a", marker: "m2", datetime: "2020-01-02.10:00"}
{_id: ..., userId: "a", marker: "m3", datetime: "2020-01-02.11:00"}
{_id: ..., userId: "a", marker: "m4", datetime: "2020-01-02.12:00"}
{_id: ..., userId: "a", marker: "m5", datetime: "2020-01-02.13:00"}
{_id: ..., userId: "b", marker: "m1", datetime: "2020-01-01.10:00"}
{_id: ..., userId: "b", marker: "m2", datetime: "2020-01-01.11:00"}
{_id: ..., userId: "b", marker: "m3", datetime: "2020-01-01.13:00"}
{_id: ..., userId: "b", marker: "m4", datetime: "2020-01-02.11:00"}
{_id: ..., userId: "b", marker: "m5", datetime: "2020-01-02.12:00"}
{_id: ..., userId: "b", marker: "m6", datetime: "2020-01-03.10:00"}
Then final result should look like this:
{_id: ..., userId: "a", marker: "m1", datetime: "2020-01-01.10:00", allowed: true}
{_id: ..., userId: "a", marker: "m2", datetime: "2020-01-02.10:00", allowed: true}
{_id: ..., userId: "a", marker: "m3", datetime: "2020-01-02.11:00", allowed: true}
{_id: ..., userId: "a", marker: "m4", datetime: "2020-01-02.12:00", allowed: false}
{_id: ..., userId: "a", marker: "m5", datetime: "2020-01-02.13:00", allowed: false}
{_id: ..., userId: "b", marker: "m1", datetime: "2020-01-01.10:00", allowed: true}
{_id: ..., userId: "b", marker: "m2", datetime: "2020-01-01.11:00", allowed: true}
{_id: ..., userId: "b", marker: "m3", datetime: "2020-01-01.13:00", allowed: false}
{_id: ..., userId: "b", marker: "m4", datetime: "2020-01-02.11:00", allowed: true}
{_id: ..., userId: "b", marker: "m5", datetime: "2020-01-02.12:00", allowed: true}
{_id: ..., userId: "b", marker: "m6", datetime: "2020-01-03.10:00", allowed: true}
I'm using MongoDB 4.2.

Please try below queries :
Query 1:
db.markers.aggregate([
/** group docs based on userId & date(2020-01-01), push all matched docs to data */
{ $group: { _id: { userId: '$userId', datetime: { $arrayElemAt: [{ $split: ["$datetime", "."] }, 0] } }, data: { $push: '$$ROOT' } } },
/** Re-forming data field with added new field allowed for only docs where criteria is met */
{
$addFields: {
data: {
$map:
{
input: "$data",
as: "each",
/** conditional check to add new field on only docs which are 0 & 1 position of array */
in: { $cond: [{ $lte: [{ $indexOfArray: ["$data", '$$each'] }, 1] }, { $mergeObjects: ['$$each', { allowed: true }] }, { $mergeObjects: ['$$each', { allowed: false }] }] }
}
}
}
},
/** unwind data */
{ $unwind: '$data' },
/** making data object as root level doc */
{ $replaceRoot: { newRoot: "$data" } }])
Query 2:
db.markers.aggregate([
{ $group: { _id: { userId: '$userId', datetime: { $arrayElemAt: [{ $split: ["$datetime", "."] }, 0] } }, data: { $push: '$$ROOT' } } }, {
$addFields: {
data: {
$map:
{
input: "$data",
as: "each",
in: {
$cond: [{
$or: [{ $eq: [{ $arrayElemAt: ["$data", -1] }, '$$each'] }, { $eq: [{ $arrayElemAt: ["$data", -2] }, '$$each'] }]
},
{ $mergeObjects: ['$$each', { allowed: true }] },
{ $mergeObjects: ['$$each', { allowed: false }] }]
}
}
}
}
}, { $unwind: '$data' }, { $replaceRoot: { newRoot: "$data" } }])
Query1 will work & get you the results, but assuming data given in question is sample data & in real-time when you look at collection userId: "a", marker: "m5" will be first document as if this collection has continuous data writes then latest document would will have latest data time, So Query1's index 0 or 1 will not work, but here Query2 would work. You can use Query1 if markers collection has exactly same ordered data as given in question.
Note : In Query2 - We can use same logic of Query1 (which is to check indexes(0,1)) instead of object comparison but this can be applicable only if we've $sort of dateTime field as first stage, And I haven't gone that route is because sorting on a whole collection's data on a field would not be efficient than this.

MongoDB how to add conditional $match condition

I have the following schema
{
f1: "test",
f2: "something",
type: "A",
date: "2018-11-01T00:00:00.000Z",
deleted: false
},
{
"f1": "check",
type: "B",
deleted: false
}
Now what I want is to get all data, and if type = "A", then add an extra condition to my match query, suppose compare its date with current date.
My current query is:
db.getCollection('myCollection').aggregate([
{$match:{
{"deleted":false},
// I want to check if type is A then compare its date
}}
])

You could try an $or and say "If it's not type A or the date is x":
{$match:{
$and: [
{deleted: false},
$or: [
{type: {$ne: 'A'}},
{date: {$gte: ISODate("2018-01-01T00:00:00.0Z"), $lt: ISODate("2018-06-01T00:00:00.0Z")}}
]
]
}}

Use $match with $or condition.
db.getCollection('tests').aggregate([
{ $match: {
$or : [
{ "deleted": false, "type": "A", "date": ISODate("2018-11-01T00:00:00.000Z") },
{ "deleted": false, type: { $ne: "A" }}
]}
}
])

MongoDB Aggregation SUM Array of Arrays by object key

Okay, so I've been searching for a while but couldn't find an answer to this, and I am desperate :P
I have some documents with this syntax
{
"period": ISODate("2018-05-29T22:00:00.000+0000"),
"totalHits": 13982
"hits": [
{
// some fields...
users: [
{
// some fields...
userId: 1,
products: [
{ productId: 1, price: 30 },
{ productId: 2, price: 30 },
{ productId: 3, price: 30 },
{ productId: 4, price: 30 },
]
},
]
}
]
}
And I want to retrieve a count of how many products (Independently of which user has them) we have on a period, an example output would be like this:
[
{
"period": ISODate("2018-05-27T22:00:00.000+0000"),
"count": 432
},
{
"period": ISODate("2018-05-28T22:00:00.000+0000"),
"count": 442
},
{
"period": ISODate("2018-05-29T22:00:00.000+0000"),
"count": 519
}
]
What is driving me crazy is the "object inside an array inside an array" I've done many aggregations but I think they were simpler than this one, so I am a bit lost.
I am thinking about changing our document structure to a better one, but we have ~6M documents which we would need to transform to the new one and that's just a mess... but Maybe it's the only solution.
We are using MongoDB 3.2, we can't update our systems atm (I wish, but not possible).

You can use $unwind to expand your array, then use $group to sum:
db.test.aggregate([
{$match: {}},
{$unwind: "$hits"},
{$project: {_id: "$_id", period: "$period", users: "$hits.users"}},
{$unwind: "$users"},
{$project: {_id: "$_id", period: "$period", subCout: {$size: "$users.products"}}},
{$group: {"_id": "$period", "count": {$sum: "$count"}}}
])

Why is $match not used in the Mongo Aggregation query?

As described in the mongo documentation:
https://docs.mongodb.com/manual/reference/sql-aggregation-comparison/
There is a query for the following SQL query:
SELECT cust_id,
SUM(li.qty) as qty
FROM orders o,
order_lineitem li
WHERE li.order_id = o.id
GROUP BY cust_id
And the equivalent mongo aggregation query is as follows:
db.orders.aggregate( [
{ $unwind: "$items" },
{
$group: {
_id: "$cust_id",
qty: { $sum: "$items.qty" }
}
}
] )
However, the query is workinf fine as expected. My question, why is there no $match clause for the corresponding WHERE clause in SQL? And how is $unwind compensating the $match clause?

The comment by #Veeram is correct. The where clause in the SQL is unnecessary because the items list is embedded in the orders collection, where in a relational database you would have both an orders table and an orders_lineitem table (names taken from the description at https://docs.mongodb.com/manual/reference/sql-aggregation-comparison/)
Per the example data, you start with documents like this:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: [ { sku: "xxx", qty: 25, price: 1 },
{ sku: "yyy", qty: 25, price: 1 } ]
}
When you $unwind, the items are unwound but the rest of the data is projected. If you run a query like
db.orders.aggregate([ {"$unwind": "$items"} ])
you get the output
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: { sku: "xxx", qty: 25, price: 1 }
},
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: { sku: "yyy", qty: 25, price: 1 }
}
That has flattened the items array, allowing the $group to add the items.qty field:
db.orders.aggregate([
{"$unwind": "$items"},
{"$group": {
"_id": "$cust_id",
"qty": {"$sum": "$items.qty"}
}
}])
With the output:
{ "_id": "abc123",
"qty": 50
}