Unable to aggregate results in MongoDB - mongodb

I'm an experienced SQL user, but am an absolute MongoDB/JSON newbie. I'm trying to aggregate results from a couple collections in our database here and keep running into this error: uncaught exception: SyntaxError: missing : after property id :
This is the script I'm using:
db.transactions.aggregate([
{
$match:
{
$and:
[
{
"updated_at": { $gte: ISODate("2022-01-01") }
},
{
"updated_at": { $lte: ISODate("2022-03-31") }
},
]
}
},
{
$lookup:
{
from: 'clients',
localField: 'client',
foreignField: '_id',
as: 'clients'
}
},
{
$unwind: '$clients'
},
{
$addFields:
{
"client_name": "$clients.client_name"
,"client_label": "$clients.client_label"
,"client_code": "$clients.client_code"
,"client_country": "$clients.client_country"
,"client_base_currency": "$clients.client_base_currency"
,"client_invoice_currency": "$clients.client_invoice_currency"
}
},
{
$project:
{
client_name: 1
,client_label: 1
,client_code: 1
,client_country: 1
,client_base_currency: 1
,client_invoice_currency: 1
,updated_at: 1
,usd_value: 1
}
},
{
$group:
{
_id:
{ $dateToString: { "date": "$updated_at", "format": "%Y-%m" } }
,"$client_name"
,"$client_label"
,"$client_code"
,"$client_country"
,"$client_base_currency"
,"$client_invoice_currency"
,total_vol: { $sum: "$usd_value" }
}
}
])
With some Googling I've been able to put together this script, but now I'm getting stuck. I'm sure it's happening in the $group stage, as when I comment out that whole part, the query works fine.
I'm basically trying to get the equivalent of this SQL script:
select
(extract(year from t.updated_at) * 100 + extract(month from t.updated_at)) as year_month
,c.client_name
,c.client_label
,c.client_code
,c.client_country
,c.client_base_currency
,c.client_invoice_currency
,sum(t.usd_value) as total_vol
from transactions t
left join clients c
on t.client = c._id
where t.update_at between '2022-01-01' and '2022-03-31'
group by 1,2,3,4,5,6,7
Any suggestions on how to correctly do this? I know this is fairly basic, but it's not entirely clicking yet, the whole JSON syntax.

Related

mongodb/aggregation/lookup - Optimize lookup pipeline to only filter the matched record from previous pipeline and also aggregating it

DB Schema:
3 Collections - `Collection1`, `Collection2`, `Collection3`
Column Structure:
Collection1 - [ _id, rid, aid, timestamp, start_timestamp, end_timestamp ]
Collection2 - [ _id, rid, aid, task ]
Collection3 - [ _id, rid, event_type, timestamp ]
MONGO DB VERSION: 4.2.14
The problem statement is, we need to join all the three collections using rid. Collection1 is the parent source from where we need the analysis from.
Collection2 contains the task for each record for rid in collection1. We need a "count" for each rid in this collection. Collection3 contains the event log for each record for rid. This is quite huge so we just need to filter only two events EventA & EventB for each rid found in pipeline-1.
I could come up with this but it's not working. I am not getting the min date from Collection3 for each rid matched in previous pipeline.
Note: Event logs min date for each event should be associated with rid matched in Collection1 filter.
Query:
db.getCollection("Collection1").aggregate([
{
$match: {
start_timestamp: {
$gte: new Date(ISODate().getTime() - 1000 * 60 * 15),
},
},
},
{
$lookup: {
from: "Collection2",
localField: "rid",
foreignField: "rid",
as: "tasks",
},
},
{
$lookup: {
from: "Collection3",
pipeline: [
{
$match: {
event: {
$in: ["EventA", "EventB"]
}
}
},
{
$group: {
_id: "$event",
timestamp: {
"$min": {
"updated": "$timestamp"
}
}
}
}
],
as: "eventlogs",
},
}
]);
Expected Output:
[
rid: "123",
aid: "456",
timestamp: ISODate("2022-06-03T09:46:39.609Z"),
start_timestamp: ISODate("2022-06-03T09:46:39.657Z"),
tasks: [
count: 5
],
logs: [
{
"_id": "EventA",
"timestamp": {
"updated": ISODate("2022-04-27T06:10:44.323Z")
}
},
{
"_id": "EventB",
"timestamp": {
"updated": ISODate("2022-05-05T06:36:10.271Z")
}
}
]
]
I need to write a highly optimized query which would do the above in less time (Assuming proper indexes are in place for each collection on columns). That query should not do COLLSCAN for the entire data in Collection3, since it's going to be quite huge.

Is there a way to use find and aggregate together in MongoDB?

I m a MongoDB begginer and I have the following problem:
I have a document format(sorry for lack of definition) as follows in MongoDB:
And I want to query the top 10 albums of the worst genre of a decade I choose.
Firstly I did an aggregate that gave me in the last stage the worst genre of the decade I choose to use as comparison later (BDA1 being my database and album my collection I want to aggregate and find on):
BDA1.album.aggregation(
[
{
$addFields: {
release_date: {
$toDate: "$release_date"
}
}
},
{
$addFields: {
sales_amount: {
$convert: {
input: "$sales_amount",
to: "int"
}
}
}
},
{
$match: {
"release_date": {
$gte: new ISODate("2009-01-01"),
$lt: new ISODate("2021-01-01")
}
}
},
{
$unwind: {
path: "$band.band_genre",
}
},
{
$group: {
_id: "$band.band_genre",
total: {
$sum: "$sales_amount"
}
}
},
{
$sort: {
total: 1
}
},
{
$limit: 1
}
])
(Sorry for the lack of good formatting but I took the code from a pipeline I used to do the aggregation in MongoDB Compass.)
That resulted in:
But my question now is: how do I do to use that aggregate result in what I can only assume is a find command where band.band_genre equals to the genre I just calculated in the aggregation?
I have been searching SO for a while with no results and google as well.
Any suggestions?
(Anything that I have forgot to mention that u feel is important to understand the problem please say and I will edit it in)

translate sql query into mongodb query

I'm trying to grasp the mongodb concepts by translating some of our sql queries into mongo aggregation framework.
I have an sql code:
select dbo.VisitNo(u.id) as visitNo , o.id, o.PatientId, u.VisitDate
from dbo.Observation o
join sbo.ProspectiveFollowUp u on u.rootid = o.Id
order by o.PatientId
The dbo.VisitNo is implemented as:
CREATE FUNCTION dbo.VisitNo(#Id int)
RETURNS INT
AS
BEGIN
DECLARE #VisitDate date, #RootId int
SELECT #VisitDate=VisitDate, #RootId=RootId FROM dbo.ProspectiveFollowUp WHERE Id=#Id
RETURN (SELECT COUNT(1) FROM dbo.ProspectiveFollowUp WHERE RootId = #RootId AND VisitDate <= #VisitDate)
END
result:
My document in Mongo has following structure:
{
"_id",
"values":[
{
"Id",
"PatientId",
"ProspectiveFollowUp":[
"Id",
"RootId",
"VisitDate"
]
}
]
}
The values array has always one element, but that's how the data was imported. ProspectiveFollowUp has at least one record.
Creating query for retrieving the data was rather easy:
db.dbo_ObservationJSON.aggregate([
{ $unwind: '$values' },
{
$project: {
_id: 0,
Id: '$values.Id',
PatientId: '$values.PatientId',
VisitDate: '$values.ProspectiveFollowUp.VisitDate'
}
},
{ $unwind: '$VisitDate' },
{ $sort: { PatientId: 1 } }
])
The harder part is the custom function itself. I can't think outside od tsql world yet, so I have hard time getting this to work. I have translated the function into mongo the following way:
var id = 4
var result = db.dbo.ObservationJSON.aggregate([
{ $unwind: '$values' },
{ $unwind: '$values.ProspectiveFollowUp' },
{ $project: { Id: '$values.ProspectiveFollowUp.Id', RootId: '$values.ProspectiveFollowUp.RootId', VisitDate: '$values.ProspectiveFollowUp.VisitDate', _id:0 }},
{ $match: { Id: id }}
]).toArray()[0]
var totalResult = db.dbo_ObservationJSON.aggregate([{
$unwind: {
path: '$values'
}
}, {
$unwind: {
path: '$values.ProspectiveFollowUp'
}
}, {
$project: {
Id: '$values.ProspectiveFollowUp.Id',
RootId: '$values.ProspectiveFollowUp.RootId',
VisitDate: '$values.ProspectiveFollowUp.VisitDate'
}
}, {
$match: {
RootId: result.RootId,
VisitDate: {
$lte: result.VisitDate
}
}
},{$count: 'total'}]).toArray()[0]
But don't know how to integrate it into the aggregation function above.
Can I write the entire sql query equivalent into one mongo aggregate expression?
I finally got it to work.
db.dbo_ObservationJSON.aggregate([
{ $unwind: '$values' },
{ $unwind: { path: '$values.ProspectiveFollowUp', "includeArrayIndex": "index" } },
{
$project: {
_id: 0,
VisitNo: { $add: ['$index', 1] },
RootId: '$values.ProspectiveFollowUp.RootId',
PatientId: '$values.PatientId',
VisitDate: '$values.ProspectiveFollowUp.VisitDate'
}
},
{
$sort: {
PatientId: 1
}
}
]);

SQL to Mongo Aggregation

Hi I want to change my sql query to mongo aggregation.
select c.year, c.minor_category, count(c.minor_category) from Crime as c
group by c.year, c.minor_category having c.minor_category = (
Select cc.minor_category from Crime as cc where cc.year=c.year group by
cc.minor_category order by count(*) desc, cc.minor_category limit 1)
I tried do something like this:
db.crimes.aggregate({
$group: {
"_id": {
year: "$year",
minor_category :"$minor_category",
count: {$sum: "$minor_category"}
}
},
},
{
$match : {
minor_category: ?
}
})
But i stuck in $match which is equivalent to having, but i dont know how to make subqueries in mongo like in my sql query.
Can anybody can help me ?
Ok based on the confirmation above , the below query should work.
db.crime.aggregate
([
{"$group":{"_id":{"year":"$year","minor":"$minor"},"count":{"$sum":1}}},
{"$project":{"year":"$_id.year","count":"$count","minor":"$_id.minor","document":"$$ROOT"}},
{"$sort":{"year":1,"count":-1}},
{"$group":{"_id":{"year":"$year"},"orig":{"$first":"$document"}}},
{"$project":{"_id":0,"year":"$orig._id.year","minor":"$orig._id.minor","count":"$orig.count"}}
)]
This translates into the following MongoDB query:
db.crime.aggregate({
$group: { // group by year and minor_catetory
_id: {
"year": "$year",
"minor_category": "$minor_category"
},
"count": { $sum: 1 }, // count all documents per group,
}
}, {
$sort: {
"count": -1, // sort descending by count
"minor_category": 1 // and ascending by minor_category
}
}, {
$group: { // now we get the highst element per year
_id: "$_id.year", // so group by year
"minor_category": { $first: "$_id.minor_category" }, // and get the first (we've sorted the data) value
"count": { $first: "$count" } // same here
}
}, {
$project: { // remove the _id field and add the others in the right order (if needed)
"_id": 0,
"year": "$_id",
"minor_category": "$minor_category",
"count": "$count"
}
})

Find or forEach Inside Aggregate MongoDB

Let's say I have two collection in my database call rumahsakit.
First collection called dim_dokter:
[{
"_id": ObjectId("58b22c79e8c1c52bf3fad997"),
"nama_dokter": "Dr. Basuki Hamzah",
"spesialisasi": "Spesialis Farmakologi Klinik",
"alamat": " Jalan Lingkar Ring Road Utara, Yogyakarta "
}, {
"_id": ObjectId("58b22c79e8c1c52bf3fad998"),
"nama_dokter": "Dr. Danie Nukman",
"spesialisasi": "Spesialis Anak",
"alamat": " Jalan Sudirman, Yogyakarta "
}, {
"_id": ObjectId("58b22c79e8c1c52bf3fad999"),
"nama_dokter": "Dr. Bambang Kurnia",
"spesialisasi": "Spesialis Mikrobiologi Klinik",
"alamat": " Jalan Ahmad Yani, Yogyakarta "
}]
Second collection called fact_perawatan:
[{
"_id": ObjectId("58b22d13e8c1c52bf3fad99a"),
"nama_pasien": "Clark",
"detail_perawatan": [{
"id_dokter": ObjectId("58b22c79e8c1c52bf3fad997"),
"jumlah_obat": 1
}, {
"id_dokter": ObjectId("58b22c79e8c1c52bf3fad998"),
"jumlah_obat": 1
}]
}]
Collection fact_perawatan have the id_dokter that is actually point to the dim_dokter._id . I want to do aggregation to show this data in fact_perawatan collection but instead showing just the id_dokter, I want to use nama_dokter from dim_dokter.
This is my code so far:
db.fact_perawatan.aggregate([
{
$match:
{
'_id': db.fact_perawatan.find({"_id" : ObjectId("58b22d13e8c1c52bf3fad99a")})[0]._id
}
},
{
$project:
{
nama_pasien: db.fact_perawatan.find({"_id": ObjectId("58b22d13e8c1c52bf3fad99a")})[0].nama_pasien,
perawatan: [
{
dokter: db.dim_dokter.find({"_id" : db.fact_perawatan.find({"_id": ObjectId("58b22d13e8c1c52bf3fad99a")})[0].detail_perawatan[0].id_dokter})[0].nama_dokter,
},
]
}
}
])
result:
{ "_id" : ObjectId("58b22d13e8c1c52bf3fad99a"), "nama_pasien" : "Clark", "perawatan" : [ { "dokter" : "Dr. Basuki Hamzah" } ] }
Those code can get the nama_dokter from dim_dokter, but only one data. In my case, the data can be up to 5. do the detail_perawatan[0] to [5] is not solution.
So, this code:
db.dim_dokter.find({"_id" : db.fact_perawatan.find({"_id": ObjectId("58b22d13e8c1c52bf3fad99a")})[0].detail_perawatan[0].id_dokter})[0].nama_dokter,
How make the code above to loop as many as the data in there ? so I can get all the data.
Thanks
You can use a $lookup to left join with dim_dokter following by a $group to regroup your data in perawatan field :
db.fact_perawatan.aggregate([{
$match: {
'_id': ObjectId("58b22d13e8c1c52bf3fad99a")
}
}, {
$unwind: "$detail_perawatan"
}, {
$lookup: {
from: "dim_dokter",
localField: "detail_perawatan.id_dokter",
foreignField: "_id",
as: "dim_dokter"
}
}, {
$unwind: "$dim_dokter"
}, {
$group: {
_id: "$_id",
nama_pasien: {
$first: "$nama_pasien"
},
perawatan: {
$push: {
"dokter": "$dim_dokter.nama_dokter"
}
}
}
}])
The $unwind is used to deconstruct the array field detail_perawatan to be able to $lookup afterwards