mongodb aggregate apply a function to a field - mongodb

As part of an aggregate I need to run this transformation:
let inheritances = await db.collection('inheritance').aggregate([
{ $match: { status: 1 }}, // inheritance active
{ $project: { "_id":1, "name": 1, "time_trigger": 1, "signers": 1, "tree": 1, "creatorId": 1, "redeem": 1, "p2sh": 1 } },
{ $lookup:
{
from: "user",
let: { creatorId: { $concat: [ "secretkey", { $toString: "$creatorId" } ] }, time_trigger: "$time_trigger"},
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$_id", sha256( { $toString: "$$creatorId" } ) ] },
{ $gt: [ new Date(), { $add: [ { $multiply: [ "$$time_trigger", 24*60*60*1000 ] }, "$last_access" ] } ] },
]
}
}
},
],
as: "user"
},
},
{ $unwind: "$user" }
]).toArray()
creatorId comes from a lookup, and in order to compare it to _id I first need to do a sha256.
How can I do it?
Thanks.

External functions will not work with the aggregation framework. Everything is parsed to BSON by default. It is all basically processed from BSON operators to native C++ code implementation, This is by design for performance.
Basically in short, you can't do this. I recommend just storing the hashed value on every document as a new field, otherwise you'll have to do it in code just before the pipeline.

Related

Update or Insert object in array in MongoDB

I have the following collection
{
"_id" : ObjectId("57315ba4846dd82425ca2408"),
"myarray" : [
{
userId : "8bc32153-2bea-4dd5-8487-3b65e3aa0869",
Time:2022-09-20T04:44:46.000+00:00,
point : 5
},
{
userId : "5020db46-3b99-4c2d-8637-921d6abe8b26",
Time:2022-09-20T04:44:49.000+00:00
point : 2
},
]
}
These are my questions
I want to push into myarray if userId doesn’t exist, and if userid already exists then update time and point also I have to keep only 5 elements in the array if 6th element comes then I a have to sort the array based on Time and remove oldest time entry
what is the best way to do this in mongo using aggregation
FYI we are using Mongo 4.4
You can achieve this by using the aggregation pipeline update syntax, the strategy will be first to update the array (or insert a new element to it).
then if the size exceeds 5 we just filter it based on minimum value. like so:
const userObj = {point: 5, userId: "12345"};
db.collection.updateOne(
{ ...updateCondition },
[
{
$set: {
myarray: {
$cond: [
{
$in: [
userObj.userId,
{$ifNull: ["$myarray.userId", []]}
]
},
{
$map: {
input: "$myarray",
in: {
$cond: [
{
$eq: [
"$$this.userId",
userObj.userId
]
},
{
$mergeObjects: [
"$$this",
{
Time: "$$NOW", // i used "now" time but you can swap this to your input
point: userObj.point
}
]
},
"$$this"
]
}
}
},
{
$concatArrays: [
{ $ifNull: ["$myarray", []] },
[
{
userId: userObj.userId,
point: userObj.point,
Time: "$$NOW"
}
]
]
}
]
}
}
},
{
$set: {
myarray: {
$cond: [
{
$gt: [
{
$size: "$myarray"
},
5
]
},
{
$filter: {
input: "$myarray",
cond: {
$ne: [
"$$this.Time",
{
$min: "$myarray.Time"
}
]
}
}
},
"$myarray"
]
}
}
}
])
Mongo Playground

$switch inside a $match MONGODB

Hi i am trying to use MONGODB query inside TIBCO jasperstudio to create a report
What I am trying to do is filter the data using two parameters #orderitemuid and #ordercatuid. My case is if I put a parameter using #orderitemuid, it will disregard the parameter for #ordercatuid. Vise versa, if I put a parameter using #ordercatuid, it will disregard the parameter for #orderitemuid. But there is also an option when using bot parameters in the query. I used a $switch inside the $match but I am getting an error. Below is the $match I am using
{
$match: {
$switch: {
branches: [
{
case: { $eq: [{ $IfNull: [$P{orderitemuid}, 0] }, 0] },
then: { 'ordcat._id': { '$eq': { '$oid': $P{ordercatuid} } } },
},
{
case: { $eq: [{ $IfNull: [$P{ordercatuid}, 0] }, 0] },
then: { '_id': { '$eq': { '$oid': $P{orderitemuid} } } },
},
],
default: {
$expr: {
$and: [
{ $eq: ['_id', { '$oid': $P{orderitemuid} }] },
{ $eq: ['ordcat_id', { '$oid': $P{ordercatuid} }] },
],
},
},
},
},
}
Thank you in advance
As mentioned in the $match docs
$match takes a document that specifies the query conditions. The query syntax is identical to the read operation query syntax; i.e. $match does not accept raw aggregation expressions. ...
And $switch is an aggregation expressions. this means it cannot be used in a $match stage without being wrapped with $expr.
You can however wrap it with $expr, this will also require you to restructure the return values a little bit, like so:
db.collection.aggregate([
{
$match: {
$expr: {
$switch: {
branches: [
{
case: {
$eq: [
{
$ifNull: [
$P{orderitemuid},
0
]
},
0
]
},
then: {
$eq: [
"$ordcat._id",
{"$oid":$P{ordercatuid}}
]
}
},
{
case: {
$eq: [
{
"$ifNull": [
$P{ordercatuid},
0
]
},
0
]
},
then: {
$eq: [
"$_id",
{"$oid":$P{orderitemuid}}
]
}
}
],
default: {
$and: [
{
$eq: [
"$_id",
{"$oid": $P{orderitemuid} }
]
},
{
$eq: [
"$ordcat_id",
{"$oid": $P{ordercatuid}}
]
}
]
}
}
}
}
}
])
Mongo Playground

Mongo DB Join on Primary/Foreign Key

I have two collections, viz: clib and mp.
The schema for clib is : {name: String, type: Number} and that for mp is: {clibId: String}.
Sample Document for clib:
{_id: ObjectId("6178008397be0747443a2a92"), name: "c1", type: 1}
{_id: ObjectId("6178008397be0747443a2a91"), name: "c2", type: 0}
Sample Document for mp:
{clibId: "6178008397be0747443a2a92"}
{clibId:"6178008397be0747443a2a91"}
While Querying mp, I want those clibId's that have type = 0 in clib collection.
Any ideas how this can be achieved?
One approach that I can think of was to use $lookUp, but that doesnt seem to be working. Also, I m not sure if this is anti-pattern for mongodb, another approach is to copy the type from clib to mp while saving mp document.
If I've understood correctly you can use a pipeline like this:
This query get the values from clib where its _id is the same as clibId and also has type = 0. Also I've added a $match stage to not output values where there is not any coincidence.
db.mp.aggregate([
{
"$lookup": {
"from": "clib",
"let": {
"id": "$clibId"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
{
"$toObjectId": "$$id"
},
"$_id"
]
},
{
"$eq": [
"$type",
0
]
}
]
}
}
}
],
"as": "result"
}
},
{
"$match": {
"result": {
"$ne": []
}
}
}
])
Example here
db.mp.aggregate([
{
$lookup: {
from: "clib",
let: {
clibId: "$clibId"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [ "$_id", "$$clibId" ],
}
]
}
}
},
{
$project: { type: 1, _id: 0 }
}
],
as: "clib"
}
},
{
"$unwind": "$clib"
},
{
"$match": {
"clib.type": 0
}
}
])
Test Here

Optimise MongoDB aggregate query performance

I have next DB structure:
Workspaces:
Key
Index
PK
id
id
content
Projects:
Key
Index
PK
id
id
FK
workspace
workspace_1
deleted
deleted_1
content
Items:
Key
Index
PK
id
id
FK
project
project_1
type
_type_1
deleted
deleted_1
content
I need to calculate a number of items of each type for each project in workspace, e.g. expected output:
[
{ _id: 'projectId1', itemType1Count: 100, itemType2Count: 50, itemType3Count: 200 },
{ _id: 'projectId2', itemType1Count: 40, itemType2Count: 100, itemType3Count: 300 },
....
]
After few attempts and some debugging I've created a query which provides output I needed:
const pipeline = [
{ $match: { workspace: 'workspaceId1' } },
{
$lookup: {
from: 'items',
let: { id: '$_id' },
pipeline: [
{
$match: {
$expr: {
$eq: ['$project', '$$id'],
},
},
},
// project only fields necessary for later pipelines to not overload
// memory and to not get `exceeded memory limit for $group` error
{ $project: { _id: 1, type: 1, deleted: 1 } },
],
as: 'items',
},
},
// Use $unwind here to optimize aggregation pipeline, see:
// https://stackoverflow.com/questions/45724785/aggregate-lookup-total-size-of-documents-in-matching-pipeline-exceeds-maximum-d
// Without $unwind we may get an `matching pipeline exceeds maximum document size` error.
// Error appears not in all requests and it's really strange and hard to debug.
{ $unwind: '$items' },
{ $match: { 'items.deleted': { $eq: false } } },
{
$group: {
_id: '$_id',
items: { $push: '$items' },
},
},
{
$project: {
_id: 1,
// Note: I have only 3 possible item types, so it's OK that it's names hardcoded.
itemType1Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type1'] },
},
},
},
itemType2Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type2'] },
},
},
},
itemType3Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type3'] },
},
},
},
},
},
]
const counts = await Project.aggregate(pipeline)
Query works like expected, but very slow... If I have some about 1000 items in one workspace it takes about 8 seconds to complete. Any ideas how to make it faster are appreciated.
Thanks.
Assuming your indexs are properly indexed that they contain the "correct" fields, we can still have some tweaks on the query itself.
Approach 1: keeping existing collection schema
db.projects.aggregate([
{
$match: {
workspace: "workspaceId1"
}
},
{
$lookup: {
from: "items",
let: {id: "$_id"},
pipeline: [
{
$match: {
$expr: {
$and: [
{$eq: ["$project","$$id"]},
{$eq: ["$deleted",false]}
]
}
}
},
// project only fields necessary for later pipelines to not overload
// memory and to not get `exceeded memory limit for $group` error
{
$project: {
_id: 1,
type: 1,
deleted: 1
}
}
],
as: "items"
}
},
// Use $unwind here to optimize aggregation pipeline, see:
// https://stackoverflow.com/questions/45724785/aggregate-lookup-total-size-of-documents-in-matching-pipeline-exceeds-maximum-d
// Without $unwind we may get an `matching pipeline exceeds maximum document size` error.
// Error appears not in all requests and it's really strange and hard to debug.
{
$unwind: "$items"
},
{
$group: {
_id: "$_id",
itemType1Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type1"]},
"then": 1,
"else": 0
}
}
},
itemType2Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type2"]},
"then": 1,
"else": 0
}
}
},
itemType3Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type1"]},
"then": 1,
"else": 0
}
}
}
}
}
])
There are 2 major changes:
moving the items.deleted : false condition into the $lookup subpipeline to lookup less items documents
skipped items: { $push: '$items' }. Instead, do a conditional sum in later $group stage
Here is the Mongo playground for your reference. (at least for the correctness of the new query)
Approach 2: If the collection schema can be modified. We can denormalize projects.workspace into the items collection like this:
{
"_id": "i1",
"project": "p1",
"workspace": "workspaceId1",
"type": "type1",
"deleted": false
}
In this way, you can skip the $lookup. A simple $match and $group will suffice.
db.items.aggregate([
{
$match: {
"deleted": false,
"workspace": "workspaceId1"
}
},
{
$group: {
_id: "$project",
itemType1Count: {
$sum: {
"$cond": {
"if": {$eq: ["$type","type1"]},
"then": 1,
"else": 0
}
}
},
...
Here is the Mongo playground with denormalized schema for your reference.

How to convert to ObjectId and match dates on MongoDB lookup?

I am grabbing an id from a nested value in a schema, then using that to lookup the id from another table and also matching on a couple dates from that table. I've tried a regular match/lookup/unwind/match and also a lookup/let/pipeline technique. In both cases, it ignores matching on the date for some reason. What am I missing?
Here is one method for reference. I'm not sure where to put the sort either since it doesn't seem to pull $meeting out to sort on.
EXAMPLE RECORDS
PRODUCT
{
"_id" : ObjectId("5f36c0df6d5553e6af208cac"),
"items" : [
{
"paramType" : "Meeting",
"paramValue" : "5f36c0df6d5553e6af208cab"
}
],
"ownerId" : ObjectId("12345678901234567")
}
MEETING
{
"_id" : ObjectId("5f36c0df6d5553e6af208cab"),
"startDate" : ISODate("2020-08-18T10:00:00.000+0000"),
"endDate" : ISODate("2020-08-18T11:00:00.000+0000")
}
AGGREGATE
db.getCollection("products").aggregate(
[
{
$match: {
"ownerId": ObjectId("12345678901234567")
}
},
{
$unwind: "$items"
},
{
$lookup: {
from: "meetings",
let: { "meetingId": '$items.paramValue' },
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$_id", "$$meetingId"] },
{
$eq: ["meeting.startDate", {
"$gte": ["$meeting.startDate", ISODate("2020-08-01T00:00:00.000Z")]
}]
},
{
$eq: ["meeting.endDate", {
"$lte": ["$meeting.endDate", ISODate("2020-08-31T23:59:59.999Z")]
}]
}
],
},
},
},
],
as: "meeting"
}
},
{
$unwind: "$meeting"
},
{
$project: {
"_id": 1,
"items": 1,
"meeting": "$meeting"
}
},
{
$sort: {
'meeting.startDate': 1
}
},
]
);
It might be because item.paramValue is not converted to an ObjectId before the lookup. But can't figure out how to convert it inside an aggregate. I tried this, but no go
{
$addFields: {
"convertedMeetingId": { $toObjectId: "$items.paramValue" }
}}
let: { "meetingId": "$convertedMeetingId" }
There are quick fixes in $lookup other looks good,
let you can convert meetingId string to ObjectId here using $toObjectId
$gte and $lte, you used $meeting.startDate and $meeting.endDate it should be $startDate and $endDate because you are already inside meeting lookup.
i am not sure why you have used $eq and with $gte and $lte, if i am not wrong i have corrected and removed $eq it will work directly.
{
$lookup: {
from: "meetings",
let: { meetingId: { $toObjectId: "$items.paramValue" } },
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$_id", "$$meetingId"] },
{ $gte: ["$startDate", ISODate("2020-08-01T00:00:00.000Z")] },
{ $lte: ["$endDate", ISODate("2020-08-31T23:59:59.999Z")] }
]
}
}
}
],
as: "meeting"
}
},
Playground