Is there an quick efficient way to duplicate elements in a mongo db collections based on a property. In the example below, I am trying to duplicate the elements based on a jobId.
I am using Spring boot, so any example using Spring boot API would be even more helpful.
Original Collection
{ _id: 1, jobId: 1, product: "A"},
{ _id: 2, jobId: 1, product: "B"},
{ _id: 3, jobId: 1, product: "C"},
After duplication
{ _id: 1, jobId: 1, product: "A"},
{ _id: 2, jobId: 1, product: "B"},
{ _id: 3, jobId: 1, product: "C"},
{ _id: 4, jobId: 2, product: "A"},
{ _id: 5, jobId: 2, product: "B"},
{ _id: 6, jobId: 2, product: "C"},
You can use following aggregation:
db.col.aggregate([
{
$group: {
_id: null,
values: { $push: "$$ROOT" }
}
},
{
$addFields: {
size: { $size: "$values" },
range: { $range: [ 0, 3 ] }
}
},
{
$unwind: "$range"
},
{
$unwind: "$values"
},
{
$project: {
_id: { $add: [ "$values._id", { $multiply: [ "$range", "$size" ] } ] },
jobId: { $add: [ "$values.jobId", "$range" ] },
product: "$values.product",
}
},
{
$sort: {
_id: 1
}
},
{
$out: "outCollection"
}
])
The algorithm is quite simple here: we want to iterate over two sets:
first one defined by all items from your source collection (that's why I'm grouping by null)
second one defined artificially by $range operator. It will define how many times we want to multiply our collection (3 times in this example)
Double unwind generates as much documents as we need. Then the formula for each _id is following: _id = _id + range * size. Last step is just to redirect the aggregation output to your collection.
Related
I'm having a claim type:
type TClaim: {
insuredId: number,
treatmentInfo: { amount: number }[]
}
and a list of claims:
[
{
insuredId: 1,
treatmentInfo: [{amount: 1}, {amount: 2}]
},
{
insuredId: 1,
treatmentInfo: [{amount: 3}, {amount: 4}]
},
{
insuredId: 2,
treatmentInfo: [{amount: 1}, {amount: 2}]
}
]
I want to get the result like:
[{insuredId: 1, numberOfClaims: 2, amount: 10},{insuredId: 2, numberOfClaims: 1, amount: 3}]
I'm using the $facet operator in mongodb aggregation, one for counting numberOfClaims and one for calculating the amount of each insurer. But I can't combine it to get the result that I want.
$facet: {
totalClaims: [ { $group: { _id: '$insuredId', totalClaims: { $count: {} } } } ],
amount: [ { $unwind: { path: '$treatmentInfo'}},
{ $group:
{ _id: '$insuredId',
amount: { $sum: '$treatmentInfo.amount',
},
},
},
]
Is there a reason why you want to use $facet? - I am just curious
You just need to add a new fields that sums up all the amount in the array first and then do a group stage by insuredId. The query is pretty much self-explanatory.
db.collection.aggregate([
{
"$addFields": {
"totalAmount": {
"$sum": "$treatmentInfo.amount"
}
}
},
{
"$group": {
"_id": "$insuredId",
"numberOfClaims": {
"$sum": 1
},
"amount": {
"$sum": "$totalAmount"
}
}
}
])
Result:
[
{
"_id": 1,
"amount": 10,
"numberOfClaims": 2
},
{
"_id": 2,
"amount": 3,
"numberOfClaims": 1
}
]
MongoDB Playground
How to get the nested object in projection and group in mongodb aggregate query.
[
{
city: "Mumbai",
meta: {
luggage: 2,
scanLuggage: 1,
upiLuggage: 1
},
cash: 10
},
{
city: "Mumbai",
meta: {
luggage: 4,
scanLuggage: 3,
upiLuggage: 1
},
cash: 24
},
]
I want to $match the above on the basis of city, and return the sum of each luggage type.
My code is as follows but $project is not working -
City.aggregate([
{
$match: { city: 'Mumbai' }
},
{
$project: {
city: 1,
mata.luggage: 1,
meta.scanLuggage: 1,
meta.upiLuggage: 1
}
},
{
$group: {
id: city,
luggage: {$sum: '$meta.luggage'},
scanLuggage: {$sum: '$meta.scanLuggage'},
upiLuggage: {$sum: '$meta.upiLuggage'}
}
}
])
But the $project is throwing error. I want my output to look like -
{
city: 'Mumbai',
luggage: 6,
scanLuggage: 4,
upiLuggage: 2
}
You should specify nested fields in quotes when using in $project, and also for grouping key should be _id.
db.collection.aggregate([
{
$match: {
city: "Mumbai"
}
},
{
$project: {
city: 1,
"meta.luggage": 1,
"meta.scanLuggage": 1,
"meta.upiLuggage": 1
}
},
{
$group: {
_id: "$city",
luggage: {
$sum: "$meta.luggage"
},
scanLuggage: {
$sum: "$meta.scanLuggage"
},
upiLuggage: {
$sum: "$meta.upiLuggage"
}
}
}
])
This is the playground link.
I am trying to run the aggregate query in Mongo using $addFields and $match
.aggregate([
{
$addFields: {
level: { $sum: '$members.level' },
},
},
{
$match: {
level: { $gte: level }
},
},
{
$project: {
_id: 0,
logo: 1,
name: 1,
level: 1,
id: '$_id',
joinType: 1,
countryId: 1,
minimumJoinLevel: 1,
membersCount: { $size: '$members' },
},
},
])
The issue is that level is not an indexed field and has been calculated in the query
My question is: how I can run this query efficiently, avoid "COLLSCAN" and make it "IXSCAN" execution
Mongo Playgound
Lets say I have these results:
A)
[
{_id: 1, Name: 'A', Price: 10, xx:0},
{_id: 2, Name: 'B', Price: 15, xx:0},
{_id: 3, Name: 'A', Price: 100, xx:1},
{_id: 4, Name: 'B', Price: 150, xx:1},
]
B)
[
{_id: 1, Name: 'A', Price: 10, xx:0},
{_id: 2, Name: 'B', Price: 15, xx:0},
]
I want to:
If exists at least one x:1, return all x:1 only
If there is none x:1, return all x:0
Should I do a MAP & FILTER on root docs? or some kind of MATCH with conditionals? or Redact?
Results desired Ex.:
A) Removed x:0 because exists x:1, so returned only x:1
[
{_id: 3, Name: 'A', xx:1},
{_id: 4, Name: 'B', xx:1},
]
B) Returned only x:0 as there are only x:0
[
{_id: 1, Name: 'A', xx:0},
{_id: 2, Name: 'B', xx:0},
]
Group the documents by the xx field and add the grouped docs to the docs array using $push.
Sort the docs by the _id field in descending order.
Limit the result to 1.
If there are documents with both xx: 0 and xx: 1 values, only the xx: 1 group would be returned since we're sorting in descending order and limiting the result to the first group. If there are no documents with xx: 1 but documents with xx: 0 exist, the first group would be xx: 0 which gets returned.
You can then use $unwind to return a document for each grouped document and $replaceRoot to lift the document to the root level.
db.collection.aggregate([
{
$group: {
_id: "$xx",
docs: {
$push: "$$ROOT",
}
}
},
{
$sort: {
_id: -1,
}
},
{
$limit: 1,
},
{
$unwind: "$docs"
},
{
$replaceRoot: {
newRoot: "$docs"
},
}
])
MongoPlayground
If there might be docs with an xx value other than 0 and 1, you should filter those out using $match before grouping the docs using $group.
db.collection.aggregate([
{
$match: {
xx: {
$in: [
0,
1
]
}
}
},
{
$group: {
_id: "$xx",
docs: {
$push: "$$ROOT",
}
}
},
{
$sort: {
_id: -1,
}
},
{
$limit: 1,
},
{
$unwind: "$docs"
},
{
$replaceRoot: {
newRoot: "$docs"
},
}
])
MongoPlayground
I have this documents in my collection :
{_id: "aaaaaaaa", email: "mail1#orange.fr"},
{_id: "bbbbbbbb", email: "mail2#orange.fr"},
{_id: "cccccccc", email: "mail3#orange.fr"},
{_id: "dddddddd", email: "mail4#gmail.com"},
{_id: "eeeeeeee", email: "mail5#gmail.com"},
{_id: "ffffffff", email: "mail6#yahoo.com"}
And i would like this result :
{
result: [
{domain: "orange.fr", count: 3},
{domain: "gmail.com", count: 2},
{domain: "yahoo.com", count: 1},
]
}
I'm not sure you can use the aggregator and $regex operator
Aggregation Framework
I don't believe that with the present document structure you can achieve the desired result by using the aggregation framework. If you stored the domain name in a separate field, it would have become trivial:
db.items.aggregate(
{
$group:
{
_id: "$emailDomain",
count: { $sum: 1 }
},
}
)
Map-Reduce
It's possible to implement what you want using a simple map-reduce aggregation. Naturally, the performance will not be good on large collections.
Query
db.emails.mapReduce(
function() {
if (this.email) {
var parts = this.email.split('#');
emit(parts[parts.length - 1], 1);
}
},
function(key, values) {
return Array.sum(values);
},
{
out: { inline: 1 }
}
)
Output
[
{
"_id" : "gmail.com",
"value" : 2
},
{
"_id" : "yahoo.com",
"value" : 1
},
{
"_id" : "orange.fr",
"value" : 3
}
]
Aggregation Framework
MongoDB 3.4(Released Nov 29, 2016) onwords in aggregation framework have many methods
[
{
$project: {
domain: {
$substr: ["$email", {
$indexOfBytes: ["$email", "#"]
}, {
$strLenBytes: "$email"
}]
}
},
{
$group: {
_id: '$domain',
count: {
$sum: 1
}
}
},
{
$sort: {
'count': -1
}
},
{
$group: {
_id: null,
result: {
$push: {
'domain': "$_id",
'count': '$count'
}
}
}
}
]
Results
{
_id: null,
result: [
{domain: "#orange.fr", count: 3},
{domain: "#gmail.com", count: 2},
{domain: "#yahoo.com", count: 1},
]
}