Is there an equivalent NTILE function in MongoDB? - mongodb

I want to divide my collection into n groups with equal number of rows. Is there an equivalent to SQL's NTILE function in MongoDB?
I'd like to create a MongoDB view for my collections that will add an additional column (partition_no)

Yes there is. The $bucketAuto aggregation stage:
db.collection.aggregate(
[
{
'$bucketAuto': {
"groupBy": "$_id",
"buckets": n
}
}
]
)
This returns the structure of:
{ _id: { min: minIdOfBucket, max: maxIdOfBucket }, count: number }[]

Related

MongoDB, Panache, Quarkus: How to do aggregate, $sum and filter

I have a table in mongodb with sales transactions each containing a userId, a timestamp and a corresponding revenue value of the specific sales transaction.
Now, I would like to query these users and getting the minimum, maximum, sum and average of all transactions of all users. There should only be transactions between two given timestamps and it should only include users, whose sum of revenue is greater than a specified value.
I have composed the corresponding query in mongosh:
db.salestransactions.aggregate(
{
"$match": {
"timestamp": {
"$gte": new ISODate("2020-01-01T19:28:38.000Z"),
"$lte": new ISODate("2020-03-01T19:28:38.000Z")
}
}
},
{
$group: {
_id: { userId: "$userId" },
minimum: {$min: "$revenue"},
maximum: {$max: "$revenue"},
sum: {$sum: "$revenue"},
avg: {$avg: "$revenue"}
}
},
{
$match: { "sum": { $gt: 10 } }
}
]
)
This query works absolutely fine.
How do I implement this query in a PanacheMongoRepository using quarkus ?
Any ideas?
Thanks!
A bit late but you could do it something like this.
Define a repo
this code is in kotkin
class YourRepositoryReactive : ReactivePanacheMongoRepository<YourEntity>{
fun getDomainDocuments():List<YourView>{
val aggregationPipeline = mutableListOf<Bson>()
// create your each stage with Document.parse("stage_obj") and add to aggregates collections
return mongoCollection().aggregate(aggregationPipeline,YourView::class.java)
}
mongoCollection() automatically executes on your Entity
YourView, a call to map related properties part of your output. Make sure that this class has
#ProjectionFor(YourEntity.class)
annotation.
Hope this helps.

MongoDB: Add field to all objects in array, based on other fields on same object?

I am fairly new to MongoDB and cant seem to find a solution to this problem.
I have a database of documents that has this structure:
{
id: 1
elements: [ {elementId: 1, nr1: 1, nr2: 3}, {elementId:2, nr1:5, nr2: 10} ]
}
I am looking for a query that can add a value nr3 which is for example nr2/nr1 to all the objects in the elements array, so that the resulting document would look like this:
{
id: 1
elements: [ {elementId: 1, nr1: 1, nr2: 3, nr3:3}, {elementId:2, nr1:5, nr2: 10, nr3: 2} ]
}
So I imagine a query along the lines of this:
db.collection.updateOne({id:1}, {$set:{"elements.$[].nr3": nr2/nr1}})
But I cant find how to get the value of nr2 and nr1 of the same object in the array.
I found some similar questions on stackoverflow stating this is not possible, but they were 5+ years old, so I thought maybe they have added support for something like this.
I realize I can achieve this with first querying the document and iterate over the elements-array doing updates along the way, but for the purpose of learning I would love to see if its possible to do this in one query.
You can use update with aggregation pipeline starting from MongoDB v4.2,
$map to iterate loop of elements
divide nr2 with nr1 using $divide
merge current object and new field nr3 using $mergeObjects
db.collection.updateOne(
{ id: 1 },
[{
$set: {
elements: {
$map: {
input: "$elements",
in: {
$mergeObjects: [
"$$this",
{ nr3: { $divide: ["$$this.nr2", "$$this.nr1"] } }
]
}
}
}
}
}]
)
Playground
db.collection.update(
{ id:1},
{ "$set": { "elements.$[elem].nr3":elements.$[elem].nr2/elements.$[elem].nr1} },
{ "multi": true }
);
I guess this should work

MongoDB, right projection subfield [duplicate]

Is it possible to rename the name of fields returned in a find query? I would like to use something like $rename, however I wouldn't like to change the documents I'm accessing. I want just to retrieve them differently, something that works like SELECT COORINATES AS COORDS in SQL.
What I do now:
db.tweets.findOne({}, {'level1.level2.coordinates': 1, _id:0})
{'level1': {'level2': {'coordinates': [10, 20]}}}
What I would like to be returned is:
{'coords': [10, 20]}
So basically using .aggregate() instead of .find():
db.tweets.aggregate([
{ "$project": {
"_id": 0,
"coords": "$level1.level2.coordinates"
}}
])
And that gives you the result that you want.
MongoDB 2.6 and above versions return a "cursor" just like find does.
See $project and other aggregation framework operators for more details.
For most cases you should simply rename the fields as returned from .find() when processing the cursor. For JavaScript as an example, you can use .map() to do this.
From the shell:
db.tweets.find({},{'level1.level2.coordinates': 1, _id:0}).map( doc => {
doc.coords = doc['level1']['level2'].coordinates;
delete doc['level1'];
return doc;
})
Or more inline:
db.tweets.find({},{'level1.level2.coordinates': 1, _id:0}).map( doc =>
({ coords: doc['level1']['level2'].coordinates })
)
This avoids any additional overhead on the server and should be used in such cases where the additional processing overhead would outweigh the gain of actual reduction in size of the data retrieved. In this case ( and most ) it would be minimal and therefore better to re-process the cursor result to restructure.
As mentioned by #Neil Lunn this can be achieved with an aggregation pipeline:
And starting Mongo 4.2, the $replaceWith aggregation operator can be used to replace a document by a sub-document:
// { level1: { level2: { coordinates: [10, 20] }, b: 4 }, a: 3 }
db.collection.aggregate(
{ $replaceWith: { coords: "$level1.level2.coordinates" } }
)
// { "coords" : [ 10, 20 ] }
Since you mention findOne, you can also limit the number of resulting documents to 1 as such:
db.collection.aggregate([
{ $replaceWith: { coords: "$level1.level2.coordinates" } },
{ $limit: 1 }
])
Prior to Mongo 4.2 and starting Mongo 3.4, $replaceRoot can be used in place of $replaceWith:
db.collection.aggregate(
{ $replaceRoot: { newRoot: { coords: "$level1.level2.coordinates" } } }
)
As we know, in general, $project stage takes the field names and specifies 1 or 0/true or false to include the fields in the output or not, we also can specify the value against a field instead of true or false to rename the field. Below is the syntax
db.test_collection.aggregate([
{$group: {
_id: '$field_to_group',
totalCount: {$sum: 1}
}},
{$project: {
_id: false,
renamed_field: '$_id', // here assigning a value instead of 0 or 1 / true or false effectively renames the field.
totalCount: true
}}
])
Stages (>= 4.2)
$addFields : {"New": "$Old"}
$unset : {"$Old": 1}

Count occurrences of duplicate values

How do I structure my MongooseJS/MongoDB query to get total duplicates/occurrences of a particular field value? Aka: The total documents with custID of some value for all custIDs
I can do this manually in command line:
db.tapwiser.find({"custID" : "12345"}, {}, {}).count();
Outputs: 1
db.tapwiser.find({"custID" : "6789"}, {}, {}).count();
Outputs: 4
I found this resource:
How to sum distinct values of a field in a MongoDB collection (utilizing mongoose)
But it requires that I specify the unique fields I want to sum.
In this case, I want to loop through all documents, sum the occurrences of each.
All you need to do is $group your documents by custID and use the $sum accumulator operator to return "count" for each group.
db.tapwiser.aggregate(
[
{ "$group": { "_id": "$custID", "count": { "$sum": 1 } } }
], function(err, results) {
// Do something with the results
}
)

MongoDB query for distinct field values that meet a conditional

I have a collection named 'sentences'. I would like a list of all the unique values of 'last_syls' where the number of entries containing that value of 'last_syls' is greater than 10.
A document in this collection looks like:
{ "_id" : ObjectId( "51dd9011cf2bee3a843f215a" ),
"last_syls" : "EY1D",
"last_word" : "maid"}
I've looked into db.sentences.distinct('last_syls'), but cannot figure out how to query based on the count for each of these distinct values.
You're going to want to use the aggregation framework:
db.sentences.aggregate([
{
$group: {
_id: "$last_syls",
count: { $sum: 1}
}
},
{
$match: {
count: { $gt: 10 }
}
}
])
This groups documents by their last_syls field with a count per group, then filters that result set to all results with a count greater than 10.