Mongoose: Score query then sort by score - non text fields - mongodb

In my db, I have a collection of books.
Each have:
a count of upvotes
a count of downvotes
a count of views
I would like to sort my db by scoring as follows:
upvote: 8 points
downvote: -4 points
view: 1/2 point
So the score will be:
(NumberOfViews*(1/2)) + (NumberOfDownvotes*-4)+ (NumberOfUpvotes*8)
So if I have:
book1 = {name:'book1', views:3000,upvotes:340, downvotes:120}
book2 = {name:'book2', views:9000,upvotes:210, downvotes:620}
book3 = {name:'book3', views:7000,upvotes:6010, downvotes:2}
The score should be:
book1Score = 3740
book2Score = 3700
book3Score = 51572
And the query should output
book3,book1,book2
How can I achieve such a thing in mongoose?
Bonus: What if I want records that are more recent to rank higher than older records on that same query?
Thanks

Well I ended up doing it all inside mongoose.
I run this query every 24 hours to re-score my collection.
Book.aggregate(
[
//I match my query
{$match:query},
{
$project: {
//take the id for reference
_id: 1,
//calculate the score of the views
viewScore: {
$multiply: [ "$views", 0.5 ]
},
//calculate the score of the upvotes
upvoteScore: {
$multiply: [ {$size: '$upvotes'}, 8 ]
},
//calculate the score of the downvotes
downvoteScore: {
$multiply: [ {$size: '$downvotes'}, -4 ]
}
}
},
{
//project a second time
$project: {
//take my id for reference
_id: 1,
//get my total score
score: {
$add:['$viewScore','$upvoteScore','$downvoteScore']
},
}
},
//sort by the score.
{$sort : {'score' : -1}},
]
)

I think the best way would be to query mongoose for the list of book then do the sorting yourself.
Something like:
// Get query results from mongoose then ...
books.sort((a,b) => {
return ((a.views*(1/2))+(a.downvotes*-4)+(a.upvotes*8))-((b.view*(1/2))+ b.downvotes*-4)+(b.upvotes*8))
});
This would sort the books in ascending order of highest points
EDIT: The above answer is for sorting after you've received the query. (And also just realized you want descending for above^ so just switch the placement to be b - a)
If you want to receive the query already sorted, you could instead calculate the score at the time you input the book and add that as a field. The use mongoose's Query#sort. Which would look something like
query.sort({ score: 'desc'});
More info on Query#sort: http://mongoosejs.com/docs/api.html#query_Query-sort

Related

How to write a single query to count elements above a certain value in MongoDB

I have the following sample collection of movies:
[
{
"title":"Boots and Saddles",
"year":1909,
"cast":[],
"genres":[]
},
{
"title":"The Wooden Leg",
"year":1909,
"cast":[],
"genres":[]
},
{
"title":"The Sanitarium",
"year":1910,
"cast":["Fatty Arbuckle"],
"genres":["Comedy"]
},
{
"title":"Snow White",
"year":1916,
"cast":["Marguerite Clark"],
"genres":["Fantasy"]
},
{
"title":"Haunted Spooks",
"year":1920,
"cast":["Harold Lloyd"],
"genres":["Comedy"]
},
{
"title":"Very Truly Yours",
"year":1922,
"cast":["Shirley Mason", "lan Forrest"],
"genres":["Romance"]
}
]
I want to count number of movies appeared in the last 20 years (from the last movie recorded in this collection).
I have following query to find which year is the most recent movie (result shows 2018):
db.movies.find({},{"_id":0, "year":1}).sort({year:-1}).limit(1)
So to find how many movies appeared in the last 20 years I wrote this:
db.movies.aggregate([{$match:{year:{$gte:1999}}},{$count:"title"}])
However, this is not very optimized, because if the database is modified or updated,I will have to modify that query every time.
Is there a more elegant way to find the result?
Thank you in advance!
You can use mongodb aggregate method.
db.movies.aggregate([
{ $sort: { year: -1 } },
{ $limit: 1 },
{
$project: {
currentYear: { $year: new Date() },
latestMovieYear: "$year",
last20Years: { $subtract: [ "$currentYear", 20 ] }
}
},
{
$match: {
year: { $gte: "$last20Years", $lte: "$latestMovieYear" }
}
},
{ $count: "movies" }
]);
Sort the documents by year in descending order, and limit the number of documents to 1. It will return latest movie present in the collection.
Use the $project operator to create a new field currentYear that returns the current year, latestMovieYear that returns the year of the latest movie, and last20Years that subtracts 20 from the current year.
Use $match operator to filter out the movies that have a year greater than or equal to last20Years and less than or equal to latestMovieYear.
Use the $count operator to count the number of documents that match the above criteria.

MongoDB, Panache, Quarkus: How to do aggregate, $sum and filter

I have a table in mongodb with sales transactions each containing a userId, a timestamp and a corresponding revenue value of the specific sales transaction.
Now, I would like to query these users and getting the minimum, maximum, sum and average of all transactions of all users. There should only be transactions between two given timestamps and it should only include users, whose sum of revenue is greater than a specified value.
I have composed the corresponding query in mongosh:
db.salestransactions.aggregate(
{
"$match": {
"timestamp": {
"$gte": new ISODate("2020-01-01T19:28:38.000Z"),
"$lte": new ISODate("2020-03-01T19:28:38.000Z")
}
}
},
{
$group: {
_id: { userId: "$userId" },
minimum: {$min: "$revenue"},
maximum: {$max: "$revenue"},
sum: {$sum: "$revenue"},
avg: {$avg: "$revenue"}
}
},
{
$match: { "sum": { $gt: 10 } }
}
]
)
This query works absolutely fine.
How do I implement this query in a PanacheMongoRepository using quarkus ?
Any ideas?
Thanks!
A bit late but you could do it something like this.
Define a repo
this code is in kotkin
class YourRepositoryReactive : ReactivePanacheMongoRepository<YourEntity>{
fun getDomainDocuments():List<YourView>{
val aggregationPipeline = mutableListOf<Bson>()
// create your each stage with Document.parse("stage_obj") and add to aggregates collections
return mongoCollection().aggregate(aggregationPipeline,YourView::class.java)
}
mongoCollection() automatically executes on your Entity
YourView, a call to map related properties part of your output. Make sure that this class has
#ProjectionFor(YourEntity.class)
annotation.
Hope this helps.

Mongodb selecting every nth of a given sorted aggregation

I want to be able to retrieve every nth item of a given collection which is quite large (millions of records)
Here is a sample of my collection
{
_id: ObjectId("614965487d5d1c55794ad324"),
hour: ISODate("2021-09-21T17:21:03.259Z"),
searches: [
ObjectId("614965487d5d1c55794ce670")
]
}
My start of aggregation is like so
[
{
$match: {
searches: {
$in: [ObjectId('614965487d5d1c55794ce670')],
},
},
},
{ $sort: { hour: -1 } },
{ $project: { hour: 1 } },
...
]
I have tried many things including
$sample which does not make the pick in the good order
Using $skip makes it very slow as the number given to skip grows
Using _id instead of $skip but my ids are unfortunately not created in an ordered manner
My goal is thus to retrieve the hour of a record, every 20000 record, so that I can then make a call to retrieve data by chunks of approximately 20000 records.
I imagine it would be possible to
sort, and number every records, then keep only the first, 20000, 40000, ..., and the last
Thanks for your help and let me know if you need more information

MongoDB match the most array elements

I have a usecase where I'm not sure if it can be solved with MongoDB in any reasonably efficient way.
The DB contains Consultants, consultants have a set of available weeks (array of week numbers).
I now want to filter on the consultants with the best matching overlap of a given set of weeks.
e.g. consultants:
{
_id: ....
name: "James",
weeks: [1,2,3,4,8,9,13]
}
{
_id: ....
name: "Anna",
weeks: [2,3,4,20,23]
}
Search data: [1,2,4]
The more the overlap, the higher I want to rank the consultant in the search result.
James matches all three entries, 1,2,4. Anna matches 2,4
Is this even possible using Mongo?
You can calculate a weight for each consultant as a setIntersection between your search array and weeks array:
db.consultants.aggregate([
{
$addFields: {
weight: {
$size: { $setIntersection: [ "$weeks", [1,2,4] ] }
}
}
},
{ $sort: { weight: -1 } }
])
The longest the array the more weeks matched so you can $sort by this weight field.

MongoDB query for distinct field values that meet a conditional

I have a collection named 'sentences'. I would like a list of all the unique values of 'last_syls' where the number of entries containing that value of 'last_syls' is greater than 10.
A document in this collection looks like:
{ "_id" : ObjectId( "51dd9011cf2bee3a843f215a" ),
"last_syls" : "EY1D",
"last_word" : "maid"}
I've looked into db.sentences.distinct('last_syls'), but cannot figure out how to query based on the count for each of these distinct values.
You're going to want to use the aggregation framework:
db.sentences.aggregate([
{
$group: {
_id: "$last_syls",
count: { $sum: 1}
}
},
{
$match: {
count: { $gt: 10 }
}
}
])
This groups documents by their last_syls field with a count per group, then filters that result set to all results with a count greater than 10.