Meteor collection get last document of each selection - mongodb

Currently I use the following find query to get the latest document of a certain ID
Conditions.find({
caveId: caveId
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
How can I use the same using multiple ids with $in for example
I tried it with the following query. The problem is that it will limit the documents to 1 for all the found caveIds. But it should set the limit for each different caveId.
Conditions.find({
caveId: {$in: caveIds}
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
One solution I came up with is using the aggregate functionality.
var conditionIds = Conditions.aggregate(
[
{"$match": { caveId: {"$in": caveIds}}},
{
$group:
{
_id: "$caveId",
conditionId: {$last: "$_id"},
diveDate: { $last: "$diveDate" }
}
}
]
).map(function(child) { return child.conditionId});
var conditions = Conditions.find({
_id: {$in: conditionIds}
},
{
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});

You don't want to use $in here as noted. You could solve this problem by looping through the caveIds and running the query on each caveId individually.

you're basically looking at a join query here: you need all caveIds and then lookup last for each.
This is a problem of database schema/denormalization in my opinion: (but this is only an opinion!):
You could as mentioned here, lookup all caveIds and then run the single query for each, every single time you need to look up last dives.
However I think you are much better off recording/updating the last dive inside your cave document, and then lookup all caveIds of interest pulling only the lastDive field.
That will give you immediately what you need, rather than going through expensive search/sort queries. This is at the expense of maintaining that field in the document, but it sounds like it should be fairly trivial as you only need to update the one field when a new event occurs.

Related

Mongodb to fetch top 100 results for each category

I have a collection of transactions that has below schema:
{
_id,
client_id,
billings,
date,
total
}
What I want to achieve is to get the 10 latest transaction models based on the date for a list of client IDs. I don't think the $slice well as the use case is mostly for embedded arrays.
Currently, I am iterating through the client_ids and using find with the limit but it is extremely slow.
UPDATE
Example
https://mongoplayground.net/p/urKH7HOxwqC
This shows two clients with 10 transaction each on different days, I want to write a query that would return latest 5 transaction for each.
Any suggestions of how to query data to make it faster?
The most efficient way would be to just execute multiple queries, 1 for each client, like so:
const clients = await db.collection.distinct('client_id');
const results = await Promise.all(
clients.map((clientId) => db.collection.find({client_id: clientId}).sort({date: -1}).limit(5))
)
To improve this performance make sure you have an index on client_id and date. If for whatever reason you can't built these indexes I'd recommend using this following pipeline (with syntax available starting version 5.3+):
db.collection.aggregate([
{
$group: {
_id: "$client_id",
latestTransactions: {
"$bottomN": {
"n": 5,
"sortBy": {
"date": 1
},
"output": "$$ROOT"
}
}
}
}
])
Mongo Playground

aggregating and sorting based on a Mongodb Relationship

I'm trying to figure out if what I want to do is even possible in Mongodb. I'm open to all sorts of suggestions regarding more appropriate ways to achieve what I need.
Currently, I have 2 collections:
vehicles (Contains vehicle data such as make and model. This data can be highly unstructured, which is why I turned to Mongodb for this)
views (Simply contains an IP, a date/time that the vehicle was viewed and the vehicle_id. There could be thousands of views)
I need to return a list of vehicles that have views between 2 dates. The list should include the number of views. I need to be able to sort by the number of views in addition to any of the usual vehicle fields. So, to be clear, if a vehicle has had 1000 views, but only 500 of those between the given dates, the count should return 500.
I'm pretty sure I could perform this query without any issues in MySQL - however, trying to store the vehicle data in MySQL has been a real headache in the past and it has been great moving to Mongo where I can add new data fields with ease and not worry about the structure of my database.
What do you all think?? TIA!
As it turns out, it's totally possible. It took me a long while to get my head around this, so I'm posting it up for future google searches...
db.statistics.aggregate({
$match: {
branch_id: { $in: [14] }
}
}, {
$lookup: {
from: 'vehicles', localField: 'vehicle_id', foreignField: '_id', as: 'vehicle'
}
}, {
$group: {
_id: "$vehicle_id",
count: { $sum: 1 },
vehicleObject: { $first: "$vehicle" }
}
}, { $unwind: "$vehicleObject" }, {
$project: {
daysInStock: { $subtract: [ new Date(), "$vehicleObject.date_assigned" ] },
vehicleObject: 1,
count: 1
}
}, { $sort: { count: -1 } }, { $limit: 10 });
To explain the above:
The Mongodb aggregate framework is the way forward for complex queries like this. Firstly, I run a $match to filter the records. Then, we use $lookup to grab the vehicle record. Worth mentioning here that this is a Many to One relationship here (lots of stats, each having a single vehicle). I can then group on the vehicle_id field, which will enable me to return one record per vehicle with a count of the number of stats in the group. As it is a group, we technically have lots of copies of that same vehicle document now in each group, so I then add just the first one into the vehicleObject variable. This would be fine, but $first tends to return an array with a single entry (pointless in my opinion), so I added the $unwind stage to pull the actual vehicle out. I then added a $project stage to calculate an additional field, sorted by the count descending and limited the results to 10.
And take a breath :)
I hope that helps someone. If you know of a better way to do what I did, then I'm open to suggestions to improve.

mongodb how to get a document which has max value of each "group with the same key" [duplicate]

This question already has answers here:
MongoDB - get documents with max attribute per group in a collection
(2 answers)
Closed 5 years ago.
I have a collection:
{'_id':'008','name':'ada','update':'1504501629','star':3.6,'desc':'ok', ...}
{'_id':'007','name':'bob','update':'1504501614','star':4.2,'desc':'gb', ...}
{'_id':'005','name':'ada','update':'1504501532','star':3.2,'desc':'ok', ...}
{'_id':'003','name':'bob','update':'1504501431','star':4.5,'desc':'bg', ...}
{'_id':'002','name':'ada','update':'1504501378','star':3.4,'desc':'no', ...}
{'_id':'001','name':'ada','update':'1504501325','star':3.6,'desc':'ok', ...}
{'_id':'000','name':'bob','update':'1504501268','star':4.3,'desc':'gg', ...}
...
if I want the result is, the max value of 'update' of the same 'name', means the newest document of 'name', get the whole document:
{'_id':'008','name':'ada','update':'1504501629','star':3.6,'desc':'ok', ...}
{'_id':'007','name':'bob','update':'1504501614','star':4.2,'desc':'gb', ...}
...
How to do it most effective?
I do it now in python is:
result=[]
for name in db.collection.distinct('name'):
result.append(db.collection.find({'name':name}).sort('update',-1)[0])
is it do 'find' too many times?
=====
I do this for crawl data with 'name', get many other keys, and every time I insert a document, I set a key named 'update'.
When I using the database, I want the newest document of specific 'name'. so it looks can not just use $group.
How should I do? re design the db structure or better way to find?
=====
Improved !
I've tried create index of 'name' & 'update', the process is shortened from half hour to 30 seconds!
But I still welcome for better solution ^_^
Your use case scenario suits real good for aggregation. As I see in your question you already know that but can't figure out how to use $group and take whole document that has the max update. If you $sort your documents before $groupyou can use $firstoperator. So no need to send a find query for each name.
db.collection.aggregate(
{ $sort: { "name": 1, "update": -1 } },
{ $group: { _id: "$name", "update": { $first: "$update" }, "doc_id": { $first: "$_id" } } }
)
I did not add an extra $projectoperation to aggregate, you can just add fields that you want in result to $groupwith $firstoperator.
Additionally, if you look closer to $sortoperation, you can see it uses your newly created index, so you did good to add that, otherwise I will recommend it too :)
Update: For your question in comment:
You should write all keys in $group. But if you think it will look bad or new fileds will come in future and does not want to rewrite $groupeach time, I would do that:
First get all _idfields of desired documents in aggregation and then get these documents in one findquery with $inoperator.
db.collection.find( { "_id": { $in: [<ids returned in aggregation] } } )

Aggregation with meteorhacks:aggregate (why would I ever use $out)?

This is a significant edit to this question, as I have changed the publication to a method and narrowed the scope of the question.
I am using meteorhacks:aggregate to calculate and publish the average and median of company valuation data for a user-selected series of companies. The selections are saved in a Valuations collection for reference and the data for aggregation comes from the Companies collection.
The code below works fine for one-time use (although it's not reactive). However, users will rerun this aggregation for thousands of valuationIds. Since $out will first clear out the new collection before inserting the new results, I can't use that here, I need to retain the results of each instance. I don't understand why $out would ever be used.
Is there any way to just add update the existing Valuation document with the aggregation results and then subscribe to that document?
server/methods
Meteor.methods({
valuationAggregate: function(valuationId, valuationSelections) {
//Aggregate and publish average of company valuation data for a user-selected series of companies./
//Selections are saved in Valuations collection for reference and data for aggregation comes from Companies collection.//
check(valuationId, String);
check(valuationSelections, Array);
var pipelineSelections = [
//Match documents in Companies collection where the 'ticker' value was selected by the user.//
{$match: {ticker: {$in: valuationSelections}}},
{
$group: {
_id: null,
avgEvRevenueLtm: {$avg: {$divide: ["$capTable.enterpriseValue", "$financial.ltm.revenue"]}},
avgEvRevenueFy1: {$avg: {$divide: ["$capTable.enterpriseValue", "$financial.fy1.revenue"]}},
avgEvRevenueFy2: {$avg: {$divide: ["$capTable.enterpriseValue", "$financial.fy2.revenue"]}},
avgEvEbitdaLtm: {$avg: {$divide: ["$capTable.enterpriseValue", "$financial.ltm.ebitda"]}},
//more//
}
},
{
$project: {
_id: 0,
valuationId: {$literal: valuationId},
avgEvRevenueLtm: 1,
avgEvRevenueFy1: 1,
avgEvRevenueFy2: 1,
avgEvEbitdaLtm: 1,
//more//
}
}
];
var results = Companies.aggregate(pipelineSelections);
console.log(results);
}
});
The code above works, as far as viewing the results on the server. In my terminal, I see:
I20150926-23:50:27.766(-4)? [ { avgEvRevenueLtm: 3.988137239679733,
I20150926-23:50:27.767(-4)? avgEvRevenueFy1: 3.8159564713187155,
I20150926-23:50:27.768(-4)? avgEvRevenueFy2: 3.50111769838031,
I20150926-23:50:27.768(-4)? avgEvEbitdaLtm: 11.176476895728268,
//more//
I20150926-23:50:27.772(-4)? valuationId: 'Qg4EwpfJ5uPXyxe62' } ]
I was able to resolve this with the following. Needed to add the forEach to unwind the array in the same way as $out.
lib/collections
ValuationResults = new Mongo.Collection('valuationResults');
server/methods
var results = Companies.aggregate(pipelineSelections);
results.forEach(function(valuationResults) {
ValuationResults.update({'result.valuationId': valuationId}, {result:valuationResults}, {upsert: true});
});
console.log(ValuationResults.find({'result.valuationId': valuationId}).fetch());
}
});

Publish all fields in document but just part of an array in the document

I have a mongo collection in which the documents have a field that is an array. I want to be able to publish everything in the documents except for the elements in the array that were created more than a day ago. I suspect the answer will be somewhat similar to this question.
Meteor publication: Hiding certain fields in an array document field?
Instead of limiting fields in the array, I just want to limit the elements in the array being published.
Thanks in advance for any responses!
EDIT
Here is an example document:
{
_id: 123456,
name: "Unit 1",
createdAt: (datetime object),
settings: *some stuff*,
packets: [
{
_id: 32412312,
temperature: 70,
createdAt: *datetime object from today*
},
{
_id: 32412312,
temperature: 70,
createdAt: *datetime from yesterday*
}
]
}
I want to get everything in this document except for the part of the array that was created more than 24 hours ago. I know I can accomplish this by moving the packets into their own collection and tying them together with keys as in a relational database but if what I am asking were possible, this would be simpler with less code.
You could do something like this in your publish method:
Meteor.publish("pubName", function() {
var collection = Collection.find().fetch(); //change this to return your data
_.each(collection, function(collectionItem) {
_.each(collectionItem.packets, function(packet, index) {
var deadline = Date.now() - 86400000 //should equal 24 hrs ago
if (packet.createdAt < deadline) {
collectionItem.packets.splice(index, 1);
}
}
}
return collection;
}
Though you might be better off storing the last 24 hours worth of packets as a separate array in your document. Would probably be less taxing on the server, not sure.
Also, code above is untested. Good luck.
you can use the $elemMatch projection
http://docs.mongodb.org/manual/reference/operator/projection/elemMatch/
So in your case, it would be
var today = new Date();
var yesterday = new Date(today);
yesterday.setDate(today.getDate() - 1);
collection.find({}, //find anything or specifc
{
fields: {
'packets': {
$elemMatch: {$gt : {'createdAt' : yesterday /* or some new Date() */}}
}
}
});
However, $elemMatch only returns the FIRST element matching your condition. To return more than 1 element, you need to use the aggregation framework, which will be more efficient than _.each or forEach, particularly if you have a large array to loop through.
collection.rawCollection().aggregate([
{
$match: {}
},
{
$redact: {
$cond: {
if : {$or: [{$gt: ["$createdAt",yesterday]},"$packets"]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}], function (error, result ){
});
You specify the $match in a way similar to find({}). Then all the documents that match your conditions get pipped into the $redact which is specified by the $cond.
$redact scans the document from top level to bottom. At the top level, you have _id, name, createdAt, settings, packets; hence {$or: [***,"$packets"]}
The presence of $packets in the $or allows the $redact to scan the second level which contain the _id, temperature and createdAt; hence {$gt: ["$createdAt",yesterday]}
This is async, you can use Meteor.wrapAsync to wrap around the function.
Hope this help