$in query operator not working in aggregation framework - mongodb

I have a large collection of dataset, which I would like to run some custom queries using mongo aggregation framework.
I am trying to do the following
minLimit and maxLimit are date objects to specify date range.
Tickets.aggregate(
{ $match: { timestamp: { $gte: minLimit, $lte: maxLimit } } },
{ $match: { staffId: { $in: staffIds } } }
).allowDiskUse(true).exec(cb);
If I don't include staff match condition the above is working fine but after I include staff match condition no records are returned.
The staffIds is a array of ids which I get using lodash pluck method.
First I get all staff documents using
Staff.find({some filter condition}, callback);
Then I pluck each of the records using
var staffIds = _.pluck(results, '_id');
I will pass this staffIds to match condition.
Why it is not working? If I remove $in match it is working fine.

Related

MongoDB query over several collections with one sort stage

I have some data with identical layout divided over several collections, say we have collections named Jobs.Current, Jobs.Finished, Jobs.ByJames.
I have implemented a complex query using some aggregation stages on one of these collections, where the last stage is the sorting. It's something like this (but in real it's implemented in C# and additionally doing a projection):
db.ArchivedJobs.aggregate([ { $match: { Name: { $gte: "A" } } }, { $addFields: { "UpdatedTime": { $max: "$Transitions.TimeStamp" } } }, { $sort: { "__tmp": 1 } } ])
My new requirement is to include all these collections into my query. I could do it by simply running the same query on all collections in sequence - but then I still need to sort the results together. As this sort isn't so trivial (using an additional field being created by a $max on a sub-array) and I'm using skip and limit options I hope it's possible to do it in a way like:
Doing the query I already implemented on all relevant collections by defining appropriate aggregation steps
Sorting the whole result afterwards inside the same aggregation request
I found something with a $lookup stage, but couldn't apply it to my request as it needs to do some field-oriented matching (?). I need to access the complete objects.
The data is something like
{
"_id":"4242",
"name":"Stream recording Test - BBC 60 secs switch",
"transitions":[
{
"_id":"123",
"timeStamp":"2020-02-13T14:59:40.449Z",
"currentProcState":"Waiting"
},
{
"_id":"124",
"timeStamp":"2020-02-13T14:59:40.55Z",
"currentProcState":"Running"
},
{
"_id":"125",
"timeStamp":"2020-02-13T15:00:23.216Z",
"currentProcState":"Error"
} ],
"currentState":"Error"
}

Compute Simple Moving Average in Mongo Shell

I am developing a financial application with Nodejs. I wonder would it be possible to compute simple moving average which is the average last N days of price directly in Mongo Shell than reading it and computing it in Node js.
Document Sample.
[{code:'0001',price:0.10,date:'2014-07-04T00:00:00.000Z'},
{code:'0001',price:0.12,date:'2014-07-05T00:00:00.000Z'},{code:'0001',price:0.13,date:'2014-07-06T00:00:00.000Z'},
{code:'0001',price:0.12,date:'2014-07-07T00:00:00.000Z'}]
If you have more than a trivial number of documents you should use the DB server to do the work rather than JS.
You don't say if you are using mongoose or the node driver directly. I'll assume you are using mongoose as that is the way most people are headed.
So your model would be:
// models/stocks.js
const mongoose = require("mongoose");
const conn = mongoose.createConnection('mongodb://localhost/stocksdb');
const StockSchema = new mongoose.Schema(
{
price: Number,
code: String,
date: Date,
},
{ timestamps: true }
);
module.exports = conn.model("Stock", StockSchema, "stocks");
You rightly suggested that aggregation frameworks would be a good way to go here. First though if we are dealing with returning values between date ranges, the records in your database need to be date objects. From your example documents you may have put strings. An example of inserting objects with dates would be:
db.stocks.insertMany([{code:'0001',price:0.10,date:ISODate('2014-07-04T00:00:00.000Z')}, {code:'0001',price:0.12,date:ISODate('2014-07-05T00:00:00.000Z')},{code:'0001',price:0.13,date:ISODate('2014-07-06T00:00:00.000Z')}, {code:'0001',price:0.12,date:ISODate('2014-07-07T00:00:00.000Z')}])
The aggregation pipeline function accepts an array with one or more pipeline stages.
The first pipeline stage we should use is $match, $match docs, this filters the documents down to only the records we are interested in which is important for performance
{ $match: {
date: {
$gte: new Date('2014-07-03'),
$lte: new Date('2014-07-07')
}
}
}
This stage will send only the documents that are on the 3rd to 7th July 2014 inclusive to the next stage (in this case all the example docs)
Next stage is the stage where you can get an average. We need to group the values together based on one field, multiple fields or all fields.
As you don't specify a field you want to average over I'll give an example for all fields. For this we use the $group object, $group docs
{
$group: {
_id: null,
average: {
$avg: '$price'
}
}
}
This will take all the documents and display an average of all the prices.
In the case of your example documents this results in
{ _id: null, avg: 0.1175 }
Check the answer:
(0.10 + 0.12 + 0.12 + 0.13) / 4 = 0.1175
FYI: I wouldn't rely on calculations done with javascript for anything critical as Numbers using floating points. See https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html for more details if you are worried about that.
For completeness here is the full aggregation query
const Stock = require("./models/stocks");
Stock.aggregate([{ $match: {
date: {
$gte: new Date('2014-07-03'),
$lte: new Date('2014-07-07')
}
}},
{
$group: {
_id: null,
avg: {
$avg: '$price'
}
}
}])
.then(console.log)
.catch(error => console.error(error))
Not sure about your moving average formula, but here is how I would do it:
var moving_average = null
db.test.find().forEach(function(doc) {
if (moving_average==null) {
moving_average = doc.price;
}
else {
moving_average = (moving_average+doc.price)/2;
}
})
output:
> moving_average
0.3
And if you wan to define the N days to do the average for, just modify the argument for find:
db.test.find({ "date": { $lt: "2014-07-10T00:00:00.000Z" }, "date": { $gt: "2014-07-07T00:00:00.000Z" } })
And if you want to do the above shell code in one-line, you can assume that moving_average is undefined and just check for that before assigning the first value.

Querying two array fields and getting positional result

I've been trying to do important update operations using two-phase commit method. Unfortunately, the field that will be updated in an array. But the same document have to has a pendingTransactions field to store the current transactions. And according to this documentation, MongoDB doesn't support positional update
if two array fields are in the query document.
Is there any chance to solve this situation?
Other Additional informations and 'Correct Results' using Two Array Fields In The Query
Actually I used a second array in query document without knowing the problem. It works stable. When testing, I encountered with the problem at another update operation. So, the first query response correctly at every time.
the stable part is like the below. With $ne operator it works correct;
invoices and pendingTransactions are Array fields
var query = {
_id: customer_id,
invoices: { $elemMatch: { 'year': 2015, 'month': 8 } },
pendingTransactions: { $ne: transaction_id }
}
Customers.findOne(filter, { 'invoices.$' }, function(err, customer){
/* Positional operator $ works correct */
}
the unstable part is like the below;
var query = {
_id: customer_id,
invoices: { $elemMatch: { 'year': 2015, 'month': 8 } },
pendingTransactions: transaction_id
}
Customers.findOne(filter, { 'invoices.$' }, function(err, customer){
/* Positional operator $ doesn't work correct */
}

Publish all fields in document but just part of an array in the document

I have a mongo collection in which the documents have a field that is an array. I want to be able to publish everything in the documents except for the elements in the array that were created more than a day ago. I suspect the answer will be somewhat similar to this question.
Meteor publication: Hiding certain fields in an array document field?
Instead of limiting fields in the array, I just want to limit the elements in the array being published.
Thanks in advance for any responses!
EDIT
Here is an example document:
{
_id: 123456,
name: "Unit 1",
createdAt: (datetime object),
settings: *some stuff*,
packets: [
{
_id: 32412312,
temperature: 70,
createdAt: *datetime object from today*
},
{
_id: 32412312,
temperature: 70,
createdAt: *datetime from yesterday*
}
]
}
I want to get everything in this document except for the part of the array that was created more than 24 hours ago. I know I can accomplish this by moving the packets into their own collection and tying them together with keys as in a relational database but if what I am asking were possible, this would be simpler with less code.
You could do something like this in your publish method:
Meteor.publish("pubName", function() {
var collection = Collection.find().fetch(); //change this to return your data
_.each(collection, function(collectionItem) {
_.each(collectionItem.packets, function(packet, index) {
var deadline = Date.now() - 86400000 //should equal 24 hrs ago
if (packet.createdAt < deadline) {
collectionItem.packets.splice(index, 1);
}
}
}
return collection;
}
Though you might be better off storing the last 24 hours worth of packets as a separate array in your document. Would probably be less taxing on the server, not sure.
Also, code above is untested. Good luck.
you can use the $elemMatch projection
http://docs.mongodb.org/manual/reference/operator/projection/elemMatch/
So in your case, it would be
var today = new Date();
var yesterday = new Date(today);
yesterday.setDate(today.getDate() - 1);
collection.find({}, //find anything or specifc
{
fields: {
'packets': {
$elemMatch: {$gt : {'createdAt' : yesterday /* or some new Date() */}}
}
}
});
However, $elemMatch only returns the FIRST element matching your condition. To return more than 1 element, you need to use the aggregation framework, which will be more efficient than _.each or forEach, particularly if you have a large array to loop through.
collection.rawCollection().aggregate([
{
$match: {}
},
{
$redact: {
$cond: {
if : {$or: [{$gt: ["$createdAt",yesterday]},"$packets"]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}], function (error, result ){
});
You specify the $match in a way similar to find({}). Then all the documents that match your conditions get pipped into the $redact which is specified by the $cond.
$redact scans the document from top level to bottom. At the top level, you have _id, name, createdAt, settings, packets; hence {$or: [***,"$packets"]}
The presence of $packets in the $or allows the $redact to scan the second level which contain the _id, temperature and createdAt; hence {$gt: ["$createdAt",yesterday]}
This is async, you can use Meteor.wrapAsync to wrap around the function.
Hope this help

MongoDB: Search in array

I have a field in a MongoDB Collection like:
{
place = ['London','Paris','New York']
}
I need a query that will return only that particular entry of the array, where a specific character occurs. For Example, I want to search for the terms having the letter 'o'(case-insensitive) in them. It should just return 'London' and 'New York'.
I tried db.cities.find({"place":/o/i}), but it returns the whole array.
You'll need to $unwind using an aggregate query, then match.
db.cities.aggregate([ { $unwind:'$place' }, { $match: { place : {$regex: /o/i } } } ])
In simple you find with $regex below query will work without aggregation
db.collectionName.find({ place : {$regex: /o/i } })