In mongodb I want to calculate sum of partialAmount field which is of string type and in this field values are stored as "20,00","15,00".
How to calculate sum of all values. Both of the queries I have tried are returning 0.
collection.aggregate([
{
$group: {
_id: null,
sum: { $sum: "$$partialAmount" }
}
}
]);
And:
collection.aggregate([
{
$group: {
_id: null,
totalAmount: {
$sum: {
$toDouble: "$partialAmount"
}
}
}
}
]);
Your first query is obviously not going to work cause you're trying to sum strings, and also you have an extra "$" in "$$partialAmount".
Your second query would work if your partialAmount-s were stored in the format "15.00" and "20.00", see here.
If they are saved as "15,00" and "20,00" in the db, your second query should throw an error, not return 0. (If you are actually getting a zero result, then maybe your "partialAmount" field is misspelled in the db, or the field gets lost in a previous stage of the pipeline)
In this case you need either change the values in your db to the "20.00" format, or if this is not feasible, use $split and $concat to convert to the proper format like this, before converting to double and summing up the values.
Related
I want the total number of cases in all my documents,
This is the query I tried to use:
db.coviddatajson.aggregate([
{ $group: { _id: null, total: { $sum: "$total_cases"} } }
])
For some reason the result is 0 which does not make sense, as it's supposed to be 1000+ at least and the expected result anything that is not zero will make sense but it's supposed to be a few thousands or something like that.
This is the dataset I am using:
https://covid.ourworldindata.org/data/owid-covid-data.json
What am I doing wrong here?
Any ideas on how to fix this query?
The total_cases field is inside data array, and $sum requires field type as number in $group stage, so before we need to do total($sum) of data.total_cases in current document and then pass it to $group stage and count total sum,
db.coviddatajson.aggregate([
{
$project: { total_cases: { $sum: "$data.total_cases" } }
},
{
$group: {
_id: null,
total: { $sum: "$total_cases" }
}
}
])
Playground
The data set has some issues.
The document size is bigger than 16MiB, you cannot load documents >16MiB into MongoDB. This in an internal limitation. You would need to split the document into sub-documents.
The document contains data for each country but also summarized data for "World". Do you have to exclude the "World" data? Can you use it, instead of manual summary?
The data is not consistent. For example some countries do not provide a number of male/female smokers or median age. Not all countries provide all data for each date, you may have missing values. How to deal with them?
Do you like a simple sum of all total_cases? If yes, the query would be easy, however the result would be pointless (15'773'189'214 total cases, twice population of the world).
I have a document (inside aggregation, after $group stage) which have an object (but I could form array, if I needed it to) with number values.
MongoPlayground example with data and my aggregate query available here.
And I want to make a new _id field during next $project stage, consisted of this three number values, like:
item_id | unix time | pointer
_id: 453435-41464556645#1829
The problem is, that when I am trying to use $concat, the query returns me an error like:
$concat only supports strings, not int
So here is my question: is it possible to achieve such results? I have seen the relevant question MongoDB concatenate strings from two fields into a third field, but it didn't cover my case.
The $concat only concatenate strings, these fields $_id.item_id contains int value and $_id.last_modified double value,
The $toString converts a value to a string,
_id: {
$concat: [
{
$toString: "$_id.item_id"
},
" - ",
{
$toString: "$_id.last_modified"
}
]
}
Playground: https://mongoplayground.net/p/SSlXW4gIs_X
I am developing a financial application with Nodejs. I wonder would it be possible to compute simple moving average which is the average last N days of price directly in Mongo Shell than reading it and computing it in Node js.
Document Sample.
[{code:'0001',price:0.10,date:'2014-07-04T00:00:00.000Z'},
{code:'0001',price:0.12,date:'2014-07-05T00:00:00.000Z'},{code:'0001',price:0.13,date:'2014-07-06T00:00:00.000Z'},
{code:'0001',price:0.12,date:'2014-07-07T00:00:00.000Z'}]
If you have more than a trivial number of documents you should use the DB server to do the work rather than JS.
You don't say if you are using mongoose or the node driver directly. I'll assume you are using mongoose as that is the way most people are headed.
So your model would be:
// models/stocks.js
const mongoose = require("mongoose");
const conn = mongoose.createConnection('mongodb://localhost/stocksdb');
const StockSchema = new mongoose.Schema(
{
price: Number,
code: String,
date: Date,
},
{ timestamps: true }
);
module.exports = conn.model("Stock", StockSchema, "stocks");
You rightly suggested that aggregation frameworks would be a good way to go here. First though if we are dealing with returning values between date ranges, the records in your database need to be date objects. From your example documents you may have put strings. An example of inserting objects with dates would be:
db.stocks.insertMany([{code:'0001',price:0.10,date:ISODate('2014-07-04T00:00:00.000Z')}, {code:'0001',price:0.12,date:ISODate('2014-07-05T00:00:00.000Z')},{code:'0001',price:0.13,date:ISODate('2014-07-06T00:00:00.000Z')}, {code:'0001',price:0.12,date:ISODate('2014-07-07T00:00:00.000Z')}])
The aggregation pipeline function accepts an array with one or more pipeline stages.
The first pipeline stage we should use is $match, $match docs, this filters the documents down to only the records we are interested in which is important for performance
{ $match: {
date: {
$gte: new Date('2014-07-03'),
$lte: new Date('2014-07-07')
}
}
}
This stage will send only the documents that are on the 3rd to 7th July 2014 inclusive to the next stage (in this case all the example docs)
Next stage is the stage where you can get an average. We need to group the values together based on one field, multiple fields or all fields.
As you don't specify a field you want to average over I'll give an example for all fields. For this we use the $group object, $group docs
{
$group: {
_id: null,
average: {
$avg: '$price'
}
}
}
This will take all the documents and display an average of all the prices.
In the case of your example documents this results in
{ _id: null, avg: 0.1175 }
Check the answer:
(0.10 + 0.12 + 0.12 + 0.13) / 4 = 0.1175
FYI: I wouldn't rely on calculations done with javascript for anything critical as Numbers using floating points. See https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html for more details if you are worried about that.
For completeness here is the full aggregation query
const Stock = require("./models/stocks");
Stock.aggregate([{ $match: {
date: {
$gte: new Date('2014-07-03'),
$lte: new Date('2014-07-07')
}
}},
{
$group: {
_id: null,
avg: {
$avg: '$price'
}
}
}])
.then(console.log)
.catch(error => console.error(error))
Not sure about your moving average formula, but here is how I would do it:
var moving_average = null
db.test.find().forEach(function(doc) {
if (moving_average==null) {
moving_average = doc.price;
}
else {
moving_average = (moving_average+doc.price)/2;
}
})
output:
> moving_average
0.3
And if you wan to define the N days to do the average for, just modify the argument for find:
db.test.find({ "date": { $lt: "2014-07-10T00:00:00.000Z" }, "date": { $gt: "2014-07-07T00:00:00.000Z" } })
And if you want to do the above shell code in one-line, you can assume that moving_average is undefined and just check for that before assigning the first value.
I have a mongo collection in which the documents have a field that is an array. I want to be able to publish everything in the documents except for the elements in the array that were created more than a day ago. I suspect the answer will be somewhat similar to this question.
Meteor publication: Hiding certain fields in an array document field?
Instead of limiting fields in the array, I just want to limit the elements in the array being published.
Thanks in advance for any responses!
EDIT
Here is an example document:
{
_id: 123456,
name: "Unit 1",
createdAt: (datetime object),
settings: *some stuff*,
packets: [
{
_id: 32412312,
temperature: 70,
createdAt: *datetime object from today*
},
{
_id: 32412312,
temperature: 70,
createdAt: *datetime from yesterday*
}
]
}
I want to get everything in this document except for the part of the array that was created more than 24 hours ago. I know I can accomplish this by moving the packets into their own collection and tying them together with keys as in a relational database but if what I am asking were possible, this would be simpler with less code.
You could do something like this in your publish method:
Meteor.publish("pubName", function() {
var collection = Collection.find().fetch(); //change this to return your data
_.each(collection, function(collectionItem) {
_.each(collectionItem.packets, function(packet, index) {
var deadline = Date.now() - 86400000 //should equal 24 hrs ago
if (packet.createdAt < deadline) {
collectionItem.packets.splice(index, 1);
}
}
}
return collection;
}
Though you might be better off storing the last 24 hours worth of packets as a separate array in your document. Would probably be less taxing on the server, not sure.
Also, code above is untested. Good luck.
you can use the $elemMatch projection
http://docs.mongodb.org/manual/reference/operator/projection/elemMatch/
So in your case, it would be
var today = new Date();
var yesterday = new Date(today);
yesterday.setDate(today.getDate() - 1);
collection.find({}, //find anything or specifc
{
fields: {
'packets': {
$elemMatch: {$gt : {'createdAt' : yesterday /* or some new Date() */}}
}
}
});
However, $elemMatch only returns the FIRST element matching your condition. To return more than 1 element, you need to use the aggregation framework, which will be more efficient than _.each or forEach, particularly if you have a large array to loop through.
collection.rawCollection().aggregate([
{
$match: {}
},
{
$redact: {
$cond: {
if : {$or: [{$gt: ["$createdAt",yesterday]},"$packets"]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}], function (error, result ){
});
You specify the $match in a way similar to find({}). Then all the documents that match your conditions get pipped into the $redact which is specified by the $cond.
$redact scans the document from top level to bottom. At the top level, you have _id, name, createdAt, settings, packets; hence {$or: [***,"$packets"]}
The presence of $packets in the $or allows the $redact to scan the second level which contain the _id, temperature and createdAt; hence {$gt: ["$createdAt",yesterday]}
This is async, you can use Meteor.wrapAsync to wrap around the function.
Hope this help
I have a collection of documents that have a value that is known to be a number, but is stored as a string. It is out of my control to change the type of the field, but I want to use that field in an aggregation (say, to average it).
It seems that I should be using a projection prior to grouping, and in that projection convert the field as needed. I can't seem to get the syntax just right - everything I try either gives me NaN, or the new field is simply missing from the next step in the aggregation.
$project: {
value: '$value',
valueasnumber: ????
}
Given the very simple example above, where the contents of $value in all documents are string type, but will parse to a number, what do I do to make valueasnumber a new (non-existing) field that is of type double with the parsed version of $value in it?
I've tried things like the examples below (and about a dozen similar things):
{ $add: new Number('$value').valueOf() }
new Number('$value').valueOf()
Am I barking up the wrong tree entirely? Any help would be greatly appreciated!
(To be 100% clear, below is how I would like to use the new field).
$group {
score: {
$avg: '$valueasnumber'
}
}
One of the way which I can think of is to use a mongo shell javascript to modify the document by adding new number field, valuesasnumber (number conversion of existing string 'value' field) in the existing document or in the new doc. Then using this numeric field for further calculations.
db.numbertest.find().forEach(function(doc) {
doc.valueasnumber = new NumberInt(doc.value);
db.numbertest.save(doc);
});
Using the valueasnumber field for numeric calculation
db.numbertest.aggregate([{$group :
{_id : null,
"score" : {$avg : "$valueasnumber"}
}
}]);
The core operation is to convert value from string to number which is unable to handled in aggregate pipeline operation currently.
mapReduce is an alternative as below.
db.c.mapReduce(function() {
emit( this.groupId, {score: Number(this.value), count: 1} );
}, function(key, values) {
var score = 0, count = 0;
for (var i = 0; i < values.length; i++) {
score += values[i].score;
count += values[i].count;
}
return {score: score, count: count};
}, {finalize: function(key, value) {
return {score: value.score / value.count};
}, out: {inline: 1}});
Now there is $toInt conversion operators in aggregation, you can check:
https://jira.mongodb.org/browse/SERVER-11400