mongo equivalent of sql query - mongodb

i need to build a mongo query to get results from a collection which has the same structure as the following sql.
click for picture of table structure
my sql query:
SELECT * FROM (
SELECT
db.date,
db.points,
db.type,
db.name,
db.rank,
YEARWEEK( db.date ) AS year_week
FROM _MyDatabase db
WHERE
db.personId = 100 AND
db.date BETWEEN '2012-10-01' AND '2015-09-30'
ORDER BY
YEARWEEK( db.date ),
db.type,
db.points DESC
) x
GROUP BY
x.year_week DESC,
x.type;
the result looks like this
date points type name rank year_week
-------------------------------------------------
23.10.2014 2000 1 Fish 2 201442
12.10.2014 2500 1 Fish 2 201441
16.10.2014 800 2 Fish 2 201441
i have tried different group / aggregate queries so far, but i couldn't get a similar result. hopefully one of you has more mongo experience than i and can give me a hint on how to solve this.

You would want something like this:
var start = new Date(2012, 9, 1),
end = new Date(2015, 8, 30),
pipeline = [
{
"$match": {
"personId": 100,
"date": { "$gte": start, "$lte": end }
}
},
{
"$project": {
"date": 1, "points": 1, "type": 1, "name": 1, "rank": 1,
"year_week": { "$week": "$date" }
}
},
{
"$sort": {
"year_week": 1,
"type": 1,
"points": -1
}
},
{
"$group": {
"_id": {
"year_week": "$year_week",
"type": "$type"
}
}
}
];
db.getCollection("_MyDatabase").aggregate(pipeline);

Related

How to calculate percentage using MongoDB aggregation

I want to calculate percentage of with help of mongoDB aggregation,
My collection has following data.
subject_id
gender
other_data
1
Male
XYZ
1
Male
ABC
1
Male
LMN
2
Female
TBZ
3
Female
NDA
4
Unknown
UJY
I want output something like this:
[{
gender: 'Male',
total: 1,
percentage: 25.0
},{
gender: 'Female',
total: 2,
percentage: 50.0
},{
gender: 'Unknown',
total: 1,
percentage: 25.0
}]
I have tried various methods but none of them works, mainly unable to count total of Male, Female, Unknown summation(to calculate percentage). The trickiest part is there are only 4 members in above example but their subject_id may be repeated according to other_data
Thanks in Advance.
You can use this aggregation query:
First group by subject_id to get the different values (different persons).
Then use $facet to create "two ways". One to use $count and get the total number of docs, and other to get the documents grouped by gender.
Then with all desired values (grouped by gender and total docs) get the first element of the result from nDocs into $facet stage. $facet will generate an array and the value we want will be in the first position.
Later use $unwind to get every groupValue with the nDoc value
And last output the values you want using $project. To get the percentage you can $divide total/nDocs and $multiply by 100.
db.collection.aggregate([
{
"$group": {
"_id": "$subject_id",
"gender": {
"$first": "$gender"
}
}
},
{
"$facet": {
"nDocs": [
{
"$count": "nDocs"
},
],
"groupValues": [
{
"$group": {
"_id": "$gender",
"total": {
"$sum": 1
}
}
},
]
}
},
{
"$addFields": {
"nDocs": {
"$arrayElemAt": [
"$nDocs",
0
]
}
}
},
{
"$unwind": "$groupValues"
},
{
"$project": {
"_id": 0,
"gender": "$groupValues._id",
"total": "$groupValues.total",
"percentage": {
"$multiply": [
{
"$divide": [
"$groupValues.total",
"$nDocs.nDocs"
]
},
100
]
}
}
}
])
Example here

How to find documents with child collection containing only certain value

I have a following JSON structure:
{
"id": "5cea8bde0c80ee2af9590e7b",
"name": "Sofitel",
"pricePerNight": 88,
"address": {
"city": "Rome",
"country": "Italy"
},
"reviews": [
{
"userName": "John",
"rating": 10,
"approved": true
},
{
"userName": "Marry",
"rating": 7,
"approved": true
}
]
}
I want to find a list of similar documents where ALL ratings values of a review meet a certain criteria eg. less than 8. The document above wouldn't qualify as on of the review has rating 10.
with Querydsl in the following form I still obtain that documnt
BooleanExpression filterByRating = qHotel.reviews.any().rating.lt(8);
You can use $filter and $match to filter out the transactions that you don't need. Following query should do it:
Note: The cond in the $filter is the opposite of your criteria. Since you need ratings less than 8, in this case you gonna need ratings greater than or equals 8
db.qHotel.aggregate([
{
$addFields: {
tempReviews: {
$filter: {
input: "$reviews",
as: "review",
cond: { $gte: [ "$$review.rating", 8 ] } // Opposite of < 8, which is >= 8
}
}
}
},
{
$match : {
tempReviews : [] // This will exclude the documents for which there is at least one review with review.rating >= 8
}
}
]);
The result in the end will contain empty field named tempReviews, you can just use $project to remove it.
EDIT:
Check the example here.

Mongo 3.2 query timeseries value at specific time

I have some timeseries data stored in Mongo with one document per account, like so:
{
"account_number": 123,
"times": [
datetime(2017, 1, 2, 12, 34, 56),
datetime(2017, 3, 4, 17, 18, 19),
datetime(2017, 3, 11, 0, 1, 11),
]
"values": [
1,
10,
9001,
]
}
So, to be clear in the above representation account 123 has a value of 1 from 2017-01-02 12:34:56 until it changes to 10 on 2017-03-04 17:18:19, which then changes to 9001 at 2017-03-11, 00:01:11.
There are many accounts and each account's data is all different (could be at different times and could have more or fewer value changes than other accounts).
I'd like to query for each users value at a given time, e.g. "What was each users value at 2017-01-30 02:03:04? Would return 1 for the above account as it was set to 1 before the given time and did not change until after the given time.
It looks like $zip would be useful but thats only available in Mongo 3.4 and I'm using 3.2 and have no plans to upgrade soon.
Edit:
I can get a small part of the way there using:
> db.account_data.aggregate([{$unwind: '$times'}, {$unwind: '$values'}])
which returns something like:
{"account_number": 123, "times": datetime(2017, 1, 2, 12, 34, 56), "values": 1},
{"account_number": 123, "times": datetime(2017, 1, 2, 12, 34, 56), "values": 10},
#...
which isn't quite right as it is returning the cross product of times/values
This is possible using only 3.2 features. I tested with the Mingo library
var mingo = require('mingo')
var data = [{
"account_number": 123,
"times": [
new Date("2017-01-02T12:34:56"),
new Date("2017-03-04T17:18:19"),
new Date("2017-03-11T00:01:11")
],
"values": [1, 10, 9001]
}]
var maxDate = new Date("2017-01-30T02:03:04")
// 1. filter dates down to those less or equal to the maxDate
// 2. take the size of the filtered date array
// 3. subtract 1 from the size to get the index of the corresponding value
// 4. lookup the value by index in the "values" array into new "valueAtDate" field
// 5. project the extra fields
var result = mingo.aggregate(data, [{
$project: {
valueAtDate: {
$arrayElemAt: [
"$values",
{ $subtract: [ { $size: { $filter: { input: "$times", as: "time", cond: { $lte: [ "$$time", maxDate ] }} } }, 1 ] }
]
},
values: 1,
times: 1
}
}])
console.log(result)
// Outputs
[ { valueAtDate: 1,
values: [ 1, 10, 9001 ],
times:
[ 2017-01-02T12:34:56.000Z,
2017-03-04T17:18:19.000Z,
2017-03-11T00:01:11.000Z ] } ]
Not sure how to do the same with MongoDb 3.2, however from 3.4 you can do the following query:
db.test.aggregate([
{
$project:
{
index: { $indexOfArray: [ "$times", "2017,3,11,0,1,11" ] },
values: true
}
},
{
$project: {
resultValue: { $arrayElemAt: [ "$values", "$index" ] }
}
}])

MongoDB calculating score from existing fields and putting it into a new field in the same collection

I'm working on Mongodb and I have one collection, let's say Collection1.
I have to calculate a score from existing fields in Collection1, and put the result into a new field Field8 in Collection1.
Collection1 :
db.Collection1.find().pretty().limit(2) {
"_id": ObjectId("5717a5d4578f3f2556f300f2"),
"Field1": "XXXX",
"Field2": 0,
"Field3": 169,
"Field4": 230,
"Field5": "...4.67", // This field refer to days in a week
"Field6": "ZZ",
"Field7": "LO"
}, {
"_id": ObjectId("17a5d4575f300f278f3f2556"),
"Field1": "YYYY",
"Field2": 1,
"Field3": 260,
"Field4": 80,
"Field5": "1.3....", // This field refer to days in a week
"Field6": "YY",
"Field7": "PK"
}
So, I have to do some calculations to my first collection's fields with the following formula, but I don't know how to proceed ? :
Score = C1*C2*C3*C4
C1 = 10 + 0.03*field3
C2 = 1 or 0.03 it depends on field2 if it equals 1 or 0
C3 = 1 or 2 .... or 7, it depends on field5 for example C3 for this document "Field5": "...4.67" should return 3, it means three days per week
C4 = 1 or field4^-0.6 if field2 equals 0 or 1
After calculating this score I should put it in new field Field8 in my Collection1 and get something just like this :
db.Collection1.find().pretty().limit(2) {
"_id": ObjectId("5717a5d4578f3f2556f300f2"),
"Field1": "XXXX",
"Field2": 0,
"Field3": 169,
"Field4": 230,
"Field5": "...4.67", // This field refer to days in a week
"Field6": "ZZ",
"Field7": "LO",
"Field8": Score // My calculated score
}, {
"_id": ObjectId("17a5d4575f300f278f3f2556"),
"Field1": "YYYY",
"Field2": 1,
"Field3": 260,
"Field4": 80,
"Field5": "1.3....", // This field refer to days in a week
"Field6": "YY",
"Field7": "PK",
"Field8": Score // My calculated score
}
How can I achieve the above?
Depending on your application needs, you can use the aggregation framework for calculating the score and use the bulkWrite() to update your collection. Consider the following example which uses the $project pipeline step as leeway for the score calculations with the arithmetic operators.
Since logic for calculating C3 in your question is getting a number from 1 to 7 which equals exactly 7 - number of points (.), the only feasible approach I can think of is to store an extra field that holds this value first before doing the aggregation. So your first step would be to create that extra field and you can go about it using the bulkWrite() as follows:
Step 1: Modify schema to accomodate extra daysInWeek field
var counter = 0, bulkUpdateOps = [];
db.collection1.find({
"Field5": { "$exists": true }
}).forEach(function(doc) {
// calculations for getting the number of points in Field5
var points, daysInWeek;
points = (doc.Field5.match(new RegExp(".", "g")) || []).length;
daysInWeek = 7 - points;
bulkUpdateOps.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": { "daysInWeek": daysInWeek }
}
}
});
counter++;
if (counter % 500 == 0) {
db.collection1.bulkWrite(bulkUpdateOps);
bulkUpdateOps = [];
}
});
if (counter % 500 != 0) { db.collection1.bulkWrite(bulkUpdateOps); }
Ideally the above operation can also accomodate calculating the other constants in your question and therefore creating the Field8 as a result. However I believe computations like this should be done on the client and let MongoDB do what it does best on the server.
Step 2: Use aggregate to add Field8 field
Having created that extra field daysInWeek you can then construct an aggregation pipeline that projects the new variables using a cohort of arithmetic operators to do the computation (again, would recommend doing such computations on the application layer). The final projection will be the product of the computed fields which you can then use the aggregate result cursor to iterate and add Field8 to the collection with each document:
var pipeline = [
{
"$project": {
"C1": {
"$add": [
10,
{ "$multiply": [ "$Field3", 0.03 ] }
]
},
"C2": {
"$cond": [
{ "$eq": [ "$Field2", 1 ] },
1,
0.03
]
},
"C3": "$daysInWeek",
"C4": {
"$cond": [
{ "$eq": [ "$Field2", 1 ] },
{ "$pow": [ "$Field4", -0.6 ] },
1
]
}
}
},
{
"$project": {
"Field8": { "$multiply": [ "$C1", "$C2", "$C3", "$C4" ] }
}
}
],
counter = 0,
bulkUpdateOps = [];
db.collection1.aggregate(pipeline).forEach(function(doc) {
bulkUpdateOps.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": { "Field8": doc.Field8 }
}
}
});
counter++;
if (counter % 500 == 0) {
db.collection1.bulkWrite(bulkUpdateOps);
bulkUpdateOps = [];
}
});
if (counter % 500 != 0) { db.collection1.bulkWrite(bulkUpdateOps); }
For MongoDB >= 2.6 and <= 3.0, use the Bulk Opeartions API where you need to iterate the collection using the cursor's forEach() method, update each document in the collection.
Some of the arithmetic operators from the above aggregation pipeline are not available in MongoDB >= 2.6 and <= 3.0 so you will need to do the computations within the forEach() iteration.
Use the bulk API to reduce server write requests by bundling each update in bulk and sending to the server only once in every 500 documents in the collection for processing:
var bulkUpdateOps = db.collection1.initializeUnorderedBulkOp(),
cursor = db.collection1.find(), // cursor
counter = 0;
cursor.forEach(function(doc) {
// computations
var c1, c2, c3, c4, Field8;
c1 = 10 + (0.03*doc.Field3);
c2 = (doc.Field2 == 1) ? 1: 0.03;
c3 = 7 - (doc.Field5.match(new RegExp(".", "g")) || []).length;
c4 = (doc.Field2 == 1) ? Math.pow(doc.Field, -0.6) : 1;
Field8 = c1*c2*c3*c4;
bulkUpdateOps.find({ "_id": doc._id }).updateOne({
"$set": { "Field8": Field8 }
});
if (counter % 500 == 0) {
bulkUpdateOps.execute();
bulkUpdateOps = db.collection1.initializeUnorderedBulkOp();
}
})
if (counter % 500 != 0) { bulkUpdateOps.execute(); }
just make a function which returns your calculated value and call in your update mongodb query.
like
var cal = function(row){ return row.Field1 + row.Field2 * row.Field3; // use your formula according to your requirements};
var rows = db.collection1.find() // can use your filter;
rows.forEach(function(row){db.collection1.update({"_id":row._id},{$set:{"Field8":cal(row)}})});

Criteria Morphia MongoDB

I have a collection like this:
{
"_id": {
"$oid": "53f34ef8ec10d6fa97dcc34b"
},
"editions": [
{
"number": 1,
...
},
{
"number": 2,
...
},
...
]
}
I want filter results of my query by some number.
I tried
criterias.add(query.criteria("editions.number").equal(paramNumber));
And
query.filter("editions.number =", paramNumber)
However I just received all collection, when I pass paramNumber equals 2. What I want is receive the following result:
{
"_id": {
"$oid": "53f34ef8ec10d6fa97dcc34b"
},
"editions": [
{
"number": 2,
...
}
]
}
What am I doing wrong?
You can't receive partial arrays like that. You'll get back with the full document/object or just the fields you've specified in a projection.