In MongoDB I have a collection that looks like:
{ low: 1, high: 5 },
{ low: 6, high: 15 },
{ low: 16, high 412 },
...
I have input that's an array of integers:
[ 4, 16, ...]
I want to find all the documents in the collection which have values included in the range depicted by low and high. In this example it would pick the first and third documents.
I've found lots of Q&A here on how to filter using a single value as the input but could not find one that included an array as input. It could be that my search failed me and that this has been answered.
Update: I should have mentioned that I'm constructing this query in an application and not running this in the CLI. Given that flexibility what if I create a $or query with each of the inputs? Something like:
$or: [{
high: { $gte: 4 },
low: { $lte: 4 },
}, {
high: { $gte: 16 },
low: { $lte: 16 },
},
...
]
It could be massive and have thousands of elements in the $or.
You can use $anyElementTrue along with $map to check if any value is included within a range defined in your documents:
db.collection.find({
$expr: {
$anyElementTrue: {
$map: {
input: [ 4, 16 ],
in: {
$and: [
{ $gte: [ "$$this", "$low" ] },
{ $lte: [ "$$this", "$high" ] },
]
}
}
}
}
})
Mongo Playground
//working code from Mongo Shell CLI 4.2.6 on windows 10
//you can use forEach and loop through for comparison if a value exists between two numbers
> print("MongoDB",db.version());
MongoDB 4.2.6
> db.lhColl.find();
{ "_id" : ObjectId("5f889258f3b30cd04c8a78e5"), "low" : 1, "high" : 5 }
{ "_id" : ObjectId("5f889258f3b30cd04c8a78e6"), "low" : 6, "high" : 15 }
{ "_id" : ObjectId("5f889258f3b30cd04c8a78e7"), "low" : 16, "high" : 412 }
> var arrayInput = [4,16,500];
> var inputLength = arrayInput.length;
> db.lhColl.aggregate([
... {$match:{}}
... ]).forEach(function(doc){
... for (i=0; i<inputLength; i++){
... if (arrayInput[i]>=doc.low){
... if(arrayInput[i] <= doc.high)
... print("arrayInputs value match:",arrayInput[i]);
... }
... }
... });
arrayInputs value match: 4
arrayInputs value match: 16
Related
I have a DB with about 500K document in MongoDB. My DB has a field number starting from 1 and increasing continuously. After I count total of the document, I realize some documents are missing (last number field is greater than total document, instead of equal). How can I check which document is missing? A document has the form:
{
"_id" : ObjectId("abc"),
"number" : 499661
}
You can try this approach. Consider you have these documents with missing numbers 1 and 4:
{ number: 0 }
{ number: 2 }
{ number: 3 }
{ number: 5 }
{ number: 6 }
To find the two missing numbers, try this aggregation:
db.test.aggregate([
{
$group: {
_id: null,
nums: { $push: "$number" }
}
},
{
$project: {
_id: 0,
missing_numbers: { $setDifference: [ { $range: [ 0, 7 ] }, "$nums" ] }
}
},
])
The output: { "missing_numbers" : [ 1, 4 ] }
I have the following document in student collection:
{
"uid": 1,
"eng": 70
}
Now I want to add 10 into eng field and want result 80. to do this I am using following query:
db.getCollection('student').aggregate([{$match:{uid:1}},{$set:{eng:{$sum:10}}}])
but it is not working. SO how can add any number in the field to the required output? is any addition query in MongoDB. help me here
I suggest using the $inc operator here:
db.getCollection('student').update(
{ uid: 1 },
{ $inc: { eng: 10 } }
)
SOLUTION #1: Set sum to the same field eng.
db.student.aggregate([
{ $match: { uid: 1 } },
{
$set: {
eng: { $add: ["$eng", 10] } // $sum: ["$eng", 10] Also works;)
}
}
])
Output:
{
"_id" : ObjectId("6065f94abb72032a689ed61d"),
"uid" : 1,
"eng" : 80
}
SOLUTION #2: Set sum to a different field result.
Using $addFields add result filed.
Using $add add 10 to eng and store it in result.
db.student.aggregate([
{ $match: { uid: 1 } },
{
$addFields: {
result: { $add: ["$eng", 10] }
}
}
])
Output:
{
"_id" : ObjectId("6065f94abb72032a689ed61d"),
"uid" : 1,
"eng" : 70,
"result" : 80
}
We have a requirement to rank our records based on some defined algorithm. We have 4 fields in our MongoDB like following;
{
"rating" : 3.5
"review" : 4
"revenue" : 100
"used" : 3.9
},
{
"rating" : 1.5
"review" : 2
"revenue" : 10
"used" : 2.1
}
While querying the data, we will send % as weightage for our calculation. So assume we are sending 30% for rating, 30% for review and 20% each for revenue and uthe sed.
Now we need to score each record based on following calculation.
Score per column = (Existing Value - Average(Column) / StandardDeviation) * %weightage
for rating = (3.5 - 2.5) /1 * 30% = .03
So we need to count score for each column (or field) and than total of all 4 field will give a score to each record.
Is it possible to do such calculation with any MongoDB inbuilt function ?
Thanks in advance
Mongo has default operators for finding standard deviation ($stdDevPop) and for finding the average ($avg) and obviously for subtraction, multiplication, and division as well.
So, using all these operators you can definitely write an aggregation for what you require.
I've done for rating below, you can use the same logic for the other fields.
Also, replace 0.3 with your %weightage.
db.collection.aggregate([
{
$match: {
rating: {
$ne: null
}
}
},
{
$group: {
_id: null,
ratings: {
$push: "$rating"
},
avg_rating: {
$avg: "$rating"
},
std_deviation_rating: {
$stdDevPop: "$rating"
}
}
},
{
$project: {
ratings: {
$map: {
input: "$ratings",
as: "rating",
in: {
$multiply: [
{
$divide: [
{
$subtract: [
"$$rating",
"$avg_rating"
]
},
"$std_deviation_rating"
]
},
0.3
]
}
}
}
}
}
])
I got a problem when I use db.collection.aggregate in MongoDB.
I have a data structure like:
_id:...
Segment:{
"S1":1,
"S2":5,
...
"Sn":10
}
It means the following in Segment: I might have several sub attributes with numeric values. I'd like to sum them up as 1 + 5 + .. + 10
The problem is: I'm not sure about the sub attributes names since for each document the segment numbers are different. So I cannot list each segment name. I just want to use something like a for loop to sum all values together.
I tried queries like:
db.collection.aggregate([
{$group:{
_id:"$Account",
total:{$sum:"$Segment.$"}
])
but it doesn't work.
You have made the classical mistake to have arbitrary field names. MongoDB is "schema-free", but it doesn't mean you don't need to think about your schema. Key names should be descriptive, and in your case, f.e. "S2" does not really mean anything. In order to do most kinds of queries and operations, you will need to redesign you schema to store your data like this:
_id:...
Segment:[
{ field: "S1", value: 1 },
{ field: "S2", value: 5 },
{ field: "Sn", value: 10 },
]
You can then run your query like:
db.collection.aggregate( [
{ $unwind: "$Segment" },
{ $group: {
_id: '$_id',
sum: { $sum: '$Segment.value' }
} }
] );
Which then results into something like this (with the only document from your question):
{
"result" : [
{
"_id" : ObjectId("51e4772e13573be11ac2ca6f"),
"sum" : 16
}
],
"ok" : 1
}
Starting Mongo 3.4, this can be achieved by applying inline operations and thus avoid expensive operations such as $group:
// { _id: "xx", segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
db.collection.aggregate([
{ $addFields: {
total: { $sum: {
$map: { input: { $objectToArray: "$segments" }, as: "kv", in: "$$kv.v" }
}}
}}
])
// { _id: "xx", total: 42, segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
The idea is to transform the object (containing the numbers to sum) as an array. This is the role of $objectToArray, which starting Mongo 3.4.4, transforms { s1: 1, s2: 3, ... } into [ { k: "s1", v: 1 }, { k: "s2", v: 3 }, ... ]. This way, we don't need to care about the field names since we can access values through their "v" fields.
Having an array rather than an object is a first step towards being able to sum its elements. But the elements obtained with $objectToArray are objects and not simple integers. We can get passed this by mapping (the $map operation) these array elements to extract the value of their "v" field. Which in our case results in creating this kind of array: [1, 3, 18, 42].
Finally, it's a simple matter of summing elements within this array, using the $sum operation.
Segment: {s1: 10, s2: 4, s3: 12}
{$set: {"new_array":{$objectToArray: "$Segment"}}}, //makes field names all "k" or "v"
{$project: {_id:0, total:{$sum: "$new_array.v"}}}
"total" will be 26.
$set replaces $addFields in newer versions of mongo. (I'm using 4.2.)
"new_array": [
{
"k": "s1",
"v": 10
},
{
"k": "s2",
"v": 4
},
{
"k": "s3",
"v": 12
}
]
You can also use regular expressions. Eg. /^s/i for words starting with "s".
Modeling shift planning app:
I came up with such data structure describing the shift.
{
"fromHour" : 7,
"fromMinute" : 30,
"toHour" : 9,
"toMinute" : 30,
"week" : 5,
"date" : "2015-01-26",
"user" : {
// ...
},
"_id" : ObjectId("54d0e4a82b9dc26c0c0f36e7")
}
The main thing that I need to store is the information:
1. when shift starts (hour and minute),
2. when shift ends (hour and minute) and
3. what date it's actually happening.
The fields fromHour, fromMinute, toHour, toMinute and date as ISO string worked pretty good for me for storing and querying the shifts by particular date.
The problem occurred when I needed to build reports out of it. Say, I want to get all shifts from "2015-01-01" to "2015-02-01" in range from 07:00, till 23:00.
I can add $and clause to my query, like
[ { fromHour: { '$gte': 7 } },
{ fromMinute: { '$gte': 0 } },
{ toHour: { '$lt': 23 } },
{ toMinute: { '$lt': 0 } } ]
But that doesn't work good, since for shift there toMinute is 30 the $lt will be false.
I'm trying to find efficient data structure that would allow to store timespans that would be easy to query.
Storing hours and minutes in two different fields separated is too error-prone and makes your job harder. Since Mongo does not have a distinct "Time" data type, only Date, and shifts usually start and end at "easy" times, I would recommend to implement something like converting the time to a real number in your application like this:
00:00 --> 0
01:00 --> 1
...
08:00 --> 8
08:15 --> 8.25
08:30 --> 8.5
...
16:30 --> 16.5
...
It is a bit of extra work in the app because you have to convert while saving or displaying but it's still better than having one single time value in two different fields.
So your data would look like this:
{
"shiftStart" : 7.5,
"shiftEnd" : 9.5,
"week" : 5,
"date" : "2015-01-26",
"user" : {
// ...
},
"_id" : ObjectId("54d0e4a82b9dc26c0c0f36e7")
}
and your query:
[ { shiftStart: { '$gte': 7 } },
{ shiftEnd: { '$lt': 23 } } ]
You can store the data with the date type in ISODate(...) format and then use $project and Date Aggregation Operators to query the data.
For your example:
db.shifts.aggregate([
{ $match: //matches the dates first to filter out before $project step
{ datetimeStart:
{ $gte: ISODate("2015-01-01T07:00:00.000Z"),
$lt: ISODate("2015-02-01T00:00:00.000Z")
},
datetimeEnd:
{ $gte: ISODate("2015-01-01T07:00:00.000Z"),
$lt: ISODate("2015-02-01T00:00:00.000Z")
}
}
},
{ $project: // $project step extracts the hours
{ otherNeededFields: 1, // any other fields you want to see
datetimeStart: 1,
datetimeEnd: 1,
hourStart: { $hour: "$datetimeStart" },
hourEnd: { $hour: "$datetimeEnd" }
}
},
{ $match: // match the shift hours
{ hourStart: { $gte: 7 },
hourEnd: { $lte: 23 }
}
}
])
With this system it would be possible, but complicated to find something more like shifts between 7:30AM and 10:30PM:
db.shifts.aggregate([
{ $match: //matches the dates first to filter out before $project step
{ datetimeStart:
{ $gte: ISODate("2015-01-01T07:30:00.000Z"),
$lt: ISODate("2015-02-01T00:00:00.000Z")
},
datetimeEnd:
{ $gte: ISODate("2015-01-01T07:30:00.000Z"),
$lt: ISODate("2015-02-01T00:00:00.000Z")
}
}
},
{ $project: // $project step extracts the hours
{ otherNeededFields: 1, // any other fields you want to see
datetimeStart: 1,
datetimeEnd: 1,
hourStart: { $hour: "$datetimeStart" },
minStart: { $minute: "$datetimeStart" },
hourEnd: { $hour: "$datetimeEnd" },
minEnd: { $minute: "$date
}
},
{ $match: // match the shift hours
{ $or:
[
{hourStart: 7, minStart: {$gte: 30}}, // hour is 7, minute >= 30
{hourStart: { $gte: 8 }} // hour is >= 8
],
$or:
[
{hourEnd: 22, minEnd: {$lte: 30}}, // hour is 22, minute <= 30
{hourEnd: { $lte: 21 }} // hour is <= 21
]
}
}
])