Find documents with field matching another field WITHOUT aggregation - mongodb

so I'm working with a system that can only take the query-argument of a .find() query as input, which means that I can not input an iterative script or an aggregation query.
However, I need to be able to find all documents that contain in one of their array fields the value that is set in another field. So, if my documents look as follows:
{ _id: ..., FieldA: x, FieldB: [x, y, z] }
Then I want to find all documents that contain the value of FieldA in FieldB, in this example contain x in FieldB.
Is that at all possible?

You can use $in with $expr for your case.
db.collection.find({
$expr: {
"$in": [
"$FieldA",
"$FieldB"
]
}
})
Here is the Mongo playground for your reference.

Related

Mongodb: Query on the last N documents(some portion) of a collection only

In my collection that has say 100 documents, I want to run the following query:
collection.find({"$text" : {"$search" : "some_string"})
Assume that a suitable "text" index already exists and thus my question is : How can I run this query on the last 'n' documents only?
All the question that I found on the web ask how to get the last n docs. Whereas My question is how to search on the last n docs only?
More generally my question is How can I run a mongo query on some portion say 20% of a collection.
What I tried
Im using pymongo so I tried to use skip() and limit() to get the last n documents but I didn't find a way to perform a query on cursor that the above mentioned function return.
After #hhsarh's anwser here's what I tried to no avail
# here's what I tried after initial answers
recents = information_collection.aggregate([
{"$match" : {"$text" : {"$search" : "healthline"}}},
{"$sort" : {"_id" : -1}},
{"$limit" : 1},
])
The result is still coming from the whole collection instead of just the last record/document as the above code attempts.
The last document doesn't contain "healthline" in any field therefore the intended result of the query should be empty []. But I get a documents.
Please can someone tell how this can be possible
What you are looking for can be achieved using MongoDB Aggregation
Note: As pointed out by #turivishal, $text won't work if it is not in the first stage of the aggregation pipeline.
collection.aggregate([
{
"$sort": {
"_id": -1
}
},
{
"$limit": 10 // `n` value, where n is the number of last records you want to consider
},
{
"$match" : {
// All your find query goes here
}
},
], {allowDiskUse=true}) // just in case if the computation exceeds 100MB
Since _id is indexed by default, the above aggregation query should be faster. But, its performance reduces in proportion to the n value.
Note: Replace the last line in the code example with the below line if you are using pymongo
], allowDiskUse=True)
It is not possible with $text operator, because there is a restriction,
The $match stage that includes a $text must be the first stage in the pipeline
It means we can't limit documents before $text operator, read more about $text operator restriction.
Second option this might possible if you use $regex regular expression operator instead of $text operator for searching,
And if you need to search same like $text operator you have modify your search input as below:
lets assume searchInput is your input variable
list of search field in searchFields
slice that search input string by space and loop that words array and convert it to regular expression
loop that search fields searchFields and prepare $in condition
searchInput = "This is search"
searchFields = ["field1", "field2"]
searchRegex = []
searchPayload = []
for s in searchInput.split(): searchRegex.append(re.compile(s, re.IGNORECASE));
for f in searchFields: searchPayload.append({ f: { "$in": searchRegex } })
print(searchPayload)
Now your input would look like,
[
{'field1': {'$in': [/This/i, /is/i, /search/i]}},
{'field2': {'$in': [/This/i, /is/i, /search/i]}}
]
Use that variable searchPayload with $or operator in search query at last stage using $in operator,
recents = information_collection.aggregate([
# 1 = ascending, -1 descending you can use anyone as per your requirement
{ "$sort": { "_id": 1 } },
# use any limit of number as per your requirement
{ "$limit": 10 },
{ "$match": { "$or": searchPayload } }
])
print(list(recents))
Note: The $regex regular expression search will cause performance issues.
To improve search performance you can create a compound index on your search fields like,
information_collection.createIndex({ field1: 1, field2: 1 });

MongoDB: What is the fastest / is there a way to get the 200 documents with a closest timestamp to a specified list of 200 timestamps, say using a $in [duplicate]

Let's assume I have a collection with documents with a ratio attribute that is a floating point number.
{'ratio':1.437}
How do I write a query to find the single document with the closest value to a given integer without loading them all into memory using a driver and finding one with the smallest value of abs(x-ratio)?
Interesting problem. I don't know if you can do it in a single query, but you can do it in two:
var x = 1; // given integer
closestBelow = db.test.find({ratio: {$lte: x}}).sort({ratio: -1}).limit(1);
closestAbove = db.test.find({ratio: {$gt: x}}).sort({ratio: 1}).limit(1);
Then you just check which of the two docs has the ratio closest to the target integer.
MongoDB 3.2 Update
The 3.2 release adds support for the $abs absolute value aggregation operator which now allows this to be done in a single aggregate query:
var x = 1;
db.test.aggregate([
// Project a diff field that's the absolute difference along with the original doc.
{$project: {diff: {$abs: {$subtract: [x, '$ratio']}}, doc: '$$ROOT'}},
// Order the docs by diff
{$sort: {diff: 1}},
// Take the first one
{$limit: 1}
])
I have another idea, but very tricky and need to change your data structure.
You can use geolocation index which supported by mongodb
First, change your data to this structure and keep the second value with 0
{'ratio':[1.437, 0]}
Then you can use $near operator to find the the closest ratio value, and because the operator return a list sorted by distance with the integer you give, you have to use limit to get only the closest value.
db.places.find( { ratio : { $near : [50,0] } } ).limit(1)
If you don't want to do this, I think you can just use #JohnnyHK's answer :)

How to $add together a subset of elements of an array in mongodb aggregation?

Here is the problem I want to resolve:
each document contains an array of 30 integers
the documents are grouped under a certain condition (not relevant here)
while grouping them, I want to:
add together the 29 last elements of the array (skipping the first one) of each document
sum the previous result among the same group, and return it
Data structure is very difficult to change and I cannot afford a migration + I still need the 30 values for another purpose. Here is what I tried, unsuccessfully:
db.collection.aggregate([
{$match: {... some matching query ...}},
{$project: {total_29_last_values: {$add: ["$my_array.1", "$my_array.2", ..., "$my_array.29"]}}},
{$group: {
... some grouping here ...
my_result: {$sum: "$total_29_last_values"}
}}
])
Theoretically (IMHO) this should work, given the definition of $add in mongodb documentation, but for some reason it fails:
exception: $add only supports numeric or date types, not Array
Maybe there is not support for adding together elements of an array, but this seems strange...
Thanks for your help !
From the docs,
The $add expression has the following syntax:
{ $add: [ <expression1>, <expression2>, ... ] }
The arguments can be any valid expression as long as they resolve to
either all numbers or to numbers and a date.
It clearly states that the $add operator accepts only numbers or dates.
$my_array.1 resolves to an empty array. for example, []. (You can always look for a match in particular index, such as, {$match:{"a.0":1}} but cannot derive the value from a particular index of an array. For that you need to use the $ or the $slice operators.This is currently an unresolved issue: JIRA1, JIRA2)
And the $add expression becomes $add:[[],[],[],..].
$add does not take an array as input and hence you get the error stating that it does not support Array as input.
What you need to do is:
Match the documents.
Unwind the my_array field.
Group together based on the _id of each document to get the sum
of all the elements in the array skipping the first element.
Project the summed field for each grouped document.
Again group the documents based on the condition to get the sum.
Stage operators:
db.collection.aggregate([
{$match:{}}, // condition
{$unwind:"$my_array"},
{$group:{"_id":"$_id",
"first_element":{$first:"$my_array"},
"sum_of_all":{$sum:"$my_array"}}},
{$project:{"_id":"$_id",
"sum_of_29":{$subtract:["$sum_of_all","$first_element"]}}},
{$group:{"_id":" ", // whatever condition
"my_result":{$sum:"$sum_of_29"}}}
])

mongodb - Find document with closest integer value

Let's assume I have a collection with documents with a ratio attribute that is a floating point number.
{'ratio':1.437}
How do I write a query to find the single document with the closest value to a given integer without loading them all into memory using a driver and finding one with the smallest value of abs(x-ratio)?
Interesting problem. I don't know if you can do it in a single query, but you can do it in two:
var x = 1; // given integer
closestBelow = db.test.find({ratio: {$lte: x}}).sort({ratio: -1}).limit(1);
closestAbove = db.test.find({ratio: {$gt: x}}).sort({ratio: 1}).limit(1);
Then you just check which of the two docs has the ratio closest to the target integer.
MongoDB 3.2 Update
The 3.2 release adds support for the $abs absolute value aggregation operator which now allows this to be done in a single aggregate query:
var x = 1;
db.test.aggregate([
// Project a diff field that's the absolute difference along with the original doc.
{$project: {diff: {$abs: {$subtract: [x, '$ratio']}}, doc: '$$ROOT'}},
// Order the docs by diff
{$sort: {diff: 1}},
// Take the first one
{$limit: 1}
])
I have another idea, but very tricky and need to change your data structure.
You can use geolocation index which supported by mongodb
First, change your data to this structure and keep the second value with 0
{'ratio':[1.437, 0]}
Then you can use $near operator to find the the closest ratio value, and because the operator return a list sorted by distance with the integer you give, you have to use limit to get only the closest value.
db.places.find( { ratio : { $near : [50,0] } } ).limit(1)
If you don't want to do this, I think you can just use #JohnnyHK's answer :)

mongodb array matching

How do I accomplish the following?
db.test.save( {a: [1,2,3]} );
db.test.find( {a: [1,2,3,4]} ); //must match because it contains all required values [1, 2, 3]
db.test.find( {a: [1,2]} ); //must NOT match because the required value 3 is missing
I know about $in and $all but they work differently.
Interesting..The problem is..the $in and the $or operators get applied on the elements of the array that you are comparing against each document in the collection, not on the elements of the arrays in the documents..To summarize your question: You want it to be a match, if any of the documents in the collection happens to be a subset of the passed array. I can't think of a way to do this unless you swap your input and output. What I mean is..Let's take your first input:
db.test.find( {a: [1,2,3,4]} );
Consider putting this in a temporary collection say,temp as:
db.temp.save( {a: [1,2,3,4]} );
Now iterate over each document in test collection and 'find' it in temp, with the $all operator to ensure it is completely contained, i.e., do something like this:
foreach(doc in test)
{
db.temp.find( { a: { $all: doc.a } } );
}
This is definitely a workaround! I am not sure if I am missing any other operator that can do this job.
There is currently no operator that will find documents with a subset values. There is a proposed improvement to add a $subset operator. Please feel free to up vote the issue in JIRA:
https://jira.mongodb.org/browse/SERVER-974
Other potential workarounds, which may not be suitable for your use case, may involve map reduce or the new aggregation framework (available in MongoDB version 2.2):
http://docs.mongodb.org/manual/reference/aggregation/#_S_project