How do I make a mongo query for something that is not in a subdocument array of heterodox size? - mongodb

I have a mongodb collection full of 65k+ documents, each one with a properties named site_histories. The value of it is an array that might be empty, or might not be. If it is not empty, it will have one or more objects similar to this:
"site_histories" : "[{\"site_id\":\"129373\",\"accepted\":\"1\",\"rejected\":\"0\",\"pending\":\"0\",\"user_id\":\"12743\"}]"
I need to make a query that will look for every instance in the collection of a document that does not have a given user_id.
I'm pretty new to Mongo, so I was trying to make a query that would find every instance that does have the given user_id, which I was then planning on adding a "$ne" to, but even that didn't work. This is the query I was using that didn't work:
db.test.find({site_histories: { $elemMatch: {user_id: '12743\' }}})
So can anyone tell me why this query didn't work? And can anyone help me format a query that will do what I need the final query to do?

If your site_histories really is an array, it should be as simple as doing:
db.test.find({"site_histories.user_id": "12743"})
That looks in all the elements of the array.
However, I'm a bit scared of all those backslashes. If site_histories is a string, that won't work. It would mean that the schema is poorly designed, you'd maybe try with $regex

Related

MongoDB $regex with $in clause

I need a mongodb query something like
db.getCollection("xyz").find({"_id" : {$regex : {$in : [xxxx/*]}}})
My Use case is -- I have a list of Strings such as
[xyz/12/poi, abc/98/mnb, ytn/65/tdx, ...]
The ids that are there in the collection(test) are something like
xyz/12/poi/2019061304.
I will get the values like xyz/12/poi from the input list, the other part of the id being yyyymmddhh format.
So, I need to go to the collection and find all the documents matching the input list with the ID of the documents in the test collection.
I can retrieve the documents individually but that does not seem to be a feasible option as the size of the input list is more than 10000.
Can you guys suggest a more feasible solution. Thanks in advance.
I tried using $in with $regex. But it seems mongodb does not support that. I have also tried pattern matching but even that is not feasible for me. Can you please suggest an alternative to using $in with $regex in mongodb.
Expected result could be an aggragate query/a normal query so that we hit the database only once and get the desired output rather than hitting the db for 10000 odd times.

Pymongo - find multiple different documents

my question is very similar to how-to-get-multiple-document-using-array-of-mongodb-id, however, I would like to find multiple documents without using the _id.
That is, consider that I have documents such as
the
document = { _id: _id, key_1: val_1, key_2: val_2, key_3: val_3}
I need to be able to .find() by multiple parameters, as for example,
query_1 = {key_1: foo, key_2: bar}
query_2 = {key_1: foofoo, key_2: barbar}
Right now, I am running a query for query_1, followed by a query for query_2.
As it turns out, this method is extremely inefficient.
I tried to add concurrency as to make it faster, but the speedup was not even 2x.
Is it possible to query multiple documents at once?,
I am looking for a method that returns the union of the matches for query_1 AND query_2.
If this is not possible, do you have any suggestions that might speed a query of this type?
Thank you for your help.

Mongo DB search based on multiple conditions

I am trying to search based on multiple conditions which works but the problem is that does not behave like this.
Assuming i have a search query like
Orders.find({$or: {"status":{"$in":["open", "closed"]},"paymentStatus":{"$in":["unpaid"]}}}
)
and i add another filter parameter like approvalStatus it does not leave the previously found items but rather it treats the query like an AND that will return an empty collection of items if one of the queries does not match.
How can i write a query that regardless of what is passed into it, it will retain previously found items even if there is no record in one of the conditions.
like a simple OR query in sql
I hope i explained this well enough
Using $or here is the right approach, but its value needs to be an array of query expressions, not an object.
So your query should look something like this instead:
Orders.find({$or: [
{"status": {"$in": ["open", "closed"]}},
{"paymentStatus": {"$in": ["unpaid"]}},
{"approvalStatus": {"$in": ["approved"]}}
]})

Does Mongo make a mistake like this?

Say I have a User Document, filled with arrays of ObjectIds.
They are references to documents in another collection.
I want to load all things from a particular user's array. So I do:
find({ _id: $in : someArrayOfObjectIds})
It's possible that certain references reference something that has been deleted.
So the resulting array of the above "find" call can be smaller then the someArrayOfObjectIds.
So for all the ObjectIds not found can I now safely assume that that document does not exist anymore, or can my query just fail to find a document (does mongo make a mistake).
Yes, you can safely assume that missing documents do not exist. By the way, your query is invalid. Should be this:
find({ _id: {$in : someArrayOfObjectIds}})
or can my query just fail to find a document
If it was possible, no one would use it. Pen and paper approach is a safer alternative that DB that makes such mistakes :)

how can I manipulate the value field of MapReduce?

When using MapReduce, each resulting document 'result' is structured like this:
{ '_id' : 123, 'value' :{'sum_donations' 999, 'nbr_visitors':50 }
I could access _id and value field by using:
db.result.find() OR db.result.find({},{_id:1, value:1})
Is there a way to select _id and sum_donations without selecting the nbr_visitors? Something like this:
{'id': 123, 'sum_donation': 999}
Or should I just create another MapReduce function that return that for me?
I was thinking about having one MapReduce Collection and manipulate it to answer different questions.
I tried
db.result.find({},{_id:1, value.sum_donations:1}) but it didn't work.
There are two problems to doing this:
The value field of the MR is not currently manipulatable from the MR itself atm, there is a JIRA for it but it's not exactly on the "list": https://jira.mongodb.org/browse/SERVER-2517
The query language of Mongo cannot automatically project your fields to the top level document. Subdocument fields stay in the subdocument.
You could (if your using MongoDB 2.2) use the aggregation framework here with the $project operator but I believe this to be super over kill and would slow down your system and your program.
So the best way to do this atm is to just extend your programming to grab the field out of that subdocument. This is probably the most performant, direct and easiest method of doing this atm, to simply code around it.