MongoDB $in not only one result in case of repeated elements - mongodb

I need to get the users whose ids are contained in an array. For this i'm using the $in operator, however being this inside an aggregate operation, i'd like to get back a specific user all the time it's id is present in the array, not just one. For example:
The ids array is A=[a,b,c,b] and U(x) is user with id x
with users.find({_id:{$in:A}}) i get these users as result: U(a),U(b),U(c)
instead i'd like to get back the result: U(a),U(b),U(c),U(b)
so get the user back every time it's id appears.
I understand that $in is working as expected but does anyone have an idea on how can i achieve this?
Thanks

This isn't possible using a MongoDB query.
MongoDB's query engine iterates over the documents in a collection (or over an index if there's a useful one) and returns to you any documents that match your query, in the order it finds them. Whether b appears once, twice, or a hundred times in your query makes no difference: the document with _id of b matches the query and is returned once, when MongoDB finds it.
You can do a post-processing step in your programming language to repeat documents as many times as you want.

Related

Mongodb query forward from username X

I have problem whit long Mongo find results. Example how can i start query starting from _id X
to forward Example I know I have document where is 1000 users details I know there is user called Peter in list I can make query Users.find({userName: "Peter"}) and get this on user _id but how I can get all users also after this with out I need return JSON from above "Peter"
With the little amount of information you have given, You need to do this in two steps:
Get the id of the first record that matches the name "peter".
db.test.findOne({"userName":"Peter"},{"_id":1});
Returns one document that satisfies the specified query criteria. If
multiple documents satisfy the query, this method returns the first
document according to the natural order which reflects the order of
documents on the disk. In capped collections, natural order is the
same as insertion order.
Once you have the id of the record with peter, you can retrieve the records with their id > the id of this record.
db.test.find({"_id":{$gte:x}});
Where, x is the id of the first record returned by the first query.

MongoDB skip & limit when querying two collections

Let's say I have two collections, A and B, and a single document in A is related to N documents in B. For example, the schemas could look like this:
Collection A:
{id: (int),
propA1: (int),
propA2: (boolean)
}
Collection B:
{idA: (int), # id for document in Collection A
propB1: (int),
propB2: (...),
...
propBN: (...)
}
I want to return properties propB2-BN and propA2 from my API, and only return information where (for example) propA2 = true, propB6 = 42, and propB1 = propA1.
This is normally fairly simple - I query Collection B to find documents where propB6 = 42, collect the idA values from the result, query Collection A with those values, and filter the results with the Collection A documents from the query.
However, adding skip and limit parameters to this seems impossible to do while keeping the behavior users would expect. Naively applying skip and limit to the first query means that, since filtering occurs after the query, less than limit documents could be returned. Worse, in some cases no documents could be returned when there are actually still documents in the collection to be read. For example, if the limit was 10 and the first 10 Collection B documents returned pointed to a document in Collection A where propA2 = false, the function would return nothing. Then the user would assume there's nothing left to read, which may not be the case.
A slightly less naive solution is to simply check if the return count is < limit, and if so, repeat the queries until the return count = limit. The problem here is that skip/limit queries where the user would expect exclusive sets of documents returned could actually return the same documents.
I want to apply skip and limit at the mongo query level, not at the API level, because the results of querying collection B could be very large.
MapReduce and the aggregation framework appear to only work on a single collection, so they don't appear to be alternatives.
This seems like something that'd come up a lot in Mongo use - any ideas/hints would be appreciated.
Note that these posts ask similar sounding questions but don't actually address the issues raised here.
Sounds like you already have a solution (2).
You cannot optimize/skip/limit on first query, depending on search you can perhaps do it on second query.
You will need a loop around it either way, like you write.
I suppose, the .skip will always be costly for you, since you will need to get all the results and then throw them away, to simulate the skip, to give the user consistent behavior.
All the logic would have to go to your loop - unless you can match in a clever way to second query (depending on requirements).
Out of curiosity: Given the time passed, you should have a solution by now?!

What's the easiest way to return the results of a query for a given key/value pair in mongo as an array of the values returned?

I have a field called id (not _id) in documents from two collections. I need to compare the contents of the first collection with the second. Basically, I need to know what documents with a given value 'id' exist in collection 'A', but not 'B'. What's the easiest way to build an array of id's from Collection A that I can use to do something like the following. :
db.B.find({id:{$nin: array_of_ids_from_coll_A}})
Please don't get hung up over why I'm using 'id' in this case, and not '_id'. Thanks.
Strictly speaking, this doesn't answer the question of 'how to build an array that...', but I'd iterate over collection A and, for each element, try to find a match in B. If none is found, add to a list.
This has a lot of roundtrips to the database, so it's not very fast, but it's very simple. Also, if A contains a lot of elements, the array of ids might be too large to throw all of them in the $nin, which otherwise would have to be solved by splitting up the array of ids. To make matters worse, $nin isn't efficient with indexes anyway.
I incorrectly assumed that the function 'distinct' returned a set of distinct documents based on a given 'field'. In fact, it returns an array of distinct values, provided a specific field. So, I was able to construct the array I was looking for with db.A.distinct('id'). Thanks to anyone who took the time to read this question, anyway.

In Mongodb, how to retrieve the subset of an object that matches a condition?

What I'm trying to do:
Filter a field of a collection that matches a given condition. Instead of returning every item in the field (which is an array of items), I only want to see matched items.
Similar to
select items from test where items.histPrices=[10,12]
It is also similar to what's found on the mongodb website here: http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields
Here's what I have been trying:
db.test.save({"name":"record", "items":[{"histPrices":[10,12],"name":"stuff"}]})
db.test.save({"name":"record", "items":[{"histPrices":[10,12],"name":"stuff"},
{"histPrices":[12,13],"name":"stuff"},{"histPrices":[11,14],"name":"stuff"}]})
db.test.find({},{"name":1,"items.histPrices":[10, 12]})
It will return all the objects that have a match for items.histPrices:[10,12], including ALL of the items in items[]. But I don't want the ones that don't match the condition.
From the comments left on Mongodb two years ago, the solution to get only the items with that histPrices[10,12] is to do it with javascript code, namely, loop through the result set and filter out the other items.
I wonder if there's a way to do that with just the query.
Your find query is wrong
db.test.find({},{"name":1,"items.histPrices":[10, 12]})
Your condition statement should be in the first part of the find statement.In your query {} means fetch all documents similar to this sql
select items from test (no where clause)
you have to change your mongodb find to
db.test.find({"items.histPrices":[10, 12]},{"name":1})
make it work
since your items is an array and if you wanted to return only the matching sub item, you have to use positional operator
db.test.find({"items.histPrices":[10, 12]},{"name":1,'items.$':1})
When working with arrays Embedded to the Document, the best approach is the one suggested by Chien-Wei Huang.
I would just add another aggregation, with the $group (in cases the document is very long, you may not want to retrieve all its content, only the array elements) Operator.
Now the command would look like:
db.test.aggregate({$match:{name:"record"}},
{$unwind:"$items"},
{$match {"items.histPrices":[10, 12]}},
{$group: {_id: "$_id",items: {$push: "$items"}}});)
If you are interested to return only one element from the array in each collection, then you should use projection instead
The same kind of issue solved here:
MongoDB Retrieve a subset of an array in a collection by specifying two fields which should match
db.test.aggregate({$unwind:"$items"}, {$match:{"items.histPrices":[10, 12]}})
But I don't know whether the performance would be OK. You have to verify it with your data.
The usage of $unwind
If you want add some filter condition like name="record", just add another $march at first, ex:
db.test.aggregate({$match:{name:"record"}}, {$unwind:"$items"}, {$match:{"items.histPrices":[10, 12]}})
https://jira.mongodb.org/browse/SERVER-828
Get particular element from mongoDB array
MongoDB query to retrieve one array value by a value in the array

Mongo query for number of items in a sub collection

This seems like it should be very simple but I can't get it to work. I want to select all documents A where there are one or more B elements in a sub collection.
Like if a Store document had a collection of Employees. I just want to find Stores with 1 or more Employees in it.
I tried something like:
{Store.Employees:{$size:{$ne:0}}}
or
{Store.Employees:{$size:{$gt:0}}}
Just can't get it to work.
This isn't supported. You basically only can get documents in which array size is equal to the value. Range searches you can't do.
What people normally do is that they cache array length in a separate field in the same document. Then they index that field and make very efficient queries.
Of course, this requires a little bit more work from you (not forgetting to keep that length field current).