Use distinct and skip in a query - mongodb

I tried running this:
db.col.find().skip(5).distinct("field1")
But it throws an error.
How to use them together?
I can use aggregation but results are different:
db.col.aggregate([{$group:{_id:'$field1'}}, {$skip:3},{$sort:{"field1":1}}])
What I want is links in sorted order i.e numbers should come first then capital letters and then small letters.

Distinct method must be run on COLLECTION not on cursor and returns an array. Read this
http://docs.mongodb.org/manual/reference/method/db.collection.distinct/
So you can't use skip after distinct.
May be you should use this query
db.col.aggregate([{$group:{_id:'$field1'}}, {$skip:3},{$sort:{"_id":1}}]) because field field1 will not exists in result after first clause of grouping.
Also I think you should do sort at first and then skip because in your query you skip 3 unsorted results and then sort them.
(If you provide more information about structure of your documents and what output you want it would be more clearly and I will correct answer properly)

Related

What does the distinct on clause mean in cloud datastore and how does it effect the reads?

This is what the cloud datastore doc says but I'm having a hard time understanding what exactly this means:
A projection query that does not use the distinct on clause is a small operation and counts as only a single entity read for the query itself.
Grouping
Projection queries can use the distinct on clause to ensure that only the first result for each distinct combination of values for the specified properties will be returned. This will return only the first result for entities which have the same values for the properties that are being projected.
Let's say i have a table for questions and i only want to get the question text sorted by the created date would this be counted as a single read and rest as small operations?
If your goal is to just project the date and text fields, you can create a composite index on those two fields. When you query, this is a small operation with all the results as a single read. You are not trying to de-duplicate (so no distinct/on) in this case and so it is a small operation with a single read.

What's the easiest way to return the results of a query for a given key/value pair in mongo as an array of the values returned?

I have a field called id (not _id) in documents from two collections. I need to compare the contents of the first collection with the second. Basically, I need to know what documents with a given value 'id' exist in collection 'A', but not 'B'. What's the easiest way to build an array of id's from Collection A that I can use to do something like the following. :
db.B.find({id:{$nin: array_of_ids_from_coll_A}})
Please don't get hung up over why I'm using 'id' in this case, and not '_id'. Thanks.
Strictly speaking, this doesn't answer the question of 'how to build an array that...', but I'd iterate over collection A and, for each element, try to find a match in B. If none is found, add to a list.
This has a lot of roundtrips to the database, so it's not very fast, but it's very simple. Also, if A contains a lot of elements, the array of ids might be too large to throw all of them in the $nin, which otherwise would have to be solved by splitting up the array of ids. To make matters worse, $nin isn't efficient with indexes anyway.
I incorrectly assumed that the function 'distinct' returned a set of distinct documents based on a given 'field'. In fact, it returns an array of distinct values, provided a specific field. So, I was able to construct the array I was looking for with db.A.distinct('id'). Thanks to anyone who took the time to read this question, anyway.

MongoDB $in not only one result in case of repeated elements

I need to get the users whose ids are contained in an array. For this i'm using the $in operator, however being this inside an aggregate operation, i'd like to get back a specific user all the time it's id is present in the array, not just one. For example:
The ids array is A=[a,b,c,b] and U(x) is user with id x
with users.find({_id:{$in:A}}) i get these users as result: U(a),U(b),U(c)
instead i'd like to get back the result: U(a),U(b),U(c),U(b)
so get the user back every time it's id appears.
I understand that $in is working as expected but does anyone have an idea on how can i achieve this?
Thanks
This isn't possible using a MongoDB query.
MongoDB's query engine iterates over the documents in a collection (or over an index if there's a useful one) and returns to you any documents that match your query, in the order it finds them. Whether b appears once, twice, or a hundred times in your query makes no difference: the document with _id of b matches the query and is returned once, when MongoDB finds it.
You can do a post-processing step in your programming language to repeat documents as many times as you want.

In Mongodb, how to retrieve the subset of an object that matches a condition?

What I'm trying to do:
Filter a field of a collection that matches a given condition. Instead of returning every item in the field (which is an array of items), I only want to see matched items.
Similar to
select items from test where items.histPrices=[10,12]
It is also similar to what's found on the mongodb website here: http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields
Here's what I have been trying:
db.test.save({"name":"record", "items":[{"histPrices":[10,12],"name":"stuff"}]})
db.test.save({"name":"record", "items":[{"histPrices":[10,12],"name":"stuff"},
{"histPrices":[12,13],"name":"stuff"},{"histPrices":[11,14],"name":"stuff"}]})
db.test.find({},{"name":1,"items.histPrices":[10, 12]})
It will return all the objects that have a match for items.histPrices:[10,12], including ALL of the items in items[]. But I don't want the ones that don't match the condition.
From the comments left on Mongodb two years ago, the solution to get only the items with that histPrices[10,12] is to do it with javascript code, namely, loop through the result set and filter out the other items.
I wonder if there's a way to do that with just the query.
Your find query is wrong
db.test.find({},{"name":1,"items.histPrices":[10, 12]})
Your condition statement should be in the first part of the find statement.In your query {} means fetch all documents similar to this sql
select items from test (no where clause)
you have to change your mongodb find to
db.test.find({"items.histPrices":[10, 12]},{"name":1})
make it work
since your items is an array and if you wanted to return only the matching sub item, you have to use positional operator
db.test.find({"items.histPrices":[10, 12]},{"name":1,'items.$':1})
When working with arrays Embedded to the Document, the best approach is the one suggested by Chien-Wei Huang.
I would just add another aggregation, with the $group (in cases the document is very long, you may not want to retrieve all its content, only the array elements) Operator.
Now the command would look like:
db.test.aggregate({$match:{name:"record"}},
{$unwind:"$items"},
{$match {"items.histPrices":[10, 12]}},
{$group: {_id: "$_id",items: {$push: "$items"}}});)
If you are interested to return only one element from the array in each collection, then you should use projection instead
The same kind of issue solved here:
MongoDB Retrieve a subset of an array in a collection by specifying two fields which should match
db.test.aggregate({$unwind:"$items"}, {$match:{"items.histPrices":[10, 12]}})
But I don't know whether the performance would be OK. You have to verify it with your data.
The usage of $unwind
If you want add some filter condition like name="record", just add another $march at first, ex:
db.test.aggregate({$match:{name:"record"}}, {$unwind:"$items"}, {$match:{"items.histPrices":[10, 12]}})
https://jira.mongodb.org/browse/SERVER-828
Get particular element from mongoDB array
MongoDB query to retrieve one array value by a value in the array

How does MongoDB evaluates multiple $or statements?

How will MongoDB evaluate this query:
db.testCol.find(
{
"$or" : [ {a:1, b:12}, {b:9, c:15}, {c:10, d:"foo"} ]
});
When scanning values in a document if first OR statement is TRUE will the other statements be also be evaluated?
Logically if the MongoDB is optimized other values in OR statement should not be evaluated, but I don't know how MongoDB is implemented.
UPDATE:
I updated my query because it was wrong and it didn't explain correctly what I was trying to accomplish. I need to find a set of documents that have different properties and if an exact combination of these properties is found the document must be returned.
The SQL equivalent of my query would be:
SELECT * FROM testCol
WHERE (a = 1 AND b = 12) OR (b = 9 AND c = 15) OR (c = 10 AND d = 'foo');
MongoDB will execute each clause of the $or operation as a seperate query and remove duplicates as a post processing pass. As such each clause can use a seperate index which is often very useful.
In other words, it will NOT look at 1 document, see which of the OR clauses apply and do an early-out if the first clause is a match. Rather it does a full dataset query per clause and de-dupe after the fact. This may seem less than efficient but in practice it's almost always faster since the first approach would only be able to hit at most one index for all clauses which is rarely efficient.
EDIT: Mongo only skips documents during the de-duplication process, not during the table scans.
Mongo won't check documents that are already part of the result set. So if your first {a:1, b:12} returns 100% of the documents, Mongo is done.
You want to put whatever will grab the most documents as your first evaluated statement because of this. If your first item only grabs 1% of documents, the subsequent item will need to scan the other 99%.
That being said, you are using $or to look for values in a single key. I think you want to use $in for this.
See here for more:
http://books.google.com/books?id=BQS33CxGid4C&lpg=PA48&ots=PqvQJPRUoe&dq=mongo%20tips%20and%20tricks%20%22OR-query%22&pg=PA48#v=onepage&q&f=false