Assume we have a collection foo with index {tag: 1} where tag is a single key-value pair (there are a lot more details in my actual use-case, but I'm trying to distill down the problem):
{_id: 1, tag: {bar: "BAR"}}
{_id: 2, tag: {baz: "BAZ"}}
When I query {tag: { $gte: { baz: MinKey() }}}, it returns BOTH documents (unexpected).
When I query {tag: { $gte: { baz: "" }}}, it returns only {_id: 2, tag: {baz: "BAZ"}} (expected).
According to https://docs.mongodb.com/manual/reference/bson-type-comparison-order/#objects, BSON objects are ordered: 1) by field names and 2) by field values.
So why does {tag: { $gte: { baz: MinKey() }}} return {_id: 1, tag: {bar: "BAR"}} when bar is NOT GREATER THAN baz?
Note the command a few lines down in the documentation you linked:
Non-existent Fields
The comparison treats a non-existent field as if it were an empty BSON Object. As such, a sort on the a field in documents { } and { a: null } would treat the documents as equivalent in sort order.
This is telling you that non-existent fields and fields set to null are treated specially.
In order for the documents {} and {a: null} to be equivalent in sort order, the sort algorithm must be considering the missing sort field to be present, and have a value of null.
If you explicitly add the missing field, just to see how it looks, the ordering makes more sense.
The filter {tag: { $gte: { baz: MinKey() }}} applied to {_id: 1, tag: {bar: "BAR"}} is essentially comparing {baz: MinKey()} with {baz: null, bar: "BAR"}.
Near the top of the documentaion you linked it states that MinKey is less than null, so that is the proper ordering.
EDIT
In general, querying is most efficient when the fieldnames are not themselves data. In terms of a tabular database, which columnn would contain "baz"?
A slight change in the schema would simplify this type of query. Instead of {tagname: tagvalue}, use {k:tagname, v:tagvalue}. You could then index tag.k and/or tag.v, and query on tag.k to find all documents with a "baz" tag, querying tags with inequality operations would work more intuitively.
db.collection.find({"tag.k":{$gte:"baz"}})
Exact matches could be done with elemMatch like
db.collection.find({tag: {$elemMatch:{k:"baz",v:"BAZ"}}})
If you really need the returned documents to contain {tagname: tagvalue}, the $arrayToObject aggregation operator can do that:
db.collection.aggregate([
{$match: {
"tag.k": {$gte: "baz"}
}},
{
$addFields: {
tag: {$arrayToObject: [["$tag"]]}
}}
])
Playground
Related
Consider the following documents:
{
_id: 1,
a: [{b:true},{b:true}]
},
{
_id: 2,
a: [{b:false},{b:true}]
},
{
_id: 3,
a: [{b:true}]
}
I'd like to write a query that will return all of the top level documents that have an array ("a") that contain only elements matching {b : true}. In this example, I'm expecting the first and third document to be returned, but not the second.
When writing the query like this, all 3 documents are returned..
{a : {b : true}}
Is there an operator that I'm missing? I've reviewed quite a few of them ($all) and I'm not sure which would match best for this use case
Thanks so much
Simply use $allElementsTrue on the array a.b.
db.collection.find({
$expr: {
"$allElementsTrue": "$a.b"
}
})
Here is the Mongo Playground for your reference.
I have a mongodb collection.
{ user_id: 1,
items : [ { _id: 1 }, { _id: 2}, {_id:3} ] }
I want to remove the items of the array having specific id. Can anybody explain what is wrong with the above query.
db.col.findOneAndUpdate({user_id:1},{$pull:{items:{$elemMatch:{_id:2}}}})
$pull takes an expression as a parameter so you don't have to use $elemMatch (doesn't work in this case). Try:
db.col.update({user_id:1},{$pull:{items:{_id:2}}})
So expression in this case means that MongoDB will remove the document having _id set to 2 but that document can have other properties as well.
In MongoDB, is there any easy way to check Order of element in Array? For example I have a document like this:
{
_id: 1,
tags: ["mongodb", "rethinkdb", "couchbase", "others"]
}
I would like to check in tags field if mongodb come before rethinkdb or not(lets see in array element, mongodb=0, rethinkdb=1 index, so mongodb come first and our case match.)?
but if there is another document (like below) where rethinkdb comes before mongodb,It case does not match.
{
_id: 2,
tags: ["rethinkdb", "mongodb", "couchbase"]
}
Here mongodb(1) comes after rethinkdb(0) so our case does not match.
Your question is not really as clear as you think it is, and thus why there are several ways to answer it:
If you are looking just to find out if a document has "mongodb" as the first element of the array then you just issue a query like this:
db.collection.find({ "tags.0": "mongodb" })
And that will return only the documents that match the given value at the specified index position using "dot notation".
If you actually expect to match if an array is in an "expected order" then you can get some help from the aggregation pipeline and set operators that are available and other features in MongoDB 2.6:
db.collection.aggregate([
{ "$project": {
"$_id": "$$ROOT",
"matched": { "$setEquals": [
"$tags",
["mongodb", "rethinkdb", "couchbase", "others"]
]}
}},
{ "$match": { "matched": true }}
])
Or if your want is to make sure that the "mongodb" value comes before the "rethinkdb" value, then you will need to evaluate in JavaScript with mapReduce, or something equally not nice like the $where operator:
db.collection.find({
"$where": function() {
return this.tags.indexOf("mongodb") < this.tags.indexOf("rethinkdb");
}
})
I have the following document:
{
'date': date,
'_id': ObjectId,
'Log': [
{
'lat': float,
'lng': float,
'date': float,
'speed': float,
'heading': float,
'fix': float
}
]
}
for 1 document, the Log array can be some hundred entries.
I need to query the first and last date element of Log on each document. I know how to query it, but I need to do it fast, so I would like to build an index for that. I don't want to index Log.date since it is too big... how can I index them?
In fact it's hard to advise without knowing how you work with the documents. One of the solutions could be to use a sparse index. You just need to add a new field to every first and last array element, let's call it shouldIndex. Then just create a sparse index which includes shouldIndex and date fields. Here's a short example:
Assume we have this document
{"Log":
[{'lat': 1, 'lng': 2, 'date': new Date(), shouldIndex : true},
{'lat': 3, 'lng': 4, 'date': new Date()},
{'lat': 5, 'lng': 6, 'date': new Date()},
{'lat': 7, 'lng': 8, 'date': new Date(), shouldIndex : true}]}
Please note the first element and the last one contain shouldIndex field.
db.testSparseIndex.ensureIndex( { "Log.shouldIndex": 1, "Log.date":1 }, { spar
se: true } )
This index should contain entries only for your first and last elements.
Alternatively you may store first and last elements date field in a seperate array.
For more info on sparse indexes please refer to this article.
Hope it helps!
So there was an answer about indexing that is fundamentally correct. As of writing though it seems a little unclear whether you are talking about indexing at all. It almost seems like what you want to do is get the first and last date from the elements in your array.
With that in mind there are a few approaches:
1. The elements in your array have been naturally inserted in increasing date values
So if the way all writes that are made to this field is done, only with use of the $push operator over a period of time, and you never update these items, at least in so much as changing a date, then your items are already in order.
What this means is you just get the first and last element from the array
db.collection.find({ _id: id },{ Log: {$slice: 1 }}); // gets the first element
db.collection.find({ _id: id },{ Log: {$slice: -1 }}); // gets the last element
Now of course that is two queries but it's a relatively simple operation and not costly.
2. For some reason your elements are not naturally ordered by date
If this is the case, or indeed if you just can't live with the two query form, then you can get the first and last values in aggregation, but using $min and $max modifiers
db.collection.aggregate([
// You might want to match first. Just doing one _id here. (commented)
//{"$match": { "_id": id }},
//Unwind the array
{"$unwind": "$Log" },
//
{"$group": {
"_id": "$_id",
"firstDate": {"$min": "$Log.Date" },
"lastDate": {"$max": "$Log.Date" }
}}
])
So finally, if your use case here is getting the details of the documents that have the first and last date, we can do that as well, mirroring the initial two query form, somewhat. Using $first and $last :
db.collection.aggregate([
// You might want to match first. Just doing one _id here. (commented)
//{"$match": { "_id": id }},
//Unwind the array
{"$unwind": "$Log" },
// Sort the results on the date
{"$sort": { "_id._id": 1, "Log.date": 1 }},
// Group using $first and $last
{"$group": {
"_id": "$_id",
"firstLog": {"$first": "$Log" },
"lastLog": {"$last": "$Log" }
}}
])
Your mileage may vary, but those approaches may obviate the need to index if this indeed would the the only usage for that index.
I'm a MongoDB novice so please forgive me if this question has an obvious answer...
Context:
I've followed the example in the MongoDB docs to implement hierarchical aggregation using map-reduce. The example uses a "compound" _id field as a map-reduce key producing aggregate documents like this...
{
_id: { u: "rick", d: ISODate("2010-10-10T14:00:00Z") },
value: {
ts: ISODate('2010-10-10T15:01:00Z'),
total: 254,
count: 10,
mean: 25.4 }
}
This is all well and good. My particular use case requires that values for several similar keys be emitted each map step. For example...
{
_id: { u: "rick", d: ISODate("2010-10-10T14:00:00Z"), hobby: "wizardry" },
value: {
ts: ISODate('2010-10-10T15:01:00Z'),
total: 254,
count: 10,
mean: 25.4 }
}
{
_id: { u: "rick", d: ISODate("2010-10-10T14:00:00Z"), gender: "male" },
value: {
ts: ISODate('2010-10-10T15:01:00Z'),
total: 254,
count: 10,
mean: 25.4 }
}
(The values are the same, but the _id keys are slightly different.)
This is also well and good.
Question:
Now I'd like to aggregate over my hierarchical collections (views), which contain documents having several different compound _id fields, but only over documents with $matching _id fields. For example, I'd like to aggregate over just the documents possessing the {u: String, d: Date, hobby: String} type _id or just the documents with an _id of type {u: String, d: Date}.
I'm aware that I can use the $exists operator to restrict which _id fields should and shouldn't be permitted, but I don't want to have to create a separate aggregation for each _id (potentially many).
Is there a simple way of programmatically restricting $matching documents to those containing (or not containing) particular fields in an aggregate?
I think the best way to address this issues is by storing your data differently. Your "_id" sort of has arbitrary values as key and that is something you should avoid. I would probably store the documents as:
{
_id: { u: "rick", d: ISODate("2010-10-10T14:00:00Z"), type: hobby, value: "wizardry" }
}
{
_id: { u: "rick", d: ISODate("2010-10-10T14:00:00Z"), type: gender, value: "male" },
}
And then your match because simple even without having to create a different match for each type.