I have a unitScores collection, where each document has an id and an array of documents like this:
"_id": ObjectId("52134edd5b1c2bb503000001"),
"scores": [
{
"userId": ObjectId("5212bf3869bf351223000002"),
"unitId": ObjectId("521160695483658217000001"),
"score": 33
},
{
"unitId": ObjectId("521160695483658217000001"),
"userId": ObjectId("5200f6e4006292d308000008"),
"score": 17
}
]
I have two find queries:
_id:new ObjectID(scoreId)
"scores.userId":new ObjectID(userId)
"scores.unitId":new ObjectID(unitId)
and
_id:new ObjectID(scoreId)
scores:
$elemMatch:
userId:new ObjectID(userId)
unitId:new ObjectID(unitId)
I would expect them to give the same result, but using the input userId and unitId of
userId: 5212bf3869bf351223000002
unitId: 521160695483658217000001
the dot notation version returns the wrong array entry (the one with score:17) and the $elemMatch returns the correct entry (the one with score:33). Why is that?
$elemMatch is not the same logic as dot notation. $elemMatch requires the same nested elements to have the values. Using dot notation allows for any nested elements to have an values. Thus, your seeing different results because the query logic is different.
Related
In a MongoDB collection, I have documents with a "position" field for ordering and an optional "date" field, e.g.
[
{
"_id": "doc1",
"position": 1
},
{
"_id": "doc2",
"position": 2,
"date": "2021-05-20T08:00:00.000Z"
},
{
"_id": "doc3",
"position": 3
},
{
"_id": "doc4",
"position": 4,
"date": "2021-05-20T08:00:00.000Z"
}
]
I would like the query this collection to get the documents "before" a specified date, in position order. The algorithm would be:
find the first element whose date is "after" the specified date
return all the documents whose position is less than the position of the element found, sorted by "position"
I have implemented this algorithm naïvely with 2 independent queries. However, I suspect it can be done with a single call to the database, but I have no idea how to proceed. Maybe with an aggregation pipeline?
Can someone give me a clue how this can be done?
EDIT: Here are the current queries I use (roughly):
limit_element = db.getCollection('collection').find({
"date": { "$gte": ISODate("2021-05-20T08:00:00.000Z") }
}).sort({
"position": 1
}).limit(1)
position = limit_element['position']
elements = db.getCollection('collection').find({
"position": { "$lt": position }
}).sort({
"position": 1
})
You can use an aggregation pipeline with two match clauses. Essentially its the same thing as you do now but within one DB access so a bit faster. With aggregation you can acess results from the previus stage to use in the next stage. If that is worth it you have to decide. I think your naive approach is sensible. In any case this a conditional problem so you will have to first find one and then do the other. Difference is just where you do the steps.
I'm trying to get a negative match for $geoWithin, will be used in mongodb Charts.
all of the required information is in the result of the latest stage in an aggregation i'm constructing in mongodb compass, the result of that stage looks like this:
{
"PizzaId": "123",
"info": {
"timestamp": {
"$date": "2021-02-15T05:00:00.000Z"
},
"location": {
"type": "Point",
"coordinates": [33.21883773803711, 33.802675247192383]
},
"dayOfWeek": 2,
},
"PizzaLocation": [{
"_id": "456",
"location": {
"type": "Point",
"coordinates": [37.83396911621094, 37.07674026489258]
}
}]
}
I want to add a stage after that a filter that checks that info.location is not in a 100 km radius within Pizzalocation.0.location:
{
$match: {
"info.location.coordinates": {
$not:
{
$geoWithin: {
$centerSphere: [
"$PizzaLocation.0.location.coordinates",
100 / 6378.1
]
}
}
}
}
}
I get an error: Point must be an array or object
Things I tried:
playing with the field name in centerSphere: removing the 0, or $, using:
{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]}
even used the [lon,lat] format and put
[{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},0]},
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},1]}]
setting literal coordinates instead of field name, it worked, but I need to use a field.
creating a view that will hold the centerSphere itself, and use a lookup to get it, but mongoDB didn't recognize $geoWithin nor $centerSphere in $addField aggregation
Things I verified:
I used $project stage on {$arrayElemAt: ["$PizzaLocation.location.coordinates",0]} , and indeed it showed in the array: [lon,lat]
I used $project stage on
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},0]}
and
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},1]}
and indeed it showed a number for each one.
So, how can I use a field's value(s) in the first argument of $centerSphere.
thank you.
So you can't do it, let's first understand why not.
From the $match docs:
The query syntax is identical to the read operation query syntax;
This means $match queries use the same syntax as find queries. and unsurprisingly $geoWithin is a query operator.
Unfortunately query syntax can not access the document values as part of the query. This is also the reason why your query fails, the "coordinates" you pass are being parsed as a string expression.
For example this following query:
{
$match: {
field1: {$eq: "$field2"}
}
}
Matches: { field1: "$field2"} but no { field1: 1, field2: 1 }
Again this is just the query language parser's behaviour so there's not much you can do.
The alternative is to use an the $geoNear stage, but not only there is no easy way to combine it with $not logic there are additional restrictions like it having to be the first stage of the pipeline and so on.
The best I can recommend is split your query into 2 parts, 1 fetch the document you need and only then re-query it using $geoWithin with the proper coordinates input.
I am beginner in MongoDB and struck at a place I am trying to fetch data from nested array but is it taking so long time as data is around 50K data, also it is not much accurate data, below is schema structure please see once -
{
"_id": {
"$oid": "6001df3312ac8b33c9d26b86"
},
"City": "Los Angeles",
"State":"California",
"Details": [
{
"Name": "Shawn",
"age": "55",
"Gender": "Male",
"profession": " A science teacher with STEM",
"inDate": "2021-01-15 23:12:17",
"Cars": [
"BMW","Ford","Opel"
],
"language": "English"
},
{
"Name": "Nicole",
"age": "21",
"Gender": "Female",
"profession": "Law student",
"inDate": "2021-01-16 13:45:00",
"Cars": [
"Opel"
],
"language": "English"
}
],
"date": "2021-01-16"
}
Here I am trying to filter date with date and Details.Cars like
db.getCollection('news').find({"Details.Cars":"BMW","date":"2021-01-16"}
it is returning details of other persons too which do not have cars- BMW , Only trying to display details of person like - Shawn which have BMW or special array value and date too not - Nicole, rest should not appear but is it not happening.
Any help is appreciated. :)
A combination of $match on the top-level fields and $filter on the array elements will do what you seek.
db.foo.aggregate([
{$match: {"date":"2021-01-16"}}
,{$addFields: {"Details": {$filter: {
input: "$Details",
as: "zz",
cond: { $in: ['BMW','$$zz.Cars'] }
}}
}}
,{$match: {$expr: { $gt:[{$size:"$Details"},0] } }}
]);
Notes:
$unwind is overly expensive for what is needed here and it likely means "reassembling" the data shape later.
We use $addFields where the new field to add (Details) already exists. This effectively means "overwrite in place" and is a common idiom when filtering an array.
The second $match will eliminate docs where the date matches but not a single entry in Details.Cars is a BMW i.e. the array has been filtered down to zero length. Sometimes you want to know this info so if this is the case, do not add the final $match.
I recommend you look into using real dates i.e. ISODate instead of strings so that you can easily take advantage of MongoDB date math and date formatting functions.
Is a common mistake think that find({nested.array:value}) will return only the nested object but actually, this query return the whole object which has a nested object with desired value.
The query is returning the whole document where value BMW exists in the array Details.Cars. So, Nicole is returned too.
To solve this problem:
To get multiple elements that match the criteria you can do an aggregation stage using $unwind to separate the different objects into array and match by the criteria you want.
db.collection.aggregate([
{
"$match": { "Details.Cars": "BMW", "date": "2021-01-26" }
},
{
"$unwind": "$Details"
},
{
"$match": { "Details.Cars": "BMW" }
}
])
This query first match by the criteria to avoid $unwind over all collection.
Then $unwind to get every document and $match again to get only the documents you want.
Example here
To get only one element (for example, if you match by _id and its unique) you can use $elemMatch in this way:
db.collection.find({
"Details.Cars": "BMW",
"date": "2021-01-16"
},
{
"Details": {
"$elemMatch": {
"Cars": "BMW"
}
}
})
Example here
You can use $elemenMatch into query or projection stage. Docs here and here
Using $elemMatch into query the way is this:
db.collection.find({
"Details": {
"$elemMatch": {
"Cars": "BMW"
}
},
"date": "2021-01-16"
},
{
"Details.$": 1
})
Example here
The result is the same. In the second case you are using positional operator to return, as docs says:
The first element that matches the query condition on the array.
That is, the first element where "Cars": "BMW".
You can choose the way you want.
I have been stuck on how to query db which the common data structure of every document looks as:
{
"_id": {
"$oid": "5e0983863bcf0dab51f2872b"
},
"word": "never", // get the `word` value for each of below queries
"wordset_id": "a42b50e85e",
"meanings": [{
"id": "1f1bca9d9f",
"def": "not ever",
"speech_part": "adverb",
"synonyms": ["ne'er"]
}, {
"id": "d35f973ed0",
"def": "not at all",
"speech_part": "adverb"
}]
}
1) query to get all the wordfor speech_part: "adverb" (eg: never,....) //
2)query to get all the word for: word length of 6 and speech_part: "adverb"
I have learnt from SO that ,to search whole collections first i have to retrieve all collections in the database , but how to write a query is where i stuck
db.collection.find({"meanings.speech_part":"adverb"},{"_id":0, "word":1})
To get array of all word of a specific speech_part above is the query.
First part of the query is filter predicate like in your scenario matching speach_part.if your matching column were not inside another object or a object inside a array, you could just write {column_name: "something"}.
as speech_part is inside an object which is inside an array, you have to write {"parentClumn.key":"something"}, in your case {"meanings.speech_part":"adverb"}.
where second part of the query is projection where you define which columns you want in your result. so to get only word column values you do {word:1}, to have more column you do {word:1, etc:1}. While mongodb project _id by default, so to remove _id from result you have to explicitly set {_id:0}
db.collection.find({
"meanings.speech_part":"adverb",
"$expr": { "$gt": [ { "$strLenCP": "$word" }, 6 ] }
},{"_id":0, "word":1})
To get array of all word of a specific speech_part with length greater than 6. This one is a bit complex query. You can look up $expr documentation. In $expr you can run function on your column and match the result. In your case strLenCP is calculating the length of your word column value and then checking, is it greater then 6 by $gt comparison operator
You may try below query to get the matching rows. You will have to try the same with pymongo.
db.getCollection('test-collection').find(
{
'meanings.speech_part': 'adverb'
},
{
_id: 0,
word: 1
}
);
Read about the projections in mongodb here:
https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results
Currently I'm hitting at a problem to process the mongodb documents and get the field wise values. For example, say mongo contains these documents:
[
{ "name": "test1", "age": 20, "gender": "male" },
{ "name": "test2", "age": 21, "gender": "female" },
{ "name": "test3", "age": 30, "gender": "male"}
]
Expected Output:
{
"name": ["test1","test2","test3"],
"age": [20,21,30],
"gender": ["male","female", "male"]
}
Is it possible to retrieve data from mongo in the above format? I dont want to write some javascript functions to process this. Looking at retrieving the data by using mongo functions along with the find query.
You'd need to use the aggregation framework to get the desired result. Run the following aggregation pipeline which filters the documents in the collection getting into the pipeline for grouping using the $match operator. This is similar to the find() query filter.
db.collection.aggregate([
{ "$match": { "age": { "$gte": 20 } } }, // filter on users with age >= 20
{
"$group": {
"_id": null,
"name": { "$push": "$name" },
"age": { "$push": "$age" },
"gender": { "$push": "$gender" }
}
},
{
"$project": {
"_id": 0,
"name": 1,
"age": 1,
"gender": 1
}
}
])
Sample Output
{
"name": ["test1", "test2", "test3"],
"age": [20, 21, 30],
"gender": ["male", "female", "male"]
}
In the above pipeline, the first pipeline step is the $match operator which is similar to SQL's WHERE clause. The above example filters incoming documents on the age field (age greater than or equal to 20).
One thing to note here is when executing a pipeline, MongoDB pipes operators into each other. "Pipe" here takes the Linux meaning: the output of an operator becomes the input of the following operator. The result of each operator is a new collection of documents. So Mongo executes the previous pipeline as follows:
collection | $match | $group | $project => result
The next pipeline stage is the $group operator. Inside the $group pipeline, you are now grouping all the filtered documents where you can specify an _id value of null to calculate accumulated values for all the input documents as a whole. Use the available accumulators to return the desired aggregation on the grouped documents. The accumulator operator $push is used in this grouping operation because it returns an array of expression values for each group.
Accumulators used in the $group stage maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline.
To get the documents with the desired field, the $project operator which is similar to SELECT in SQL is used to rename the field names and select/deselect the fields to be returned, out of the grouped fields. If you specify 0 for a field, it will NOT be sent in the pipeline to the next operator.
You cannot do this with the find command.
Try using mongodb's aggregation pipeline.
Specifically use $group in combination with $push
See here: https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group