Mongo query array.length within an array - mongodb

I tried to use length on an array that is within another array and got an error:
db.quotes.find( { $where: "this.booksWithQuote.authors.length>0" } )
// fails with this.booksWithQuote.authors is undefined
I'm basing this syntax on this example MongoDB – find all documents where an array / list size is greater than N that works:
db.domain.find( {booksWithQuote: {$exists:true}, $where:'this.booksWithQuote.length>0'} )
// Above works
So I'm wondering if this is possible, how can I find all documents that have array1.array2 where array2 length is greater than zero.
I've tried the following but it fails with a syntax error:
db.quotes.find( {
"booksWithQuote": {$exists: true} },
"booksWithQuote.authors": {$exists: true} },
$where: "booksWithQuote.authors.length>0" } )
// fails with this.booksWithQuote.authors is undefined
It's worth pointing out that if I knew the author of a book, the nested array with array searching worked. Pretty cool!
db.quotes.find( { "booksWithQuote.authors" : "Sam Fisher" } )
// Returns all quotes that have a book that has the author Sam Fisher
But in my case I'm just trying to find all of the quotes that have more than one author on any given book.
To follow along, consider this example.
I have a collection of quotes, and each quote has a list of books where the quote was used, and each book has an array of authors.
Below is some sample data so you can understand the structure of the data. It shows one quote with no books, another quote with a book but no authors, and a third quote with books and several authors.
[
{
quoteText: "Love to code",
booksWithQuote: [ ]
},
{
quoteText: "Another Quotesake",
booksWithQuote: [
{
title: "Where is elmo",
authors: [ ]
}
]
},
{
quoteText: "For goodness sake",
booksWithQuote: [
{
title: "The search for Elmo",
authors: [
"John Smith",
"Sam Fisher",
"Jim Wiggins"
]
},
{
title: "Finding Elmo",
authors: [ "Sam Fisher" ]
},
{
title: "Mercy me",
authors: [ ]
}
]
}
]
So to reiterate, How can I find all documents that have an array within another array where the 2nd array has one or more elements?

You can use $exists operator and refer to first element of an array using dot notation:
db.col.find({ "booksWithQuote.authors.0": { $exists: true } })

MongoDB documentation: https://www.mongodb.com/docs/manual/reference/operator/query/expr/
you can use $expr to achieve this
$expr:{$gte: [{$size: "$selectedArray"}, size]}

Related

MongoDB query match for several subfields

After spending several hours trying to solve this, and not finding my answer in the docs of on StackOverflow, I'm opeing a question here.
I have a large collection (3.5M docucuments) and want to filter out those that match on a specific combination of sub fields.
E.g. the documents look like:
{
_id:...,
...<a number of fields>
"ML":[
{
"_id": ...,
... <more fields>
"Op": [
"_id": ...,
"Pr": {
"P94": <number>,
"P95" : ...,
...,
"P145": <optional and number>
}
{...},
...
],
{...},
...
],
...
}
So P145 is sometimes there, sometimes not.
I want to find al documents that have a "ML.Op.Pr" with both "P94":8 and P145 exists.
I've tried and failed (as I get no/0 results):
.find({"ML.Op.Pr":{"P94":8,"P145":1})
.find({"ML.Op.Pr":{$and[{"P94":8},{"P145":1}]}})
I've also tried $and as a first step,
.find({$and[{"ML.Op.Pr.P94":8},{"ML.Op.Pr.P145":1}]})
but since both ML and Op are an array with multiple entries, it returns too many results. I need both Pr's to be set in the same array element.
As you can see I'm first trying to find where P145 = 1, because when I replace it with $exists it doesn't parse at all.
How should I do this?
You have to use nested $elemMatch for each sub-array divisions in order to get the desired result.
db.collection.find({
"ML": {
"$elemMatch": {
"Op": {
"$elemMatch": {
"Pr.P94": 8,
"Pr.P145": {
"$exists": true
},
}
}
}
}
})
Mongo Playground sample execution

How can I update a property within an array of objects based on it's existing value in Mongo?

I have some documents with the following structure...
{
user: "Joe",
lists: [
{ listId: "1234", listName: "dogs" },
{ listId: "5678", listName: "cats" }
]
}
I am trying to prepend a string to each listId field but I am stuck. Amongst other things I have tried...
db.users.updateMany(
{"lists.listId": /^[0-9a-f]{20,}$/},
[{$set:
{"lists.listId.$[]": {"$concat": ["0000", "$lists.listId"]}}
}]
)
But I got the error message: "FieldPath field names may not start with '$'"
Variations on this write results into the appropriate field, but not the results I'm after.
I've bashed my head against the docs for a few hours now but all the references I can find to using the positional operator to reference the value of the field that is being updated use the field name directly, not referenced as a property like I am doing. I've not really messed with pipelines a lot before and I'm finding it all a bit confusing! Someone kindly helped me with a closely related problem yesterday, using $map, and that worked great for a plain array of strings but I haven't had any luck adapting that to an array of objects with string properties. Sorry if this is Mongo 101, the docs are good, but there's a lot of them and I'm not sure which bits are relevant to this.
You can do it like this:
db.collection.users({},
[
{
"$set": {
lists: {
$map: {
input: "$lists",
in: {
$mergeObjects: [
{
"listName": "$$this.listName",
"listId": {
$concat: [
"0000",
"$$this.listId"
]
}
}
]
}
}
}
}
}
],
{
"multi": true
})
Here is the working example: https://mongoplayground.net/p/Q8kUTB6X5JY

How does 'fuzzy' work in MongoDB's $searchBeta stage of aggregation?

I'm not quite understanding how fuzzy works in the $searchBeta stage of aggregation. I'm not getting the desired result that I want when I'm trying to implement full-text search on my backend. Full text search for MongoDB was released last year (2019), so there really aren't many tutorials and/or references to go by besides the documentation. I've read the documentation, but I'm still confused, so I would like some clarification.
Let's say I have these 5 documents in my db:
{
"name": "Lightning Bolt",
"set_name": "Masters 25"
},
{
"name": "Snapcaster Mage",
"set_name": "Modern Masters 2017"
},
{
"name": "Verdant Catacombs",
"set_name": "Modern Masters 2017"
},
{
"name": "Chain Lightning",
"set_name": "Battlebond"
},
{
"name": "Battle of Wits",
"set_name": "Magic 2013"
}
And this is my aggregation in MongoDB Compass:
db.cards.aggregate([
{
$searchBeta: {
search: { //search has been deprecated, but it works in MongoDB Compass; replace with 'text'
query: 'lightn',
path: ["name", "set_name"],
fuzzy: {
maxEdits: 1,
prefixLength: 2,
maxExpansion: 100
}
}
}
}
]);
What I'm expecting my result to be:
[
{
"name": "Lightning Bolt", //lightn is in 'Lightning'
"set_name": "Masters 25"
},
{
"name": "Chain Lightning", //lightn is in 'Lightning'
"set_name": "Battlebond"
}
]
What I actually get:
[] //empty array
I don't really understand why my result is empty, so it would be much appreciated if someone explained what I'm doing wrong.
What I think is happening:
db.cards.aggregate... is looking for documents in the "name" and "set_name" fields for words that have a max edit of one character variation from the "lightn" query. The documents that are in the cards collection contain edits that are greater than 2, and therefor your expected result is an empty array. "Fuzzy is used to find strings which are similar to the search term or terms"; used with maxEdits and prefixLength.
Have you tried the term operator with the wildcard option? I think the below aggregation would get you the results you were actually expecting.
e.g.
db.cards.aggregate([
{$searchBeta:
{"term":
{"path":
["name","set_name"],
"query": "l*h*",
"wildcard":true}
}}]).pretty()
You need to provide an index to use with your search query.
The index is basically the analyzer that your query will use to process your results regarding if you want to a full match of the text, or you want a partial match etc.
You can read more about Analyzers from here
In your case, an index based on STANDARD analyzer will help.
After you create your index your code, modified below, will work:
db.cards.aggregate([
{
$search:{
text: { //search has been deprecated, but it works in MongoDB Compass; replace with 'text'
index: 'index_name_for_analyzer (STANDARD in your case)'
query: 'lightn',
path: ["name"] //since you only want to search in one field
fuzzy: {
maxEdits: 1,
prefixLength: 2,
maxExpansion: 100
}
}
}
}
]);

mongodb check regex on fields from one collection to all fields in other collection

After digging google and SO for a week I've ended up asking the question here. Suppose there are two collections,
UsersCollection:
[
{...
name:"James"
userregex: "a|regex|str|here"
},
{...
name:"James"
userregex: "another|regex|string|there"
},
...
]
PostCollection:
[
{...
title:"a string here ..."
},
{...
title: "another string here ..."
},
...
]
I need to get all users whose userregex will match any post.title(Need user_id, post_id groups or something similar).
What I've tried so far:
1. Get all users in collection, run regex on all products, works but too dirty! it'll have to execute a query for each user
2. Same as above, but using a foreach in Mongo query, it's the same as above but only Database layer instead of application layer
I searched alot for available methods such as aggregations, upwind etc with no luck.
So is it possible to do this in Mongo? Should i change my database type? if yes what type would be good? performance is my first priority. Thanks
It is not possible to reference the regex field stored in the document in the regex operator inside match expression.
So it can't be done in mongo side with current structure.
$lookup works well with equality condition. So one alternative ( similar to what Nic suggested ) would be update your post collection to include an extra field called keywords ( array of keyword values it can be searched on ) for each title.
db.users.aggregate([
{$lookup: {
from: "posts",
localField: "userregex",
foreignField: "keywords",
as: "posts"
}
}
])
The above query will do something like this (works from 3.4).
keywords: { $in: [ userregex.elem1, userregex.elem2, ... ] }.
From the docs
If the field holds an array, then the $in operator selects the
documents whose field holds an array that contains at least one
element that matches a value in the specified array (e.g. ,
, etc.)
It looks like earlier versions ( tested on 3.2 ) will only match if array have same order, values and length of arrays is same.
Sample Input:
Users
db.users.insertMany([
{
"name": "James",
"userregex": [
"another",
"here"
]
},
{
"name": "John",
"userregex": [
"another",
"string"
]
}
])
Posts
db.posts.insertMany([
{
"title": "a string here",
"keyword": [
"here"
]
},
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "one string here",
"keywords": [
"string"
]
}
])
Sample Output:
[
{
"name": "James",
"userregex": [
"another",
"here"
],
"posts": [
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "a string here",
"keywords": [
"here"
]
}
]
},
{
"name": "John",
"userregex": [
"another",
"string"
],
"posts": [
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "one string here",
"keywords": [
"string"
]
}
]
}
]
MongoDB is good for your use case but you need to use a approach different from current one. Since you are only concerned about any title matching any post, you can store the last results of such a match. Below is a example code
db.users.find({last_post_id: {$exists: 0}}).forEach(
function(row) {
var regex = new RegExp(row['userregex']);
var found = db.post_collection.findOne({title: regex});
if (found) {
post_id = found["post_id"];
db.users.updateOne({
user_id: row["user_id"]
}, {
$set :{ last_post_id: post_id}
});
}
}
)
What it does is that only filters users which don't have last_post_id set, searches post records for that and sets the last_post_id if a record is found. So after running this, you can return the results like
db.users.find({last_post_id: {$exists: 1}}, {user_id:1, last_post_id:1, _id:0})
The only thing you need to be concerned about is a edit/delete to an existing post. So after every edit/delete, you should just run below, so that all matches for that post id are run again.
post_id_changed = 1
db.users.updateMany({last_post_id: post_id_changed}, {$unset: {last_post_id: 1}})
This will make sure that next time you run the update these users are processed again. The approach does have one drawback that for every user without a matching title, the query for such users would run again and again. Though you can workaround that by using some timestamps or post count check
Also you should make to sure to put index on post_collection.title
I was thinking that if you pre-tokenized your post titles like this:
{
"_id": ...
"title": "Another string there",
"keywords": [
"another",
"string",
"there"
]
}
but unfortunately $lookup requires that foreignField is a single element, so my idea of something like this will not work :( But maybe it will give you another idea?
db.Post.aggregate([
{$lookup: {
from: "Users",
localField: "keywords",
foreignField: "keywords",
as: "users"
}
},
]))

Query for any nested subdocuments

I would like to perform a query for a given nested value on multiple subdocuments.
In the example below, I would like to perform the search across multiple "product_types" objects.
{
"product_types": {
"type_1": [
{
name: "something",
price: 100
},
{
name: "something else",
price: 50
}
],
"type_2": [
{
name: "another one",
price: 20
},
{
name: "and a last one",
price: 30
}
]
}
}
I understood the dollar sign matches any subdocument. Here is what I came up with to get all the product with a "price" value of 100. But it doesn't work. Any idea?
db.inventory.find( { product_types.$.price : 100 } )
PS: I anticipate on some answers saying that such a db design to store products would be very bad and I agree; this is just an example to illustrate the kind of query I want to perform.
MongoDB doesn't support any sort of wildcard property like you're trying to do here with $. However, you can search multiple properties using an $or operator:
db.inventory.find({ $or: [
{ product_types.type_1.price: 100 },
{ product_types.type_2.price: 100 }
]})
But this is going to return the matching documents in full rather than just the matched array elements, so you'll also have to post-process the docs in your code to pull those out.
Searching with embedded documents can be done as
db.inventory.find({"product_types.type_1.price":100})
Field name should be inside " "! Otherwise it will throw syntax error.