Query for any nested subdocuments - mongodb

I would like to perform a query for a given nested value on multiple subdocuments.
In the example below, I would like to perform the search across multiple "product_types" objects.
{
"product_types": {
"type_1": [
{
name: "something",
price: 100
},
{
name: "something else",
price: 50
}
],
"type_2": [
{
name: "another one",
price: 20
},
{
name: "and a last one",
price: 30
}
]
}
}
I understood the dollar sign matches any subdocument. Here is what I came up with to get all the product with a "price" value of 100. But it doesn't work. Any idea?
db.inventory.find( { product_types.$.price : 100 } )
PS: I anticipate on some answers saying that such a db design to store products would be very bad and I agree; this is just an example to illustrate the kind of query I want to perform.

MongoDB doesn't support any sort of wildcard property like you're trying to do here with $. However, you can search multiple properties using an $or operator:
db.inventory.find({ $or: [
{ product_types.type_1.price: 100 },
{ product_types.type_2.price: 100 }
]})
But this is going to return the matching documents in full rather than just the matched array elements, so you'll also have to post-process the docs in your code to pull those out.

Searching with embedded documents can be done as
db.inventory.find({"product_types.type_1.price":100})
Field name should be inside " "! Otherwise it will throw syntax error.

Related

How does 'fuzzy' work in MongoDB's $searchBeta stage of aggregation?

I'm not quite understanding how fuzzy works in the $searchBeta stage of aggregation. I'm not getting the desired result that I want when I'm trying to implement full-text search on my backend. Full text search for MongoDB was released last year (2019), so there really aren't many tutorials and/or references to go by besides the documentation. I've read the documentation, but I'm still confused, so I would like some clarification.
Let's say I have these 5 documents in my db:
{
"name": "Lightning Bolt",
"set_name": "Masters 25"
},
{
"name": "Snapcaster Mage",
"set_name": "Modern Masters 2017"
},
{
"name": "Verdant Catacombs",
"set_name": "Modern Masters 2017"
},
{
"name": "Chain Lightning",
"set_name": "Battlebond"
},
{
"name": "Battle of Wits",
"set_name": "Magic 2013"
}
And this is my aggregation in MongoDB Compass:
db.cards.aggregate([
{
$searchBeta: {
search: { //search has been deprecated, but it works in MongoDB Compass; replace with 'text'
query: 'lightn',
path: ["name", "set_name"],
fuzzy: {
maxEdits: 1,
prefixLength: 2,
maxExpansion: 100
}
}
}
}
]);
What I'm expecting my result to be:
[
{
"name": "Lightning Bolt", //lightn is in 'Lightning'
"set_name": "Masters 25"
},
{
"name": "Chain Lightning", //lightn is in 'Lightning'
"set_name": "Battlebond"
}
]
What I actually get:
[] //empty array
I don't really understand why my result is empty, so it would be much appreciated if someone explained what I'm doing wrong.
What I think is happening:
db.cards.aggregate... is looking for documents in the "name" and "set_name" fields for words that have a max edit of one character variation from the "lightn" query. The documents that are in the cards collection contain edits that are greater than 2, and therefor your expected result is an empty array. "Fuzzy is used to find strings which are similar to the search term or terms"; used with maxEdits and prefixLength.
Have you tried the term operator with the wildcard option? I think the below aggregation would get you the results you were actually expecting.
e.g.
db.cards.aggregate([
{$searchBeta:
{"term":
{"path":
["name","set_name"],
"query": "l*h*",
"wildcard":true}
}}]).pretty()
You need to provide an index to use with your search query.
The index is basically the analyzer that your query will use to process your results regarding if you want to a full match of the text, or you want a partial match etc.
You can read more about Analyzers from here
In your case, an index based on STANDARD analyzer will help.
After you create your index your code, modified below, will work:
db.cards.aggregate([
{
$search:{
text: { //search has been deprecated, but it works in MongoDB Compass; replace with 'text'
index: 'index_name_for_analyzer (STANDARD in your case)'
query: 'lightn',
path: ["name"] //since you only want to search in one field
fuzzy: {
maxEdits: 1,
prefixLength: 2,
maxExpansion: 100
}
}
}
}
]);

Mongo query array.length within an array

I tried to use length on an array that is within another array and got an error:
db.quotes.find( { $where: "this.booksWithQuote.authors.length>0" } )
// fails with this.booksWithQuote.authors is undefined
I'm basing this syntax on this example MongoDB – find all documents where an array / list size is greater than N that works:
db.domain.find( {booksWithQuote: {$exists:true}, $where:'this.booksWithQuote.length>0'} )
// Above works
So I'm wondering if this is possible, how can I find all documents that have array1.array2 where array2 length is greater than zero.
I've tried the following but it fails with a syntax error:
db.quotes.find( {
"booksWithQuote": {$exists: true} },
"booksWithQuote.authors": {$exists: true} },
$where: "booksWithQuote.authors.length>0" } )
// fails with this.booksWithQuote.authors is undefined
It's worth pointing out that if I knew the author of a book, the nested array with array searching worked. Pretty cool!
db.quotes.find( { "booksWithQuote.authors" : "Sam Fisher" } )
// Returns all quotes that have a book that has the author Sam Fisher
But in my case I'm just trying to find all of the quotes that have more than one author on any given book.
To follow along, consider this example.
I have a collection of quotes, and each quote has a list of books where the quote was used, and each book has an array of authors.
Below is some sample data so you can understand the structure of the data. It shows one quote with no books, another quote with a book but no authors, and a third quote with books and several authors.
[
{
quoteText: "Love to code",
booksWithQuote: [ ]
},
{
quoteText: "Another Quotesake",
booksWithQuote: [
{
title: "Where is elmo",
authors: [ ]
}
]
},
{
quoteText: "For goodness sake",
booksWithQuote: [
{
title: "The search for Elmo",
authors: [
"John Smith",
"Sam Fisher",
"Jim Wiggins"
]
},
{
title: "Finding Elmo",
authors: [ "Sam Fisher" ]
},
{
title: "Mercy me",
authors: [ ]
}
]
}
]
So to reiterate, How can I find all documents that have an array within another array where the 2nd array has one or more elements?
You can use $exists operator and refer to first element of an array using dot notation:
db.col.find({ "booksWithQuote.authors.0": { $exists: true } })
MongoDB documentation: https://www.mongodb.com/docs/manual/reference/operator/query/expr/
you can use $expr to achieve this
$expr:{$gte: [{$size: "$selectedArray"}, size]}

mongodb query to verify embedded array sequence numbers

given a document structure as shown, where the trades array can have thousands of items... how on earth could one do a query that would verify that the sequence always has 'startTradeId' one number higher than the previous items 'endTradeId', all the way through the array? is this even possible?
{
"name": "STOCK",
"trades": [{
"endTradeId": 41306,
"startTradeId": 41302,
...
},
{
"endTradeId": 41301,
"startTradeId": 41297,
...
},
{
"endTradeId": 41296,
"startTradeId": 41240,
...
},
...
]
}
You can use $where operator like below :
db.your_collection.find( { $where : function(){ return "this.trades.startTradeId > this.trades.endTradeId" }});

mongodb get all keys within a string

Is it possible to search a string if I have some data stored like
Names:
{
name: 'john'
},
{
name: 'pete'
},
{
name: 'jack smith'
}
Then I perform a query like
{ $stringContainsKeys: 'pete said hi to jack smith' }
and it would return
{
name: 'pete'
},
{
name: 'jack smith'
}
I'm not sure that this is even possible in mongoDB or if this kind of searching has a specific name.
Yes, quite possible indeed through the use of the $text operator which performs a text search on the content of the fields indexed with a text index.
Suppose you have the following test documents:
db.collection.insert([
{
_id: 1, name: 'john'
},
{
_id: 2, name: 'pete'
},
{
_id: 3, name: 'jack smith'
}
])
First you need to create a text index on the name field of your document:
db.collection.createIndex( { "name": "text" } )
And then perform a logical OR search on each term of a search string which is space-delimited and returns documents that contains any of the terms
The following query searches specifies a $search string of six terms delimited by space, "pete said hi to jack smith":
db.collection.find( { "$text": { "$search": "pete said hi to jack smith" } } )
This query returns documents that contain either pete or said or hi or to or jack or smith in the indexed name field:
/* 0 */
{
"_id" : 3,
"name" : "jack smith"
}
/* 1 */
{
"_id" : 2,
"name" : "pete"
}
Starting from Mongodb 2.6 you can search mongodb collection to match any of the search terms.
db.names.find( { $text: { $search: "pete said hi to jack smith" } } )
This will search for each of the terms separated by space.
You can find more information about this at
http://docs.mongodb.org/manual/reference/operator/query/text/#match-any-of-the-search-terms
However, it will work only with individual terms. If you have to search for exact phrase which is not a single term, e.g. you want to find "jack smith', but not "smith jack", it will not work, so you will have to use search for a phrase.
http://docs.mongodb.org/manual/reference/operator/query/text/#search-for-a-phrase which searches for exact phrases in the text.
If you need more advanced text-based search features in your application, you might consider using something like Elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/1.3/query-dsl-mlt-field-query.html.
Zoran

Query multiple date ranges, return only specific key in MongoDB

In Mongo, I have a documents that look like the following:
dateRange: [{
"price": "200",
"dateStart": "2014-01-01",
"dateEnd": "2014-01-30"
},
{
"price": "220",
"dateStart": "2014-02-01",
"dateEnd": "2014-02-15"
}]
Nice and simple right? Just dates and prices. Now, the tricky party I'm is how would I go about creating a query to find the dateRange that fits with 2014-01-12, and then JUST return the price after it's found instead of the entire array of dateRanges?
These dateRanges can get quite large, and I'm trying to minimize the amount of data returned (if this is possible at all with Mongo). Note, the date format I can change up if required, I was just using the above for example purposes.
Any help is appreciated, thanks!
You want to use the $elemMatch operator, which is only valid in versions 2.2 upward. You will also need to make sure you use multikey indexes.
edit: To be clear you will also have to use the $elemMatch find operator as pointed out in comment below.
This being said, I agree with the gist of comment by mnemosyn. It would be better to have each element of the array represented as a single document.
quick example of $elemMatch to demonstrate the projection. Simply add $elemMatch to the find as well.
> db.test.save ( {
_id: 1,
zipcode: 63109,
students: [
{ name: "john", school: 102, age: 10 },
{ name: "jess", school: 102, age: 11 },
{ name: "jeff", school: 108, age: 15 }
]
} );
> db.test.find( { zipcode: 63109 }, { students: { $elemMatch: { school: 102 } } } ).pretty() );
{
"_id" : 1,
"students" : [
{
"name" : "john",
"school" : 102,
"age" : 10
}
]
}
Well, the problem with that schema is that it uses large embedded arrays - this can be quite inefficient, because a mongodb query will always find a document, not a subset of an embedded object. Even if you're using a projection, mongodb will have to read the entire object internally, so if the array becomes huge, say 100k entries, that will slow things down to a halt.
Why not simply separate these array elements into documents, e.g.
{
price : 200,
productId : ObjectId("foo"), // or whatever the price refers to
dateStart : "2014-01-01",
dateEnd : "2013-01-30"
}
This way, mongodb doesn't need to pull the entire object with all prices, but only the prices that match your date range. This will minimize the amount of data transferred. You can then also use the query projection to only return the price, i.e. db.collection.find({ criteria }, {"price" : 1, "_id" : 0}).
Of course, the number of objects will increase dramatically, but efficient indexing will solve that problem. The only inefficiency induced is the duplication of the productId, which is cheaper than dealing with huge embedded arrays.
P.S: I'd suggest using actual dates (ISODate) instead of strings, even if their format is sortable.