How can I query several words in indexed fields in pymongo? - mongodb

When I want to execute a indexed text search i use the following command:
text_results = db.command('text', 'foo', search=query)
I am now wondering how I can query several words. I tried already to set the query to query = ['word1', 'word2'] but that does not work.

Using the the search string "word1 word2" searches for the term word1 OR the term word2:
text_results = db.command('text', 'foo', search='word1 word2')
Also, here's a quote from docs:
If the search string includes phrases, the search performs an AND with
any other terms in the search string; e.g. search for "\"twinkle
twinkle\" little star" searches for "twinkle twinkle" and ("little" or
"star").
So to search where the field contains "word1" AND "word2", go for
text_results = db.command('text', 'foo', search="\"word1\" \"word2\"")

Related

Is there a ts (text search) function would return found string instead of boolean?

I am using PostgreSQL to find out the matched string in the article by using tsvector and tsquery.
I read the PostgreSQL manual 12.3 Controlling Text Search but nothing could help me to get the exact output I wanted.
Query:
SELECT ts_headline('english',
'The most common type of search
is to find all documents containing given query terms
and return them in order of their similarity to the
query.',
to_tsquery('query & similarity'),
'StartSel = <, StopSel = >');
ts_headline output
The most common type of search
is to find all documents containing given <query> terms
and return them in order of their <similarity> to the
<query>.
I'm looking for the only string as mentioned below:
query, similarity
If you pick delimiters for StartSel and StopSel that you are sure do not exist elsewhere in the string, then it is pretty easy to do this with a regexp.
SELECT distinct regexp_matches[1] from
regexp_matches(
ts_headline('english',
'The most common type of search
is to find all documents containing given query terms
and return them in order of their similarity to the
query.',
to_tsquery('query & similarity'),
'StartSel = <, StopSel = >'
),
'<(.*?)>','g'
);

Text Indexes MongoDB, Minimum length of search string

I have created a text index for collection X from mongo shell
db.X.ensureIndex({name: 'text', cusines: 'text', 'address.city': 'text'})
now if a document whose name property has a value seasons, its length is 7
so if I run the find query(with a search string of length <= 5)
db.X.find({$text: {$search: 'seaso'}})
it does not return any value if I change the search string to season (length >= 6) then it returns the document.
Now my question is does the search string has some minimum length constraint to fetch the records.
if yes, then is there is any way to change it?
MongoDB $text searches do not support partial matching. MongoDB allows support text search queries on string content with support for case insensitivity and stemming.
Looking at your examples:
// this returns nothing because there is no inferred association between
// the value: 'seasons' and your input: 'seaso'
db.X.find({$text: {$search: 'seaso'}})
// this returns a match because 'season' is seen as a stem of 'seasons'
db.X.find({$text: {$search: 'season'}})
So, this is not an inssue with the length of your input. Searching on seaso returns no matches because:
Your text index does not contain the whole word: seaso
Your text index does not contain a whole word for which seaso is a recognised stem
This presumes that the language of your text index is English, You can confirm this by runing db.X.getIndexes() and you'll see this in the definition of your text index:
"default_language" : "english"
FWIW, if your index is case insensitive then the following will also return matches:
db.X.find({$text: {$search: 'SEaSON'}})
db.X.find({$text: {$search: 'SEASONs'}})
Update 1: in repsonse to this question "is it possible to use RegExp".
Assuming the name attribute contains the value seasons and you are seaching with seaso then the following will match your document:
db.X.find({type: {$regex: /^seaso/}})
More details in the docs but ...
This will not use your text index so if you proceeed with using the $regex operator then you won't need the text index.
Index coverage with the $regex operator is probably not what you expect, the brief summary is this: if your search value is anchored (i.e. seaso, rather than easons) then MongoDB can use an index but otherwise it cannot.

Pymongo find document whose field is a substring of a given string

Let's say we have a collection with the following documents:
{_id : 1, str : 'hello'}
{_id : 2, str : 'hello world'}
{_id : 3, str : 'world'}
And I would like to find documents whose str field is a substring of hello world!. Is there a way to do this in pymongo?
I know the opposite - getting documents whose field contains a string can be done using $regex, but what I want is getting documents whose field is contained by a string.
You can use text indexes for this, which support text search queries on string content. Text indexes can include any field whose value is a string or an array of string elements.
Here's a minimal example using pymongo:
# Get database connection
conn = pymongo.MongoClient('mongodb://localhost:27017/')
coll = conn.get_database('test').get_collection('test')
# Create text index
coll.create_index([('str',pymongo.TEXT)])
# Text search
print list(coll.find({'$text': {'$search': 'hello world'}}))
With your example documents, this will result in:
[{u'_id': 3.0, u'str': u'world'},
{u'_id': 2.0, u'str': u'hello world'},
{u'_id': 1.0, u'str': u'hello'}]
For more information, please see:
Text Indexes
$text operator

Mongodb count query to search for strings containing either one string or the other

Hi I trying to get a count of the documents in a mongodb containing either of the strings(words). I have around 50 words(or strings) . I am aware that i need to use "or" query here.
Here is the query which i tried: But I am not sure if this is correct
db.collection.find({"created_at": /^sep 23.*/i, "$and": [{ "text": /.*abc.*/i },{ "text": /.*efg.*/i }]}).count()
You can do this by using $in which acts as an OR match against a single field:
db.collection.find({created_at: /^sep 23/i, text: {$in: [/abc/i, /efg/i] }}).count()
And you can simplify your regular expressions a bit to remove the .* parts because those are already implied.
Given you have not specified that you want to search a specific field in a document, I would suggest the text search option described here: http://docs.mongodb.org/manual/reference/operator/query/text/
With reference to the doc mentioned above you could use:
db.collection.find( { $text: { $search: "word1 word2 word3" } } )
space delimited strings are considered as having a logical OR operator between them...

MongoDB - Contains (LIKE) query on concatenated field

I am new in MongoDB.
I am programming an application with spring data and mongodb and I have one class with two fields: firstname and lastname.
I need one query for documents that contain one string in the full name (firstname + lastname).
For example: firstname = "Hansen David", lastname = "Gonzalez Sastoque" and I have a query to find David Gonzalez. In this example I expect there to be a match.
Concatenate two strings solves it but I don't know how to perform this.
Create a new array field (call it names) in the document and in that array put each name split by space. In your example the array would have the following contents:
hansen
david
gonzalez
sastoque
(make them all lower case to prevent case insensitivity issues)
Before you do your query, convert your input to lower case and split it by spaces as well.
Now, you can use the $all operator to achieve your objective:
db.persons.find( { names: { $all: [ "david", "gonzalez" ] } } );
You can use $where modifier in your queries:
db.users.findOne({$where: %JavaScript to match the document%})
In your case it may look like this:
db.users.findOne({$where: "this.firstname + ' ' + this.lastname == 'Gonzalez Sastoque'"})
or this:
db.users.findOne({$where: "this.firstname.match(/Gonzalez/) && this.lastname.match(/Sastoque/)"})
My last example does exactly what you want.
Update: Try following code:
db.users.findOne({$where: "(this.firstname + ' ' + this.lastname).match('David Gonzalez'.replace(' ', '( .*)? '))"})
You should split your full name into a first name and a last name, then do your query on both fields, using the appropriate MongoDB query selectors.