how can i write this query on sphinx select * from vehicle_details where make LIKE "%john%" OR id IN (1,2,3,4), can anyone help me? I've search a lot and i can't find the answer. please help
Well if you really want to use sphinx, could perhaps make id into a fake keyword, so can use it in the MATCH, eg
sql_query = SELECT id, CONCAT('id',id) as _id, make, description FROM ...
Now you have a id based keyword you can match on.
SELECT * FROM index WHERE MATCH('(#make *john*) | (#_id id1|id2|id3|id4)')
But do read up on sphinx keyword matching, as sphinx by default only matches whole words, you need to enable part word matching with wildcards, (eg with min_infix_len) so you can get close to a simple LIKE %..% match (which doesnt take into account words)
Actually pretty hard to do, becuase you mixing a string search (the LIKE which will be a MATCH) - with an attribute filter.
Would suggest two seperate queries, one to sphinx for the text filter. And the IN filter just do directly in database (mysql?). Merge the results in the application.
Related
We hit a bug with our PostreSQL full text search system where a user whose first name is "Don" was not being included in search results. After some digging, we found that "don" is listed as a stopword in the default full text search dictionary in PostgreSQL (https://github.com/postgres/postgres/blob/master/src/backend/snowball/stopwords/english.stop).
We are using a hosted DB solution so we don't have access to the file system and thus can't create a modified version of the stopword file.
Are there any workarounds for this other than doing a string comparison check? Given that there can be multiple search tokens, it seems pretty bad to have to perform a string comparison of the name fields against every search token.
All the other words in the English stopword file seem pretty reasonable, but I'm really surprised I don't see any other Google/SO results complaining about users named "Don".
Maybe this makes it clear why don is a stopword:
SELECT to_tsvector('english', 'don''t');
to_tsvector
-------------
(1 row)
You wouldn't want to remove that stop word.
Full text search is not useful for proper names.
Normally, trigram indexes are better for that.
I need to redact proper names from text fields in SQL Server. Let's say I have the following table:
PersonTable
FirstName
LastName
Notes
I could do this:
UPDATE PersonTable
SET Notes = REPLACE(REPLACE(Notes, FirstName, 'REDACTED'), LastName, 'REDACTED')
That should work fine for the exact match condition, but what if someone has misspelled first or last name in the Notes field, or worse yet, used a nick-name like Jim?
I think Full Text searching using Contains is good for this sort of thing where the deviation is meaning or derivation-based, but will it work for names? Even if it worked for finding rows where Notes contained a name, I don't think it works with the Replace scenario.
I have also considered SOUNDEX, but I am also not seeing how to do this using Replace for a text field. The only way I can see using Soundex or something like that would be to split the text field into words and do a comparison on each word. I have to do this on many text fields in very heavily populated tables, so I'm not excited about doing that if there's a better way.
Does anyone have experience doing something like this?
Thanks
sql_query=SELECT id,headline,summary,body,tags,issues,published_at
FROM sphinx_search
I am working on the search feature of my Web site and I am using Sphinx, Perl and Sphinx::Search. As long as I want to search in all the attributes and I don't restrict it to just one, everything goes well. However when the user searches for a specific tag, I can't just give the result of a fuzzy search, I want to use the power of Sphinx to search only on tags or issues, maybe sometimes the user wants to search on headline and issues.
How can I perform such a task?
You need to put it in Extended Match Mode
https://metacpan.org/module/JJSCHUTZ/Sphinx-Search-0.27.2/lib/Sphinx/Search.pm#SetMatchMode
Then you can use Extended Query syntax
http://sphinxsearch.com/docs/current.html#extended-syntax
Which includes the field search operator
#tags keyword1
(Be careful with sphinx, the word "attribute" has a specific meaning - values attached to the document, useful for sorting/grouping/filtering and returning with the resultset. Whereas I think you are talking about fields. All the columns from the sql_query you dont mark as an attribute, are a field - and full text searchable)
I'm using PostgreSQL's full-text search capability to implement a search feature on a client's site. I'm using the ts_headline function to get the context that the search terms appear in, but the client is not happy with the selection of words displayed. In particular, the headline seems to consistently begin with the search term, whereas the client would like it to start a few words earlier.
Is there any way to either configure PostgreSQL to have this behavior, or modify the ts_headline call to get the desired results?
Edit: Apologies for not including some sample SQL in the first place.
SELECT
ts_headline('english', "text", plainto_tsquery('"endpoints"'))
FROM "Page"
WHERE to_tsvector("text") ## plainto_tsquery('"endpoints"')
ORDER BY ts_rank(to_tsvector("text"), plainto_tsquery('"endpoints"'))
Using the MaxFragments option, you might get better results. Similarly you can play with MinWords and MaxWords, e.g.
SELECT
ts_headline('english', "text", plainto_tsquery('"endpoints"'), 'MaxFragments=0, MinWords=5, MaxWords=9')
FROM "Page"
WHERE
to_tsvector("text") ## plainto_tsquery('"endpoints"')
ORDER BY
ts_rank(to_tsvector("text"), plainto_tsquery('"endpoints"'))
You will probably need to experiment.
See MinWords, MaxWords and MaxFragments in http://www.postgresql.org/docs/current/interactive/textsearch-controls.html
One would normally have this query in their sphinx.conf file :
sql_query = SELECT id,text_field1,text_field2,text_field3 FROM table_name
Would there be much difference if I combine all fields into one searchable text field like so?
sql_query = SELECT id, CONCAT(text_field1,text_field2,text_field3) as searchable_text FROM table_name
What benefits does one have over the other?
Thanks!
I think either way is generally fine... however, Sphinx has the ability to focus queries at certain fields (see the extended query syntax examples). If you merge all the columns into one field, you'll lose that ability.
You'll also lose the ability to weight certain fields higher than others.
CONCAT(text_field1,text_field2,text_field3) is wrong
use CONCAT(text_field1,' ',text_field2,' ',text_field3)
but it's better to let index separate fields
search returns same result but you can select one of list if needed
'#text_field2 foo'