Is it possible to perform a Sphinx search on one string attribute? - perl

sql_query=SELECT id,headline,summary,body,tags,issues,published_at
FROM sphinx_search
I am working on the search feature of my Web site and I am using Sphinx, Perl and Sphinx::Search. As long as I want to search in all the attributes and I don't restrict it to just one, everything goes well. However when the user searches for a specific tag, I can't just give the result of a fuzzy search, I want to use the power of Sphinx to search only on tags or issues, maybe sometimes the user wants to search on headline and issues.
How can I perform such a task?

You need to put it in Extended Match Mode
https://metacpan.org/module/JJSCHUTZ/Sphinx-Search-0.27.2/lib/Sphinx/Search.pm#SetMatchMode
Then you can use Extended Query syntax
http://sphinxsearch.com/docs/current.html#extended-syntax
Which includes the field search operator
#tags keyword1
(Be careful with sphinx, the word "attribute" has a specific meaning - values attached to the document, useful for sorting/grouping/filtering and returning with the resultset. Whereas I think you are talking about fields. All the columns from the sql_query you dont mark as an attribute, are a field - and full text searchable)

Related

Searching for a user named "Don" with PostgreSQL full text search

We hit a bug with our PostreSQL full text search system where a user whose first name is "Don" was not being included in search results. After some digging, we found that "don" is listed as a stopword in the default full text search dictionary in PostgreSQL (https://github.com/postgres/postgres/blob/master/src/backend/snowball/stopwords/english.stop).
We are using a hosted DB solution so we don't have access to the file system and thus can't create a modified version of the stopword file.
Are there any workarounds for this other than doing a string comparison check? Given that there can be multiple search tokens, it seems pretty bad to have to perform a string comparison of the name fields against every search token.
All the other words in the English stopword file seem pretty reasonable, but I'm really surprised I don't see any other Google/SO results complaining about users named "Don".
Maybe this makes it clear why don is a stopword:
SELECT to_tsvector('english', 'don''t');
to_tsvector
-------------
(1 row)
You wouldn't want to remove that stop word.
Full text search is not useful for proper names.
Normally, trigram indexes are better for that.

adding up specific mergefield values in word

I have a table in a word document that has three colums and all fields are mailmerge fields from an external IT system.
There are three columns displaying the fields:
Charge Description
Charge Value (£)
Eiligible? (yes/no)
I am trying to create a field that adds up all eligibale charges so that only charge values that show a "yes" in the eligigble field are included. Does anyone know if this is possible? I have tried creating a formula but can't get it to work. Also, I would assume at some point an if statment is required so that it only includes the eligible charge.
Has anyone done anything similar before and if so, would they mind sharing how it was achieved?
Many thanks
You can do some things with expression fields (created in Word with CTRL-F9). This will look like {} and you can insert the expression. eg {{MERGFIELD charge} + {MERGEFIELD charge2}}. Since however you want to check multiple values and then create an expression, its probably easier to use a macro. The macro would contain your logic, then set the fields in the document accordingly.
Here are two external links since I can't reproduce a useful amount the content here because it's a verbose answer to a potentially deep question:
Expression Fields
Merge fields
I hope that helps.

Mysql to Sphinx query conversion

how can i write this query on sphinx select * from vehicle_details where make LIKE "%john%" OR id IN (1,2,3,4), can anyone help me? I've search a lot and i can't find the answer. please help
Well if you really want to use sphinx, could perhaps make id into a fake keyword, so can use it in the MATCH, eg
sql_query = SELECT id, CONCAT('id',id) as _id, make, description FROM ...
Now you have a id based keyword you can match on.
SELECT * FROM index WHERE MATCH('(#make *john*) | (#_id id1|id2|id3|id4)')
But do read up on sphinx keyword matching, as sphinx by default only matches whole words, you need to enable part word matching with wildcards, (eg with min_infix_len) so you can get close to a simple LIKE %..% match (which doesnt take into account words)
Actually pretty hard to do, becuase you mixing a string search (the LIKE which will be a MATCH) - with an attribute filter.
Would suggest two seperate queries, one to sphinx for the text filter. And the IN filter just do directly in database (mysql?). Merge the results in the application.

Search SharePoint Foundation 2013 Picture Library by terms defined in Keywords field

Since Term Store functionality (and probably most of metadata functionality) isn't available in SharePoint Foundation 2013, I couldn't find a way to search through the pictures using some sort of tagging. Thus I decided to employ something what is available already in Foundation version.
When you edit the picture, you can see 3 fields: Title, Description and Keywords like so:
It would be nice if I could make Search index terms (tags) added to the Keywords field. However, after some testing I saw that only Title is indexed and presented in search results. Although I could use my search terms in Title field, it won't be elegant.
So, is there any way to make use of Keywords entity in my case? Please note, it's a Foundation version, so there is no Enterprise Keywords functionality either (or at least I couldn't find one).
OK, so I used this kind of workaround in the end:
Went to my Picture Libraly's Settings
Chose to create a new column (this can also be done in Site Settings > Site Columns, if you want to reuse it for more sites)
Called the column Primary Tags
Chose Single line of text option, because multiple lines option cannot be indexed
Because single line option is limited to 255 characters, I repeated steps 2-4 to create another column and named it Secondary Tags
Then went to Indexed Columns page and added those 2 new columns to the index
Now I have Title, Primary Tags and Secondary Tags indexed fields available for each picture with a total of 765 characters available.

PostgreSQL full-text search headlines do not contain enough context

I'm using PostgreSQL's full-text search capability to implement a search feature on a client's site. I'm using the ts_headline function to get the context that the search terms appear in, but the client is not happy with the selection of words displayed. In particular, the headline seems to consistently begin with the search term, whereas the client would like it to start a few words earlier.
Is there any way to either configure PostgreSQL to have this behavior, or modify the ts_headline call to get the desired results?
Edit: Apologies for not including some sample SQL in the first place.
SELECT
ts_headline('english', "text", plainto_tsquery('"endpoints"'))
FROM "Page"
WHERE to_tsvector("text") ## plainto_tsquery('"endpoints"')
ORDER BY ts_rank(to_tsvector("text"), plainto_tsquery('"endpoints"'))
Using the MaxFragments option, you might get better results. Similarly you can play with MinWords and MaxWords, e.g.
SELECT
ts_headline('english', "text", plainto_tsquery('"endpoints"'), 'MaxFragments=0, MinWords=5, MaxWords=9')
FROM "Page"
WHERE
to_tsvector("text") ## plainto_tsquery('"endpoints"')
ORDER BY
ts_rank(to_tsvector("text"), plainto_tsquery('"endpoints"'))
You will probably need to experiment.
See MinWords, MaxWords and MaxFragments in http://www.postgresql.org/docs/current/interactive/textsearch-controls.html