On our PostgreSQL Database we have a field called Description. As you can guess this Description contains a lot of text and we would like to look inside this descriptions to find a certain word.
We tried contains and Charindex function but both are not working...
Any Idea how we can solve this?
Thank you very much!
Luca
You can use regular expressions with word delimiter markers:
select * from table
where description ~ ('\m' || 'yourword' || '\M');
Use ~* instead of ~ for case insensitive searches.
Note that using description LIKE '%yourword%' as #JNevill suggests will find that your word within other words as well, e.g. 'Jean-Luc Picard' LIKE '%car%' is true.
Related
I'm using postgres full text search for (amoung other things) to provide autocomplete functionality for usernames and tags. However, I'd like autocomplete to match the column value 'dashed-tag-example' against a ts_query like 'dashedtag:*'.
My understanding is that, to do this without duplicating the column in my table I need to create a dictionary along the lines of the simple dictionary that strips charachters like '-'. Is it possible to create such a dictionary using SQL (i.e. something I could put in a rails migration)?
It seems like it should somehow be possible to define a dictionary (or do I need a parser?) that uses postgres's regexp substition functions but I can't seem to find any examples online of how to create a dictionary (parser?) like that. Is this possible? How?
The dictionary is too late; you would need a different parser, which would require writing C code.
The simple and pragmatic solution is to use replace() to strip the - when you construct the tsvector.
You don't need to create a new column for that, simply search like this:
SELECT ... FROM ...
WHERE to_tsvector('english', replace(col, '-', ''))
## to_tsquery('english', replace('search-string', '-' ''));
I would like to use a postgres tsquery on a column that has strings that all contain numbers, like this:
FRUIT-239476234
If I try to make a tsquery out of this:
select to_tsquery('FRUIT-239476234');
What I get is:
'fruit' & '-239476234'
I want to be able to search by just the numeric portion of this value like so:
239476234
It seems that it is unable to match this because it is interpreting my hyphen as a "negative sign" and doesn't think 239476234 matches -239476234. How can I tell postgres to treat all of my characters as text and not try to be smart about numbers and hyphens?
An answer from the future. Once version 13 of PostgreSQL is released, you will be able to do use the dict_int module to do this.
create extension dict_int ;
ALTER TEXT SEARCH DICTIONARY intdict (MAXLEN = 100, ABSVAL=true);
ALTER TEXT SEARCH CONFIGURATION english ALTER MAPPING FOR int WITH intdict;
select to_tsquery('FRUIT-239476234');
to_tsquery
-----------------------
'fruit' & '239476234'
But you would probably be better off creating your own TEXT SEARCH DICTIONARY as well as copying the 'english' CONFIGURATION and modifying the copy, rather than modifying the default ones in place. Otherwise you have the risk that upgrading will silently lose your changes.
If you don't want to wait for v13, you could back-patch this change and compile into your own version of the extension for a prior server.
This is done by the text search parser, which is not configurable (short of writing your own parser in C, which is supported).
The simplest solution is to pre-process all search strings by replacing - with a space.
I'm trying to use Sphinx to find rows having words in their title column.
The query looks like this:
SELECT * FROM my_table WHERE MATCH ('#title "words"')
But it also returns rows having word (without the s) instead of words in the title.
What am I doing wrong?
Sounds like you have morphology (specifically stemming?) enabled on the index.
Should consider enabling index_exact_words
http://sphinxsearch.com/docs/current.html#conf-index-exact-words
which gives you exact form operator.
MATCH('#title =words')
Also gives you the possibility of the interesting expand_keywords option :)
http://sphinxsearch.com/docs/current.html#conf-expand-keywords
... or if dont ever want these matches, could disable stemming :) Alas there isn't a 'stemming optional' mode. (eg a ~ fuzzy operator to specifically stem)
If I want to capture all Descriptions that have "Dillard s" (with the space being any single alphanumeric wildcard), is it more appropriate to use:
DESCRIPTION iLIKE '%Dillard_s%'
or use
DESCRIPTION Similar To '%Dillard_s%'
Thanks!
I would use like:
DESCRIPTION LIKE '%Dillard_s%'
If you don't want case-insensitivity (and your question suggests you don't), then just use like.
I tend to use either LIKE or go whole-hog and use regular expressions:
DESCRIPTION ~ 'Dillard[a-zA-Z0-9]s%'
Is there a query I can run to search all packages to see if a particular table and/or column is used in the package? There are too many packages to open each one and do a find on the value(s) I'm looking for.
You can do this:
select *
from user_source
where upper(text) like upper('%SOMETEXT%');
Alternatively, SQL Developer has a built-in report to do this under:
View > Reports > Data Dictionary Reports > PLSQL > Search Source Code
The 11G docs for USER_SOURCE are here
you can use the views *_DEPENDENCIES, for example:
SELECT owner, NAME
FROM dba_dependencies
WHERE referenced_owner = :table_owner
AND referenced_name = :table_name
AND TYPE IN ('PACKAGE', 'PACKAGE BODY')
Sometimes the column you are looking for may be part of the name of many other things that you are not interested in.
For example I was recently looking for a column called "BQR", which also forms part of many other columns such as "BQR_OWNER", "PROP_BQR", etc.
So I would like to have the checkbox that word processors have to indicate "Whole words only".
Unfortunately LIKE has no such functionality, but REGEXP_LIKE can help.
SELECT *
FROM user_source
WHERE regexp_like(text, '(\s|\.|,|^)bqr(\s|,|$)');
This is the regular expression to find this column and exclude the other columns with "BQR" as part of the name:
(\s|\.|,|^)bqr(\s|,|$)
The regular expression matches white-space (\s), or (|) period (.), or (|) comma (,), or (|) start-of-line (^), followed by "bqr", followed by white-space, comma or end-of-line ($).
By the way, if you need to add other characters such as "(" or ")" because the column may be used as "UPPER(bqr)", then those options can be added to the lists of before and after characters.
(\s|\(|\.|,|^)bqr(\s|,|\)|$)