Postgresql SELECT if string contains - postgresql

So I have a in my Postgresql:
TAG_TABLE
==========================
id tag_name
--------------------------
1 aaa
2 bbb
3 ccc
To simplify my problem,
What I want to do is SELECT 'id' from TAG_TABLE when a string "aaaaaaaa" contains the 'tag_name'.
So ideally, it should only return "1", which is the ID for tag name 'aaa'
This is what I am doing so far:
SELECT id FROM TAG_TABLE WHERE 'aaaaaaaaaaa' LIKE '%tag_name%'
But obviously, this does not work, since the postgres thinks that '%tag_name%' means a pattern containing the substring 'tag_name' instead of the actual data value under that column.
How do I pass the tag_name to the pattern??

You should use tag_name outside of quotes; then it's interpreted as a field of the record. Concatenate using '||' with the literal percent signs:
SELECT id FROM TAG_TABLE WHERE 'aaaaaaaa' LIKE '%' || tag_name || '%';
And remember that LIKE is case-sensitive. If you need a case-insensitive comparison, you could do this:
SELECT id FROM TAG_TABLE WHERE 'aaaaaaaa' LIKE '%' || LOWER(tag_name) || '%';

A proper way to search for a substring is to use position function instead of like expression, which requires escaping %, _ and an escape character (\ by default):
SELECT id FROM TAG_TABLE WHERE position(tag_name in 'aaaaaaaaaaa')>0;

I personally prefer the simpler syntax of the ~ operator.
SELECT id FROM TAG_TABLE WHERE 'aaaaaaaa' ~ tag_name;
Worth reading through Difference between LIKE and ~ in Postgres to understand the difference.
`

In addition to the solution with 'aaaaaaaa' LIKE '%' || tag_name || '%' there
are position (reversed order of args) and strpos.
SELECT id FROM TAG_TABLE WHERE strpos('aaaaaaaa', tag_name) > 0
Besides what is more efficient (LIKE looks less efficient, but an index might change things), there is a very minor issue with LIKE: tag_name of course should not contain % and especially _ (single char wildcard), to give no false positives.

SELECT id FROM TAG_TABLE WHERE 'aaaaaaaa' LIKE '%' || "tag_name" || '%';
tag_name should be in quotation otherwise it will give error as tag_name doest not exist

Related

Match column values and ignoring the special characters in the strings in PostgreSQL 11.0

I have following table in PostgreSQL 11.0
name id name_matched
0.5% timolol fixed combination ophthalmic solution 3275172 brimonidine
drop) 3275173 brimonidine tartrate
0.2% w 3275174 chlorhexidine digluconate
0.2μg act-hib® 3275175 act hib
1.0% prednisolone acetate association 3275176 gatifloxacin
0.3% topical minocycline ointment 3275177 minocycline.
I would like to keep only those rows where name = name_matched
When I tried below query, I was missing row4 and row6 due to special characters in the name value. How can I ignore those characters and get those rows in my output.
select *
FROM tbl
where name not ilike '%' || name_matched || '%'
I think the logic you are trying to articulate here is that the name_matched needs to be a substring of the name to make a match:
SELECT *
FROM tbl
WHERE name ILIKE '%' || name_matched || '%';
If the above might be giving a few false positives (or negatives), then we could consider using regex here, but maybe you don't need to do that.

Postgres reverse LIKE lookup indexing and performance

We have a musicians table containing records with multiple string fields, say:
"Jimi", "Hendrix", "Guitar"
"Phil", "Collins", "Drums"
"Sting", "", "Bass"
"Ringo", "Starr", "Drums"
"Paul", "McCartney", "Bass"
I want to pass postgres a long string, say:
"It is known that Jimi liked to set light to his guitar and smash up
all the drums while on stage."
and i want to get returned the fields that have any matches - preferably in order of the most matches first:
"Jimi", "Hendrix", "Guitar"
"Phil", "Collins", "Drums"
"Ringo", "Starr", "Drums"
because i need the search to be case insensitive, i'm constructing a query like this...
select * from musicians where lowercase_string like '%'||firstname||'%' or lowercase_string like '%'||lastname||'%' or lowercase_string like '%'||instrument||'%'
and then looping through (in ruby in my case) to capture the result with the most matches.
this is however very slow in the sql stage (1 minute+).
i've tried adding lower-case GIN index using pg_trgm as suggested here - but it's not helping - presumably because the like query is back to front?
Thanks!
With my testing, it seems that no trigram index could help your query at all. And no other index type could possibly speed up an (I)LIKE / FTS based search.
I should mention that all of the queries below use the trigram indexes, when they are queried "reversed": when the table contains the document (which is indexed), and your parameter is the query. The (I)LIKE variant variant f.ex. 2-3 times faster with it.
These the queries I've tested:
select *
from musicians
where :input_string ilike '%' || firstname || '%'
or :input_string ilike '%' || lastname || '%'
or :input_string ilike '%' || instrument || '%'
At first, FTS seemed a great idea, but my testing shows that even without ranking, it is 60-100 times slower than the (I)LIKE variant. (So even, when you don't have to post-process results with these methods, these are not worth it).
select *
from musicians
where to_tsvector(:input_string) ## (plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
However, ORDER BY rank doesn't slow down that much further: it is 70-120 times slower than the (I)LIKE variant.
select *
from musicians
where to_tsvector(:input_string) ## (plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
order by ts_rank(to_tsvector(:input_string), plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
Then, for a last effort, I tried the (fairly new) "word similarity" operators of the trigram module: <% and %> (available from PostgreSQL 9.6).
select *
from musicians
where :input_string %> firstname
or :input_string %> lastname
or :input_string %> instrument
select *
from musicians
where firstname <% :input_string
or lastname <% :input_string
or instrument <% :input_string
These were somewhat faster then FTS: around 50-70 times slower than the (I)LIKE variant.
(Partially working) rextester: it is run against PostgreSQL 9.5, so the 9.6 operators obviously won't run here.
Update: IF full word match is enough for you, you can actually reverse your query, to be able to use indexes. You'll need to "parse" your query (aka. "long string") though:
with long_string(ls) as (
values (:input_string)
),
words(word) as (
select s
from long_string, regexp_split_to_table(ls, '[^[:alnum:]]+') s
where s <> ''
)
select musicians.*
from musicians, words
where firstname ilike word
or lastname ilike word
or instrument ilike word
group by musicians.id
Note: I parsed the query for every complete word. You can have some other logic there, or it can even be parsed in client side.
The default, btree index shines here, as it is much faster than the trigram index with (I)LIKE (we won't need them anyway, as we are looking for complete word match here):
with long_string(ls) as (
values (:input_string)
),
words(word) as (
select s
from long_string, regexp_split_to_table(lower(ls), '[^[:alnum:]]+') s
where s <> ''
)
select musicians.*
from musicians, words
where lower(firstname) = word
or lower(lastname) = word
or lower(instrument) = word
group by musicians.id
http://rextester.com/PSABJ6745
You could even get the match count with something like
sum((lower(firstname) = word)::int
+ (lower(lastname) = word)::int
+ (lower(instrument) = word)::int)
The ilike option with match ordering:
with long_string (ls) as (values
('It is known that Jimi liked to set light to his guitar and smash up all the drums while on stage.')
)
select musicians.*, matches
from
musicians
cross join
long_string
cross join lateral
(select
(ls ilike format ('%%%s%%', first_name) and first_name != '')::int +
(ls ilike format ('%%%s%%', last_name) and last_name != '')::int +
(ls ilike format ('%%%s%%', instrument) and instrument != '')::int
as matches
) m
where matches > 0
order by matches desc
;
first_name | last_name | instrument | matches
------------+-----------+------------+---------
Jimi | Hendrix | Guitar | 2
Phil | Collins | Drums | 1
Ringo | Starr | Drums | 1

Postgres SQL - different results from LIKE query using OR vs ||

I have a table with an integer column. It has 12 records numbered 1000 to 1012. Remember, these are ints.
This query returns, as expected, 12 results:
select count(*) from proposals where qd_number::text like '%10%'
as does this:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) LIKE '%10%' OR qd_number::text LIKE '%10%' )
but this query returns 2 records:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) || ' ' || qd_number::text LIKE '%10%' )
which implies using || in concatenated where expressions is not equivalent to using OR. Is that correct or am I missing something else here?
You probably have nulls in first_name. For these records (lower(first_name) || ' ' || qd_number::text results in null, so you don't find the numbers any longer.
using || in concatenated where expressions is not equivalent to using ORIs that correct or am I missing something else here?
That is correct.
|| is the string concatenation operator in SQL, not the OR operator.

how to replace the first "9" in a string with another character in postgresql database column?

I have a column "pnum" in a "test" table.
I'd like to replace the leading "9" in pnum with "*" for every record.
testdb=# select * from test limit 5;
id name pnum
===========================================
1 jk 912312345
2 tt 9912333333
I would like the pnums to look like this:
id name pnum
===========================================
1 jk *12312345
2 tt *912333333
How would I do something like this in postgres?
EDIT 1:
I have tried something like this so far:
select id, name, '*' && substring(pnum FROM 2 FOR CHAR_LENGTH(pnum)-1 ) from test limit 3;
Also tried this:
select id, name, '*' || substring(pnum FROM 2 FOR CHAR_LENGTH(pnum)-1 ) from test limit 3;
Neither one has worked...
EDIT 2:
I figured it out:
select id, name, '*'::text || substring(pnum FROM 2 FOR CHAR_LENGTH(pnum)-1 ) from test limit 3;
See function regexp_replace(string text, pattern text, replacement text [, flags text]) String Functions and Operators
SELECT regexp_replace('9912333333', '^[9]', '*');
regexp_replace
----------------
*912333333
You can use Postgres' string manipulation functions for this. In your case "Substring" and "Char_Length"
'*' || Substring(<yourfield> FROM 2 FOR CHAR_LENGTH(<yourfield>)-1) as outputfield

LIKE with % on column names

Here is my query that results in a syntax error:
SELECT *
FROM account_invoice,sale_order
WHERE sale_order.name LIKE %account_invoice.origin%
The account_invoice.origin field contains the text of sale_order.name, plus other text as well, so I need to match sale_order.name string anywhere in the account_invoice.origin string.
I'm using PostgreSQL 8.4.
Try this
SELECT *
FROM account_invoice,sale_order
WHERE sale_order.name LIKE '%' || account_invoice.origin || '%'
% needs single quote because the pattern is a string.
|| is the operator for concatenation.