Newbie needing help in PostgreSQL - postgresql

My current SQL statement is:
SELECT *
FROM names
WHERE UPPER(first_name) LIKE UPPER('John Smith%')
OR UPPER(last_name) LIKE UPPER('John Smith%')
OR UPPER(first_name || ' ' || last_name) LIKE UPPER('John Smith%')
I want to search my table for "John Smith", this SQL statement is okay.
But what if I have an entry with the first name as 'John Kevin' and last name 'Smith', this wouldn't include that entry. What do I need to add? Thanks all! :)

You can use the Similar To operator to cover all possible combinations.
Select * from table names UPPER(first_name || ' ' || last_name)
SIMILAR to '%(UPPER(John)|UPPER(Smith))%';

Related

How to concatenate 2 or more columns in beamSql

I'm trying to concatenate 2 columns using delimiter as "."
code :
PCollection<BeamRecord> first = apps.apply(BeamSql.query(
"SELECT *,('CatLib' || 'ProdKey') AS CatLibKey from PCOLLECTION"));
How shall I specify delimiter between 2 columns ?
I'd say go for
SELECT
COALESCE(CatLib, '') || '.' || COALESCE(ProdKey, '') AS CatLibKey,
(any other columns here...)
FROM
PCOLLECTION;
but in SQL there is no "Select everything but column X" or "Select everything else" so you'd have to write down every name of the column you want to select.
Thanks #Impulse The Fox.
I have modified my query to :
PCollection<BeamRecord> first = apps.apply(BeamSql.query(
"SELECT Outlet, CatLib, ProdKey, Week, SalesComponent, DuetoValue, PrimaryCausalKey, CausalValue, ModelIteration, Published, (CatLib || '.' || ProdKey) AS CatLibKey from PCOLLECTION"));
and this worked perfectly.

Postgres SQL - different results from LIKE query using OR vs ||

I have a table with an integer column. It has 12 records numbered 1000 to 1012. Remember, these are ints.
This query returns, as expected, 12 results:
select count(*) from proposals where qd_number::text like '%10%'
as does this:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) LIKE '%10%' OR qd_number::text LIKE '%10%' )
but this query returns 2 records:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) || ' ' || qd_number::text LIKE '%10%' )
which implies using || in concatenated where expressions is not equivalent to using OR. Is that correct or am I missing something else here?
You probably have nulls in first_name. For these records (lower(first_name) || ' ' || qd_number::text results in null, so you don't find the numbers any longer.
using || in concatenated where expressions is not equivalent to using ORIs that correct or am I missing something else here?
That is correct.
|| is the string concatenation operator in SQL, not the OR operator.

Postgres Full Text Search on multiple columns

I am using Postgres 9.3 w/ Laravel 5 and I have set up the following migration:
DB::statement("ALTER TABLE users ADD COLUMN searchtext TSVECTOR");
DB::statement("UPDATE users SET searchtext = to_tsvector('english', first_name || ' ' || last_name || ' ' || email)");
DB::statement("CREATE INDEX searchtext_gin ON users USING GIN(searchtext)");
DB::statement("CREATE TRIGGER ts_searchtext BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('searchtext', 'pg_catalog.english', 'first_name', 'last_name', 'email')");
If I have an entry with the first name "Christopher", and I run the following query I get no results
return User::whereRaw("searchtext ## to_tsquery('Chris')")->get();
If I search for "Christopher" I get the record. What do I need to do to be able to search with a partial match?
The english dictionary doesn't stem nicknames.
regress=> SELECT to_tsvector('english', 'Christopher'), to_tsquery('english', 'Chris');
to_tsvector | to_tsquery
---------------+------------
'christoph':1 | 'chris'
(1 row)
You'll need to overlay a dictionary that maps nicknames too, so christopher can be stemmed to chris.

Return first and last words in a person name - postgres

I have a list of names and I want to separate the first and last words in a person's name.
I was trying to use the "trim" function without success.
Can someone explain how could I do it?
table:
Names
Mary Johnson Angel Smith
Dinah Robertson Donald
Paul Blank Power Silver
Then I want to have as a result:
Names
Mary Smith
Dinah Donald
Paul Silver
Thanks,
You can do it simply with regular expressions, like:
substring(trim(name) FROM '^([^ ]+)') || ' ' || substring(trim(name) FROM '([^ ]+)$')
Of course it would only work you are 100% there is always supplied at least a first and a last name. I'm not 100% sure it is the case for everybody in the World. For instance: would that work for names in Chinese? I'm not sure and I avoid doing any assumption about people names. The best is to simply ask the user two fields, one for the "name" and another for "How would you like to be called?".
Another approach, which takes advantage of Postgres string processing built-in functions:
SELECT split_part(name, ' ', 1) as first_token,
split_part(name, ' ', array_length(regexp_split_to_array(name, ' '), 1)) as last_token
FROM mytable
Here's how I extracted full names from emails with a dot in them, eg Jeremy.Thompson#abc.com
SELECT split_part(email, '.', 1) || ' ' || replace(split_part(email, '.', 2), '#abc','')
FROM people
Result:
Jeremy | Thompson
You can easily replace the dot with a space:
SELECT split_part(email, ' ', 1) || ' ' || replace(split_part(email, ' ', 2), '#abc','')
FROM people

PGSQL - Joining two tables on complicated condition

I got stuck during database migration on PostgreSQL and need your help.
I have two tables that I need to join: drzewa_mateczne.migracja (data I need to migrate) and ibl_as.t_adres_lesny (dictionary I need to join with migracja).
I need to join them on replace(drzewa_mateczne.migracja.adresy_lesne, ' ', '') = replace(ibl_as.t_adres_lesny.adres, ' ', ''). However my data is not very regular, so I want to join it on first good match with the dictionary.
I've created the following query:
select
count(*)
from
drzewa_mateczne.migracja a
where
length(a.adresy_lesne) > 0
and replace(a.adresy_lesne, ' ', '') = (select substr(replace(al.adres, ' ', ''), 1, length(replace(a.adresy_lesne, ' ', ''))) from ibl_as.t_adres_lesny al limit 1)
The query doesn't return any rows.
It does successfully join empty rows if ran without
length(a.adresy_lesne) > 0
The two following queries return rows (as expected):
select replace(adres, ' ', '')
from ibl_as.t_adres_lesny
where substr(replace(adres, ' ', ''), 1, 16) = '16-15-1-13-180-c'
limit 1
select replace(adresy_lesne, ' ', ''), length(replace(adresy_lesne, ' ', ''))
from drzewa_mateczne.migracja
where replace(adresy_lesne, ' ', '') = '16-15-1-13-180-c'
I'm suspecting that there might be a problem in sub-query inside the 'where' clause in my query. If you guys could help me resolve this issue, or at least point me in the right direction, I'd be very greatful.
Thanks in advance,
Jan
You can largely simplify to:
SELECT count(*)
FROM drzewa_mateczne.migracja a
WHERE a.adresy_lesne <> ''
AND EXISTS (
SELECT 1 FROM ibl_as.t_adres_lesny al
WHERE replace(al.adres, ' ', '')
LIKE (replace(a.adresy_lesne, ' ', '') || '%')
)
a.adresy_lesne <> '' does the same as length(a.adresy_lesne) > 0, just faster.
Replace the correlated subquery with an EXISTS semi-join (to get only one match per row).
Replace the complex string construction with a simple LIKE expression.
More information on pattern matching and index support in these related answers:
PostgreSQL LIKE query performance variations
Difference between LIKE and ~ in Postgres
speeding up wildcard text lookups
What you're basically telling the database to do is to get you the count of rows from drzewa_mateczne.migracja that have a non-empty adresy_lesne field that is a prefix of the adres field of a semi-random ibl_as.t_adres_lesny row...
Lose the "limit 1" in the subquery and substitute the "=" with "in" and see if that is what you wanted...