Extract strings with multiple words in Postgres 11.0 - postgresql

I have following column in Postgres table. I would like to only get values where there are multiple words in a string.
col1
nilotinib hydrochloride
ergamisol
ergamisol
methdilazine hydrochloride
The desired output is
col1
nilotinib hydrochloride
methdilazine hydrochloride
I am using following pattern to extract the strings but it's not working
SELECT regexp_match(col1, '^\w+\s+.*') from tb1;

To filter rows, use a WHERE clause in your statement:
SELECT col1
FROM tb1
WHERE col1 ~ '^\w+\s+.*';
See the string matching documentation for alternatives to your pattern. For your case, col1 ~ '\s' should be sufficient, or col1 SIMILAR TO '%[[:space:]]%'.

Related

Converting a postgresql array to a set

I'm using the (nifty!) string_agg function in postgresql to accumulate values of a given field
string_agg(r.pmid, ',' order by pmid)
This gives results like this - due to duplicate values of id's in underlying data:
15364708,15364708,15364708,15364709,15364709,15364709
How can this array / list be converted to a set ?
You could try using DISTINCT with STRING_AGG:
SELECT STRING_AGG(DISTINCT r.pmid, ',' ORDER BY pmid)
FROM yourTable
...

SQL sql query with regexp

I have column like
Column1 column 2
Audy. 123
Baza.z. 675
Sco,da#. 432
Here I am trying
Select column2 from table where lower(column1)='audy';
select column2 from table where lower(column1) regexp'[.,]' ='audy';
I need column data with out special characters in it using either sql or postgre sql and should take regexp from where condition only.
You can use regex_replace:
select column2
from table
where regexp_replace(lower(column1), '\W', '', 'g') = 'audy'
The regex \W matches any character that is not a letter or number. The flag 'g' (global) means remove all matches, not just the first one.

Need help in parsing column value based on value in other column

I have two columns, COL1 and COL2. COL1 has value like 'Birds sitting on $1 and enjoying' and COL2 has value like 'the.location_value[/tree,\building]'
I need to update third column COL3 with values like 'Birds sitting on /tree and enjoying'
i.e. $1 in 1st column is replaced with /tree
which is the 1st word from list of comma separated words with in square brackets [] in COL2 i.e. [/tree,\building]
I wanted to know the best suitable combination of string function in postgresql to use to achieve this.
You need to first extract the first element from the comma separated list, to do that, you can use split_part() but you first need to extract the actual list of values. This can be done using substring() with a regular expression:
substring(col2 from '\[(.*)\]')
will return /tree,\building
So the complete query would be:
select replace(col1, '$1', split_part(substring(col2 from '\[(.*)\]'), ',', 1))
from the_table;
Online example: http://rextester.com/CMFZMP1728
This one should work with any (int) number after $:
select t.*, c.col3
from t,
lateral (select string_agg(case
when o = 1 then s
else (string_to_array((select regexp_matches(t.col2, '\[(.*)\]'))[1], ','))[(select regexp_matches(s, '^\$(\d+)'))[1]::int] || substring(s from '^\$\d+(.*)')
end, '' order by o) col3
from regexp_split_to_table(t.col1, '(?=\$\d+)') with ordinality s(s, o)) c
http://rextester.com/OKZAG54145
Note:it is not the most efficient though. It splits col2's values (in the square brackets) each time for replacing $N.
Update: LATERAL and WITH ORDINALITY is not supported in older versions, but you could try a correlating subquery instead:
select t.*, (select array_to_string(array_agg(case
when s ~ E'^\\$(\\d+)'
then (string_to_array((select regexp_matches(t.col2, E'\\[(.*)\\]'))[1], ','))[(select regexp_matches(s, E'^\\$(\\d+)'))[1]::int] || substring(s from E'^\\$\\d+(.*)')
else s
end), '') col3
from regexp_split_to_table(t.col1, E'(?=\\$\\d+)') s) col3
from t

postgres coalesce fields, sum and group by

maybe someone can help me out with a postgres query.
the table structure looks like this
nummer nachname vorname cash
+-------+----------+----------+------+
2 Bert Brecht 0,758
2 Harry Belafonte 1,568
3 Elvis Presley 0,357
4 Mark Twain 1,555
4 Ella Fitz 0,333
…
How can I coalesce the fields where "nummer" are the same and sum the cash values?
My output should look like this:
2 Bert, Brecht 2,326
Harry, Belafonte
3 Elvis, Presley 0,357
4 Mark, Twain 1,888
Ella, Fitz
I think the part to coalesce should work something like this:
array_to_string(array_agg(nachname|| ', ' ||coalesce(vorname, '')), '<br />') as name,
Thanks for any help,
tony
SELECT
nummer,
string_agg(nachname||CASE WHEN vorname IS NULL THEN '' ELSE ', '||vorname END, E'\n') AS name,
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
See this SQLFiddle; note that it doesn't display the newline characters between names, but they're still there.
The CASE statement is used instead of coalesce so you don't have a trailing comma on entries with a last name but no first name. If you want a trailing comma, use format('%s, %s',vorname,nachname) instead and avoid all that ugly string concatenation business:
SELECT
nummer, string_agg(format('%s, %s', nachname, vorname), E'\n'),
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
If string_agg doesn't work, get a newer PostgreSQL, or mention the version in your questions so it's clear you're using an obsolete version. The query is trivially rewritten to use array_to_string and array_agg anyway.
If you're asking how to sum numbers that're actually represented as text strings like 1,2345 in the database: don't do that. Fix your schema. Format numbers on input and output instead, store them as numeric, float8, integer, ... whatever the appropriate numeric type for the job is.

handle escape sequence char in DB2

I want to search a column and get values where value containts \ .
I tried select * from "Values" where "ValueName" like '\'. But returns no value.
Also tried like "\" and like'\''%' etc. But no results.
See the DB2 Documentation on the LIKE predicate, in particular the parts about escape expressions.
What you want is
select * from Values where ValueName like '\\%' escape '\'
To give an example of usage:
create table backslash_escape_test
(
backslash_escape_test_column varchar(20)
);
insert into backslash_escape_test(backslash_escape_test_column)
values ('foo\');
insert into backslash_escape_test(backslash_escape_test_column)
values ('no slashes here');
insert into backslash_escape_test(backslash_escape_test_column)
values ('foo\bar');
insert into backslash_escape_test(backslash_escape_test_column)
values ('\bar');
select count(*) from backslash_escape_test where
backslash_escape_test_column like '%\\%' escape '\';
returns 3 (all 3 rows with \ in them).
select count(*) from backslash_escape_test where
backslash_escape_test_column like '\\%' escape '\';
returns 1 (the \bar row).
select * from Values where ValueName like '%\\%'
values is a not so good name because it may be confused with the values keyword
Don't escape it. You just need wildcards around it like this:
select count(*)
from escape_test
where test_column like '%\%'
But, suppose you really do need to escape the slash. Here's a simpler, more straightforward answer:
The escape-expression allows you to specify whatever character for escaping that you wish. So why use a character that you're looking for, thus requiring you to escape it? Use any other character instead. I'll use a plus sign as an example, but it could be a backslash, pound-sign, question-mark, anything other than a character you are looking for or one of the wildcard characters (% or _).
select count(*)
from escape_test
where test_column like '%\%' escape '+';
Now you don't have to add anything into your like-pattern.
To hold myself to the same standard of proof that #Michael demonstrated --
create table escape_test
( test_column varchar(20) );
insert into escape_test
(test_column)
values ('foo\'),
('no slashes here'),
('foo\bar'),
('\bar');
select 'test1' trial, count(*) result
from escape_test
where test_column like '%\%'
UNION
select 'test2', count(*)
from escape_test
where test_column like '%\\%' escape '\'
UNION
select 'test3', count(*)
from escape_test
where test_column like '%\%' escape '+'
;
Which returns the same number of rows for each method:
TRIAL RESULT
----- ------
test1 3
test2 3
test3 3