String Include some other strings - postgresql

How do I check in postgres that a varchar contains 'aaa' or 'bbb'?
I tried myVarchar IN ('aaa', 'bbb') but, obviously, it's true when myvarchar is exactly equal to 'aaa' or 'bbb'.

for multiple similarity check the best fit in terms of speed and laconic syntax would be
SIMILAR TO '%(aaa|bbb|ccc)%'

you can use ANY & LIKE operators together.
SELECT * FROM "myTable" WHERE "myColumn" LIKE ANY( ARRAY[ '%aaa%', '%bbb%' ] );

Assuming this is your table:
CREATE TABLE t
(
myVarchar varchar
) ;
INSERT INTO t (myVarchar)
VALUES
('something aaa else'),
('also some bbb'),
('maybe ccc') ;
-- (some random data, this query is PostgreSQL specific)
INSERT INTO t (myVarchar)
SELECT
random()::varchar
FROM
generate_series(1, 10000) ;
SQL Standard approach:
You can do (in all SQL standard databases):
SELECT
*
FROM
t
WHERE
myVarchar LIKE '%aaa%' or myVarchar LIKE '%bbb%' ;
and you'll get:
| myvarchar |
| :----------------- |
| something aaa else |
| also some bbb |
PostgreSQL specific approaches
Specifically for PostgreSQL, you can use a (single) regex with multiple values to look for:
SELECT
*
FROM
t
WHERE
myVarchar ~ 'aaa|bbb' ;
| myvarchar |
| :----------------- |
| something aaa else |
| also some bbb |
dbfiddle here
If you need quick finds, you can use trigram indexes, like this:
CREATE EXTENSION pg_trgm; -- Only needed if extension not already installed
CREATE INDEX myVarchar_like_idx
ON t
USING GIST (myVarchar gist_trgm_ops);
... the query using LIKE will be much faster.

Related

PostgreSQL Stored procedure performance

I use PostgreSQL and I have a table 'PERSON' in schema 'public' that looks like this:
+----+-------------+-------+----------------------------+
| id | internal_id | name | created |
+----+-------------+-------+----------------------------+
| 1 | P0001-XX00 | Bob | 2021-05-24 22:10:01.93025 |
+----+-------------+-------+----------------------------+
| 2 | P0001-CX00 | Tom | 2021-06-27 22:10:01.93025 |
+----+-------------+-------+----------------------------+
| 3 | P0002-XX00 | Anna | 2021-05-24 22:10:01.93025 |
+----+-------------+-------+----------------------------+
id -> bigint; internal_id -> character varying; name -> character varying; created -> timestamp without timezone
I need to write procedure that delete those records that are older than fixed timestamp, for example: now(). But as soon as such an old record has been found, I need to check if there are other records in the table which are not old yet and with the same first 5 characters in internal_id as the found old record. If there are such records, then I should not delete the old record.
So I wrote the following procedure with plpgsql and it seems to work:
BEGIN
DELETE FROM public."PERSON" AS t1
WHERE t1.created < now()
AND NOT EXISTS
(
SELECT
FROM public."PERSON" AS t2
WHERE left(t2.internal_id, 5) = left(t1.internal_id, 5)
AND t2.created >= now()
);
COMMIT;
END;
Questions:
Could it have been made more correct or prettier or cleaner? Perhaps, instead of using the left() function, it was necessary to use LIKE, or, in principle, to do it somehow differently?
Do you think this procedure has normal performance or it can be improved?
Thank you in advance!
That should work just fine.
For good performance, create an index:
CREATE INDEX ON public."PERSON" (left(internal_id, 5), created);

Get cutted JSON in query

I have a column in db which constains JSON values like:
{"key-1": "val-1", "key-2": "val-2", "key-3": "val-3"}
By query like..
SELECT column->>'key-1' FROM table;
I can get my val-1.
Is there a way to get value with key as JSON in sql query from already existed JSON value?
I want to get result like:
{"key-1": "val-1"}
from
{"key-1": "val-1", "key-2": "val-2", "key-3": "val-3"}
using sql query.
Use ampersand operator, &, e.g.,
Live test: https://www.db-fiddle.com/f/9izCEH75JhwVDvsGvsZomG/0
with the_table as
(
select '{"key-1": "val-1", "key-2": "val-2", "key-3": "val-3"}'::jsonb as d
)
select d & 'key-1' as j from the_table
Output:
| j |
| ----------------- |
| {"key-1":"val-1"} |
Just kidding :) Create a function that extracts the desired key value pair, and then create your own user-defined operator for it.
create or replace function extract_one_jsonb(j jsonb, key text)
returns jsonb
as
$$
select jsonb_build_object(key, j->key)
$$ language sql;
create operator & (
leftarg = jsonb,
rightarg = text,
procedure = extract_one_jsonb
);
Of course you can just use a function, or if creating a user-defined operator is not an option:
with the_table as
(
select '{"key-1": "val-1", "key-2": "val-2", "key-3": "val-3"}'::jsonb as d
)
select extract_one_jsonb(d, 'key-1') as j from the_table
Output:
| j |
| ----------------- |
| {"key-1":"val-1"} |
If extracting a key value pair from jsonb is being done many times, it's desirable to give an operator for it, e.g., &. Postgres is pretty flexible when you want to create your own operator, this can be created too: ->>>.
Live test: https://www.db-fiddle.com/f/9izCEH75JhwVDvsGvsZomG/1
create operator ->>> (
leftarg = jsonb,
rightarg = text,
procedure = extract_one_jsonb
);
Output:
| j |
| ----------------- |
| {"key-1":"val-1"} |
->> is already used by Postgres: https://www.postgresql.org/docs/11/functions-json.html
You can create '->>>' instead. ->>> looks more like an extractor operator than ampersand &. Besides it looks good even you stick it to the source field (that is without spaces)
with the_table as
(
select '{"key-1": "val-1", "key-2": "val-2", "key-3": "val-3"}'::jsonb as d
)
select d->>>'key-1' as j from the_table
Tried the following, it works too, looks like a scissor (for cutting): %>
select d%>'key-1' as j from the_table
The only thing I can think of is to get the key/value pair and assemble that back into a single JSON value:
select jsonb_build_object(j.k, j.v)
from the_table t, jsonb_each(t.json_col) as j(k,v)
where j.k = 'key-1'
and ... more conditions ...;
Online example: https://rextester.com/VGSX43955

problems with full-text search in postgres

I have the next table, and data:
/* script for people table, with field tsvector and gin */
CREATE TABLE public.people (
id INTEGER,
name VARCHAR(30),
lastname VARCHAR(30),
complete TSVECTOR
)
WITH (oids = false);
CREATE INDEX idx_complete ON public.people
USING gin (complete);
/* data for people table */
INSERT INTO public.people ("id", "name", "lastname", "complete")
VALUES
(1, 'MICHAEL', 'BRYANT BRYANT', '''bryant'':2,3 ''michael'':1'),
(2, 'HENRY STEVEN', 'BUSH TIESSEN', '''bush'':3 ''henri'':1 ''steven'':2 ''tiessen'':4'),
(3, 'WILLINGTON STEVEN', 'STEPHENS FLINN', '''flinn'':4 ''stephen'':3 ''steven'':2 ''willington'':1'),
(4, 'BRET', 'MARTINEZ AROCH', '''aroch'':3 ''bret'':1 ''martinez'':2'),
(5, 'TERENCE BERT', 'CAVALIERE ENRON', '''bert'':2 ''cavalier'':3 ''terenc'':1');
I need retrieve the names and lastnames, according the tsvector field. Actually I have the query:
SELECT * FROM people WHERE complete ## to_tsquery('WILLINGTON & FLINN');
And the result is right (the third record). BUT if I try with
SELECT * FROM people WHERE complete ## to_tsquery('STEVEN & FLINN');
/* the same record! */
I don't have results. Why? What can I do?
You should use the same language to search your table as the values in your field 'complete' where inserted.
Check the result of that query compared english and german:
select * ,
to_tsvector('english', concat_ws(' ', name, lastname )) as english,
to_tsvector('german', concat_ws(' ', name, lastname )) as german
from public.people
so that should work for you :
SELECT * FROM people WHERE complete ## to_tsquery('english','STEVEN & FLINN');
You are probably using a text search configuration where either STEVEN or FLINN are modified by stemming.
I can reproduce this here:
test=> SHOW default_text_search_config;
default_text_search_config
----------------------------
pg_catalog.german
(1 row)
test=> SELECT complete FROM public.people WHERE id = 3;
complete
-------------------------------------------------
'flinn':4 'stephen':3 'steven':2 'willington':1
(1 row)
test=> SELECT * FROM ts_debug('STEVEN & FLINN');
alias | description | token | dictionaries | dictionary | lexemes
-----------+-----------------+--------+---------------+-------------+---------
asciiword | Word, all ASCII | STEVEN | {german_stem} | german_stem | {stev}
blank | Space symbols | | {} | |
blank | Space symbols | & | {} | |
asciiword | Word, all ASCII | FLINN | {german_stem} | german_stem | {flinn}
(4 rows)
test=> SELECT * FROM public.people
WHERE complete ## to_tsquery('STEVEN & FLINN');
id | name | lastname | complete
----+------+----------+----------
(0 rows)
So you see, the German Snowball dictionary stems STEVEN to stev.
Since complete contains the unstemmed version steven, no match is found.
You should use the same text search configuration when you populate complete and in the query.

Split a string and populate a table for all records in table in SQL Server 2008 R2

I have a table EmployeeMoves:
| EmployeeID | CityIDs
+------------------------------
| 24 | 23,21,22
| 25 | 25,12,14
| 29 | 1,2,5
| 31 | 7
| 55 | 11,34
| 60 | 7,9,21,23,30
I'm trying to figure out how to expand the comma-delimited values from the EmployeeMoves.CityIDs column to populate an EmployeeCities table, which should look like this:
| EmployeeID | CityID
+------------------------------
| 24 | 23
| 24 | 21
| 24 | 22
| 25 | 25
| 25 | 12
| 25 | 14
| ... and so on
I already have a function called SplitADelimitedList that splits a comma-delimited list of integers into a rowset. It takes the delimited list as a parameter. The SQL below will give me a table with split values under the column Value:
select value from dbo.SplitADelimitedList ('23,21,1,4');
| Value
+-----------
| 23
| 21
| 1
| 4
The question is: How do I populate EmployeeCities from EmployeeMoves with a single (even if complex) SQL statement using the comma-delimited list of CityIDs from each row in the EmployeeMoves table, but without any cursors or looping in T-SQL? I could have 100 records in the EmployeeMoves table for 100 different employees.
This is how I tried to solve this problem. It seems to work and is very quick in performance.
INSERT INTO EmployeeCities
SELECT
em.EmployeeID,
c.Value
FROM EmployeeMoves em
CROSS APPLY dbo.SplitADelimitedList(em.CityIDs) c;
UPDATE 1:
This update provides the definition of the user-defined function dbo.SplitADelimitedList. This function is used in above query to split a comma-delimited list to table of integer values.
CREATE FUNCTION dbo.fn_SplitADelimitedList1
(
#String NVARCHAR(MAX)
)
RETURNS #SplittedValues TABLE(
Value INT
)
AS
BEGIN
DECLARE #SplitLength INT
DECLARE #Delimiter VARCHAR(10)
SET #Delimiter = ',' --set this to the delimiter you are using
WHILE len(#String) > 0
BEGIN
SELECT #SplitLength = (CASE charindex(#Delimiter, #String)
WHEN 0 THEN
datalength(#String) / 2
ELSE
charindex(#Delimiter, #String) - 1
END)
INSERT INTO #SplittedValues
SELECT cast(substring(#String, 1, #SplitLength) AS INTEGER)
WHERE
ltrim(rtrim(isnull(substring(#String, 1, #SplitLength), ''))) <> '';
SELECT #String = (CASE ((datalength(#String) / 2) - #SplitLength)
WHEN 0 THEN
''
ELSE
right(#String, (datalength(#String) / 2) - #SplitLength - 1)
END)
END
RETURN
END
Preface
This is not the right way to do it. You shouldn't create comma-delimited lists in SQL Server. This violates first normal form, which should sound like an unbelievably vile expletive to you.
It is trivial for a client-side application to select rows of employees and related cities and display this as a comma-separated list. It shouldn't be done in the database. Please do everything you can to avoid this kind of construction in the future. If at all possible, you should refactor your database.
The Right Answer
To get the list of cities, properly expanded, from a table containing lists of cities, you can do this:
INSERT dbo.EmployeeCities
SELECT
M.EmployeeID,
C.CityID
FROM
EmployeeMoves M
CROSS APPLY dbo.SplitADelimitedList(M.CityIDs) C
;
The Wrong Answer
I wrote this answer due to a misunderstanding of what you wanted: I thought you were trying to query against properly-stored data to produce a list of comma-separated CityIDs. But I realize now you wanted the reverse: to query the list of cities using existing comma-separated values already stored in a column.
WITH EmployeeData AS (
SELECT
M.EmployeeID,
M.CityID
FROM
dbo.SplitADelimitedList ('23,21,1,4') C
INNER JOIN dbo.EmployeeMoves M
ON Convert(int, C.Value) = M.CityID
)
SELECT
E.EmployeeID,
CityIDs = Substring((
SELECT ',' + Convert(varchar(max), CityID)
FROM EmployeeData C
WHERE E.EmployeeID = C.EmployeeID
FOR XML PATH (''), TYPE
).value('.[1]', 'varchar(max)'), 2, 2147483647)
FROM
(SELECT DISTINCT EmployeeID FROM EmployeeData) E
;
Part of my difficulty in understanding is that your question is a bit disorganized. Next time, please clearly label your example data and show what you have, and what you're trying to work toward. Since you put the data for EmployeeCities last, it looked like it was what you were trying to achieve. It's not a good use of people's time when questions are not laid out well.

Filter an ID Column against a range of values

I have the following SQL:
SELECT ',' + LTRIM(RTRIM(CAST(vessel_is_id as CHAR(2)))) + ',' AS 'Id'
FROM Vessels
WHERE ',' + LTRIM(RTRIM(CAST(vessel_is_id as varCHAR(2)))) + ',' IN (',1,2,3,4,5,6,')
Basically, I want to filter the vessel_is_id against a variable list of integer values (which is passed in as a varchar into the stored proc). Now, the above SQL does not work. I do have rows in the table with a `vessel__is_id' of 1, but they are not returned.
Can someone suggest a better approach to this for me? Or, if the above is OK
EDIT:
Sample data
| vessel_is_id |
| ------------ |
| 1 |
| 2 |
| 5 |
| 3 |
| 1 |
| 1 |
So I want to returned all of the above where vessel_is_id is in a variable filter i.e. '1,3' - which should return 4 records.
Cheers.
Jas.
IF OBJECT_ID(N'dbo.fn_ArrayToTable',N'FN') IS NOT NULL
DROP FUNCTION [dbo].[fn_ArrayToTable]
GO
CREATE FUNCTION [dbo].fn_ArrayToTable (#array VARCHAR(MAX))
-- =============================================
-- Author: Dan Andrews
-- Create date: 04/11/11
-- Description: String to Tabled-Valued Function
--
-- =============================================
RETURNS #output TABLE (data VARCHAR(256))
AS
BEGIN
DECLARE #pointer INT
SET #pointer = CHARINDEX(',', #array)
WHILE #pointer != 0
BEGIN
INSERT INTO #output
SELECT RTRIM(LTRIM(LEFT(#array,#pointer-1)))
SELECT #array = RIGHT(#array, LEN(#array)-#pointer),
#pointer = CHARINDEX(',', #array)
END
RETURN
END
Which you may apply like:
SELECT * FROM dbo.fn_ArrayToTable('2,3,4,5,2,2')
and in your case:
SELECT LTRIM(RTRIM(CAST(vessel_is_id AS CHAR(2)))) AS 'Id'
FROM Vessels
WHERE LTRIM(RTRIM(CAST(vessel_is_id AS VARCHAR(2)))) IN (SELECT data FROM dbo.fn_ArrayToTable('1,2,3,4,5,6')
Since Sql server doesn't have an Array you may want to consider passing in a set of values as an XML type. You can then turn the XML type into a relation and join on it. Drawing on the time-tested pubs database for example. Of course you're client may or may not have an easy time generating the XML for the parameter value, but this approach is safe from sql-injection which most "comma seperated" value approaches are not.
declare #stateSelector xml
set #stateSelector = '<values>
<value>or</value>
<value>ut</value>
<value>tn</value>
</values>'
select * from authors
where state in ( select c.value('.', 'varchar(2)') from #stateSelector.nodes('//value') as t(c))