Postgres function to turn regexp_matches SETOF result into an ARRAY - postgresql

I am having a go at pgSQL for the first time.
Many thanks in advance indeed - Eugen
The problem I like to solve
I like to return an array of words with a minimum length from a text. Picked regexp_matches which returns SETOF text[] (not array). There may be better ways but this is what I picked.
regexp_matches ( string text, pattern text [, flags text ] ) → setof text[]
Postgres Reference for String functins
The Issue
When trying to use individual results from the resut set (seemingly arrays) they are in fact of type text but include the curly brackets of a Postgres array.
Example
Functionally the below example works because I added a workaround removing curly brackets with the trim function.
The output for the function described below
# select lower_words_arr('there is an answer','\w{3,}') AS words;
NOTICE: match = {there}
NOTICE: match = {answer}
words
----------------
{there,answer}
(1 row)
My Question
Why come, that individual regexp_matches entries match and return data as "{txt}" but behave as type text and not ARRAY?
Is there a way without my workaround?
Code
CREATE OR REPLACE FUNCTION lower_words_arr(input_str text, match_expr text) RETURNS text[] AS $$
DECLARE
words_arr text[];
one_match text;
BEGIN
FOR one_match IN SELECT regexp_matches(lower(input_str), match_expr, 'g')
AS match
LOOP
RAISE NOTICE 'match = %', one_match;
-- But if I write:
--
-- RAISE NOTICE 'match = %', one_match[0];
--
-- fails with:
---
-- ERROR: cannot subscript type text because it is not an array
-- CONTEXT: SQL statement "SELECT one_match[0]"
-- PL/pgSQL function lower_words_arr(text,text) line 10 at RAISE
--
-- even though this the result returned
---
-- NOTICE: match = {there}
-- NOTICE: match = {answewr}
words_arr := array_append(words_arr,trim(BOTH '{}' FROM one_match));
END LOOP;
RETURN words_arr;
END;
$$ LANGUAGE plpgsql;

I have now solved the issue without the trim workaround. The solution came right from the Postgres documentation. Look for SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1]; on that page. With that knowledge in hand I changed the function to
Improved Solution
CREATE OR REPLACE FUNCTION lower_words_arr(input_str text, match_expr text) RETURNS text[] AS $$
DECLARE
words_arr text[];
one_match text;
BEGIN
FOR one_match IN SELECT(regexp_matches(lower(input_str), match_expr, 'g'))[1] AS match
LOOP
RAISE NOTICE 'match = %', one_match;
words_arr := array_append(words_arr,one_match);
END LOOP;
RETURN words_arr;
END;
$$ LANGUAGE plpgsql STABLE;
Output
select lower_words_arr('there is an answer - you will find it in the Postgres documentation.','\w{4,}') AS words;
NOTICE: match = there
NOTICE: match = answer
NOTICE: match = will
NOTICE: match = find
NOTICE: match = postgres
NOTICE: match = documentation
words
-------------------------------------------------
{there,answer,will,find,postgres,documentation}
(1 row)
Conclusion
My original problem has been solved, but a puzzling piece understanding is still missing (not subject of my initial question):
Why does the one_match variable not behave like an array when [1] is removed from the SELECT?
Maybe I should open a question specific to that behaviour ...

Related

How to use FOREACH in a PostgreSQL LOOP

I am trying to write a very simple pgsql statement to loop through a simple array of state abbreviations.
CREATE OR REPLACE FUNCTION my_schema.showState()
RETURNS text AS
$$
DECLARE
my_array text[] := '["az","al", "ak", "ar"]'
BEGIN
FOREACH state IN my_array
LOOP
RETURN SELECT format('%s', state);
END LOOP;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM showState();
I am using PostgresSQL version 11+. I keep getting an error ERROR: syntax error at or near "BEGIN" The output I want to see here is just seeing the state abbreviation printed in the results window for now. Like this:
What am I doing wrong here?
There's a ; missing after my_array text[] := '["az","al", "ak", "ar"]'.
'["az","al", "ak", "ar"]' isn't a valid array literal.
If you want a set returning function, you need to declare its return type as a SETOF.
The ARRAY keyword is missing in the FOREACH's head.
state must be declared.
You need to use RETURN NEXT ... to push a value into the set to be returned.
format() is pointless here, it doesn't effectively do anything.
With all that rectified one'd get something along the lines of:
CREATE
OR REPLACE FUNCTION showstate()
RETURNS SETOF text
AS
$$
DECLARE
my_array text[] := ARRAY['az',
'al',
'ak',
'ar'];
state text;
BEGIN
FOREACH state IN ARRAY my_array
LOOP
RETURN NEXT state;
END LOOP;
END;
$$
LANGUAGE plpgsql;
db<>fiddle

Accessing jsonb object in postgress throws error

I have a table called "junaid", which has a column "connections" which is of type "jsonb".
create table junaid (
connection jsonb
}
The value in the "connections" column is array of objects.
conections = [{"name":"abc", "age":123},{"name":"xyz", "age":222}]
I have a stored procedure to access these values.
CREATE OR REPLACE FUNCTION test() RETURNS INTEGER AS $$
DECLARE
myconnection jsonb;
i jsonb;
BEGIN
select connections into myconnection from junaid;
FOR i IN SELECT * FROM jsonb_array_elements(myconnection)
LOOP
RAISE NOTICE 'output from space %', i->>’name’;
END LOOP;
return 0;
EXCEPTION WHEN others THEN
return 1;
END;
$$ LANGUAGE plpgsql;
When I run the stored proc, I get this error:
column "’name’" does not exist
You're using wrong quote characters. Instead of backticks or forward ticks or whatever, you should be using the single-quote character for the keyname too, as you seem to be using for the format string. I.e. it should be i->>'name'.
P.S. SO syntax highlighting shows that something fishy is going on...

How to get the value of a dynamically generated field name in PL/pgSQL

Sample code trimmed down the the bare essentials to demonstrate question:
CREATE OR REPLACE FUNCTION mytest4() RETURNS TEXT AS $$
DECLARE
wc_row wc_files%ROWTYPE;
fieldName TEXT;
BEGIN
SELECT * INTO wc_row FROM wc_files WHERE "fileNumber" = 17117;
-- RETURN wc_row."fileTitle"; -- This works. I get the contents of the field.
fieldName := 'fileTitle';
-- RETURN format('wc_row.%I',fieldName); -- This returns 'wc_row."fileTitle"'
-- but I need the value of it instead.
RETURN EXECUTE format('wc_row.%I',fieldName); -- This gives a syntax error.
END;
$$ LANGUAGE plpgsql;
How can I get the value of a dynamically generated field name in this situation?
Use a trick with the function to_json(), which for a composite type returns a json object with column names as keys:
create or replace function mytest4()
returns text as $$
declare
wc_row wc_files;
fieldname text;
begin
select * into wc_row from wc_files where "filenumber" = 17117;
fieldname := 'filetitle';
return to_json(wc_row)->>fieldname;
end;
$$ language plpgsql;
You don't need tricks. EXECUTE does what you need, you were on the right track already. But RETURN EXECUTE ... is not legal syntax.
CREATE OR REPLACE FUNCTION mytest4(OUT my_col text) AS
$func$
DECLARE
field_name text := 'fileTitle';
BEGIN
EXECUTE format('SELECT %I FROM wc_files WHERE "fileNumber" = 17117', field_name)
INTO my_col; -- data type coerced to text automatically.
END
$func$ LANGUAGE plpgsql;
Since you only want to return a scalar value use EXECUTE .. INTO ... - optionally you can assign to the OUT parameter directly.
RETURN QUERY EXECUTE .. is for returning a set of values.
Use format() to conveniently escape identifiers and avoid SQL injection. Provide identifiers names case sensitive! filetitle is not the same as fileTitle in this context.
Are PostgreSQL column names case-sensitive?
Use an OUT parameter to simplify your code.

Postgres function - using table mapping to update value

I have a general question. I have a function that creates a file. However, within that function presently I am hard coding the file name pattern based on argument inputs. Now I have come to a point where I need to have more than one file name pattern. I devised a way of using another table as the file name map that the function can simply call if the user inputs that file name pattern id. Here is my example to help better illustrate my point:
Here is the table creation and data insertion for referential purposes:
create table some_schema.file_mapper(
mapper_id integer not null,
file_name_template varchar);
insert into some_schema.file_mapper (mapper_id, file_name_template)
values
(1, '||v_first_name||''_''||v_last_name||')
(2, '||v_last_name||''_''||v_first_name||')
(3, '||v_last_name||''_''||v_last_name||');
Now the function itself
create or replace function some_schema.some_function(integer)
returns varchar as
$body$
Declare
v_filename_config_id alias for $1;
v_filename varchar;
v_first_name varchar;
v_last_name varchar;
cmd text;
Begin
v_first_name :='Joe';
v_last_name :='Shmoe';
cmd := 'select file_name_template
from some_schema.file_mapper
where mapper_id = '||v_filename_config||'';
execute cmd into v_filename;
raise notice 'checking "%"',v_filename;
return 'done';
end;
$body$
LANGUAGE plpgsql;
Now that I have this. I want to be able to mix and match file name patterns. So for instance I wanted to use mapper_id 3, I would expect a returned file of "Shmoe_Shmoe.csv" if I execute the script:
select from some_schema.some_function(2)
The Problem is whenever I get it to read the "v_filename" variable it will not evaluate and return the values from the function's variables. Originally, I believed it to be a quoting issue(and it probably still is). After messing with the quoting I have gotten about as far the error below:
ERROR: syntax error at or near "_"
LINE 4: ...s/dir/someplace/||v_last_name||'_'||v_firs...
^
QUERY: copy(
select *
from some_schema.some_table)
to '/dir/someplace/||v_last_name||'_'||v_first_name||.csv/;
DELIMITER,
csv HEADER;
CONTEXT: PL/pgSQL function some_schema.some_function(integer) line 27 at EXECUTE statement
As you can tell it is pretty much telling me it is a quoting issue. Is there a way I can get the function to properly evaluate the variable and return the proper file name? Let me know if I am not clear and need to elaborate.
This more or less illustrates the usage of format(). (Slightly reduced wrt the original question):
CREATE TABLE file_mapper
( mapper_id INTEGER NOT NULL PRIMARY KEY
, file_name_template TEXT
);
INSERT INTO file_mapper(mapper_id, file_name_template) VALUES (1,'one'), (2, 'two');
CREATE OR REPLACE FUNCTION dump_the_shit(INTEGER)
RETURNS VARCHAR AS
$body$
DECLARE
v_filename_config_id alias for $1;
v_filename VARCHAR;
name_cmd TEXT;
copy_cmd TEXT;
BEGIN
name_cmd := format( e'select file_name_template
from file_mapper
where mapper_id = %L;', v_filename_config_id );
EXECUTE name_cmd into v_filename;
RAISE NOTICE 'V_filename := %', v_filename;
copy_cmd := format( e'copy(
select *
from %I)
to \'/tmp/%s.csv\'
csv HEADER;' , 'file_mapper' , v_filename);
EXECUTE copy_cmd;
RETURN copy_cmd;
END;
$body$
LANGUAGE plpgsql;
SELECT dump_the_shit(1);
format function description
summary:
use %I for identifiers (tablenames and column names) [ FROM %I ]
use %L for literals such as query constants [ WHERE the_date = %L ]
use %s for ordinary strings [ to \'/tmp/%s.csv\' ]

PL/pgSQL column name the same as variable

I'm new to plpgsql and I'm trying to create function that will check if a certain value exists in table and if not will add a row.
CREATE OR REPLACE FUNCTION hire(
id_pracownika integer,
imie character varying,
nazwisko character varying,
miasto character varying,
pensja real)
RETURNS TEXT AS
$BODY$
DECLARE
wynik TEXT;
sprawdzenie INT;
BEGIN
sprawdzenie = id_pracownika;
IF EXISTS (SELECT id_pracownika FROM pracownicy WHERE id_pracownika=sprawdzenie) THEN
wynik = "JUZ ISTNIEJE";
RETURN wynik;
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (id_pracownika,imie,nazwisko,miasto,pensja);
wynik = "OK";
RETURN wynik;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
The issue is that I'm getting errors saying that id_pracownika is a column name and a variable.
How to specify that "id_pracownika" in such context refers to column name?
Assuming id_pracownika is The PRIMARY KEY of the table. Or at least defined UNIQUE. (If it's not NOT NULL, NULL is a corner case.)
SELECT or INSERT
Your function is another implementation of "SELECT or INSERT" - a variant of the UPSERT problem, which is more complex in the face of concurrent write load than it might seem. See:
Is SELECT or INSERT in a function prone to race conditions?
With UPSERT in Postgres 9.5 or later
In Postgres 9.5 or later use UPSERT (INSERT ... ON CONFLICT ...) Details in the Postgres Wiki. This new syntax does a clean job:
CREATE OR REPLACE FUNCTION hire(
_id_pracownika integer
, _imie varchar
, _nazwisko varchar
, _miasto varchar
, _pensja real)
RETURNS text
LANGUAGE plpgsql AS
$func$
BEGIN
INSERT INTO pracownicy
( id_pracownika, imie, nazwisko, miasto, pensja)
VALUES (_id_pracownika,_imie,_nazwisko,_miasto,_pensja)
ON CONFLICT DO NOTHING;
IF FOUND THEN
RETURN 'OK';
ELSE
RETURN 'JUZ ISTNIEJE'; -- already exists
END IF;
END
$func$;
About the special variable FOUND:
Why is IS NOT NULL false when checking a row type?
Table-qualify column names to disambiguate where necessary. (You can also prefix function parameters with the function name, but that gets awkward quickly.)
But column names in the target list of an INSERT may not be table-qualified. Those are never ambiguous anyway.
Best avoid ambiguities a priori. Some (including me) like to prefix all function parameters and variables with an underscore.
If you positively need a column name as function parameter name, one way to avoid naming collisions is to use an ALIAS inside the function. One of the rare cases where ALIAS is actually useful.
Or reference function parameters by ordinal position: $1 for id_pracownika in this case.
If all else fails, you can decide what takes precedence by setting #variable_conflict. See:
Naming conflict between function parameter and result of JOIN with USING clause
There is more:
There are intricacies to the RETURNING clause in an UPSERT. See:
How to use RETURNING with ON CONFLICT in PostgreSQL?
String literals (text constants) must be enclosed in single quotes: 'OK', not "OK". See:
Insert text with single quotes in PostgreSQL
Assigning variables is comparatively more expensive than in other programming languages. Keep assignments to a minimum for best performance in plpgsql. Do as much as possible in SQL statements directly.
VOLATILE COST 100 are default decorators for functions. No need to spell those out.
Without UPSERT in Postgres 9.4 or older
...
IF EXISTS (SELECT FROM pracownicy p
WHERE p.id_pracownika = hire.id_pracownika) THEN
RETURN 'JUZ ISTNIEJE';
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (hire.id_pracownika,hire.imie,hire.nazwisko,hire.miasto,hire.pensja);
RETURN 'OK';
END IF;
...
But there is a tiny race condition between the SELECT and the INSERT, so not bullet-proof under heavy concurrent write-load.
In an EXISTS expression, the SELECT list does not matter. SELECT id_pracownika, SELECT 1, or even SELECT 1/0 - all the same. Just use an empty SELECT list. Only the existence of any qualifying row matters. See:
What is easier to read in EXISTS subqueries?
It is a example tested by me where I use EXECUTE to run a select and put its result in a cursor, using dynamic column names.
1. Create the table:
create table people (
nickname varchar(9),
name varchar(12),
second_name varchar(12),
country varchar(30)
);
2. Create the function:
CREATE OR REPLACE FUNCTION fun_find_people (col_name text, col_value varchar)
RETURNS void AS
$BODY$
DECLARE
local_cursor_p refcursor;
row_from_people RECORD;
BEGIN
open local_cursor_p FOR
EXECUTE 'select * from people where '|| col_name || ' LIKE ''' || col_value || '%'' ';
raise notice 'col_name: %',col_name;
raise notice 'col_value: %',col_value;
LOOP
FETCH local_cursor_p INTO row_from_people; EXIT WHEN NOT FOUND;
raise notice 'row_from_people.nickname: %', row_from_people.nickname ;
raise notice 'row_from_people.name: %', row_from_people.name ;
raise notice 'row_from_people.country: %', row_from_people.country;
END LOOP;
END;
$BODY$ LANGUAGE 'plpgsql'
3. Run the function
select fun_find_people('name', 'Cristian');
select fun_find_people('country', 'Chile');
inspire with Erwin Brandstetter's answers.
CREATE OR REPLACE FUNCTION test_upsert(
_parent_id int,
_some_text text)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE a text;
BEGIN
INSERT INTO parent_tree (parent_id, some_text)
VALUES (_parent_id,_some_text)
ON CONFLICT DO NOTHING
RETURNING 'ok' into a;
return a;
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;
END
$func$;
Follow Erwin's answer. I make a variable hold the return type text.
If conflict do nothing then the function will return nothing. For example, already have parent_id = 10, Then the result would be as following:
test_upsert
------------
(1 row)
NOT Sure the usage of:
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;