Why subqueried function does not insert new rows?

Why subqueried function does not insert new rows? - postgresql

I need a function to insert rows because one column's (seriano) default value should be the same as PK id.
I have defined table:
CREATE SEQUENCE some_table_id_seq
INCREMENT 1
START 1
MINVALUE 1
MAXVALUE 9223372036854775807
CACHE 1;
CREATE TABLE some_table
(
id bigint NOT NULL DEFAULT nextval('some_table_id_seq'::regclass),
itemid integer NOT NULL,
serialno bigint,
CONSTRAINT stockitem_pkey PRIMARY KEY (id),
CONSTRAINT stockitem_serialno_key UNIQUE (serialno)
);
and function to insert count of rows:
CREATE OR REPLACE FUNCTION insert_item(itemid int, count int DEFAULT 1) RETURNS SETOF bigint AS
$func$
DECLARE
ids bigint[] DEFAULT '{}';
id bigint;
BEGIN
FOR counter IN 1..count LOOP
id := NEXTVAL( 'some_table_id_seq' );
INSERT INTO some_table (id, itemid, serialno) VALUES (id, itemid, id);
ids := array_append(ids, id);
END LOOP;
RETURN QUERY SELECT unnest(ids);
END
$func$
LANGUAGE plpgsql;
And inserting with it works fine:
$ select insert_item(123, 10);
insert_item
-------------
1
2
3
4
5
6
7
8
9
10
(10 rows)
$ select * from some_table;
id | itemid | serialno
----+--------+----------
1 | 123 | 1
2 | 123 | 2
3 | 123 | 3
4 | 123 | 4
5 | 123 | 5
6 | 123 | 6
7 | 123 | 7
8 | 123 | 8
9 | 123 | 9
10 | 123 | 10
(10 rows)
But if I want to use function insert_item as subquery, it seems not to work anymore:
$ select id, itemid from some_table where id in (select insert_item(123, 10));
id | itemid
----+--------
(0 rows)
I created dumb function insert_dumb to test in a subquery:
CREATE OR REPLACE FUNCTION insert_dumb(itemid int, count int DEFAULT 1) RETURNS SETOF bigint AS
$func$
DECLARE
ids bigint[] DEFAULT '{}';
BEGIN
FOR counter IN 1..count LOOP
ids := array_append(ids, counter::bigint);
END LOOP;
RETURN QUERY SELECT unnest(ids);
END
$func$
LANGUAGE plpgsql;
and this works in a subquery as expected:
$ select id, itemid from some_table where id in (select insert_dumb(123, 10));
id | itemid
----+--------
1 | 123
2 | 123
3 | 123
4 | 123
5 | 123
6 | 123
7 | 123
8 | 123
9 | 123
10 | 123
(10 rows)
Why does insert_item function not insert new rows when called as subquery? I tried to add raise notice to the loop and it runs as expected shouting new id every time (and increasing the sequence), but no new rows are appended to the table.
I made all the setup available as fiddle
I am using Postgres 11 on Ubuntu.
EDIT
Of course, I let out my real reason, and it pays off...
I need the insert_item function returning ids, so I could use it in update-statement, like:
update some_table set some_text = 'x' where id in (select insert_item(123, 10);)
And addition to the why-question: it is understandable I can get no ids in return (because they share the same snapshot), but the function runs all the needed INSERTs without affecting the table. Shouldn't those rows be available in the next query?

The problem is that the subquery and the surrounding query share the same snapshot, that is, they see the same state of the database. Hence the outer query cannot see the rows inserted by the inner query.
See the documentation (which explains that in the context of WITH, although it also applies here):
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot “see” one another's effects on the target tables.
In addition, there is a second problem with your approach: if you run EXPLAIN (ANALYZE) on your statement, you will find that the subquery is not executed at all! Since the table is empty, there is no id, and running the subquery is not necessary to calculate the (empty) result.
You will have to run that in two different statements. Or, better, do it in a different fashion: updating a row that you just inserted is unnecessarily wasteful.

Laurenz explained the visibility problem, but you don't need the sub-query at all if you re-write your function to return the actual table, rather than just he IDs
CREATE OR REPLACE FUNCTION insert_item(itemid int, count int DEFAULT 1)
RETURNS setof some_table
AS
$func$
INSERT INTO some_table (id, itemid, serialno)
select NEXTVAL( 'some_table_id_seq' ), itemid, currval('some_table_id_seq')
from generate_series(1,count)
returning *;
$func$
LANGUAGE sql;
Then you can use it like this:
select id, itemid
from insert_item(123, 10);
And you get the complete inserted rows.
Online example

Related

fetch data from two different tables in single function postresql

I want to write function to get data from two different tables
My code:
create function
return table(a integer,b integer,c integer,k integer,l integer,m integer);
if(x=1) then
select a,b,c from mst_1
else
select k,l,m from mst_2
end IF;
end;
the problem is that two tables posses different columns, I'm getting error.

I replicated a case similar to yours, and it's just a matter of using the correct sintax.
If you have two tables like test and test_other like in my case
create table test (id serial, name varchar, surname varchar);
insert into test values(1,'Carlo', 'Rossi');
insert into test values(2,'Giovanni', 'Galli');
create table test_other (id_other serial, name_other varchar, surname_other varchar);
insert into test_other values(1,'Beppe', 'Bianchi');
insert into test_other values(2,'Salmo', 'Verdi');
you now want a function that returns the 3 columns from test if an input parameter is 1, the 3 columns from test_other otherwise.
Your function will look like the following
create or replace function case_return(x integer)
returns table(id integer,value_1 varchar, value_2 varchar)
language plpgsql
as
$$
begin
if(x=1) then
return query select test.id,test.name,test.surname from test;
else
return query select test_other.id_other, test_other.name_other, test_other.surname_other from test_other;
end IF;
end;
$$
;
The function always returns the columns id, value_1 and value_2 as per definition even if your source columns are different
defaultdb=> select * from case_return(0); id | value_1 | value_2
----+---------+---------
1 | Beppe | Bianchi
2 | Salmo | Verdi
(2 rows)
defaultdb=> select * from case_return(1); id | value_1 | value_2
----+----------+---------
1 | Carlo | Rossi
2 | Giovanni | Galli
(2 rows)

Postgres auto insert with previous values if some data misses

I have data with date range, some date wont come for few days, during the missed window I just want to insert the previous data.
Is there way to take care of this during insert of data it self
For example
create table foo (ID VARCHAR(10), foo_count int, actual_date date);
insert into foo ( values('234534', 100, '2017-01-01'),('234534', 200, '2017-01-02'));
insert into foo ( values('234534', 300, '2017-01-03') );
insert into foo ( values('234534', 300, '2017-01-08') );
After the last insert I can make sure previous data gets generated
So it should look something like this
ID | foo_count | actual_date
-----------+-----------------+------------
234534 | 100 | 2017-01-01
234534 | 200 | 2017-02-01
234534 | 300 | 2017-03-01
234534 | 300 | 2017-04-01
234534 | 300 | 2017-05-01
234534 | 300 | 2017-06-01
234534 | 180 | 2017-07-01
I am using JPA to insert it, currently I query the table and see the current date in the table and populate the missing data

I would think about a better INSERT statement. Inserting from a SELECT statement would make things easier. The SELECT statement could be used to generate the requested date series.
INSERT INTO foo
SELECT
--<advanced query>
However, I guess, that's not simply possible since you are not using the JDBC directly or want to use native queries for inserting your data.
In that case, you could install a trigger to your database which could do the magic:
demo:db<>fiddle
Trigger function:
CREATE FUNCTION insert_missing()
RETURNS TRIGGER AS
$$
DECLARE
max_record record;
BEGIN
SELECT -- 1
id,
foo_count,
actual_date
FROM
foo
ORDER BY actual_date DESC
LIMIT 1
INTO max_record;
IF (NEW.actual_date - 1 > max_record.actual_date) THEN -- 2
INSERT INTO foo
SELECT
max_record.id,
max_record.foo_count,
generate_series(max_record.actual_date + 1, NEW.actual_date - 1, interval '1 day'); -- 3
END IF;
RETURN NEW;
END;
$$ language 'plpgsql';
Query the record with the current maximum date.
If the maximum date is more than one day before the new date...
... Insert a date series (from day after current max date until the date before the new one). This can be generated with generate_series().
Afterwards create the ON BEFORE INSERT trigger:
CREATE TRIGGER insert_missing
BEFORE INSERT
ON foo
FOR EACH ROW
EXECUTE PROCEDURE insert_missing();

Generating incremental numbers based on a different column

I have got a composite primary key in a table in PostgreSQL (I am using pgAdmin4)
Let's call the the two primary keys productno and version.
version represents the version of productno.
So if I create a new dataset, then it needs to be checked if a dataset with this productno already exists.
If productno doesn't exist yet, then version should be (version) 1
If productno exists once, then version should be 2
If productno exists twice, then version should be 3
... and so on
So that we get something like:
productno | version
-----|-----------
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
I found a quite similar problem: auto increment on composite primary key
But I can't use this solution because PostgreSQL syntax is obviously a bit different - so tried a lot around with functions and triggers but couldn't figure out the right way to do it.

You can keep the version numbers in a separate table (one for each "base PK" value). That is way more efficient than doing a max() + 1 on every insert and has the additional benefit that it's safe for concurrent transactions.
So first we need a table that keeps track of the version numbers:
create table version_counter
(
product_no integer primary key,
version_nr integer not null
);
Then we create a function that increments the version for a given product_no and returns that new version number:
create function next_version(p_product_no int)
returns integer
as
$$
insert into version_counter (product_no, version_nr)
values (p_product_no, 1)
on conflict (product_no)
do update
set version_nr = version_counter.version_nr + 1
returning version_nr;
$$
language sql
volatile;
The trick here is the the insert on conflict which increments an existing value or inserts a new row if the passed product_no does not yet exists.
For the product table:
create table product
(
product_no integer not null,
version_nr integer not null,
created_at timestamp default clock_timestamp(),
primary key (product_no, version_nr)
);
then create a trigger:
create function increment_version()
returns trigger
as
$$
begin
new.version_nr := next_version(new.product_no);
return new;
end;
$$
language plpgsql;
create trigger base_table_insert_trigger
before insert on product
for each row
execute procedure increment_version();
This is safe for concurrent transactions because the row in version_counter will be locked for that product_no until the transaction inserting the row into the product table is committed - which will commit the change to the version_counter table as well (and free the lock on that row).
If two concurrent transactions insert the same value for product_no, one of them will wait until the other finishes.
If two concurrent transactions insert different values for product_no, they can work without having to wait for the other.
If we then insert these rows:
insert into product (product_no) values (1);
insert into product (product_no) values (2);
insert into product (product_no) values (3);
insert into product (product_no) values (1);
insert into product (product_no) values (3);
insert into product (product_no) values (2);
The product table looks like this:
select *
from product
order by product_no, version_nr;
product_no | version_nr | created_at
-----------+------------+------------------------
1 | 1 | 2019-08-23 10:50:57.880
1 | 2 | 2019-08-23 10:50:57.947
2 | 1 | 2019-08-23 10:50:57.899
2 | 2 | 2019-08-23 10:50:57.989
3 | 1 | 2019-08-23 10:50:57.926
3 | 2 | 2019-08-23 10:50:57.966
Online example: https://rextester.com/CULK95702

You can do it like this:
-- Check if pk exists
SELECT pk INTO temp_pk FROM table a WHERE a.pk = v_pk1;
-- If exists, inserts it
IF temp_pk IS NOT NULL THEN
INSERT INTO table(pk, versionpk) VALUES (v_pk1, temp_pk);
END IF;

So - I got it work now
So if you want a column to update depending on another column in pg sql - have a look at this:
This is the function I use:
CREATE FUNCTION public.testfunction()
RETURNS trigger
LANGUAGE 'plpgsql'
COST 100
VOLATILE NOT LEAKPROOF
AS $BODY$
DECLARE v_productno INTEGER := NEW.productno;
BEGIN
IF NOT EXISTS (SELECT *
FROM testtable
WHERE productno = v_productno)
THEN
NEW.version := 1;
ELSE
NEW.version := (SELECT MAX(testtable.version)+1
FROM testtable
WHERE testtable.productno = v_productno);
END IF;
RETURN NEW;
END;
$BODY$;
And this is the trigger that runs the function:
CREATE TRIGGER testtrigger
BEFORE INSERT
ON public.testtable
FOR EACH ROW
EXECUTE PROCEDURE public.testfunction();
Thank you #ChechoCZ, you definetly helped me getting in the right direction.

Postgres cascade delete on non-unique column

I have a table like this:
id | group_id | parent_group
---+----------+-------------
1 | 1 | null
2 | 1 | null
3 | 2 | 1
4 | 2 | 1
Is it possible to add a constraint such that a row is automatically deleted when there is no row with a group_id equal to the row's parent_group? For example, if I delete rows 1 and 2, I want rows 3 and 4 to be deleted automatically because there are no more rows with group_id 1.

The answer that clemens posted led me to the following solution. I'm not very familiar with triggers though; could there be any problems with this and is there a better way to do it?
CREATE OR REPLACE FUNCTION on_group_deleted() RETURNS TRIGGER AS $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM my_table WHERE group_id = OLD.group_id) THEN
DELETE FROM my_table WHERE parent_group = OLD.group_id;
END IF;
RETURN OLD;
END;
$$ LANGUAGE PLPGSQL;
CREATE TRIGGER my_table_delete_trigger AFTER DELETE ON my_table
FOR EACH ROW
EXECUTE PROCEDURE on_group_deleted();

Postgres: The best way to optmize "greater than" query

what is the best way to optimize join query that join with the same table on the next id value at the sub group? For now I have something like this:
CREATE OR REPLACE FUNCTION select_next_id(bigint, bigint) RETURNS bigint AS $body$
DECLARE
_id bigint;
BEGIN
SELECT id INTO _id FROM table WHERE id_group = $2 AND id > $1 ORDER BY id ASC LIMIT 1;
RETURN _id;
END;
$body$ LANGUAGE plpgsql;
And the JOIN query:
SELECT * FROM table t1
JOIN table t2 ON t2.id = select_next_id(t1.id, t1.id_group)
The table have more than 2kk rows, and it takes very very long. Is there a better way to do this quick? Also I have UNIQUE INDEX on column id. Not very helpfull I guess.
Some sample data:
id | id_group
=============
1 | 1
2 | 1
3 | 1
4 | 2
5 | 2
6 | 2
20 | 4
25 | 4
37 | 4
40 | 1
55 | 2
And I want to recieve something like this:
id | id_next
1 | 2
2 | 3
3 | null
4 | 5
5 | 6
6 | 55
and so on.

For the query in the function, you need an index on (id_group, id), not just (id).
Next, you don't need the overhead of plpgsql in the function itself, and you can give a few hints to the planner by making it as stable and having a small cost:
CREATE OR REPLACE FUNCTION select_next_id(bigint, bigint) RETURNS bigint AS $body$
SELECT id FROM table WHERE id_group = $2 AND id > $1 ORDER BY id ASC LIMIT 1;
$body$ LANGUAGE sql STABLE COST 10;
In the final query, depending on what you're actually trying to do, you might be able to get rid of the join and the function call by using lead() as highlighted by the horse:
http://www.postgresql.org/docs/current/static/tutorial-window.html

I'm not entirely sure, but I think you want something like this:
select id,
lead(id) over (partition by id_group order by id) as id_next
from the_table
order by id, id_next;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Why subqueried function does not insert new rows? - postgresql

Related

fetch data from two different tables in single function postresql

Postgres auto insert with previous values if some data misses

Generating incremental numbers based on a different column

Postgres cascade delete on non-unique column

Postgres: The best way to optmize "greater than" query

Categories

Resources