Counter for inserted rows in PSQL - postgresql

I'm looking for a PostgreSQL equivalent of the image below. I have a script to batch insert multiple tables, the idea is to declare some variables to count how many of each have been inserted when executing.
So far I have this, however there's no straightforward way for ##ROW_COUNT:
BEGIN TRANSACTION;
DO $$
DECLARE
EmailModulesTotal integer := 0;
DependenciesTotal integer := 0;
ModuleTypesTotal integer := 0;
ModuleSectionsTotal integer := 0;
BEGIN
-- inserts go here
RAISE NOTICE 'Total Inserted/Updated Email Modules: %
Total Inserted Dependencies: %
Total Inserted Module Types: %
Total Inserted Module Sections: %',
EmailModulesTotal,
DependenciesTotal,
ModuleTypesTotal,
ModuleSectionsTotal
END $$;
COMMIT TRANSACTION;

In PL/pgSQL you can access the number of affected (i.e. inserted in your case) rows using get diagnostics. Here is an illustration.
create temporary table t (id serial, txt text);
do language plpgsql
$$
declare
counter integer;
begin
insert into t(txt) values ('One'), ('Two'), ('Three');
get diagnostics counter = row_count;
raise notice 'Inserted % rows', counter;
end;
$$;
The result is Inserted 3 rows
Another way in plain SQL is to use a data modifying CTE:
with cte as
(
insert into t(txt) values ('One'), ('Two'), ('Three')
returning 1
)
select count(*) from cte;
Whichever suits you better.

Related

Postgres Count records inserted/ updated

I'm trying to keey track of a clients database with which we sync. I need to record records_added (INSERTs) and records_updated (UPDATEs) to our table.
I'm using an UPSERT to handle the sync, and a trigger to update a table keeping track of insert/updates.
The issue is counting records that have are updated. I have 40+ columns to check, do I have to put all these in my check logic? Is there a more elegant way?
Section of code in question:
select
case
when old.uuid = new.uuid
and (
old.another_field != new.another_field,
old.and_another_field != new.and_another_field,
-- many more columns here << This is particularly painful
) then 1
else 0
end into update_count;
Reproducible example:
-- create tables
CREATE TABLE IF NOT EXISTS example (uuid serial primary key, another_field int, and_another_field int);
CREATE TABLE IF NOT EXISTS tracker_table (
records_added integer DEFAULT 0,
records_updated integer DEFAULT 0,
created_at date unique
);
-- create function
CREATE OR REPLACE FUNCTION update_records_inserted () RETURNS TRIGGER AS $body$
DECLARE update_count INT;
DECLARE insert_count INT;
BEGIN
-- ---------------- START OF BLOCK IN QUESTION -----------------
select
case
when old.uuid = new.uuid
and (
old.another_field != new.another_field
-- many more columns here
) then 1
else 0
end into update_count;
-- ------------------ END OF BLOCK IN QUESTION ------------------
-- count INSERTs
select
case
when old.uuid is null
and new.uuid is not null then 1
else 0
end into insert_count;
-- --log the counts
-- raise notice 'update %', update_count;
-- raise notice 'insert %', insert_count;
-- insert or update count to tracker table
insert into
tracker_table(
created_at,
records_added,
records_updated
)
VALUES
(CURRENT_DATE, insert_count, update_count) ON CONFLICT (created_at) DO
UPDATE
SET
records_added = tracker_table.records_added + insert_count,
records_updated = tracker_table.records_updated + update_count;
RETURN NEW;
END;
$body$ LANGUAGE plpgsql;
-- Trigger
DROP TRIGGER IF EXISTS example_trigger ON example;
CREATE TRIGGER example_trigger
AFTER
INSERT
OR
UPDATE
ON example FOR EACH ROW EXECUTE PROCEDURE update_records_inserted ();
-- A query to insert, then update when number of uses > 1
insert into example(whatever) values (2, 3) ON CONFLICT(uuid) DO UPDATE SET another_field=excluded.another_field+1;

ERROR: query has no destination for result data: Postgresql

I am using Postgresql11 and a function that works well in a single run fails when I add a LOOP statement with
"ERROR: query has no destination for result data HINT: If you want to
discard the results of a SELECT, use PERFORM instead."
The function has VOID as return value, selects data from a source table into a temp table, calculates some data and inserts the result into a target table. The temp table is then dropped and the function ends. I would like to repeat this procedure in defined intervals and have included a LOOP statement. With LOOP it does not insert into the target table and does not actually loop at all.
create function transfer_cs_regular_loop(trading_pair character varying) returns void
language plpgsql
as
$$
DECLARE
first_open decimal;
first_price decimal;
last_close decimal;
last_price decimal;
highest_price decimal;
lowest_price decimal;
trade_volume decimal;
n_trades int;
start_time bigint;
last_entry bigint;
counter int := 0;
time_frame int := 10;
BEGIN
WHILE counter < 100 LOOP
SELECT max(token_trades.trade_time) INTO last_entry FROM token_trades WHERE token_trades.trade_symbol = trading_pair;
RAISE NOTICE 'Latest Entry: %', last_entry;
start_time = last_entry - (60 * 1000);
RAISE NOTICE 'Start Time: %', start_time;
CREATE TEMP TABLE temp_table AS
SELECT * FROM token_trades where trade_symbol = trading_pair and trade_time > start_time;
SELECT temp_table.trade_time,temp_table.trade_price INTO first_open, first_price FROM temp_table ORDER BY temp_table.trade_time ASC FETCH FIRST 1 ROW ONLY;
SELECT temp_table.trade_time,temp_table.trade_price INTO last_close, last_price FROM temp_table ORDER BY temp_table.trade_time DESC FETCH FIRST 1 ROW ONLY;
SELECT max(temp_table.trade_price) INTO highest_price FROM temp_table;
SELECT min(temp_table.trade_price) INTO lowest_price FROM temp_table;
SELECT INTO trade_volume sum(temp_table.trade_quantity) FROM temp_table;
SELECT INTO n_trades count(*) FROM temp_table;
INSERT INTO candlestick_data_5min_test(open, high, low, close, open_time, close_time, volume, number_trades, trading_pair) VALUES (first_price, highest_price, lowest_price, last_price, first_open, last_close, trade_volume, n_trades, trading_pair);
DROP TABLE temp_table;
counter := counter + 1;
SELECT pg_sleep(time_frame);
RAISE NOTICE '**************************Counter: %', counter;
END LOOP;
END;
$$;
The error refers to the last SELECT statement in the function. If there is a SELECT without INTO it will always return results. When there's no LOOP this result will be used as the return value of the function (even if it is void).
When you add a LOOP there can't be any SELECT without INTO inside the loop because a single return value would be needed and now there will be many. In this case you need to use PERFORM which does exactly the same thing as a SELECT but discards the results.
Therefore change the last SELECT into a PERFORM and the error will go away:
PERFORM pg_sleep(time_frame);

Postgresql 9.5 Performance degradation when updating JSON field using jsonb_set

I have a performance problem when using jsonb_set to replace values of a jsonb field.
It seems that every time the values are replaced in the JSONB expression the time updating the field increases.
This is my test table:
CREATE TABLE IF NOT EXISTS TEMP_package_var(
id VARCHAR,
val jsonb
);
My test function is:
create or replace function set_package_var_any(package_name text, var_name text, var_subname text, var_value anyelement) returns VOID as $func$
DECLARE
wk_var jsonb;
wk_mainvar jsonb;
wk_newvar jsonb;
wk_name text := lower(var_name);
wk_subname text := lower(var_subname);
start_at timestamp;
end_at timestamp;
BEGIN
select val from TEMP_package_var INTO wk_var WHERE id = lower(package_name);
if not found then
insert into TEMP_package_var(id, val) values(lower(package_name),jsonb_build_object(wk_name, jsonb_build_object(wk_subname, var_value)));
return;
end if;
wk_subname := var_name || ',' || wk_subname;
start_at := clock_timestamp();
UPDATE TEMP_package_var SET val = jsonb_set(wk_var, string_to_array(wk_subname,','), to_jsonb(var_value), true) WHERE id = lower(package_name);
end_at := clock_timestamp();
raise notice 'UPDATE Time__________________________________________: % - [%]', end_at - start_at, var_value;
END;
$func$
LANGUAGE PLPGSQL
SECURITY DEFINER;
And my main function is:
CREATE OR REPLACE FUNCTION pkgvar_set_test(loops bigint, rerun bigint) returns void as $body$
declare
start_at timestamp;
end_at timestamp;
begin
start_at := clock_timestamp();
for ix0 in 1 .. rerun loop
for ix in 1 .. loops loop
perform set_package_var_any('package_name', 'var_name', 'var_name_'||to_char(ix,'FM00000'), 'var_value_'||to_char(ix0,'FM00000')||'_'||to_char(ix,'FM00000'));
end loop;
end loop;
end_at := clock_timestamp();
raise notice 'time is %', end_at - start_at;
end;
$body$
LANGUAGE PLPGSQL
SECURITY DEFINER;
Then I execute the main function several times.
The test looks like this:
select pkgvar_set_test(2,2);
This create the following JSONB row:
{"var_name": {"var_name_00001": "var_value_00001_00001", "var_name_00002": "var_value_00001_00002"}}
And then replaces the values:
{"var_name": {"var_name_00001": "var_value_00002_00001", "var_name_00002": "var_value_00002_00002"}}
But when doing the same test with more records
First time:
select pkgvar_set_test(50,50);
NOTICE: time is 00:00:00.667468
Second time:
select pkgvar_set_test(50,50);
NOTICE: time is 00:00:01.348275
Third time:
select pkgvar_set_test(50,50);
NOTICE: time is 00:00:01.920818
And so on, as you can see replacing is getting slower and slower.
I understand the difference between the first loop and the second loop, but what I cannot figure it out is why the value increases with the number of executions.
Can somebody help me out?
Postgresql: 9.5.2
OS: Centos 7.1
Thank you

Using a row as a table in a query within a function PLpgSQL

I am trying to write a plpgsql function that loops through a table. On each loop, it pulls a row from the table, stores it in a record, then uses that record in the join clause of a query. Here is my code:
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
execute $$ select count(*) from $$||rec1||$$ $$ into tablesize;
while counter < tablesize loop
counter = counter + 1;
raise notice 'hi';
execute $$ select count(*) from cities_sridconv $$ into tablesize;
end loop;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
Each time I run this, I get the following error:
ERROR: could not find array type for data type record
Is there a way to use the row as a table in the query within the nested loops?
My end goal is to build a function that loops through a table, pulling a row from that table on each loop. In each loop, a number COUNTER is computed using the row, then a query is executed depending on the row and COUNTER. Knowing that this code is currently very flawed, I am posting it below to give an idea of what I am trying to do:
CREATE OR REPLACE FUNCTION "testfncjh" () RETURNS void
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
for rec1 in SELECT * FROM poilocations_sridconv loop
counter = 0;
execute $$ select count(*)
from $$||rec1||$$ a
join
cities_srid_conv b
on right(a.geom_wgs_pois,$$||counter||$$) = right(b.geom_wgs_pois,$$||counter||$$) $$ into tablesize;
raise notice 'got through first execute';
while tablesize = 0 loop
counter = counter + 1;
execute $$ select count(*)
from '||rec1||' a
join
cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||') $$ into tablesize;
raise notice 'hi';
end loop;
EXECUTE
'select
poiname,
name as cityname,
postgis.ST_Distance(postgis.ST_GeomFromText(''POINT(poilat poilong)''),
postgis.ST_GeomFromText(''POINT(citylat citylong)'')
) as distance
from (select a.poiname,
a.latitude::text as poilat,
a.longitude::text as poilong,
b.geonameid,
b.name,
b.latitude as citylat,
b.longitude as citylong
from '||rec1||' a
join cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||'))
) x
order by distance
limit 1'
poi_cities_match (poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
end loop;
END;
$dbvis$ LANGUAGE plpgsql;
I am running on a PostgreSQL 8.2.15 database.
Also, sorry for reposting. I had to remove some data from the original.
I think you should be able to use composite types for what you want. I simplified your top example and used composite types in the following way.
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 poilocations_sridconv;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
select count(*) FROM (select (rec1).*)theRecord into counter;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
The main changes being the rec1 poilocations_sridconv; line and using (select (rec1).*)
Hope it helps.
EDIT: I should note that the function is not doing the same thing as it does in the question above. This is just as an example of how you could use a record as a table in a query.
You have a few issues with your code (apart, perhaps, from your logic).
Foremost, you should not use a record as a table source in a JOIN. Instead, filter the second table for rows that match some field from the record.
Second, you should use the format() function instead of assembling strings with the || operator. But you can't because you are using the before-prehistoric version 8.2. This is from the cave-painting era (yes, it's that bad). UPGRADE!
Thirdly, don't over-complicate your queries. The sub-query is not necessary here.
Put together, the second dynamic query from your real code would reduce to this:
EXECUTE format(
'SELECT b.name,
postgis.ST_Distance(postgis.ST_SetSRID(postgis.ST_MakePoint(%1$I.longitude, %1$I.latitude), 4326),
postgis.ST_SetSRID(postgis.ST_MakePoint(b.longitude, b.latitude), 4326))
FROM cities_srid_conv b
WHERE right(%1$I.geom_wgs_pois, %2$L) = right(b.geom_wgs_pois, %2$L)
ORDER BY distance
LIMIT 1', rec1, counter) INTO cityname, distance;
poi_cities_match (rec1.poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
Here %1$I refers to the first parameter after the string, which is an idenifier: rec1; %2$L is the second parameter, being a literal value: counter. I leave it to yourself to re-work this to a pre-8.4 string concatenation. The results from the query are stored in a few additional variables which you can then use in the following function call.
Lastly, you had longitude and latitude reversed. In PostGIS longitude always comes first.

postgresql copy with schema support

I'm trying to load some data from CSV using the postgresql COPY command. The trick is that I'd like to implement multi-tenancy on a userid (which is contained in the CSV). Is there an easy way to tell the postgres copy command to filter based on this userid when loading the csv?
i.e. all rows with userid=x go to schema=x, rows with userid=y go to schema=y.
There is not a way of doing this with just the COPY command, but you could copy all your data into a master table, and then put together a simple PL/PGSQL function that does this for you. Something like this -
CREATE OR REPLACE FUNCTION public.spike()
RETURNS void AS
$BODY$
DECLARE
user_id integer;
destination_schema text;
BEGIN
FOR user_id IN SELECT userid FROM master_table GROUP BY userid LOOP
CASE user_id
WHEN 1 THEN
destination_schema := 'foo';
WHEN 2 THEN
destination_schema := 'bar';
ELSE
destination_schema := 'baz';
END CASE;
EXECUTE 'INSERT INTO '|| destination_schema ||'.my_table SELECT * FROM master_table WHERE userid=$1' USING user_id;
-- EXECUTE 'DELETE FROM master_table WHERE userid=$1' USING user_id;
END LOOP;
TRUNCATE TABLE master_table;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100;
This gets all unique user_ids from the master_table, uses a CASE statement to determine the destination schema, and then executes an INSERT SELECT to move rows, and finally deletes the moved rows.