How can we capture modified dates for all the tables in PGSQL (apart from writing data triggers on each table for inserts /deletes).Can we do it at a generic level for all the tables using event triggers.
for eg: the below query captures modified dates for every function in a generic .
`select now(), nspname, proname, command_tag, prosrc
from pg_event_trigger_ddl_commands() e
join pg_proc p on p.oid = e.objid
join pg_namespace n on n.oid = pronamespace;`
Likewise ,is there any to capture last modified date logs for all tables ?
Have Tried ,
CREATE FUNCTION test_event_trigger_table_rewrite_oid()
RETURNS event_trigger
LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE 'rewriting table % for reason %',
pg_event_trigger_table_rewrite_oid()::regclass,
pg_event_trigger_table_rewrite_reason();
END;
$$;
CREATE EVENT TRIGGER test_table_rewrite_oid
ON table_rewrite
EXECUTE FUNCTION test_event_trigger_table_rewrite_oid();
but the above code only captures time when table DDL is changed.Want it to happen at inserts /deletes
To capture the time when DML statements are run, use a trigger AFTER INSERT OR UPDATE OR DELETE defined FOR EACH STATEMENT.
Related
I have created a procedure to generate temp table using colpivot https://github.com/hnsl/colpivot
and saving the result into a physical table as below in PGSQL
create or replace procedure create_report_table()
language plpgsql
as $$
begin
drop table if exists reports;
select colpivot('_report',
'select
u.username,
c.shortname as course_short_name,
to_timestamp(cp.timecompleted)::date as completed
FROM mdl_course_completions AS cp
JOIN mdl_course AS c ON cp.course = c.id
JOIN mdl_user AS u ON cp.userid = u.id
WHERE c.enablecompletion = 1
ORDER BY u.username' ,array['username'], array['course_short_name'], '#.completed', null);
create table reports as (SELECT * FROM _report);
commit;
end; $$
colpivot function , drop table , delete table works really fine in isolation. but when I create the procedure as above, and call the procedure to execute, this throws an error Query has no result in destination data
Is there any way I can use colpivot in collaboration with several queries as I am currently trying ?
Use PERFORM instead of SELECT. That will execute the statement, without the need to keep the result somewhere. This is what the manual says:
Sometimes it is useful to evaluate an expression or SELECT query but discard the result, for example when calling a function that has
side-effects but no useful result value. To do this in PL/pgSQL, use
the PERFORM statement
I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema
I'm trying to, somehow, trigger a automatic function drop when a table is dropped and I can't figure out how to do it.
TL;DR: Is there a way to trigger a function drop when a specific table is dropped? (POSTGRESQL 11.7)
Detailed explanation
I'll try to explain my problem using a simplified use case with dummy names.
I have three tables: sensor1, sensor2 and sumSensors;
A FUNCTION (sumdata) was created to INSERT data on sumSensors table. Inside this function I'll fetch data from sensor1 and sensor2 tables and insert its sum on table sumSensors;
A trigger was created for each sensor table which like this:
CREATE TRIGGER trig1
AFTER INSERT ON sensor1
FOR EACH ROW EXECUTE
FUNCTION sumdata();
Now, when a new row is inserted on tables sensor1 OR sensor2, the function sumdata will be executed and insert the sum of last values from both on table sumSensors
If I wanted to DROP FUNTION sumdata CASCADE;, the triggers would be automatically removed from tables sensor1 and sensor2. Until now that's everything fine! But that's not what I want.
My problem is:
Q: And if I just DROP TABLE sumSensors CASCADE;? What would happen to the function which was meant to insert on this table?
A: As expected, since there's no association between sumSensors table and sumdata function, the function won't be dropped (still exist)! The same happens to the triggers which use it (still exist). This means that when a new row is inserted on sensor tables, the function sumdata will be executed and corrupted, leading to a failure (even the INSERT which triggered the function execution won't be actually inserted).
Is there a way to trigger a function drop when a specific table is dropped?
Thank you in advance
There is no dependency tracking for functions in PostgreSQL (as of version 12).
You can use event triggers to maintain the dependencies yourself.
Full example follows.
More information: documentation of event triggers feature, support functions.
BEGIN;
CREATE TABLE _testtable ( id serial primary key, payload text );
INSERT INTO _testtable (payload) VALUES ('Test data');
CREATE FUNCTION _testfunc(integer) RETURNS integer
LANGUAGE SQL AS $$ SELECT $1 + count(*)::integer FROM _testtable; $$;
SELECT _testfunc(100);
CREATE FUNCTION trg_drop_dependent_functions()
RETURNS event_trigger
LANGUAGE plpgsql AS $$
DECLARE
_dropped record;
BEGIN
FOR _dropped IN
SELECT schema_name, object_name
FROM pg_catalog.pg_event_trigger_dropped_objects()
WHERE object_type = 'table'
LOOP
IF _dropped.schema_name = 'public' AND _dropped.object_name = '_testtable' THEN
EXECUTE 'DROP FUNCTION IF EXISTS _testfunc(integer)';
END IF;
END LOOP;
END;
$$;
CREATE EVENT TRIGGER trg_drop_dependent_functions ON sql_drop
EXECUTE FUNCTION trg_drop_dependent_functions();
DROP TABLE _testtable;
ROLLBACK;
Below is a great function to check the real count of all tables in PostgreSQL database. I found it here.
From my local test, it seems that the function returns the all result only after it finished all counting for 100 tables.
I am trying to make it more practical. If we could save the result of each table counting as soon as it finished with the table, then we can check the progress of all counting jobs instead of waiting for the end.
I think if I could UPDATE the result in this function immediately after finishing the first table, it will be great for my requirement.
Can you let me know how I can update the result into the table after this function finishes the counting of the first table?
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
ORDER BY c.relname
LOOP
RETURN QUERY EXECUTE format('select count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
-- Query
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
Updated
I have decided to use a shell script to run the function below as a background process. The function would generate a processing log file so that I can check the current process.
I think your idea is good, but I also don't think it will work "out of the box" on PostgreSQL. I'm by no means the expert on this, but the way MVCC works on PostgreSQL, it's basically doing all of the DML in what can best be understood as temporary space, and then if and when everything works as expected it moves it all in at the end.
This has a lot of advantages, most notably that when someone is updating tables it doesn't prevent others from querying from those same tables.
If this were Oracle, I think you could accomplish this within the stored proc by using commit, but this isn't Oracle. And to be fair, Oracle doesn't allow truncates to be rolled back within a stored proc the way PostgreSQL does, so there are gives and takes.
Again, I'm not the expert, so if I've messed up a detail or two, feel free to correct me.
So, back to the solution. One way you COULD accomplish this is to set up your server as a remote server. Something like this would work:
CREATE SERVER pgprod
FOREIGN DATA WRAPPER dblink_fdw
OPTIONS (dbname 'postgres', host 'localhost', port '5432');
Assuming you have a table that stores the tables and counts:
create table table_counts (
table_name text not null,
record_count bigint,
constraint table_counts_pk primary key (table_name)
);
Were it not for your desire to see these results as they occur, something like this would work, for a single schema. It's easy enough to make this all schemas, so this is for illustration:
CREATE or replace FUNCTION rowcount_all(schema_name text)
returns void as
$$
declare
rowcount integer;
tablename text;
begin
for tablename in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
ORDER BY c.relname
LOOP
EXECUTE 'select count(*) from ' || schema_name || '.' || tablename into rowcount;
insert into table_counts values (schema_name || '.' || tablename, rowcount)
on conflict (table_name) do
update set record_count = rowcount;
END LOOP;
end
$$ language plpgsql;
(this presupposes 9.5 or greater -- if not, hand-roll your own upsert).
However, since you want real-time updates to the table, you could then put that same upsert into a dblink expression:
perform dblink_exec('pgprod', '
<< your upsert statement here >>
');
Of course the formatting of the SQL within the DBlink is now a little extra tricky, but the upside is once you nail it, you can run the function in the background and query the table while it's running to see the dynamic results.
I'd weigh that against the need to really have the information real-time.
I have a table called assignments. I would like to be able to read/write to all the columns in this table using either assignments.column or homework.column, how can I do this?
I know this is not something you would normally do. I need to be able to do this to provide backwards compatibility for a short period of time.
We have an iOS app that currently does direct postgresql queries against the DB. We're updating all of our apps to use an API. In the process of building the API the developer decided to change the name of the tables because we (foolishly) thought we didn't need backwards compatibility.
Now, V1.0 and the API both need to be able to write to this table so I don't have to do some voodoo later to transfer/combine data later...
We're using Ruby on Rails for the API.
With Postgres 9.3 the following should be enough:
CREATE VIEW homework AS SELECT * FROM assignments;
It works because simple views are automatically updatable (see docs).
In Postgres 9.3 or later, a simple VIEW is "updatable" automatically. The manual:
Simple views are automatically updatable: the system will allow
INSERT, UPDATE and DELETE statements to be used on the view in
the same way as on a regular table. A view is automatically updatable
if it satisfies all of the following conditions:
The view must have exactly one entry in its FROM list, which must be a table or another updatable view.
The view definition must not contain WITH, DISTINCT, GROUP BY, HAVING, LIMIT, or OFFSET clauses at the top level.
The view definition must not contain set operations (UNION, INTERSECT or EXCEPT) at the top level.
The view's select list must not contain any aggregates, window functions or set-returning functions.
If one of these conditions is not met (or for the now outdated Postgres 9.2 or older), a manual setup may do the job.
Building on your work in progress:
Trigger function
CREATE OR REPLACE FUNCTION trg_ia_insupdel()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl CONSTANT regclass := 'iassignments_assignments';
_cols text;
_vals text;
BEGIN
CASE TG_OP
WHEN 'INSERT' THEN
INSERT INTO iassignments_assignments
VALUES (NEW.*);
RETURN NEW;
WHEN 'UPDATE' THEN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ') -- incl. pk col!
, string_agg('n.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl -- _tbl converted to oid automatically
AND attnum > 0 -- no system columns
AND NOT attisdropped; -- no dropped (dead) columns
EXECUTE format('
UPDATE %s t
SET (%s) = (%s)
FROM (SELECT ($1).*) n
WHERE t.published_assignment_id
= ($2).published_assignment_id' -- match to OLD value of pk
, _tbl, _cols, _vals) -- _tbl converted to text automatically
USING NEW, OLD;
RETURN NEW;
WHEN 'DELETE' THEN
DELETE FROM iassignments_assignments
WHERE published_assignment_id = OLD.published_assignment_id;
RETURN OLD;
END CASE;
RETURN NULL; -- control should never reach this
END
$func$;
Trigger
CREATE TRIGGER insupbef
INSTEAD OF INSERT OR UPDATE OR DELETE ON assignments_published
FOR EACH ROW EXECUTE PROCEDURE trg_ia_insupdel();
Notes
assignments_published must be a VIEW, an INSTEAD OF trigger is only allowed for views.
Dynamic SQL (in the UPDATE section) is not strictly necessary, only to cover future changes to the table layout automatically. The names of table and PK are still hard coded.
Simpler and probably cheaper without sub-block (like you had).
Using (SELECT ($1).*) instead of the shorter VALUES ($1.*) to preserve column names.
My naming convention: I prepend trg_ for trigger functions, followed by an abbreviation indicating the target table and finally one or more of the the tokens ins, up and del for INSERT, UPDATE and DELETE respectively. The name of the trigger is a copy of the function name, stripped of the first two parts. This is purely a matter of convention and taste but has proven useful for me since the names tell the purpose and are still short.
More explanation in the related answer that has already been mentioned:
Update multiple columns in a trigger function in plpgsql
This is where I am with the trigger functions so far, any feedback would be greatly appreciated. It's a combination of http://vibhorkumar.wordpress.com/2011/10/28/instead-of-trigger/ and Update multiple columns in a trigger function in plpgsql
Table: iassignments_assignments
Columns:
published_assignment_id
name
filepath
filename
link
teacher
due date
description
published
classrooms
View: assignments_published - SELECT * FROM iassignments_assignments
Trigger Function for assignments_published
CREATE OR REPLACE FUNCTION assignments_published_trigger_func()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
BEGIN
IF TG_OP = 'INSERT' THEN
EXECUTE format('INSERT INTO %s SELECT ($1).*', 'iassignments_assignments')
USING NEW;
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
DECLARE
tbl = 'iassignments_assignments';
cols text;
vals text;
BEGIN
SELECT INTO cols, vals
string_agg(quote_ident(attname), ', ')
,string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s t
SET (%s) = (%s)
FROM (SELECT ($1).*) x
WHERE t.published_assignment_id = ($2).published_assignment_id'
, tbl, cols, vals)
USING NEW, OLD;
RETURN NEW;
END
ELSIF TG_OP = 'DELETE' THEN
DELETE FROM iassignments_assignments WHERE published_assignment_id=OLD.published_assignment_id;
RETURN NULL;
END IF;
RETURN NEW;
END;
$function$;
Trigger
CREATE TRIGGER assignments_published_trigger
INSTEAD OF INSERT OR UPDATE OR DELETE ON
assignments_published FOR EACH ROW EXECUTE PROCEDURE assignments_published_trigger_func();
Table: iassignments_classes
Columns:
class_assignment_id
guid
assignment_published_id
View: assignments_class - SELECT * FROM assignments_classes
Trigger Function for assignments_class
**I'll create this function once I have received feedback on the other and know it's create, so I (hopefully) need very little changes to this function.
Trigger
CREATE TRIGGER assignments_class_trigger
INSTEAD OF INSERT OR UPDATE OR DELETE ON
assignments_class FOR EACH ROW EXECUTE PROCEDURE assignments_class_trigger_func();