Setting up postgres ts_vector column - postgresql

I have a search table in postgres with a ts_vector column. It looks like when I insert a dstring to this column is vectorizes it, but it doesn't do any stemming or removal of stop words:
test=# create table sample_ts_vec ( id varchar(255), tsv tsvector);
CREATE TABLE
test=# insert into sample_ts_vec values ('t1234', 'this is a test');
INSERT 0 1
test=# select * from sample_ts_vec;
id | tsv
-------+------------------------
t1234 | 'a' 'is' 'test' 'this'
(1 row)
test=# insert into sample_ts_vec values ('t1235', to_tsvector('this is a test'));
INSERT 0 1
test=# select * from sample_ts_vec;
id | tsv
-------+------------------------
t1234 | 'a' 'is' 'test' 'this'
t1235 | 'test':4
(2 rows)
You'll notice that in the second insert, the 3 stop words are removed, and the word is stemmed (in this case, no stemming necessary), whereas in the first example each word gets added. How can I apply the to_tsvector function automagically to the string value prior to insert?

Jasen's answer was close, but it had a few important errors - here's the corrected version:
CREATE FUNCTION tsvfix() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$
BEGIN
NEW.tsv=to_tsvector(NEW.tsv);
RETURN NEW;
END
$$;
CREATE TRIGGER "tsvfix" BEFORE UPDATE OR INSERT ON "sample_ts_vec" FOR EACH ROW EXECUTE PROCEDURE tsvfix();
Even this doesn't work however. I get an error ERROR: function to_tsvector(tsvector) does not exist

you could create a TRIGGER for ON UPDATE OR INSERT
assuming the table has a column data which you want to make a tsv index on, something like this
CREATE FUNCTION tsvfix() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$
BEGIN
NEW.tsv=to_tsvector(NEW.data);
RETURN NEW;
END
$$;
CREATE TRIGER "tsvfix" ON UPDATE OR INSERT TO "sample_ts_vec" EXECUTE PROCEDURE tsvfix;

Related

Is it possible capture all records into table A from table B on or after truncating table B

I am trying to write a trigger with reference to Postgres DOC. But its not even allowing to create a trigger base on truncate, tried different approaches but didn't work.
CREATE TRIGGER delete_after_test
AFTER truncate
ON tableA
FOR EACH ROW
EXECUTE PROCEDURE delete_after_test3();
Function:
CREATE OR REPLACE FUNCTION econnect.delete_after_test3()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
declare
query text;
begin
insert into econnect.delete_after_test_2 (
"name",
age1,
log_time
)
values
(
old."name",
old.age1,
CURRENT_TIMESTAMP
)
;
return old;
END;
$function$
;
Reference: https://www.postgresql.org/docs/current/sql-createtrigger.html
"TRUNCATE will not fire any ON DELETE triggers that might exist for the tables. But it will fire ON TRUNCATE triggers. If ON TRUNCATE triggers are defined for any of the tables, then all BEFORE TRUNCATE triggers are fired before any truncation happens, and all AFTER TRUNCATE triggers are fired after the last truncation is performed and any sequences are reset. The triggers will fire in the order that the tables are to be processed (first those listed in the command, and then any that were added due to cascading)"
A solution using ON DELETE:
create table delete_test(id integer, fld1 varchar, fld2 boolean);
create table delete_test_save(id integer, fld1 varchar, fld2 boolean);
insert into delete_test values (1, 'test', 't'), (2, 'dog', 'f'), (3, 'cat', 't')
CREATE OR REPLACE FUNCTION public.delete_save()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
INSERT INTO delete_test_save SELECT * FROM del_recs;
RETURN OLD;
END;
$function$
CREATE TRIGGER trg_del_save
AFTER DELETE ON delete_test referencing OLD TABLE AS del_recs FOR EACH statement
EXECUTE FUNCTION delete_save ();
delete from delete_test;
DELETE 3
select * from delete_test;
id | fld1 | fld2
----+------+------
(0 rows)
select * from delete_test_save;
id | fld1 | fld2
----+------+------
1 | test | t
2 | dog | f
3 | cat | t
The example uses a transition relation (referencing OLD TABLE AS del_recs) to collect all the deleted records for use in the function. Then it is possible to do the INSERT INTO delete_test_save SELECT * FROM del_recs; to transfer the records to the other table. No, they will not work with a TRUNCATE trigger.
Transition relations are explained here Create Trigger:
The REFERENCING option enables collection of transition relations, which are row sets that include all of the rows inserted, deleted, or modified by the current SQL statement. This feature lets the trigger see a global view of what the statement did, not just one row at a time. This option is only allowed for an AFTER trigger that is not a constraint trigger; also, if the trigger is an UPDATE trigger, it must not specify a column_name list. OLD TABLE may only be specified once, and only for a trigger that can fire on UPDATE or DELETE; it creates a transition relation containing the before-images of all rows updated or deleted by the statement. Similarly, NEW TABLE may only be specified once, and only for a trigger that can fire on UPDATE or INSERT; it creates a transition relation containing the after-images of all rows updated or inserted by the statement.

How to DELETE/INSERT rows in the same table using a UPDATE Trigger?

I want to create a trigger function, which copies certain columns of an recent updated row and deletes the old data. After that I want to insert the copied columns in exact the same table in the same row (overwrite). I need the data to be INSERTED because this function will be embedded in an existing program, with predefined Triggers.
That's what I have so far:
CREATE OR REPLACE FUNCTION update_table()
RETURNS TRIGGER AS
$func$
BEGIN
WITH tmp AS (DELETE FROM table
WHERE table.id = NEW.id
RETURNING id, geom )
INSERT INTO table (id, geom) SELECT * FROM tmp;
END;
$func$ language plpgsql;
CREATE TRIGGER T_update
AFTER UPDATE OF geom ON table
EXECUTE PROCEDURE update_table();
But I get the Error message:
ERROR: cannot perform DELETE RETURNING on relation "table"
HINT: You need an unconditional ON DELETE DO INSTEAD rule with a RETURNING clause.
Why I should use a rule here?
I'm using PostgreSQL 9.6
UPDATE:
A little bit of clarification. When I have two columns in my table (id, geom), after I updated geom I want to make a copy of this (new)row and insert it into the same table, while overwriting the updated row. (I'm not interested in any value before the update) I know that this is odd but I need this row to be inserted again because the program i embed this function in, listens to a INSERT statement and cannot be changed by me.
Right after you update a row, its old values will no longer be available. So, if you simply want to preserve the old row in case of an update you need to create a BEFORE UPDATE trigger, so that you can still access the OLD values and create a new row, e.g.
CREATE TABLE t (id int, geom geometry(point,4326));
CREATE OR REPLACE FUNCTION update_table() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO t (id, geom) VALUES (OLD.id,OLD.geom);
RETURN NEW;
END; $$ LANGUAGE plpgsql;
CREATE TRIGGER t_update
BEFORE UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE update_table();
INSERT INTO t VALUES (1,'SRID=4326;POINT(1 1)');
If you update the record 1 ..
UPDATE t SET geom = 'SRID=4326;POINT(2 2)', id = 2 WHERE id = 1;
UPDATE t SET geom = 'SRID=4326;POINT(3 3)', id = 3 WHERE id = 2;
.. you get a new record in the same table as you wished
SELECT id, ST_AsText(geom) FROM t;
id | st_astext
----+------------
1 | POINT(1 1)
2 | POINT(2 2)
3 | POINT(3 3)
Demo: db<>fiddle
Unrelated note: consider upgrading your PostgreSQL version! 9.6 will reach EOL in November, 2021.
First thanks to #JimJones for the answer. I´d like to post his answer modified for this purpose. This code "overwrites" the updated row by inserting a copy of itself and then deleting the old duplicate. That way I can Trigger on INSERT.
CREATE TABLE t (Unique_id SERIAL,id int, geom geometry(point,4326));
CREATE OR REPLACE FUNCTION update_table() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO t (id, geom) VALUES (NEW.id,NEW.geom);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER t_update
BEFORE UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE update_table();
CREATE OR REPLACE FUNCTION delete_table() RETURNS TRIGGER AS $$
BEGIN
DELETE FROM t a
USING t b
WHERE a.Unique_id < b.Unique_id
AND a.geom = b.geom;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER t_delete
AFTER UPDATE OF geom ON t FOR EACH ROW EXECUTE PROCEDURE delete_table();
INSERT INTO t VALUES (1,1,'SRID=4326;POINT(1 1)');
UPDATE t SET geom = 'SRID=4326;POINT(2 2)' WHERE id = 1;

Loop on rows produced by SELECT - looping variable does not exist?

The function is created fine, but when I try to execute it, I get this error:
ERROR: relation "column1" does not exist
SQL state: 42P01
Context: SQL statement "ALTER TABLE COLUMN1 ADD COLUMN locationZM geography (POINTZM, 4326)"
PL/pgSQL function addlocationzm() line 6 at SQL statement
Code:
CREATE OR REPLACE FUNCTION addlocationZM()
RETURNS void AS
$$
DECLARE
COLUMN1 RECORD;
BEGIN
FOR COLUMN1 IN SELECT f_table_name FROM *schema*.geography_columns WHERE type LIKE 'Point%' LOOP
ALTER TABLE COLUMN1 ADD COLUMN locationZM geography (POINTZM, 4326);
END LOOP;
END;
$$
LANGUAGE 'plpgsql';
SELECT addlocationZM()
I'm probably just being dumb, but I've been at this for a while now and I just can't get it. The SELECT f_table_name ... statement executed on its own returns 58 rows of a single column, each of which is the name of a table in my schema. The idea of this is to create a new column, type PointZM, in each table pulled by the SELECT.
The function would work like this:
CREATE OR REPLACE FUNCTION addlocationZM()
RETURNS void AS
$func$
DECLARE
_tbl text;
BEGIN
FOR _tbl IN
SELECT f_table_name FROM myschema.geography_columns WHERE type LIKE 'Point%'
LOOP
EXECUTE
format('ALTER TABLE %I ADD COLUMN location_zm geography(POINTZM, 4326)', _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Note how I use a simple text variable to simplify matters. You don't need the record to begin with.
If it's a one-time operation, use a DO command instead of creating a function:
DO
$do$
BEGIN
EXECUTE (
SELECT string_agg(
format(
'ALTER TABLE %I ADD COLUMN location_zm geography(POINTZM, 4326);'
, f_table_name)
, E'\n')
FROM myschema.geography_columns
WHERE type LIKE 'Point%'
);
END
$do$;
This is concatenating a single string comprised of all commands (separated with ;) for a single EXECUTE.
Or, especially while you are not familiar with plpgsql and dynamic SQL, just generate the commands, copy/paste the result and execute as 2nd step:
SELECT 'ALTER TABLE '
|| quote_ident(f_table_name)
|| ' ADD COLUMN locationZM geography(POINTZM, 4326);'
FROM myschema.geography_columns
WHERE type LIKE 'Point%';
(Demonstrating quote_ident() this time.)
Related:
Table name as a PostgreSQL function parameter
Aside: Unquoted CaMeL-case identifiers like locationZM or your function name addlocationZM may not be such a good idea:
Are PostgreSQL column names case-sensitive?

PostgreSQL modifying fields dynamically in NEW record in a trigger function

I have a user table with IDs and usernames (and other details) and several other tables referring to this table with various column names (CONSTRAINT some_name FOREIGN KEY (columnname) REFERENCES "user" (userid)). What I need to do is add the usernames to the referring tables (in preparation for dropping the whole user table). This is of course easily accomplished with a single ALTER TABLE and UPDATE, and keeping these up-to-date with triggers is also (fairly) easy. But it's the trigger function that is causing me some annoyance. I could have used individual functions for each table, but this seemed redundant, so I created one common function for this purpose:
CREATE OR REPLACE FUNCTION public.add_username() RETURNS trigger AS
$BODY$
DECLARE
sourcefield text;
targetfield text;
username text;
existing text;
BEGIN
IF (TG_NARGS != 2) THEN
RAISE EXCEPTION 'Need source field and target field parameters';
END IF;
sourcefield = TG_ARGV[0];
targetfield = TG_ARGV[1];
EXECUTE 'SELECT username FROM "user" WHERE userid = ($1).' || sourcefield INTO username USING NEW;
EXECUTE format('SELECT ($1).%I', targetfield) INTO existing USING NEW;
IF ((TG_OP = 'INSERT' AND existing IS NULL) OR (TG_OP = 'UPDATE' AND (existing IS NULL OR username != existing))) THEN
CASE targetfield
WHEN 'username' THEN
NEW.username := username;
WHEN 'modifiername' THEN
NEW.modifiername := username;
WHEN 'creatorname' THEN
NEW.creatorname := username;
.....
END CASE;
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
And using the trigger function:
CREATE TRIGGER some_trigger_name BEFORE UPDATE OR INSERT ON my_schema.my_table FOR EACH ROW EXECUTE PROCEDURE public.add_username('userid', 'username');
The way this works is the trigger function receives the original source field name (for example userid) and the target field name (username) via TG_ARGV. These are then used to fill in the (possibly) missing information. All this works nice enough, but how can I get rid of that CASE-mess? Is there a way to dynamically modify the values in the NEW record when I don't know the name of the field in advance (or rather it can be a lot of things)? It is in the targetfield parameter, but obviously NEW.targetfield does not work, nor something like NEW[targetfield] (like Javascript for example).
Any ideas how this could be accomplished? Besides using for instance PL/Python..
There are not simple plpgsql based solutions. Some possible solutions:
Using hstore extension.
CREATE TYPE footype AS (a int, b int, c int);
postgres=# select row(10,20,30);
row
------------
(10,20,30)
(1 row)
postgres=# select row(10,20,30)::footype #= 'b=>100';
?column?
-------------
(10,100,30)
(1 row)
hstore based function can be very simple:
create or replace function update_fields(r anyelement,
variadic changes text[])
returns anyelement as $$
select $1 #= hstore($2);
$$ language sql;
postgres=# select *
from update_fields(row(10,20,30)::footype,
'b', '1000', 'c', '800');
a | b | c
----+------+-----
10 | 1000 | 800
(1 row)
Some years ago I wrote a extension pl toolbox. There is a function record_set_fields:
pavel=# select * from pst.record_expand(pst.record_set_fields(row(10,20),'f1',33));
name | value | typ
------+-------+---------
f1 | 33 | integer
f2 | 20 | integer
(2 rows)
Probably you can find some plpgsql only solutions based on some tricks with system tables and arrays like this, but I cannot to suggest it. It is too less readable and for not advanced user just only black magic. hstore is simple and almost everywhere so it should be preferred way.
On PostgreSQL 9.4 (maybe 9.3) you can try to black magic with JSON manipulations:
postgres=# select json_populate_record(NULL::footype, jo)
from (select json_object(array_agg(key),
array_agg(case key when 'b'
then 1000::text
else value
end)) jo
from json_each_text(row_to_json(row(10,20,30)::footype))) x;
json_populate_record
----------------------
(10,1000,30)
(1 row)
So I am able to write function:
CREATE OR REPLACE FUNCTION public.update_field(r anyelement,
fn text, val text,
OUT result anyelement)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
declare jo json;
begin
jo := (select json_object(array_agg(key),
array_agg(case key when 'b' then val
else value end))
from json_each_text(row_to_json(r)));
result := json_populate_record(r, jo);
end;
$function$
postgres=# select * from update_field(row(10,20,30)::footype, 'b', '1000');
a | b | c
----+------+----
10 | 1000 | 30
(1 row)
JSON based function should not be terrible fast. hstore should be faster.
UPDATE/caution: Erwin points out that this is currently undocumented, and the docs indicates it should not be possible to alter records this way.
Use Pavel's solution or hstore.
The json based solution is almost as fast as hstore when simplified. json_populate_record() modifies existing records for us, so we only have to create a json object from the keys we want to change.
See my similar answer, where you'll find benchmarks that compares the solutions.
The simplest solution requires Postgres 9.4:
SELECT json_populate_record (
record
,json_build_object('key', 'new-value')
);
But if you only have Postgres 9.3, you can use casting instead of json_object:
SELECT json_populate_record(
record
, ('{"'||'key'||'":"'||'new-value'||'"}')::json
);

strange behavior in table column in postgres

I am currently using postgres 8.3. I have created a table that acts as a dirty flag table for members that exist in another table. I have applied triggers after insert or update on the members table that will insert/update a record on the modifications table with a value of true. The trigger seems to work, however I am noticing that something is flipping the boolean is_modified value. I have no idea how to go about trying to isolate what could be flipping it.
Trigger function:
BEGIN;
CREATE OR REPLACE FUNCTION set_member_as_modified() RETURNS TRIGGER AS $set_member_as_modified$
BEGIN
LOOP
-- first try to update the key
UPDATE member_modification SET is_modified = TRUE, updated = current_timestamp WHERE "memberID" = NEW."memberID";
IF FOUND THEN
RETURN NEW;
END IF;
--member doesn't exist in modification table, so insert them
-- if someone else inserts the same key conncurrently, raise a unique-key failure
BEGIN
INSERT INTO member_modification("memberID",is_modified,updated) VALUES(NEW."memberID", TRUE,current_timestamp);
RETURN NEW;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the update again
END;
END LOOP;
END;
$set_member_as_modified$ LANGUAGE plpgsql;
COMMIT;
CREATE TRIGGER set_member_as_modified AFTER INSERT OR UPDATE ON members FOR EACH ROW EXECUTE PROCEDURE set_member_as_modified();
Here is the sql I run and the results:
$CREATE TRIGGER set_member_as_modified AFTER INSERT OR UPDATE ON members FOR EACH ROW EXECUTE PROCEDURE set_member_as_modified();
Results:
UPDATE 1
bluesky=# select * from member_modification;
-[ RECORD 1 ]---+---------------------------
modification_id | 14
is_modified | t
updated | 2011-05-26 09:49:47.992241
memberID | 182346
bluesky=# select * from member_modification;
-[ RECORD 1 ]---+---------------------------
modification_id | 14
is_modified | f
updated | 2011-05-26 09:49:47.992241
memberID | 182346
As you can see something flipped the is_modified value. Is there anything in postgres I can use to determine what queries/processes are acting on this table?
Are you sure you've posted everything needed? The two queries on member_modification suggest that a separate query is being run in between, which sets is_modified back to false.
You could add an text[] field to member_modification, e.g. query_trace text[] not null default '{}', then and a before insert/update trigger on each row on that table which goes something like:
NEW.query_trace := NEW.query_trace || current_query();
If current_query() is not available in 8.3, see this:
http://www.postgresql.org/docs/8.3/static/monitoring-stats.html
SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,
pg_stat_get_backend_activity(s.backendid) AS current_query
FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
You could then get the list of subsequent queries that affected it:
select query_trace[i] from generate_series(1, array_length(query_trace, 1)) as i