Look for a unique value using two variables - postgresql

I need to implement the ISSUES column by using REFERENCE and LOCALISATION variables for each rows with unique values stocked on Table_issues_Localisation.
The problem is, those two variables make a two dimension table, so I have to dynamically select the right column of LOCALISATION.
Here is an explanation of what I need to do with an image. I'm sorry for posting an image but I think this is way more understandable.
I tried to make a query to update row by row the Table_Observation.ISSUES column with informations stocked on variable columns (=SELECT(SELECT))) of Table_issues_Localisation.
On Table_Observation.ROW_NUMBER column indicates the number of each rows. It is used for the loop.
DO $$
DECLARE
my_variable TEXT;
BEGIN
FOR i IN 1..35 LOOP
my_variable = SELECT((SELECT LOCALISATION FROM Table_Observation WHERE Table_Observation.ROW_NUMBER = i) FROM Table__issues_Localisation ON Table_Observation.REFERENCE = Table__issues_Localisation.REFERENCE)
UPDATE Table_Observation
SET ISSUES = my_variable
WHERE Table_Observation.ROW_NUMBER = i
END LOOP;
END;
$$
Postgres v 9.4
I hope I'm clear enough,

You don't need PL/pgSQL or a loop for this. You can do that with a single update statement:
update observation o
set issues = row_to_json(il) ->> o.localisation
from issues_localisation il
where il.reference = o.reference;
This requires that the values in observation.localisation exactly map to the column names in issues_localisation.
With the following test data:
create table observation
(
rn integer primary key,
reference integer,
localisation text,
issues text
);
create table issues_localisation
(
reference integer,
issues_12 text,
issues_17 text,
issues_27 text,
issues_34 text
);
insert into observation (rn, reference, localisation)
values
(1, 27568, 'issues_27'),
(2, 6492, 'issues_34'),
(3, 1529, 'issues_34'),
(4, 1529, 'issues_34'),
(5, 709, 'issues_12');
insert into issues_localisation (reference, issues_12, issues_17, issues_27, issues_34)
values
(29, 'FB', 'FB', 'TFB', null),
(506, 'M', null, 'M', null),
(709, 'TF', null, null, null),
(1234, null, 'TF', 'TF', null),
(1529, 'FB', 'FB', 'FB', 'M'),
(3548, null, 'M', null, null),
(6492, 'FB', 'FB', 'FB', null),
(18210, 'TFB', null, 'TFB', 'TFB'),
(27568, 'TF', null, 'TF', 'TF');
The update will result in this data in the table observation:
rn | reference | localisation | issues
---+-----------+--------------+-------
1 | 27568 | issues_27 | TF
2 | 6492 | issues_34 |
3 | 1529 | issues_34 | M
4 | 1529 | issues_34 | M
5 | 709 | issues_12 | TF
Online example: http://rextester.com/OCGFM81609
For your next question you should supply the sample data (and the expected output) the way I did in my answer.
I removed the completely useless prefix table_ from the table names as well. That is horrible naming convention.

And here is an (unfinished, still need to execute) example of dynamic sql:
CREATE FUNCTION bagger (_name text) RETURNS text
AS
$func$
DECLARE upd text;
BEGIN
upd := format('
UPDATE observation dst
SET issues = src.%I
FROM issues_localisation src
WHERE src.reference = dst.reference
AND dst.localisation = %L
;', _name, _name);
-- RAISE NOTICE upd;
RETURN upd;
END
$func$
LANGUAGE plpgsql
;
-- SELECT bagger('issues_12' );
WITH loc AS (
SELECT DISTINCT localisation AS loc
FROM observation
)
SELECT bagger( loc.loc)
FROM loc
;

Related

Postgres merge two rows with common array elements

I have a postgres table with a column names "ids".
+----+--------------+
| id | ids |
+----+--------------+
| 1 | {1, 2, 3} |
| 2 | {2, 7, 10} |
| 3 | {14, 11, 1} |
| 4 | {12, 13} |
| 5 | {15, 16, 12} |
+----+--------------+
I want to merge rows with at least one common array element and create a new row from that (or merge into one existing row). So finally the table would look like the following:
+----+--------------------------+
| id | ids |
+----+--------------------------+
| 6 | {1, 2, 3, 7, 10, 14, 11} |
| 7 | {12, 13, 15, 16} |
+----+--------------------------+
Order of array elements in the resulting table does not really matter but they must be unique.
The rows are added independently from another system. For example we could add a new row where ids are {16, 18, 1}.
Right now to make sure we combine all the rows with at least one common array element, I am doing the calculations in my server (Node.js).
So before I create a new row, I pull all the existing rows in the database that have at least one item in common using:
await t.any('SELECT * FROM arraytable WHERE $1 && ids', [16, 18, 1])
This gives me all the rows that have at least 16, 18 or 1. Then I merge the rows with [16, 18, 1] and remove duplicates.
With the availability of this new array, I delete all existing rows fetched above and insert this new row to the database. As you can see, most of the work is being done in Node.js.
Instead of this I am trying to create a trigger, which will do all these steps for me as soon as I add the new row. How do I go about doing this with a trigger. Also, are there better ways?
Can procedure suffice?
CREATE OR REPLACE PROCEDURE add_ids(new_ids INT[])
AS $$
DECLARE sum_array INT[];
BEGIN
SELECT ARRAY (SELECT UNNEST(ids) FROM table1 WHERE table1.ids && new_ids) INTO sum_array;
sum_array := sum_array || new_ids;
SELECT ARRAY(SELECT DISTINCT UNNEST(sum_array)) INTO sum_array;
DELETE FROM table1 WHERE table1.ids && sum_array;
INSERT INTO table1(ids) SELECT sum_array;
END;
$$
LANGUAGE plpgsql;
Unfortunately inserting row inside trigger calls another trigger causing infinitie loop. I do not know work around that.
PS. Sorry if creating another answer is bad practice. I want to leave it for now for reference. I will delete it when the problem is resolved.
Edit by pewpewlasers:
To prevent the loop another table is probably needed. I have created a new temporary table2. New arrays can be added to this table. This table will have a trigger which does the calculations and saves it to table1. It also deletes this temporarily created row.
CREATE OR REPLACE FUNCTION on_insert_temp() RETURNS TRIGGER AS $f$
DECLARE sum_array BIGINT[];
BEGIN
SELECT ARRAY (SELECT UNNEST(ids) FROM table1 WHERE table1.ids && NEW.ids) INTO sum_array;
sum_array := sum_array || NEW.ids;
SELECT ARRAY(SELECT DISTINCT UNNEST(sum_array)) INTO sum_array;
DELETE FROM table1 WHERE table1.ids && sum_array;
INSERT INTO table1(ids) SELECT sum_array;
DELETE FROM table2 WHERE id = NEW.id;
RETURN OLD;
END
$f$ LANGUAGE plpgsql;
CREATE TRIGGER on_insert_temp AFTER INSERT ON table2 FOR EACH ROW EXECUTE PROCEDURE on_insert_temp();
Given tables
CREATE TABLE table1(id serial, ids INT [] )
CREATE TABLE table2(id serial, ids INT [] )
the trigger can looks like that
CREATE OR REPLACE FUNCTION sum_tables_trigger() RETURNS TRIGGER AS $table1$
BEGIN
INSERT INTO table2(ids) SELECT ARRAY(SELECT DISTINCT UNNEST(table1.ids || new.ids) ORDER BY 1) FROM table1 WHERE table1.ids && new.ids;
RETURN NEW;
END;
$table1$ LANGUAGE plpgsql;
CREATE TRIGGER sum_tables_trigger_ BEFORE INSERT ON table1
FOR EACH ROW EXECUTE PROCEDURE sum_tables_trigger();
tableA.ids && tableB.ids returns true, if tables have common element.
tableA.ids || tableB.ids adds elements.
ARRAY(SELECT DISTINCT UNNEST(table1.ids || new.ids) ORDER BY 1) removes duplicates.

Replace ID column in LTREE with different column Postgres

I have a hierarchical structure in postgres which uses LTREE to calculate path based on a trigger. I am following the examples from here: https://coderwall.com/p/whf3-a/hierarchical-data-in-postgres
Table, trigger and data being inserted look like this:
CREATE TABLE section (
id INTEGER PRIMARY KEY,
asset_name TEXT,
parent_id INTEGER REFERENCES section,
parent_path LTREE
);
CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);
CREATE INDEX section_parent_id_idx ON section (parent_id);
CREATE OR REPLACE FUNCTION update_section_parent_path() RETURNS TRIGGER AS $$
DECLARE
path ltree;
BEGIN
IF NEW.parent_id IS NULL THEN
NEW.parent_path = '6'::ltree;
ELSEIF TG_OP = 'INSERT' OR OLD.parent_id IS NULL OR OLD.parent_id != NEW.parent_id THEN
SELECT parent_path || id::text FROM section WHERE id = NEW.parent_id INTO path;
IF path IS NULL THEN
RAISE EXCEPTION 'Invalid parent_id %', NEW.parent_id;
END IF;
NEW.parent_path = path;
END IF;
ltree_array = (STRING_TO_ARRAY(path::TEXT,'.'))[2:2147483647];
IF ltree_array IS NOT NULL THEN
FOREACH entry IN ARRAY(ltree_array)
LOOP
v_asset_name = (SELECT asset_name
FROM section AS s
WHERE s.id = entry);
s3path = s3path || v_asset_name || '/';
END LOOP;
END IF;
s3path = NEW.asset_store_id || '/' || s3path || NEW.asset_name;
NEW.s3_path = s3path;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER parent_path_tgr
BEFORE INSERT OR UPDATE ON section
FOR EACH ROW EXECUTE PROCEDURE update_section_parent_path();
INSERT INTO section (id, asset_name, parent_id ) VALUES (1, 'Main Folder', NULL);
INSERT INTO section (id, asset_name, parent_id ) VALUES (2, 'Rejected', 1);
INSERT INTO section (id, asset_name, parent_id ) VALUES (3, 'Records', NULL);
INSERT INTO section (id, asset_name, parent_id ) VALUES (4, 'Expired', 3);
INSERT INTO section (id, asset_name, parent_id ) VALUES (5, 'Selected', 3);
INSERT INTO section (id, asset_name, parent_id ) VALUES (6, 'Useless', 5);
The table output looks like this:
id|asset_name |parent_id|parent_path|
--|-----------|---------|-----------|
1|Main Folder| |6 |
2|Rejected | 1|6.1 |
3|Records | |6 |
4|Expired | 3|6.3 |
5|Selected | 3|6.3 |
6|Useless | 5|6.3.5 |
parent_path is all numerical because it looks like it is concatinating id, but I want to concatinate asset_name column.
I have tried replacing id with asset_name but that throws an syntax error. I can't seem to figure out how to replace id within the trigger.

Postgresql data validation checks based on table values

I'm trying to create a database that tracks electrical cables. Each cable contains 1 or more cores that are connected to terminals at each end. The number of cores in each cable is defined in a table.
| number_of_cores | cable_id
|----------2-------|---1-----|
The core table is as follows
cable_no | from_id | core_mark | to_id
1001 | 1 | 1 Black | 2
1001 | 2 | 1 White | 4
I want to create a check that will prevent another 1001 cable core from being inserted.
Is this possible in postgresql?
Ideally, if I tried to insert another 1001 cable with another unique core number, the error would be something like "all cores used on cable 1001"
Thanks,
I think what you need is something like a check constraint. (https://www.postgresql.org/docs/current/ddl-constraints.html)
Follow those steps :
1. Create some table properly
create table cable (cable_id int primary key, number_of_cores int);
create table core (core_id int primary key, cable_id int references cable (cable_id), from_id int, core_mark varchar (50), to_id int);
2. Create the function that will verify the inserts
create or replace function test_max_core_number(in_cable_id int)
returns boolean
language plpgsql
as $function$
declare
res boolean := false;
begin
if exists (
select *
from cable
where cable_id = in_cable_id
and number_of_cores > (select count(*) from core where cable_id = in_cable_id )
)
then
res := true;
end if;
return res;
end;
$function$;
3. Add the constraint to your table
alter table core
add constraint cstr_check check (test_max_core_number(cable_id));
4. Now it is time for some testing :)
insert into cable (cable_id, number_of_cores) values (1, 2), (2, 3);
insert into core (core_id, cable_id, from_id, core_mark, to_id)
values
(1, 1, 1, '1 Black', 2)
,(2, 1, 2, '1 White', 4);
Normally all goes fine for now.
5. And now the wanted error !
insert into core (core_id, cable_id, from_id, core_mark, to_id)
values
(3, 1, 3, '1 Green', 2);
Hope this helps !
I think #Jaisus gave a good answer.
I would add only a cross-check into the cable to prevent to set bad values into number_of_cores:
create or replace function test_cable_number_of_cores(in_cable_id int,in_number_of_cores int)
returns boolean
language plpgsql
as $function$
declare
res boolean := false;
begin
res := (in_number_of_cores>0 and (select count(cable_id) from core where cable_id=in_cable_id) <= in_number_of_cores);
return res;
end;
$function$;
alter table cable add check(test_cable_number_of_cores(cable_id, number_of_cores));
-- ok
insert into cable(cable_id, number_of_cores) values (3, 2);
update cable set number_of_cores=3 where cable_id=3;
-- error
update cable set number_of_cores=1 where cable_id=1;

Create function in postgresql to update column values from a table with preferred values and aliases

I want to create a function that will update a column of type varchar to a preferred string that is referenced in the column of another table to help me clean this column more iteratively.
CREATE TABLE big_table (
mn_uid NUMERIC PRIMARY KEY,
user_name VARCHAR
);
INSERT INTO big_table VALUES
(1, 'DAVE'),
(2, 'Dave'),
(3, 'david'),
(4, 'Jak'),
(5, 'jack'),
(6, 'Jack'),
(7, 'Grant');
CREATE TABLE nameKey_table (
nk_uid NUMERIC PRIMARY KEY,
correct VARCHAR,
wrong VARCHAR
);
INSERT INTO nameKey_table VALUES
(1, 'David', 'Dave_DAVE_dave_DAVID_david'),
(2, 'Jack', 'JACK_jack_Jak_jak');
I want to perform the following procedure:
UPDATE big_table
SET user_name = (SELECT correct
FROM nameKey_table
WHERE wrong
LIKE '%DAVE%')
WHERE user_name = 'DAVE';
but looped over each user_name in big_table so that I have a function that can do something like this:
UPDATE big_table SET user_name = corrected_name_fn();
Here is my attempt to do something like this but I can't seem to get it to work:
CREATE FUNCTION corrected_name_fn() RETURNS VARCHAR AS $$
DECLARE entry RECORD;
DECLARE correct_name VARCHAR;
BEGIN
FOR entry IN SELECT DISTINCT user_name FROM big_table LOOP
EXECUTE 'SELECT correct
FROM nameKey_table
WHERE wrong
LIKE ''%$1%'''
INTO correct_name
USING entry;
RETURN correct_name;
END LOOP;
END;
$$ LANGUAGE plpgsql;
I want the final output in big_table to be:
| mn_uid | user_name |
| 1 | 'David' |
| 2 | 'David' |
| 3 | 'David' |
| 4 | 'Jack' |
| 5 | 'Jack' |
| 6 | 'Jack' |
| 7 | 'Grant' |
I realize rows 6 and 7 provide two unique cases that I want to build into the function with IF ELSE statements.
If user_name is in nameKey_table.correct, go to next
If user_name is not in nameKey_table.correct or does not match a string in nameKey_table.wrong, leave as is.
Thanks for any help on this!!
It sounds like you want a trigger on the table. Here is my suggestion:
CREATE OR REPLACE FUNCTION tf_fix_name() RETURNS TRIGGER AS
$$
DECLARE
corrected_name TEXT;
BEGIN
SELECT correct INTO corrected_name FROM nameKey_table WHERE expression ~* NEW.user_name;
IF FOUND THEN
NEW.user_name := corrected_name;
END IF;
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
CREATE TEMP TABLE big_table (
mn_uid INT PRIMARY KEY,
user_name TEXT NOT NULL
);
CREATE TRIGGER trigger_fix_name
BEFORE INSERT
ON big_table
FOR EACH ROW
EXECUTE PROCEDURE tf_fix_name();
CREATE TEMP TABLE nameKey_table (
nk_uid INT PRIMARY KEY,
correct TEXT NOT NULL,
expression TEXT NOT NULL
);
INSERT INTO nameKey_table VALUES
(1, 'David', '(dave|david)'),
(2, 'Jack', '(jack|jak)');
INSERT INTO big_table VALUES
(1, 'DAVE'),
(2, 'Dave'),
(3, 'david'),
(4, 'Jak'),
(5, 'jack'),
(6, 'Jack'),
(7, 'Grant');
SELECT * FROM big_table;
+--------+-----------+
| mn_uid | user_name |
+--------+-----------+
| 1 | David |
| 2 | David |
| 3 | David |
| 4 | Jack |
| 5 | Jack |
| 6 | Jack |
| 7 | Grant |
+--------+-----------+
(7 rows)
Note: I think you can do what you want a lot easier with a case insensitive regular expression. And I also changed your primary keys to INTs. Not sure why they are numerics, but it doesn't really change the solutions. My solution was developed and tested on PostgreSQL 9.6.
You don't need a function; you can just update one table from the contents of another table:
UPDATE big_table dst
SET user_name = src.correct
FROM nameKey_table src
WHERE src.wrong LIKE '%' || dst.user_name || '%'
AND dst.user_name <> src.correct -- avoid idempotent updates
;
And if you need performance, dont rely on the LIKE operator, it cannot use indexes for leading %. Instead, use a lookup-table with one entry per row:
CREATE TABLE bad_spell (
correct VARCHAR,
wrong VARCHAR PRIMARY KEY -- This will cause an unique index to be created.
);
INSERT INTO bad_spell VALUES
('David', 'Dave')
,('David', 'DAVE')
,('David', 'dave')
,('David', 'DAVID')
,('David', 'david')
,('Jack', 'JACK')
,('Jack', 'jack')
,('Jack', 'Jak')
,('Jack', 'jak')
;
-- This indexes could be temporary
CREATE INDEX ON big_table(user_name);
-- EXPLAIN
UPDATE big_table dst
SET user_name = src.correct
FROM bad_spell src
WHERE dst.user_name = src.wrong
AND dst.user_name <> src.correct -- avoid idempotent updates
;
SELECT* FROM big_table
;

Filter a row by column types in a function

Right now I have a generic notification function that is triggered after create on a couple of tables in my database (there's a node process on the other end listening for notifications). Here's what my update trigger looks like:
CREATE OR REPLACE FUNCTION notify_create() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
PERFORM pg_notify('update_watchers',
json_build_object(
'eventType', 'new',
'type', TG_TABLE_NAME,
'payload', row_to_json(NEW)
)::text
);
RETURN NEW;
END;
$$;
The problem is, if NEW is too big, this will overflow the limit of 8000 bytes in a couple of limited corner cases (I rarely have a new item in the table that is that big). In the notify_update function, I just report on which columns have changed by listing the column names. That would work here, but what I would rather do is only have row_to_json pull out entries from NEW that are of type integer.
That is because sometimes what I'm notifying is "hey there's a new entry in an entity table". The new entry could be from a couple of different tables (documents, profiles, etc). In that case, I really only need the id, since anyone who is interested in the new value ends up fetching it later anyway.
Sometimes I'm notifying "hey, there's a new entry in a join table", in which case I don't have an id field but instead have something like documents_id and profiles_id.
I could just write a bunch of different notify_create functions, for each scenario. I'd prefer to have one that did something like
row_to_json(NEW.filter(t => typeof t === 'number'))
to mix together plpgsql and javascript notation, but I'm sure you get the point: only include those fields of NEW that are number typed
Is this possible, or should I just write a bunch of different notifiers?
You can easily eliminate json objects of type other than number, e.g.:
with my_table(int1, text1, int2, date1, float1) as (
values
(1, 'text1', 100, '2017-01-01'::date, 123.54)
)
select jsonb_object_agg(key, value) filter (where jsonb_typeof(value) = 'number')
from my_table,
jsonb_each(to_jsonb(my_table))
jsonb_object_agg
--------------------------------------------
{"int1": 1, "int2": 100, "float1": 123.54}
(1 row)
The function below leaves only integers:
create or replace function leave_integers(jdata jsonb)
returns jsonb language sql as $$
select jsonb_object_agg(key, value)
filter (
where jsonb_typeof(value) = 'number'
and value::text not like '%.%')
from jsonb_each(jdata)
$$;
with my_table(int1, text1, int2, date1, float1) as (
values
(1, 'text1', 100, '2017-01-01'::date, 123.54)
)
select leave_integers(to_jsonb(my_table))
from my_table;
leave_integers
--------------------------
{"int1": 1, "int2": 100}
(1 row)
Alternative (better) solution
This function checks Postgres types directly and returns values strictly from integer columns.
create or replace function integer_columns_to_jsonb(anyelement)
returns jsonb language sql as $$
select jsonb_object_agg(key, value)
from jsonb_each(to_jsonb($1))
where key in (
select attname
from pg_type t
join pg_attribute on typrelid = attrelid
where t.oid = pg_typeof($1)
and atttypid = 'int'::regtype)
$$;
The example shows that the function eliminates some corner cases handled incorrectly by leave_integers():
create table my_table (int1 int, int2 int, float1 float, text1 text);
insert into my_table values (1, 2, 3, '4');
select integer_columns_to_jsonb(t), leave_integers(to_jsonb(t))
from my_table t;
integer_columns_to_jsonb | leave_integers
--------------------------+-------------------------------------
{"int1": 1, "int2": 2} | {"int1": 1, "int2": 2, "float1": 3}
(1 row)