PostgreSql: why this update works incorrectly? - postgresql

There is table t1 in what I need to replace id with new value.
The 2nd table t_changes contains substitutions
old_id->new_id.
But when I do UPDATE the t1 contains the same new id value for all records.
What is incorrect?
The same update works in T-SQL successfully.
drop table t1;
drop table t2;
drop table t_changes;
create table t1
(id INT,name text, new_id INT default(0));
create table t_changes
(old_id INT,new_id int)
insert into t1(id,NAME)
VALUES (1,'n1'),(2,'n2'),(3,'n3');
insert into t_changes(old_id,new_id)
values(1,11),(2,12),(3,13),(4,13)
select * from t1;
select * from t_changes;
-------!!!!
update t1
set new_id = n.new_id
from t1 t
inner join t_changes n
on n.old_id=t.id;
select * from t1
------------------------------
"id" "name" "new_id"
-----------------
"1" "n1" "11"
"2" "n2" "11"
"3" "n3" "11"

This is your Postgres update statement:
update t1
set new_id = n.new_id
from t1 t inner join
t_changes n
on n.old_id = t.id;
The problem is that the t1 in the update refers to a different t1 in the from. You intend for them to be the same reference. You can do this as:
update t1
set new_id = n.new_id
from t_changes n
where n.old_id = t.id;
Your syntax is fairly close to the syntax supported by some other databases (such as SQL Server). However, for them, you would need to use the table alias in the update:
update t
set new_id = n.new_id
from t1 t inner join
t_changes n
on n.old_id = t.id;

How about doing this instead:
update t1
set new_id = (SELECT new_id FROM t_changes WHERE old_id=id);
Note that if for some row in t1 there is no corresponding row in t_changes, this will change t1.new_id to NULL.

Related

Postgresql strange behavior with update trigger

I have a table1: id int, id_2 int, date timestamp, vec float[]
And table2 : id int, vec float[]
My target is to create trigger on update of table1 which will take last 10 (by date) rows for id_2, take average of vectors by first axis(10 x N -> N) and write it to table2 under id = id_2.
My code:
CREATE OR REPLACE FUNCTION public.foo()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
WITH rows AS (
SELECT DISTINCT t1.id_2, t2.id, t2.vec, t2.date, DENSE_RANK() OVER (PARTITION BY t1.id_2 ORDER BY t2.date desc) AS counter
FROM new_table AS t1
LEFT JOIN table1 t2 ON t1.id_2 = t2.id_2
),
elements_average AS (
SELECT id_2, AVG(unnest::float) AS av
FROM rows,
unnest(vec) with ORDINALITY
WHERE counter < 11
GROUP BY id_2, ORDINALITY
ORDER BY ORDINALITY
),
avr AS (
SELECT id_2, array_agg(av::float) AS averages
FROM elements_average
GROUP BY id_2
)
UPDATE table2 SET vec = averages FROM avr WHERE table2.id = user_av.id_2;
RETURN NULL;
END;
$function$
;
CREATE TRIGGER foo_trigger AFTER
UPDATE
ON
public.table1 REFERENCING NEW TABLE AS new_table FOR EACH STATEMENT EXECUTE FUNCTION foo()
The problem: when I update few rows in table1 with different id_2 in one transactions a value in table2 becomes wrong. Not the average.
What's even more strange is that this code gives correct values in same situation:
...
avr AS (
SELECT id_2, array_agg(av::float) AS averages
FROM elements_average
GROUP BY id_2
),
strange_thing AS (
SELECT * from elements_average
)
UPDATE table2 SET vec = averages FROM avr WHERE table2.id = user_av.id_2;
RETURN NULL;
END;
$function$
;
So, small meaningless and unimportant SELECT changes the behavior of the function. Is it a bug of postgres or my fault?

IF Condition Returning too Many Values

I am pretty new to the t-sql world and am trying to create a query that will change a value based on multiple criteria.
TSH1 is the main table that values will be changed in.
Freightview is the table that has the shipping amount I need to add into TSH1.
I want the query to look for matches between the tables and when there is one make a change to the FREIGHT line if it exists. If the FREIGHT line doesn't exist then it needs to add a line with the invoice amount from Freightview table.
My issue is the IF statement. It is returning two many values for the query to work. What do I need to change?
The last two queries are to return values that are not in each table.
SELECT *
FROM TSH1 T
JOIN Freightview FR on FR.[Shippers number] = T.sonum
IF
((SELECT [Shippers number] FROM Freightview) = (SELECT sonum FROM TSH1 T WHERE EXISTS(SELECT * FROM TSH1 T WHERE T.productnum = 'FRT-OUT' OR T.productnum = 'FRT-IN' OR T.productnum = 'FRT')))
BEGIN
UPDATE TSH1 SET tcost = FR.[Invoice Amount] FROM TSH1 T INNER JOIN Freightview FR on FR.[Shippers number] = T.sonum
WHERE T.productnum = 'FRT-OUT' OR T.productnum = 'FRT-IN' OR T.productnum = 'FRT';
END
ELSE IF
((SELECT [Shippers number] FROM Freightview) = (SELECT sonum FROM TSH1 T WHERE NOT EXISTS(SELECT * FROM TSH1 T WHERE T.productnum = 'FRT-OUT' OR T.productnum = 'FRT-IN' OR T.productnum = 'FRT')))
BEGIN
SELECT * INTO temp_table FROM TSH1 T INNER JOIN Freightview FR on FR.[Shippers number] = T.sonum
WHERE FR.[Shippers number] = T.sonum AND NOT EXISTS (SELECT productnum from TSH1 T where T.productnum = 'FRT-OUT' OR T.productnum = 'FRT-IN' OR T.productnum = 'FRT');
UPDATE temp_table SET temp_table.productnum = 'FRT', [Invoice Amount] = TT.tcost, temp_table.productid = '7240', temp_table.pd = 'FREIGHT', temp_table.qtyfulfilled = 1,
temp_table.tprice = 0, temp_table.stdcost = 0, temp_table.flag = 'D', temp_table.avgcost = NULL
FROM temp_table TT
INNER JOIN Freightview FR on TT.sonum = FR.[Shippers number];
UPDATE temp_table SET ID=NULL;
DELETE x FROM (
SELECT *, rn=row_number() over (partition by TT.sonum order by TT.soid)
FROM temp_table TT
) x
WHERE rn > 1;
INSERT INTO TSH1 SELECT * FROM temp_table;
DROP TABLE temp_table;
END
ELSE
BEGIN
SELECT *
FROM TSH1 T
LEFT JOIN Freightview FR on T.sonum = FR.[Shippers number]
WHERE FR.[Shippers number] IS NULL;
END
BEGIN
SELECT *
FROM Freightview FR
LEFT JOIN TSH1_Backup T on T.sonum = FR.[Shippers number]
WHERE T.sonum IS NULL;
END
END```
With SQL, you typically have to "think in sets". For example, a select statement returns a set of values, not just a single value1.
If I select * from T, the result might have multiple rows.
If I insert T1 select * from T2, multiple rows might be inserted into T1.
So, a statement like
if ((select c from T1) = (select c from T2))
Is sort of an odd construct. What exactly are we comparing here? On the left hand side we have zero or more rows from T1, and on the right hand side we zero or more rows from T2.
Now, you might be thinking to yourself...
Well the answer is obvious. If the two result sets are identical, then the equality comparison should return true, right?
Well... yes. It would be nice if we could do that. But that would require that SQL think of the result of a select statement as "an anonymous collection type with member-wise value equality semantics". And SQL is not that sophisticated as a language. In SQL, if you're comparing one thing to another with =, the left hand side and the right hand side should both be scalar types. "Single values", like an int, or a float, or a boolean. Not sets.
Fundamentally, it's the same reason why you can't do this:
create table T1(i int);
create table T2(j int);
if (T1 = T2) print 'tables had exactly the same content`;
So, how do you get the semantics "tell me if the contents of T1 and T2 exactly match?". There's no compact syntax to do this, you have to be verbose about it, there are lots of different ways you can "phrase" the question, and it's easy to make a mistake. Here's one correct way:
create table T1(i int);
create table T2(j int);
if not exists
(
select *
from T1
full join T2 on T1.i = T2.j
where T1.i is null or T2.j is null
) print 'tables had exactly the same content';
The logic is "match every row that you can, and tell me if there are any rows that couldn't be matched".
Now, interestingly enough SQL doesn't "validate" the comparison until it actually gets its results, so if your select statements each happen to return just a single row and single column, then the result of the select statement is treated as a scalar value, not a set, and then the equality comparison works. I sort of wish it didn't, because it's inconsistent and confuses people:
create table T1(i int);
create table T2(j int);
insert T1 values (1);
insert T2 values (1);
-- This will unfortunately succeed, and do what you intuitively "expect".
if ((select i from T1) = (select j from T2))
print 'tables both exactly one row with the same value';
But what if I put more rows into one of the tables?
create table T1(i int);
create table T2(j int);
insert T1 values (1), (2);
insert T2 values (1);
-- This will fail
if ((select i from T1) = (select j from T2))
print 'tables both exactly one row with the same value';
The error is:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
You have some SQL that makes this same mistake:
if ((select [shippers number] from Freightview) = -- ...
I hope this answers your specific question about why you're getting the error. But hang on, let's go back and look at your requirements:
I want the query to look for matches between the tables and when there is one make a change to the FREIGHT line if it exists. If the FREIGHT line doesn't exist then it needs to add a line with the invoice amount from Freightview table.
So, you want a combination of insert and update, depending on the data. An "upsert".
TSQL has a statement which can do exactly this: Merge. Here's a simplified example to demonstrate how to use it.
create table T1(i int, c char);
create table T2(j int, c char);
insert T1 values (1, 'a');
insert T2 values (1, 'b'), (2, 'c');
merge T1 -- T1 will be "target" in the rest of the merge statement
using T2 on t2.j = T1.i -- T2 will be "source" in the rest of the merge statment
when matched then
update
set T1.c = T2.c
-- "target" isn't an alias defined by me. It's defined by the structure of "merge"
-- So this condition translates to "if there is a row in T2 with no matching row in T1"
when not matched by target then
insert (i, c)
values (T2.j, T2.c);
select * from T1;
/* result:
i c
----
1 b
2 c
*/
Formatting merge statements is hard, I've never found a way to do it that I am totally happy with.
1 That's not really accurate. SQL allows duplicate rows to exist in tables, result sets, and so on. In mathematics sets cannot have duplicate members. So technically you have to "think in bags". But people tend to say "think in sets" despite this.

Postgres join involving tables having join condition defined on an text array

I have two tables in postgresql
One table is of the form
Create table table1(
ID serial PRIMARY KEY,
Type []Text
)
Create table table2(
type text,
sellerID int
)
Now i want to get all the rows from table1 which are having type same that in table2 but the problem is that in table1 the type is an array.
In case the type in the table has an identifiable delimiter like ',' ,';' etc. you can rewrite the query as regexp_split_to_table(type,',') or versions later than 9.5 unnest function can be use too.
For eg.,
select * from
( select id ,regexp_split_to_table(type,',') from table1)table1
inner join
select * from table2
on trim(table1.type) = trim(table2.type)
Another good example can be found - https://www.dbrnd.com/2017/03/postgresql-regexp_split_to_array-to-split-string-using-different-delimiters/
SELECT
a[1] AS DiskInfo
,a[2] AS DiskNumber
,a[3] AS MessageKeyword
FROM (
SELECT regexp_split_to_array('Postgres Disk information , disk 2 , failed', ',')
) AS dt(a)
You can use the ANY operator in the JOIN condition:
select *
from table1 t1
join table2 t2 on t2.type = any (t1.type);
Note that if the types in the table1 match multiple rows in table2, you would get duplicates (from table1) because that's how a join works. Maybe you want an EXISTS condition instead:
select *
from table1 t1
where exists (select *
from table2 t2
where t2.type = any(t1.type));

Same name attributes in select list in pg-promise

Is it possible to get the same name attributes in the select list (as JSON deduplicates them)?
For instance:
CREATE TABLE t1 (
id int;
);
INSERT INTO t1 VALUES(1);
INSERT INTO t1 VALUES(2);
CREATE TABLE t2 (
id int;
);
INSERT INTO t2 VALUES(1);
SELECT *
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
should return:
id id
-----
1 1
2 null
but will return instead:
id
---
1
null
I'm trying to build a web-based SQL editor, and this is kind of a showstopper.
Sorry, found it, it was solved in:
pg: https://github.com/brianc/node-postgres/pull/393
and subsequently in pg-promise: https://github.com/vitaly-t/pg-promise/releases/tag/v.4.0.5
One can use rowMode argument to get results as an array:
http://vitaly-t.github.io/pg-promise/PreparedStatement.html#rowMode

How do I avoid listing all the table columns in a PostgreSQL returns statement?

I have a PostgreSQL function similar to this:
CREATE OR REPLACE FUNCTION dbo.MyTestFunction(
_ID INT
)
RETURNS dbo.MyTable AS
$$
SELECT *,
(SELECT Name FROM dbo.MySecondTable WHERE RecordID = PersonID)
FROM dbo.MyTable
WHERE PersonID = _ID
$$ LANGUAGE SQL STABLE;
I would really like to NOT have to replace the RETURNS dbo.MyTable AS with something like:
RETURNS TABLE(
col1 INT,
col2 TEXT,
col3 BOOLEAN,
col4 TEXT
) AS
and list out all the columns of MyTable and Name of MySecondTable. Is this something that can be done? Thanks.
--EDIT--
To clarify I have to return ALL columns in MyTable and 1 column from MySecondTable. If MyTable has >15 columns, I don't want to have to list out all the columns in a RETURNS TABLE (col1.. coln).
You just list the columns that you want returned in the SELECT portion of your SQL statement:
SELECT t1.column1, t1.column2,
(SELECT Name FROM dbo.MySecondTable WHERE RecordID = PersonID)
FROM dbo.MyTable t1
WHERE PersonID = _ID
Now you'll just get column1, column3, and name returned
Furthermore, you'll probably find better performance using a LEFT OUTER JOIN in your FROM portion of the SQL statement as opposed to the correlated subquery you have now:
SELECT t1.column1, t1.column2, t2.Name
FROM dbo.MyTable t1
LEFT OUTER JOIN dbo.MySecondTable t2 ON
t2.RecordID = t1.PersonID
WHERE PersonID = _ID
Took a bit of a guess on where RecordID and PersonID were coming from, but that's the general idea.