Building a query which sets a column according to the data in a join table - postgresql

I have a table af with columns af.id, etc. and a table af_pb with columns af_id and pb_id (which assigns entities from table pb to the entities of table af).
What i want:
add a new column precedence in table af
for each af.id in af:
if there is a pair (af_id, pb_id) with af.id = af_id and some pb_id in the join table af_pb, then set af.precedence = 0
if there is no such pair, set af.precedence = 1
How can i reach this in PostgreSQL? I already read about the case-when-else-statement but I didn't managed to implement it such that the column precedence is set correctly.

While this can be done in with a case expression, it is not necessary. If you want a default value for later inserts into table af then alter the table with it, then update to set the non-default.
alter table af add column precedence integer default 1;
update af
set precedence = 0
where exists (select null
from af_pb
where af.af_id = af_pb.af_id);
If a default is not desired then a just add the column and afterward update to set the appropriate value:
alter table af add column precedence integer;
update af
set precedence =
( not (exists (select null
from af_pb
where af.af_id = af_pb.af_id)))::integer;

Related

Postgres Query to delete duplicated rows by many parameters

I have a table with fields: Guide, FileSource, CodeName. All these fields are of text type. Some fields have the same values, and sometimes the values are NULL. I made query which deletes duplicate rows with the same parameters, however, when some values equal NULL there is nothing deleted rows. How could I change the query that will delete rows with the same parameters include values equal NULL?
DELETE FROM public.TableName as T1
USING public.TableName as T2
WHERE T1.ctid > T2.ctid
AND T1."Guide" = T2."Guide"
AND T1."FileSource" = T2."FileSource"
AND T1."CodeName" = T2."CodeName";
NULLs cannot be compared directly (with "="). You can handle NULLs using the coalesce statement like this:
DELETE FROM public.TableName as T1
USING public.TableName as T2
WHERE T1.ctid > T2.ctid
AND COALESCE(T1."Guide",'') = COALESCE(T2."Guide",'')
AND COALESCE(T1."FileSource",'') = COALESCE(T2."FileSource",'')
AND COALESCE(T1."CodeName",'') = COALESCE(T2."CodeName",'');

Problems with create policy of update

I want to use row level security to create a policy for update, tb.idx never could update to less than 2 if cls = 'great2':
create table tb (
idx integer,
cls text);
create role user1;
grant all on tb to user1;
......
create policy up_p on tb for update
using(true)
with check (idx >2 and cls='great2');
output:
set role user1;
select * from tb;
update tb set idx=1 cls='great2'
There are two problems:
when using select * from tb, it shows an empty table.
it allows update with idx=1 cls='great2'.
it shows an empty table.
Quote from the manual
If row-level security is enabled for a table, but no applicable policies exist, a “default deny” policy is assumed, so that no rows will be visible or updatable.
So you need to create a policy that allows selecting:
create policy tb_select on tb
for select
using (true);
it allows update with idx=1 cls='great2'.
Quote from the manual
Existing table rows are checked against the expression specified in USING, while new rows that would be created via INSERT or UPDATE are checked against the expression specified in WITH CHECK
As you created the policy with using (true) that means all rows can be updated.
So you need:
create policy up_p on tb
for update
using (idx > 2 and cls='great2');
Assuming there is a row with (1, 'great2') the following update would not update anything:
update stuff.tb
set cls = 'great2'
where idx = 1;
Note, that for the policy to actually be active you also need:
alter table tb enable row level security;
However, if you simply want to ensure that values for idx are always greater than 2 for rows with cls = 'great2', a check constraint might be the better option:
create table tb
(
idx integer,
cls text,
constraint check_idx check ( (idx > 2 and cls = 'great2') or (cls <> 'great2'))
);
insert into tb
values
(10, 'great2'),
(1, 'foo');
Now running:
update tb
set idx = 1
where idx = 10
results in:
ERROR: new row for relation "tb" violates check constraint "check_idx"
Detail: Failing row contains (1, great2).
the same happens if you change the cls value for a row with idx <= 2
update tb
set cls = 'great2'
where idx = 1;

Postgresql inserts values falsely

I want to add a denormalized table for some data of a gtfs-feed. For that I created a new table:
CREATE TABLE denormalized_trips (
stops_coords json NOT NULL,
stops_object json NOT NULL,
agency_key text NOT NULL,
trip_id text NOT NULL,
route_id text NOT NULL,
service_id text NOT NULL,
shape_id text,
route_color text,
route_long_name text,
route_desc text,
direction_id text
);
CREATE INDEX denormalized_trips_index ON denormalized_trips (agency_key, trip_id);
CREATE UNIQUE INDEX denormalized_trips_index ON denormalized_trips (agency_key, route_id);
Now I want to transfer data from one table to the other via an insert statement. The statement is rather complex.
INSERT INTO denormalized_trips
SELECT
trps.stops_coords,
trps.stops_object,
trps.trip_id,
trps.service_id,
trps.route_id,
trps.direction_id,
trps.agency_key,
trps.shape_id,
trps.route_color,
trps.route_long_name,
trps.route_desc
FROM (
SELECT
array_to_json(ARRAY_AGG(array[stop_lat, stop_lon])) AS stops_coords,
array_to_json(ARRAY_AGG(array[
stops.stop_id,
CAST ( stop_times.stop_sequence AS TEXT ),
stops.stop_name,
stop_times.departure_time,
CAST ( stop_times.departure_time_seconds AS TEXT ),
stop_times.arrival_time,
CAST ( stop_times.arrival_time_seconds AS TEXT )
])) AS stops_object,
trips.trip_id,
trips.service_id,
trips.direction_id,
trips.agency_key,
trips.shape_id,
routes.route_id,
routes.route_color,
routes.route_long_name,
routes.route_desc
FROM gtfs_stop_times AS stop_times
INNER JOIN gtfs_trips AS trips
ON trips.trip_id = stop_times.trip_id AND trips.agency_key = stop_times.agency_key
INNER JOIN gtfs_routes AS routes ON trips.agency_key = routes.agency_key AND routes.route_id = trips.route_id
INNER JOIN gtfs_stops AS stops
ON stops.stop_id = stop_times.stop_id
AND stops.agency_key = stop_times.agency_key
AND NOT EXISTS (
SELECT 0
FROM denormalized_max_stop_sequence AS max
WHERE max.agency_key = stop_times.agency_key
AND max.trip_id = stop_times.trip_id
AND max.trip_max = stop_times.stop_sequence
)
GROUP BY
trips.trip_id,
trips.service_id,
trips.direction_id,
trips.agency_key,
trips.shape_id,
routes.route_id,
routes.route_color,
routes.route_long_name,
routes.route_desc
) as trps
If I just run the inner select statement I will get the right results. They look something like this: (screenshot does not show all tables because it's too long)
But if I execute the insert statement and display the content of the table i will get something like this:
As you may notice the contents are not inserted into the right columns of the table. The agency_key now has the values of the trip_id and the direction_id is now the service_id (and there are more tables that are messed up).
So my question is what am I doing wrong that my insert statement inserts the contents into the wrong columns of the newly created table?
Thanks for your help.
Postgres, by default, will insert your values in the order the columns are declared in the table; it has nothing to do with what your columns are named in the query.
https://www.postgresql.org/docs/9.5/static/sql-insert.html
If no list of column names is given at all, the default is all the columns of the table in their declared order; or the first N column names, if there are only N columns supplied by the VALUES clause or query.
You can alter your insert to declare the order of the columns you're inserting, or you can change the order of your select to match the order of columns in the table.

What does a column assignment using an aggregate in the columns area of a select do?

I'm trying to decipher another programmer's code who is long-gone, and I came across a select statement in a stored procedure that looks like this (simplified) example:
SELECT #Table2.Col1, Table2.Col2, Table2.Col3, MysteryColumn = CASE WHEN y.Col3 IS NOT NULL THEN #Table2.MysteryColumn - y.Col3 ELSE #Table2.MysteryColumn END
INTO #Table1
FROM #Table2
LEFT OUTER JOIN (
SELECT Table3.Col1, Table3.Col2, Col3 = SUM(#Table3.Col3)
FROM Table3
INNER JOIN #Table4 ON Table4.Col1 = Table3.Col1 AND Table4.Col2 = Table3.Col2
GROUP BY Table3.Col1, Table3.Col2
) AS y ON #Table2.Col1 = y.Col1 AND #Table2.Col2 = y.Col2
WHERE #Table2.Col2 < #EnteredValue
My question, what does the fourth column of the primary selection do? does it produce a boolean value checking to see if the values are equal? or does it set the #Table2.MysteryColumn equal to some value and then inserts it into #Table1? Or does it just update the #Table2.MysteryColumn and not output a value into #Table1?
This same thing seems to happen inside of the sub-query on the third column, and I am equally at a loss as to what that does as well.
MysteryColumn = gives the expression a name also called a column alias. The fact that a column in the table#2 also has the same name is besides the point.
Since it uses INTO syntax it also gives the column its name in the resulting temporary table. See the SELECT CLAUSE and note | column_alias = expression and the INTO CLAUSE

Update from two differents table with postgresql

I want to update a table with postgresql.
In fact, I have a table (TABLE_ONE) with two column (old_id and new_id). I have a second table (TABLE_TWO) with colums (id,column1,column2,...).
I want to update the column id from TABLE_TWO. The wanted behavior is that when TABLE_ONE.id = TABLE_TWO.old_id, we set id to new_id.
How can i do that?
You want an UPDATE FROM statement:
UPDATE table_one
SET table_one.id = table_two.id
FROM table_two
WHERE table_one.id = table_two.old_id;