How to nest a SELECT into a UPDATE statement in PL/pgSQL - postgresql

I have the below code which works well, the problem is I am creating a table each time, which means I need to recreate all indexes and delete the old tables when the new ones have been created.
DO
$do$
DECLARE
m text;
arr text[] := array['e09000001','e09000007','e09000033','e09000019'];
BEGIN
FOREACH m IN ARRAY arr
LOOP
EXECUTE format($fmt$
CREATE TABLE %I AS
SELECT a.ogc_fid,
a.poly_id,
a.title_no,
a.wkb_geometry,
a.distcode,
SUM(COALESCE((ST_Area(ST_Intersection(a.wkb_geometry, b.wkb_geometry))/ST_Area(a.wkb_geometry))*100, 0)) AS aw
FROM %I a
LEFT OUTER JOIN filter_ancientwoodlands b ON
ST_Overlaps(a.wkb_geometry, b.wkb_geometry) OR ST_Within(b.wkb_geometry, a.wkb_geometry)
GROUP BY a.ogc_fid,
a.poly_id,
a.title_no,
a.wkb_geometry,
a.distcode;
$fmt$, m || '_splitv2_aw', m || '_splitv2_distcode');
END LOOP;
END
$do$
Instead I would like to just create a new column in the existing table and update it. I have done this with simple queries like:
ALTER TABLE e09000001 ADD COLUMN area double precision;
UPDATE e09000001 SET area=ST_AREA(wkb_geometry);
I am having a lot of trouble figuring out to use UPDATE and SET with my more complicated SELECT statement above. Does anyone know how I can achieve this?
UPDATE: So I tried doing what #abelisto suggested:
UPDATE test_table
SET aw = subquery.aw_temp
FROM (SELECT SUM(COALESCE((ST_Area(ST_Intersection(a.wkb_geometry, b.wkb_geometry))/ST_Area(a.wkb_geometry))*100, 0)) AS aw_temp
FROM test_table a
LEFT OUTER JOIN filter_ancientwoodlands b ON
ST_Overlaps(a.wkb_geometry, b.wkb_geometry) OR ST_Within(b.wkb_geometry, a.wkb_geometry)
GROUP BY a.ogc_fid,
a.poly_id,
a.title_no,
a.wkb_geometry,
a.distcode) AS subquery;
But the query just runs for a long time (going one an hour) when it should only take a few seconds. Can anyone see an error in my code?

You need a WHERE clause to join the from expression to the update table.
perhaps like this.
UPDATE test_table
SET aw = subquery.aw_temp
FROM (SELECT SUM(COALESCE((ST_Area(ST_Intersection(a.wkb_geometry, b.wkb_geometry))/ST_Area(a.wkb_geometry))*100, 0)) AS aw_temp,a.wkb_geometry
FROM test_table a
LEFT OUTER JOIN filter_ancientwoodlands b ON
ST_Overlaps(a.wkb_geometry, b.wkb_geometry) OR ST_Within(b.wkb_geometry, a.wkb_geometry)
GROUP BY a.ogc_fid,
a.poly_id,
a.title_no,
a.wkb_geometry,
a.distcode) AS subquery
WHERE
subquery.wkb_geometry = test_table.wkb_geometry;

Related

insert into select - stored procedure affects 0 rows

I am using SQL Server 2014. I created a stored procedure to update a table, but when i run this it affects 0 rows. i'm expecting to see 501 rows affected, as the actual insert statement when run alone returns that.The table beingupdated is pre-populated.
I also tried pre-populating the table with 500 records to see if the last 1 row was pulled by the stored procedure, but it still affects 0 rows.
Create PROCEDURE UPDATE_STAGING
(#StatementType NVARCHAR(20) = '')
AS
BEGIN
IF #StatementType = 'Insertnew'
BEGIN
INSERT INTO owner.dbo.MVR_Staging
(
policy_number,
quote_number,
request_id,
CreateTs,
mvr_response_raw_data
)
select
p.pol_num,
A.pol_number,
R.Request_ID,
R.CreateTS,
R._raw_data
from TABLE1 A with (NOLOCK)
left join TABLE2 R with (NOLOCK)
on R.Request_id = isnull(A.CACHE_REQUEST_ID, A.Request_id)
inner join TABLE3 P
on p.quote_policy_num = a.policy_number
where
A.[SOURCE] = 'MVR'
and A.CREATED_ON >= '2020-01-01'
END
IF #StatementType = 'Select'
BEGIN
SELECT *
FROM owner.dbo.MVR_Staging
END
END
to run:
exec UPDATE_STAGING insertnew
GO
Some correction to your code that is not related to your issue, but is good to keep a best practice and clean code.When declaring a stored procedure parameter, there's no point using parenthesis (#StatementType NVARCHAR(20) = ''). Also you should be using ELSE IF #StatementType = 'Select', without ELSE, this second IF condition will always be checked. Execute the procedure exec UPDATE_STAGING 'insertnew', as the parameter is NVARCHAR. As for your real issue, you could try comment the INSERT part and leave only the SELECT to see if rows are returned.

using recursive CTE within a function

I wanted to try out using a recursive CTE for the first time, so I wrote a query to show the notes in a musical scale based on the root note and the steps given the different scales.
When running the script itself all is well, but the second that I try to make it into a function, I get the error "relation "temp_scale_steps" does not exist".
I am using postgreSQL 9.4.1. I can't see any reason why this would not work. Any advice would be gratefully received.
The code below:
create or replace function scale_notes(note_id int, scale_id int)
returns table(ordinal int, note varchar(2))
as
$BODY$
drop table if exists temp_min_note_seq;
create temp table temp_min_note_seq
as
select min(note_seq_id) as min_note_id from note_seq where note_id = $1
;
drop table if exists temp_scale_steps;
create temp table temp_scale_steps
as
with recursive steps (ordinal, step) as
(
select ordinal
,step
from scale_steps
where scale_id = $2
union all
select ordinal+1
,step
from steps
where ordinal < (select max(ordinal) from scale_steps where scale_id = $2)
)
select ordinal
,sum(step) as temp_note_seq_id
from steps
group by 1
order by 1
;
select x.ordinal
,n.note
from
(
select ordinal
,min_note_id + temp_note_seq_id as temp_note_seq_id
from temp_scale_steps
join temp_min_note_seq on (1=1)
) x
join note_seq ns on (x.temp_note_seq_id = ns.note_seq_id)
join notes n on (ns.note_id = n.note_id)
order by ordinal;
$BODY$
language sql volatile;
In response to comments I have changed the script so that the query is done in one step and now all works. However, I would still be interested to know why the version above does not work.

How can I check if an element in a table exists?

I'm writing a Postgres function which should delete from 3 tables successively.
The relation is delete from mobgroupdata -> mobilenums -> terminals and when I don't have an element in mobgroupdata, I want to delete from mobilenums and then from terminals. But what should be the condition. I've tried with
IF mRec.id != 0, but it didn't work, than I've tried with exists, it also didn't work. Also when I made my select statement from the DB and mobgroupdata's id doesn't exist, the code is breaking, but when I select element which consist in all tables it works. Does anybody know what should be the if statement to make it works?
CREATE OR REPLACE FUNCTION "Delete_From_Terminals_Casc_final12"(
"Id_list" bigint,
"Curuser_id" bigint)
RETURNS SETOF term_mgd_mobnums AS
$BODY$
declare
mRec "term_mgd_mobnums"%ROWTYPE;
BEGIN
for mRec in select mn."id_terminals", t.sn , t.imei ,t.les ,t.category ,t.model ,t.tswv ,t.status ,t.activation_date ,t.deactivation_date ,t.paytype ,t.ip_address ,t.pin1 ,t.pin2 ,t.puk1 ,t.puk2 ,t.notes ,t.units ,t.validtill, t.responsible_user,t.id_clients,t.currentuser, t.isn,
md.id_mobilenums, mn.current_status, mn.start_date ,mn.streason ,mn.unit ,mn.mobnumber ,mn.service ,mn.status as mn_status,mn.activator ,mn.responsible_department,mn.date_changed ,mn.reason ,mn.installed_on ,mn.usedby ,mn.regnumber ,mn.responsible_user as mn_responsible_user ,mn.description,
md.id,md.les1 ,md.les2,md.les3,md.les4,md.les5,md.member1 ,md.member2,md.member3,md.member4,md.member5,md.user1 ,md.user2,md.user3,md.user4,md.user5,md.pass1 ,md.pass2,md.pass3,md.pass4,md.pass5 from terminals t
inner join mobilenums mn on t."id" = mn."id_terminals"
inner join mobgroupdata md on md."id_mobilenums" = mn."id"
where mn."id_terminals" = $1
loop
IF exists THEN
PERFORM "Delete_From_Mobgroupdata2"(mRec.id,$2);
PERFORM "Delete_From_Mobilenums"(mRec.id_mobilenums::text,$2);
PERFORM "Delete_From_Terminals"(mRec.id_terminals::text,$2);
ELSE
PERFORM "Delete_From_Mobilenums"(mRec.id_mobilenums::text,$2);
PERFORM "Delete_From_Terminals"(mRec.id_terminals::text,$2);
END IF;
RETURN NEXT mRec;
end loop;
return;
end;$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION "Delete_From_Terminals_Casc_final12"(bigint, bigint)
OWNER TO postgres;
Two problems with your code, if I am reading your question correctly:
You are using INNER JOIN to join to mobgroupdata. This will only retrieve results for rows which do exist in all of your tables. Use LEFT OUTER JOIN instead.
You tried mRec.id != 0, but you are looking for NULL, not 0. 0 and NULL are not the same thing in SQL. The condition you want is mRec.id IS NOT NULL.

PostgreSQL: How to figure out missing numbers in a column using generate_series()?

SELECT commandid
FROM results
WHERE NOT EXISTS (
SELECT *
FROM generate_series(0,119999)
WHERE generate_series = results.commandid
);
I have a column in results of type int but various tests failed and were not added to the table. I would like to create a query that returns a list of commandid that are not found in results. I thought the above query would do what I wanted. However, it does not even work if I use a range that is outside the expected possible range of commandid (like negative numbers).
Given sample data:
create table results ( commandid integer primary key);
insert into results (commandid) select * from generate_series(1,1000);
delete from results where random() < 0.20;
This works:
SELECT s.i AS missing_cmd
FROM generate_series(0,1000) s(i)
WHERE NOT EXISTS (SELECT 1 FROM results WHERE commandid = s.i);
as does this alternative formulation:
SELECT s.i AS missing_cmd
FROM generate_series(0,1000) s(i)
LEFT OUTER JOIN results ON (results.commandid = s.i)
WHERE results.commandid IS NULL;
Both of the above appear to result in identical query plans in my tests, but you should compare with your data on your database using EXPLAIN ANALYZE to see which is best.
Explanation
Note that instead of NOT IN I've used NOT EXISTS with a subquery in one formulation, and an ordinary OUTER JOIN in the other. It's much easier for the DB server to optimise these and it avoids the confusing issues that can arise with NULLs in NOT IN.
I initially favoured the OUTER JOIN formulation, but at least in 9.1 with my test data the NOT EXISTS form optimizes to the same plan.
Both will perform better than the NOT IN formulation below when the series is large, as in your case. NOT IN used to require Pg to do a linear search of the IN list for every tuple being tested, but examination of the query plan suggests Pg may be smart enough to hash it now. The NOT EXISTS (transformed into a JOIN by the query planner) and the JOIN work better.
The NOT IN formulation is both confusing in the presence of NULL commandids and can be inefficient:
SELECT s.i AS missing_cmd
FROM generate_series(0,1000) s(i)
WHERE s.i NOT IN (SELECT commandid FROM results);
so I'd avoid it. With 1,000,000 rows the other two completed in 1.2 seconds and the NOT IN formulation ran CPU-bound until I got bored and cancelled it.
As I mentioned in the comment, you need to do the reverse of the above query.
SELECT
generate_series
FROM
generate_series(0, 119999)
WHERE
NOT generate_series IN (SELECT commandid FROM results);
At that point, you should find values that do not exist within the commandid column within the selected range.
I am not so experienced SQL guru, but I like other ways to solve problem.
Just today I had similar problem - to find unused numbers in one character column.
I have solved my problem by using pl/pgsql and was very interested in what will be speed of my procedure.
I used #Craig Ringer's way to generate table with serial column, add one million records, and then delete every 99th record. This procedure work about 3 sec in searching for missing numbers:
-- creating table
create table results (commandid character(7) primary key);
-- populating table with serial numbers formatted as characters
insert into results (commandid) select cast(num_id as character(7)) from generate_series(1,1000000) as num_id;
-- delete some records
delete from results where cast(commandid as integer) % 99 = 0;
create or replace function unused_numbers()
returns setof integer as
$body$
declare
i integer;
r record;
begin
-- looping trough table with sychronized counter:
i := 1;
for r in
(select distinct cast(commandid as integer) as num_value
from results
order by num_value asc)
loop
if not (i = r.num_value) then
while true loop
return next i;
i = i + 1;
if (i = r.num_value) then
i = i + 1;
exit;
else
continue;
end if;
end loop;
else
i := i + 1;
end if;
end loop;
return;
end;
$body$
language plpgsql volatile
cost 100
rows 1000;
select * from unused_numbers();
Maybe it will be usable for someone.
If you're on AWS redshift, you might end up needing to defy the question, since it doesn't support generate_series. You'll end up with something like this:
select
startpoints.id gapstart,
min(endpoints.id) resume
from (
select id+1 id
from yourtable outer_series
where not exists
(select null
from yourtable inner_series
where inner_series.id = outer_series.id + 1
)
order by id
) startpoints,
yourtable endpoints
where
endpoints.id > startpoints.id
group by
startpoints.id;

How to execute this query without imploding my array?

Can I execute a query like this? If not, can you give me a better way to do it without walking through my array or imploding it.
....
DECLARE
examples example[];
myinput myinput[];
BEGIN
select array(select e from mytable e where row_id in (myinput)) into examples
...
SELECT e
FROM mytable e
WHERE row_id = ANY(myinput)