How to access a record's field in an Execute sentence? - postgresql

I need to access a dynamic field from a dynamic table, and I have this code for doing this
Here f_field takes the name of a column from t_table dynamicly.
When t_table is declared as a record this code gets an error. But when declared as a static column fieldType it runs as expected.
The question is, How can I declared t_row for this code to run properly, or how can I achieve the same by other way. Remember that t_table and f_field are dynamic and therefore their values change.
FOR t_row IN EXECUTE 'SELECT * from ' || t_table loop
EXECUTE format('select $1.%I', f_field) USING t_row into f;
RAISE notice '%', f;
END LOOP;

if I get your task right, you dont need $1 and USING here at all, eg:
do
$$
declare
t_table text;
f text;
f_field text := 'oid';
r record;
begin
for r in (select relname from pg_class where relname = 'pg_database') loop
EXECUTE format('select %I from %I', f_field, r.relname) into f;
RAISE notice '%', f;
end loop;
end;
$$
;
mind it raises only first row out of many this way
NOTICE: 12669
DO

When I was doing my research I found out that sentences like EXECUTE... doesn't understand records structure, this was by experimenting with plpgsql because there is not doc about this. So I need to convert t_row in another kind of object that has methods to extract a value from a dynamic field, I chose json because I think it is easy to use. This is my code.
FOR t_row IN EXECUTE 'SELECT * from dd.a' loop
f_json := row_to_json(layer_row);
f := json_extract_path(f_json, f_field);
RAISE notice '%', f;
END LOOP;
It work for fine for me, but this is only a walkaround, if someone could point out why is my first code not working I will apreciate it. Thanks in advance

Related

Postgres Update Statement: how to set a column's value only if the column name contains a certain substring?

Each of the following tables ['table1', 'table2'] are part of the public schema, knowing that each table may contain multiple columns containing the substring 'substring' in the name for Example let's look at the following :
Table_1 (xyz, xyz_substring,...some_other_columns... abc,
abc_substring)
Table_2 (xyz, xyz_substring,..some_other_columns... abc,
abc_substring)
I am coming to this from a pythonic way of thinking, but basically, how could one execute a statement without knowing exactly what to set since the columns we need to target need to meet a certain criteria ?
I had the idea to add a another loop over the column names of the current table and check if the name meets the criteria. then execute the query but It feels very far away from optimal.
DO $$
declare
t text;
tablenames TEXT ARRAY DEFAULT ARRAY['table_1', 'table_2'];
BEGIN
FOREACH t IN ARRAY tablenames
LOOP
raise notice 'table(%)', t;
-- update : for all the column that contain 'substring' in their names set a value
END LOOP;
END$$;
EDIT:
Thanks for the answer #Stefanov.sm , i followed exactly your thought and was able to base your logic to have just one statement :
DO $$
declare
t text;
tablenames TEXT ARRAY DEFAULT ARRAY['table_1', 'table_2'];
dynsql text;
colname text;
BEGIN
FOREACH t IN ARRAY tablenames LOOP
raise notice 'table (%)', t;
dynsql := format('update %I set', t);
for colname in select column_name from information_schema.columns where table_schema = 'public' and table_name = t
loop
if colname like '%substring%' then
dynsql := concat(dynsql,format(' %I = ....whatever expression here (Make sure to check if you should use Literal formatter %L if needed) ....,',colname,...whateverargs...));
end if;
end loop;
dynsql := concat(dynsql,';'); -- not sure if required.
raise notice 'SQL to execute (%)', dynsql;
execute dynsql;
END LOOP;
END;
$$;
Extract the list of columns of each table and then format/execute dynamic SQL. Something like
DO $$
declare
t text;
tablenames text[] DEFAULT ARRAY['table_1', 'table_2'];
dynsql text;
colname text;
BEGIN
FOREACH t IN ARRAY tablenames LOOP
raise notice 'table (%)', t;
for colname in select column_name
from information_schema.columns
where table_schema = 'public' and table_name = t loop
if colname ~ '__substring$' then
dynsql := format('update %I set %I = ...expression... ...other clauses if any...', t, colname);
raise notice 'SQL to execute (%)', dynsql;
execute dynsql;
end if;
end loop;
END LOOP;
END;
$$;
This will cause excessive bloat so do not forget to vacuum your tables. If your tables' schema is not public then edit the select from information_schema accordingly. You may use pg_catalog resource instead of information_schema too.

Using a row as a table in a query within a function PLpgSQL

I am trying to write a plpgsql function that loops through a table. On each loop, it pulls a row from the table, stores it in a record, then uses that record in the join clause of a query. Here is my code:
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
execute $$ select count(*) from $$||rec1||$$ $$ into tablesize;
while counter < tablesize loop
counter = counter + 1;
raise notice 'hi';
execute $$ select count(*) from cities_sridconv $$ into tablesize;
end loop;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
Each time I run this, I get the following error:
ERROR: could not find array type for data type record
Is there a way to use the row as a table in the query within the nested loops?
My end goal is to build a function that loops through a table, pulling a row from that table on each loop. In each loop, a number COUNTER is computed using the row, then a query is executed depending on the row and COUNTER. Knowing that this code is currently very flawed, I am posting it below to give an idea of what I am trying to do:
CREATE OR REPLACE FUNCTION "testfncjh" () RETURNS void
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
for rec1 in SELECT * FROM poilocations_sridconv loop
counter = 0;
execute $$ select count(*)
from $$||rec1||$$ a
join
cities_srid_conv b
on right(a.geom_wgs_pois,$$||counter||$$) = right(b.geom_wgs_pois,$$||counter||$$) $$ into tablesize;
raise notice 'got through first execute';
while tablesize = 0 loop
counter = counter + 1;
execute $$ select count(*)
from '||rec1||' a
join
cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||') $$ into tablesize;
raise notice 'hi';
end loop;
EXECUTE
'select
poiname,
name as cityname,
postgis.ST_Distance(postgis.ST_GeomFromText(''POINT(poilat poilong)''),
postgis.ST_GeomFromText(''POINT(citylat citylong)'')
) as distance
from (select a.poiname,
a.latitude::text as poilat,
a.longitude::text as poilong,
b.geonameid,
b.name,
b.latitude as citylat,
b.longitude as citylong
from '||rec1||' a
join cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||'))
) x
order by distance
limit 1'
poi_cities_match (poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
end loop;
END;
$dbvis$ LANGUAGE plpgsql;
I am running on a PostgreSQL 8.2.15 database.
Also, sorry for reposting. I had to remove some data from the original.
I think you should be able to use composite types for what you want. I simplified your top example and used composite types in the following way.
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 poilocations_sridconv;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
select count(*) FROM (select (rec1).*)theRecord into counter;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
The main changes being the rec1 poilocations_sridconv; line and using (select (rec1).*)
Hope it helps.
EDIT: I should note that the function is not doing the same thing as it does in the question above. This is just as an example of how you could use a record as a table in a query.
You have a few issues with your code (apart, perhaps, from your logic).
Foremost, you should not use a record as a table source in a JOIN. Instead, filter the second table for rows that match some field from the record.
Second, you should use the format() function instead of assembling strings with the || operator. But you can't because you are using the before-prehistoric version 8.2. This is from the cave-painting era (yes, it's that bad). UPGRADE!
Thirdly, don't over-complicate your queries. The sub-query is not necessary here.
Put together, the second dynamic query from your real code would reduce to this:
EXECUTE format(
'SELECT b.name,
postgis.ST_Distance(postgis.ST_SetSRID(postgis.ST_MakePoint(%1$I.longitude, %1$I.latitude), 4326),
postgis.ST_SetSRID(postgis.ST_MakePoint(b.longitude, b.latitude), 4326))
FROM cities_srid_conv b
WHERE right(%1$I.geom_wgs_pois, %2$L) = right(b.geom_wgs_pois, %2$L)
ORDER BY distance
LIMIT 1', rec1, counter) INTO cityname, distance;
poi_cities_match (rec1.poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
Here %1$I refers to the first parameter after the string, which is an idenifier: rec1; %2$L is the second parameter, being a literal value: counter. I leave it to yourself to re-work this to a pre-8.4 string concatenation. The results from the query are stored in a few additional variables which you can then use in the following function call.
Lastly, you had longitude and latitude reversed. In PostGIS longitude always comes first.

Variables for identifiers inside IF EXISTS in a plpgsql function

CREATE OR REPLACE FUNCTION drop_now()
RETURNS void AS
$BODY$
DECLARE
row record;
BEGIN
RAISE INFO 'in';
FOR row IN
select relname from pg_stat_user_tables
WHERE schemaname='public' AND relname LIKE '%test%'
LOOP
IF EXISTS(SELECT row.relname.tm FROM row.relname
WHERE row.relname.tm < current_timestamp - INTERVAL '90 minutes'
LIMIT 1)
THEN
-- EXECUTE 'DROP TABLE ' || quote_ident(row.relname);
RAISE INFO 'Dropped table: %', quote_ident(row.relname);
END IF;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Could you tell me how to use variables in SELECT which is inside IF EXISTS? At the present moment, row.relname.tm and row.relname are treated literally which is not I want.
CREATE OR REPLACE FUNCTION drop_now()
RETURNS void AS
$func$
DECLARE
_tbl regclass;
_found int;
BEGIN
FOR _tbl IN
SELECT relid
FROM pg_stat_user_tables
WHERE schemaname = 'public'
AND relname LIKE '%test%'
LOOP
EXECUTE format($f$SELECT 1 FROM %s
WHERE tm < now() - interval '90 min'$f$, _tbl);
GET DIAGNOSTICS _found = ROW_COUNT;
IF _found > 0 THEN
-- EXECUTE 'DROP TABLE ' || _tbl;
RAISE NOTICE 'Dropped table: %', _tbl;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Major points
row is a reserved word in the SQL standard. It's use is allowed in Postgres, but it's still unwise. I make it a habbit to prepend psql variable with an underscore _ to avoid any naming conflicts.
You don't don't select the whole row anyway, just the table name in this example. Best use a variable of type regclass, thereby avoiding SQL injection by way of illegal table names automatically. Details in this related answer:
Table name as a PostgreSQL function parameter
You don't need LIMIT in an EXISTS expression, which only checks for the existence of any rows. And you don't need meaningful target columns for the same reason. Just write SELECT 1 or SELECT * or something.
You need dynamic SQL for queries with variable identifiers. Plain SQL does not allow for that. I.e.: build a query string and EXECUTE it. Details in this closely related answer:
Dynamic SQL (EXECUTE) as condition for IF statement
The same is true for a DROP statement, should you want to run it. I added a comment.
You'll need to build your query as a string then execute that - see the section on executing dynamic commands in the plpgsql section of the manual.

Update record of a cursor where the table name is a parameter

I am adjusting some PL/pgSQL code so my refcursor can take the table name as parameter. Therefore I changed the following line:
declare
pointCurs CURSOR FOR SELECT * from tableName for update;
with this one:
OPEN pointCurs FOR execute 'SELECT * FROM ' || quote_ident(tableName) for update;
I adjusted the loop, and voilĂ , the loop went through. Now at some point in the loop I needed to update the record (pointed by the cursor) and I got stuck. How should I properly adjust the following line of code?
UPDATE tableName set tp_id = pos where current of pointCurs;
I fixed the quotes for the tableName and pos and added the EXECUTE clause at the beginning, but I get the error on the where current of pointCurs.
Questions:
How can I update the record?
The function was working properly for tables from the public schema and failed for tables from other schemas (e.g., trace.myname).
Any comments are highly appreciated..
Answer for (i)
1. Explicit (unbound) cursor
EXECUTE is not a "clause", but a PL/pgSQL command to execute SQL strings. Cursors are not visible inside the command. You need to pass values to it.
Hence, you cannot use the special syntax WHERE CURRENT OFcursor. I use the system column ctid instead to determine the row without knowing the name of a unique column. Note that ctid is only guaranteed to be stable within the same transaction.
CREATE OR REPLACE FUNCTION f_curs1(_tbl text)
RETURNS void AS
$func$
DECLARE
_curs refcursor;
rec record;
BEGIN
OPEN _curs FOR EXECUTE 'SELECT * FROM ' || quote_ident(_tbl) FOR UPDATE;
LOOP
FETCH NEXT FROM _curs INTO rec;
EXIT WHEN rec IS NULL;
RAISE NOTICE '%', rec.tbl_id;
EXECUTE format('UPDATE %I SET tbl_id = tbl_id + 10 WHERE ctid = $1', _tbl)
USING rec.ctid;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Why format() with %I?
There is also a variant of the FOR statement to loop through cursors, but it only works for bound cursors. We have to use an unbound cursor here.
2. Implicit cursor in FOR loop
There is normally no need for explicit cursors in plpgsql. Use the implicit cursor of a FOR loop instead:
CREATE OR REPLACE FUNCTION f_curs2(_tbl text)
RETURNS void AS
$func$
DECLARE
_ctid tid;
BEGIN
FOR _ctid IN EXECUTE 'SELECT ctid FROM ' || quote_ident(_tbl) FOR UPDATE
LOOP
EXECUTE format('UPDATE %I SET tbl_id = tbl_id + 100 WHERE ctid = $1', _tbl)
USING _ctid;
END LOOP;
END
$func$ LANGUAGE plpgsql;
3. Set based approach
Or better, yet (if possible!): Rethink your problem in terms of set-based operations and execute a single (dynamic) SQL command:
-- Set-base dynamic SQL
CREATE OR REPLACE FUNCTION f_nocurs(_tbl text)
RETURNS void AS
$func$
BEGIN
EXECUTE format('UPDATE %I SET tbl_id = tbl_id + 1000', _tbl);
-- add WHERE clause as needed
END
$func$ LANGUAGE plpgsql;
SQL Fiddle demonstrating all 3 variants.
Answer for (ii)
A schema-qualified table name like trace.myname actually consists of two identifiers. You have to
either pass and escape them separately,
or go with the more elegant approach of using a regclass type:
CREATE OR REPLACE FUNCTION f_nocurs(_tbl regclass)
RETURNS void AS
$func$
BEGIN
EXECUTE format('UPDATE %s SET tbl_id = tbl_id + 1000', _tbl);
END
$func$ LANGUAGE plpgsql;
I switched from %I to %s, because the regclass parameter is automatically properly escaped when (automatically) converted to text.
More details in this related answer:
Table name as a PostgreSQL function parameter

Create a function to get column from multiple tables in PostgreSQL

I'm trying to create a function to get a field value from multiple tables in my database. I made script like this:
CREATE OR REPLACE FUNCTION get_all_changes() RETURNS SETOF RECORD AS
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOR tblrow IN SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname='public' LOOP /*FOREACH tblname IN ARRAY $1 LOOP*/
RAISE NOTICE 'r: %', tblrow.tablename;
FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP
RETURN NEXT row;
END LOOP;
END LOOP;
END
$$
LANGUAGE 'plpgsql' ;
SELECT get_all_changes();
But it is not working, everytime it shows this error
tblrow.tablename" not defined in line "FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP"
Your inner FOR loop must use the FOR...EXECUTE syntax as shown in the manual:
FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP
statements
END LOOP [ label ];
In your case something along this line:
FOR row IN EXECUTE 'SELECT MAX("lastUpdate") FROM ' || quote_ident(tblrow.tablename) LOOP
RETURN NEXT row;
END LOOP
The reason for this is explained in the manual somewhere else:
Oftentimes you will want to generate dynamic commands inside your PL/pgSQL functions, that is, commands that will involve different tables or different data types each time they are executed. PL/pgSQL's normal attempts to cache plans for commands (as discussed in Section 39.10.2) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided[...]
Answer to your new question (mislabeled as answer):
This can be much simpler. You do not need to create a table just do define a record type.
If at all, you would better create a type with CREATE TYPE, but that's only efficient if you need the type in multiple places. For just a single function, you can use RETURNS TABLE instead :
CREATE OR REPLACE FUNCTION get_all_changes(text[])
RETURNS TABLE (tablename text
,"lastUpdate" timestamp with time zone
,nums integer) AS
$func$
DECLARE
tblname text;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
RETURN QUERY EXECUTE format(
$f$SELECT '%I', MAX("lastUpdate"), COUNT(*)::int FROM %1$I
$f$, tblname)
END LOOP;
END
$func$ LANGUAGE plpgsql;
A couple more points:
Use RETURN QUERY EXECUTE instead of the nested loop. Much simpler and faster.
Column aliases would only serve as documentation, those names are discarded in favor of the names declared in the RETURNS clause (directly or indirectly).
Use format() with %I to replace the concatenation with quote_ident() and %1$I to refer to the same parameter another time.
count() usually returns type bigint. Cast the integer, since you defined the column in the return type as such: count(*)::int.
Thanks,
I finally made my script like:
CREATE TABLE IF NOT EXISTS __rsdb_changes (tablename text,"lastUpdate" timestamp with time zone, nums bigint);
CREATE OR REPLACE FUNCTION get_all_changes(varchar[]) RETURNS SETOF __rsdb_changes AS /*TABLE (tablename varchar(40),"lastUpdate" timestamp with time zone, nums integer)*/
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
/*RAISE NOTICE 'r: %', tblrow.tablename;*/
FOR row IN EXECUTE 'SELECT CONCAT('''|| quote_ident(tblname) ||''') AS tablename, MAX("lastUpdate") AS "lastUpdate",COUNT(*) AS nums FROM ' || quote_ident(tblname) LOOP
/*RAISE NOTICE 'row.tablename: %',row.tablename;*/
/*RAISE NOTICE 'row.lastUpdate: %',row."lastUpdate";*/
/*RAISE NOTICE 'row.nums: %',row.nums;*/
RETURN NEXT row;
END LOOP;
END LOOP;
RETURN;
END
$$
LANGUAGE 'plpgsql' ;
Well, it works. But it seems I can only create a table to define the return structure instead of just RETURNS SETOF RECORD. Am I right?
Thanks again.