plpgsql Select statement in For loop to create multiple CSV files - postgresql

I would like to repeat the following query 8760 times, replacing ‘2’ with 1 to 8760 for every hour in the year. The idea is to create a separate CSV file for each hour for further processing.
COPY
(SELECT *
FROM
public.completedsolarirad2012
WHERE
completedsolarirad2012."UniquetmstmpID" = 2)
TO 'C:\temp\2012hour2.csv' WITH DELIMITER ','
CSV HEADER
I have put together the following function (testing with only a few hours):
CREATE OR REPLACE FUNCTION everyhour()
RETURNS void AS
$BODY$BEGIN
FOR i IN 0..5 LOOP
EXECUTE $x$
COPY (
SELECT *
FROM
public.completedsolarirad2012
WHERE
completedsolarirad2012."UniquetmstmpID" = i
)
TO $concat$ 'C:\temp.' || i::text
|| '.out.csv' WITH DELIMITER ',' CSV HEADER $concat$
$x$;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION everyhour()
OWNER TO postgres;
I seem to be having two separate problems:
Firstly, I’m getting:
Error: column "i" does not exist.
Secondly, when I test the concatenation statement only by replacing “i” with e.g. “2”, I get:
Error: relative path not allowed for COPY to file
I am the postgres superuser, so I do not understand why I am having this problem.
Note: Removing the $concat$ double-quoting around the concatenation statement gives the following error:
ERROR: syntax error at or near "||"
LINE 9: TO 'C:\temp.' || i::text
I would be very grateful for any help.

Assuming your server OS is Windows.
CREATE OR REPLACE FUNCTION everyhour()
RETURNS void AS
$func$
BEGIN
FOR i IN 0..5 LOOP
EXECUTE '
COPY (
SELECT *
FROM public.completedsolarirad2012 c
WHERE c."UniquetmstmpID" = ' || i || $x$)
TO 'C:/temp.'$x$ || i || $x$'.out.csv' WITH DELIMITER ',' CSV HEADER$x$;
END LOOP;
END
$func$ LANGUAGE plpgsql;
You had one layer of dollar-quoting too many.
You also accidentally concatenated the letter "i" instead of its value.
Use forward-slashes, even with windows. Or you may have to double up the backslashes, depending on your settings. More in this related answer:
PostgreSQL: export resulting data from SQL query to Excel/CSV
Simpler with format()
Since Postgres 9.1 you can use format() to simplify complex concatenations:
CREATE OR REPLACE FUNCTION everyhour()
RETURNS void AS
$func$
BEGIN
FOR i IN 0..5 LOOP
EXECUTE format($x$COPY (
SELECT *
FROM public.completedsolarirad2012 c
WHERE c."UniquetmstmpID" = %1$s)
TO 'C:/temp.%1$s.out.csv' WITH DELIMITER ',' CSV HEADER$x$, i);
END LOOP;
END
$func$ LANGUAGE plpgsql;

Related

liquibase sql migration issue

I am writing 1 PostgreSQL function for some operation.
Writing SQL migration for that function but facing formatting error as liquibase is not able to recognize some portion.
Function Liquibase Migration:
CREATE OR REPLACE FUNCTION schema.fncn(trId integer, sts integer, stIds character varying)
RETURNS double precision
LANGUAGE plpgsql
AS '
DECLARE
abc integer;
query CHAR(1500);
xyz integer;
BEGIN
query := ''select sum(t.a)
FROM schema.tbl t
where t.id in(1,2)
and t.status ='' || sts ||
'' and t.status <> 2
and t.tr_id ='' || trId ||
'' and t.sw in('''', ''N'')'';
IF stIds is not null then
query := query || '' AND t.st_id IN ('' || stIds || '')'';
ELSE
END IF;
EXECUTE query INTO abc;
SELECT abc INTO xyz;
RETURN xyz;
END;
'
;
Following error it throwing:
Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "N"
Reason: liquibase.exception.DatabaseException: ERROR: syntax error at or near "N"
Any suggestion what I am missing?
The immediate problem is the nesting of ' of single quotes. To make that easier, use dollar quoting for the function body. You can nest dollar quoted string by choosing different delimiters.
To avoid any problems with concatenation of parameters, use parameter place holders in the query and pass the values with the USING clause. That will however require two different execute calls.
I assume stIds is a comma separated string of values. To use that as a (single) placeholder, convert it to an array using string_to_array() - or even better: change the type of the input parameter to text[] and pass an array directly.
The query variable is better defined as text, don't use char. There is also no need to copy the result of the query into a different variable (which by the way would be more efficient using xyz := abc; rather than a select into)
CREATE OR REPLACE FUNCTION schema.fncn(trId integer, sts integer, stIds character varying)
RETURNS double precision
LANGUAGE plpgsql
AS
$body$
DECLARE
abc integer;
query text;
BEGIN
query := $q$ select sum(t.a)
FROM schema.tbl t
where t.id in (1,2)
and t.status = $1
and t.status <> 2
and t.tr_id = $2
and t.sw in ('''', 'N') $q$;
IF stIds is not null then
query := query || $sql$ AND t.st_id = ANY (string_to_array($4, ',') $sql$;
EXECUTE query INTO abc
using trid, sts, stids;
ELSE
EXECUTE query INTO abc
using trid, sts;
END IF;
RETURN abc;
END;
$body$
;
Note that in the Liquibase change, you must use splitStatements=false in order to run this without errors.

PL/pgSQL Looping through multiple schema, tables and rows

I have a database with multiple identical schemas. There is a number of tables all named 'tran_...' in each schema. I want to loop through all 'tran_' tables in all schemas and pull out records that fall within a specific date range. This is the code I have so far:
CREATE OR REPLACE FUNCTION public."configChanges"(starttime timestamp, endtime timestamp)
RETURNS SETOF character varying AS
$BODY$DECLARE
tbl_row RECORD;
tbl_name VARCHAR(50);
tran_row RECORD;
out_record VARCHAR(200);
BEGIN
FOR tbl_row IN
SELECT * FROM pg_tables WHERE schemaname LIKE 'ivr%' AND tablename LIKE 'tran_%'
LOOP
tbl_name := tbl_row.schemaname || '.' || tbl_row.tablename;
FOR tran_row IN
SELECT * FROM tbl_name
WHERE ch_edit_date >= starttime AND ch_edit_date <= endtime
LOOP
out_record := tbl_name || ' ' || tran_row.ch_field_name;
RETURN NEXT out_record;
END LOOP;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE plpgsql;
When I attempt to run this, I get:
ERROR: relation "tbl_name" does not exist
LINE 1: SELECT * FROM tbl_name WHERE ch_edit_date >= starttime AND c...
#Pavel already provided a fix for your basic error.
However, since your tbl_name is actually schema-qualified (two separate identifiers in : schema.table), it cannot be escaped as a whole with %I in format(). You have to escape each identifier individually.
Aside from that, I suggest a different approach. The outer loop is necessary, but the inner loop can be replaced with a simpler and more efficient set-based approach:
CREATE OR REPLACE FUNCTION public.config_changes(_start timestamp, _end timestamp)
RETURNS SETOF text AS
$func$
DECLARE
_tbl text;
BEGIN
FOR _tbl IN
SELECT quote_ident(schemaname) || '.' || quote_ident(tablename)
FROM pg_tables
WHERE schemaname LIKE 'ivr%'
AND tablename LIKE 'tran_%'
LOOP
RETURN QUERY EXECUTE format (
$$
SELECT %1$L || ' ' || ch_field_name
FROM %1$s
WHERE ch_edit_date BETWEEN $1 AND $2
$$, _tbl
)
USING _start, _end;
END LOOP;
RETURN;
END
$func$ LANGUAGE plpgsql;
You have to use dynamic SQL to parametrize identifiers (or code), like #Pavel already told you. With RETURN QUERY EXECUTE you can return the result of a dynamic query directly. Examples:
Return SETOF rows from PostgreSQL function
Refactor a PL/pgSQL function to return the output of various SELECT queries
Remember that identifiers have to be treated as unsafe user input in dynamic SQL and must always be sanitized to avoid syntax errors and SQL injection:
Table name as a PostgreSQL function parameter
Note how I escape table and schema separately:
quote_ident(schemaname) || '.' || quote_ident(tablename)
Consequently I just use %s to insert the already escaped table name in the later query. And %L to escape it a string literal for output.
I like to prepend parameter and variable names with _ to avoid naming conflicts with column names. No other special meaning.
There is a slight difference compared to your original function. This one returns an escaped identifier (double-quoted only where necessary) as table name, e.g.:
"WeIRD name"
instead of
WeIRD name
Much simpler yet
If possible, use inheritance to obviate the need for above function altogether. Complete example:
Select (retrieve) all records from multiple schemas using Postgres
You cannot use a plpgsql variable as SQL table name or SQL column name. In this case you have to use dynamic SQL:
FOR tran_row IN
EXECUTE format('SELECT * FROM %I
WHERE ch_edit_date >= starttime AND ch_edit_date <= endtime', tbl_name)
LOOP
out_record := tbl_name || ' ' || tran_row.ch_field_name;
RETURN NEXT out_record;
END LOOP;

Passing table names in an array

I need to do the same deletion or purge operation (based on several conditions) on a set of tables. For that I am trying to pass the table names in an array to a function. I am not sure if I am doing it right. Or is there a better way?
I am pasting just a sample example this is not the real function I have written but the basic is same as below:
CREATE OR REPLACE FUNCTION test (tablename text[]) RETURNS int AS
$func$
BEGIN
execute 'delete * from '||tablename;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
But when I call the function I get an error:
select test( {'rajeev1'} );
ERROR: syntax error at or near "{"
LINE 10: select test( {'rajeev1'} );
^
********** Error **********
ERROR: syntax error at or near "{"
SQL state: 42601
Character: 179
Array syntax
'{rajeev1, rajeev2}' or ARRAY['rajeev1', 'rajeev2']. Read the manual.
TRUNCATE
Since you are deleting all rows from the tables, consider TRUNCATE instead. Per documentation:
Tip: TRUNCATE is a PostgreSQL extension that provides a faster
mechanism to remove all rows from a table.
Be sure to study the details. If TRUNCATE works for you, the whole operation becomes very simple, since the command accepts multiple tables:
TRUNCATE rajeev1, rajeev2, rajeev3, ..
Dynamic DELETE
Else you need dynamic SQL like you already tried. The scary missing detail: you are completely open to SQL injection and catastrophic syntax errors. Use format() with %I (not %s to sanitize identifiers like table names. Or, better yet in this particular case, use an array of regclass as parameter instead:
CREATE OR REPLACE FUNCTION f_del_all(_tbls regclass)
RETURNS void AS
$func$
DECLARE
_tbl regclass;
BEGIN
FOREACH _tbl IN ARRAY _tbls LOOP
EXECUTE format('DELETE * FROM %s', _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_del_all('{rajeev1,rajeev2,rajeev3}');
Explanation here:
Table name as a PostgreSQL function parameter
You used wrong syntax for text array constant in the function call. But even if it was right, your function is not correct.
If your function has text array as argument you should loop over the array to execute query for each element.
CREATE OR REPLACE FUNCTION test (tablenames text[]) RETURNS int AS
$func$
DECLARE
tablename text;
BEGIN
FOREACH tablename IN ARRAY tablenames LOOP
EXECUTE FORMAT('delete * from %s', tablename);
END LOOP;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
You can then call the function for several tables at once, not only for one.
SELECT test( '{rajeev1, rajeev2}' );
If you do not need this feature, simply change the argument type to text.
CREATE OR REPLACE FUNCTION test (tablename text) RETURNS int AS
$func$
BEGIN
EXECUTE format('delete * from %s', tablename);
RETURN 1;
END
$func$ LANGUAGE plpgsql;
SELECT test('rajeev1');
I recommend using the format function.
If you want to execute a function (say purge_this_one_table(tablename)) on a group of tables identified by similar names you can use this construction:
create or replace function purge_all_these_tables(mask text)
returns void language plpgsql
as $$
declare
tabname text;
begin
for tabname in
select relname
from pg_class
where relkind = 'r' and relname like mask
loop
execute format(
'purge_this_one_table(%s)',
tabname);
end loop;
end $$;
select purge_all_these_tables('agg_weekly_%');
It should be:
select test('{rajeev1}');

PostgreSQL - Writing dynamic sql in stored procedure that returns a result set

How can I write a stored procedure that contains a dynamically built SQL statement that returns a result set? Here is my sample code:
CREATE OR REPLACE FUNCTION reporting.report_get_countries_new (
starts_with varchar,
ends_with varchar
)
RETURNS TABLE (
country_id integer,
country_name varchar
) AS
$body$
DECLARE
starts_with ALIAS FOR $1;
ends_with ALIAS FOR $2;
sql VARCHAR;
BEGIN
sql = 'SELECT * FROM lookups.countries WHERE lookups.countries.country_name >= ' || starts_with ;
IF ends_with IS NOT NULL THEN
sql = sql || ' AND lookups.countries.country_name <= ' || ends_with ;
END IF;
RETURN QUERY EXECUTE sql;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100 ROWS 1000;
This code returns an error:
ERROR: syntax error at or near "RETURN"
LINE 1: RETURN QUERY SELECT * FROM omnipay_lookups.countries WHERE o...
^
QUERY: RETURN QUERY SELECT * FROM omnipay_lookups.countries WHERE omnipay_lookups.countries.country_name >= r
CONTEXT: PL/pgSQL function "report_get_countries_new" line 14 at EXECUTE statement
I have tried other ways instead of this:
RETURN QUERY EXECUTE sql;
Way 1:
RETURN EXECUTE sql;
Way 2:
sql = 'RETURN QUERY SELECT * FROM....
/*later*/
EXECUTE sql;
In all cases without success.
Ultimately I want to write a stored procedure that contains a dynamic sql statement and that returns the result set from the dynamic sql statement.
There is room for improvements:
CREATE OR REPLACE FUNCTION report_get_countries_new (starts_with text
, ends_with text = NULL)
RETURNS SETOF lookups.countries AS
$func$
DECLARE
sql text := 'SELECT * FROM lookups.countries WHERE country_name >= $1';
BEGIN
IF ends_with IS NOT NULL THEN
sql := sql || ' AND country_name <= $2';
END IF;
RETURN QUERY EXECUTE sql
USING starts_with, ends_with;
END
$func$ LANGUAGE plpgsql;
-- the rest is default settings
Major points
PostgreSQL 8.4 introduced the USING clause for EXECUTE, which is useful for several reasons. Recap in the manual:
The command string can use parameter values, which are referenced in
the command as $1, $2, etc. These symbols refer to values supplied in
the USING clause. This method is often preferable to inserting data
values into the command string as text: it avoids run-time overhead of
converting the values to text and back, and it is much less prone to
SQL-injection attacks since there is no need for quoting or escaping.
IOW, it is safer and faster than building a query string with text representation of parameters, even when sanitized with quote_literal().
Note that $1, $2 in the query string refer to the supplied values in the USING clause, not to the function parameters.
While you return SELECT * FROM lookups.countries, you can simplify the RETURN declaration like demonstrated:
RETURNS SETOF lookups.countries
In PostgreSQL there is a composite type defined for every table automatically. Use it. The effect is that the function depends on the type and you get an error message if you try to alter the table. Drop & recreate the function in such a case.
This may or may not be desirable - generally it is! You want to be made aware of side effects if you alter tables. The way you have it, your function would break silently and raise an exception on it's next call.
If you provide an explicit default for the second parameter in the declaration like demonstrated, you can (but don't have to) simplify the call in case you don't want to set an upper bound with ends_with.
SELECT * FROM report_get_countries_new('Zaire');
instead of:
SELECT * FROM report_get_countries_new('Zaire', NULL);
Be aware of function overloading in this context.
Don't quote the language name 'plpgsql' even if that's tolerated (for now). It's an identifier.
You can assign a variable at declaration time. Saves an extra step.
Parameters are named in the header. Drop the nonsensical lines:
starts_with ALIAS FOR $1;
ends_with ALIAS FOR $2;
Use quote_literal() to avoid SQL injection (!!!) and fix your quoting problem:
CREATE OR REPLACE FUNCTION report_get_countries_new (
starts_with varchar,
ends_with varchar
)
RETURNS TABLE (
country_id integer,
country_name varchar
) AS
$body$
DECLARE
starts_with ALIAS FOR $1;
ends_with ALIAS FOR $2;
sql VARCHAR;
BEGIN
sql := 'SELECT * FROM lookups.countries WHERE lookups.countries.country_name ' || quote_literal(starts_with) ;
IF ends_with IS NOT NULL THEN
sql := sql || ' AND lookups.countries.country_name <= ' || quote_literal(ends_with) ;
END IF;
RETURN QUERY EXECUTE sql;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100 ROWS 1000;
This is tested in version 9.1, works fine.

postgresql: USING CURSOR for extracting data from one database and inserting them to another

here is another algorithm using cursor but i'm having a hard time fixing its error ...
CREATE OR REPLACE FUNCTION extractstudent()
RETURNS VOID AS
$BODY$
DECLARE
studcur SCROLL cursor FOR SELECT fname, lname, mname, address FROM student;
BEGIN
open studcur;
Loop
--fetching 1 row at a time
FETCH First FROM studcur;
--every row fetched is being inserted to another database on the local site
--myconT is the name of the connection to the other database in the local site
execute 'SELECT * from dblink_exec(''myconT'', ''insert into temp_student values(studcur)'')';
--move to the next row and execute again
move next from studcur;
--exit when the row content is already empty
exit when studcur is null;
end loop;
close studcur;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION extractstudent() OWNER TO postgres;
You rarely need to explicitly use cursors in postgresql or pl/pgsql. What you've written looks suspiciously like a SQL Server cursor loop construct, and you don't need to do that. Also, you can use "PERFORM" instead of "EXECUTE" to run a query and discard the results: this will avoid re-parsing the query each time (although it can't avoid dblink parsing the query each time).
You can do something more like this:
DECLARE
rec student%rowtype;
BEGIN
FOR rec IN SELECT * FROM student
LOOP
PERFORM dblink_exec('myconT',
'insert into temp_student values ('
|| quote_nullable(rec.fname) || ','
|| quote_nullable(rec.lname) || ','
|| quote_nullable(rec.mname) || ','
|| quote_nullable(rec.address) || ')');
END LOOP;
END;
Why not try it by yourself , according the error, you can try to solve them step by step !