Use function variable in dynamic COPY statement - postgresql

According to docs of PostgreSQL it is possible to copy data to csv file right from a query without using an intermediate table. I am curious how to do that.
CREATE OR REPLACE FUNCTION m_tbl(my_var integer)
RETURNS void AS
$BODY$
DECLARE
BEGIN
COPY (
select my_var
)
TO 'c:/temp/out.csv';
END;
$$ LANGUAGE plpgsql;
I get an error: no such column 'my_var'.

Yes, it is possible to COPY from any query, whether or not it refers to a table.
However, COPY is a non-plannable statement, a utility statement. It doesn't support query parameters - and query parameters are how PL/PgSQL implements the insertion of variables into statements.
So you can't use PL/PgSQL variables with COPY.
You must instead use dynamic SQL with EXECUTE. See the Pl/PgSQL documentation for examples. There are lots of examples here on Stack Overflow and on https://dba.stackexchange.com/ too.
Something like:
EXECUTE format('
COPY (
select %L
)
TO ''c:/temp/out.csv'';
', my_var);
The same applies if you want the file path to be dynamic - you'd use:
EXECUTE format('
COPY (
select %L
)
TO %L;
', my_var, 'file_name.csv');
It also works for dynamic column names but you would use %I (for identifier, like "my_name") instead of %L for literal like 'my_value'. For details on %I and %L, see the documentation for format.

Related

Stored procedure in Redshift to use string replacement with unload

I want to achieve something among the following lines:
create or replace procedure my_sp_name(p_date date) as
$$
begin
unload ('
select *
from my_table
where my_date = ''DATE_GOES_HERE''
')
to 's3://my-bucket/some_key/DATE_GOES_HERE/some_file'
iam_role 'my_iam_role'
allowoverwrite csv parallel off;
end ;
$$
language plpgsql;
I have been trying different things, however, can't seem to find what is the correct way to to do the string replacement in both parts of the unload. What is the correct way of doing this?
You need to EXECUTE "SQL statement" in the procedure. See: https://docs.aws.amazon.com/redshift/latest/dg/c_PLpgSQL-statements.html under "dynamic SQL" section.

Return variable in postgres select

This covers most use cases How do you use variables in a simple PostgreSQL script? but not the select clause.
This code produces an error column "ct" does not exist"
DO
$$
declare CT timestamp := '2020-09-04 23:59:59';
select CT,5 from job;
$$;
I can see why it would interpret CT as a column name. What's the Postgres syntax required to refer to a variable in the context of the select clause?
I would expect that query to return
'2020-09-04 23:59:59',5
for each row in the job table.
Addendum to the accepted answer
My use case doesn't return rows. Instead, the result of the select is consumed by an insert statement. I'm transforming rows from staging tables into other tables and adding value like the import date and the identity owning the inserts. It's these values that are provided by the variables - they are used in several such transforms and the point of the variable is to let me set each value once up the top of the script.
Because the rows are consumed like this, it turns out that I don't need a function wrapping this code. It's a bit inconvenient to test since I can't run the select and look at the outcome without copying it and pasting in literals, but at least it's possible to use variables. My working script looks like this:
do
$$
declare ct timestamp := '2020-09-04 23:59:59';
declare cb int := 2;
declare iso8601 varchar(50) := 'YYYY-MM-DD HH24:MI:SS';
declare USAdate varchar(50) := 'MM-DD-YYYY HH24:MI:SS';
begin
delete from dozer_wheel_loader_equipment_movement where created = ct;
INSERT INTO dozer_wheel_loader_equipment_movement
(site, primary_category_id, machine, machine_class, x, y, z, timestamp_local, created, created_by)
select site ,mc.id ,machine , machineclass ,x,y,z,to_timestamp(timestamplocal, iso8601), ct, cb
from stage_dozer_csv d join machine_category mc on d.primarycategory = mc.short_code;
...
end
$$
There is a lot of worthwhile related reading at How to declare a variable in a PostgreSQL query
There are few things about variables in PostgreSQL.
Variable can not be used in Plain SQL in Postgres. So you have to use any pl language i.e. plpgsql to use this. You have tried the same in your example.
In your DO block you have missed the Begin and End, So you have to write it like below
DO
$$
begin
declare CT timestamp := '2020-09-04 23:59:59';
select CT,5 from job;
end
$$;
But when you read the official documentation of DO Statement, it says DO will allow to run the anonymous code but it returns void, that's why above code will throw following error:
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function inline_code_block line 4 at SQL statement
So there is only one way - wrap this code block in a Function like below:
create or replace function func() returns table(col1 timestamp, col2 int )
AS
$$
declare ct timestamp := '2020-09-04 23:59:59';
begin
return query
select CT,5 from job;
end;
$$
language plpgsql
and you can call it like below:
select * from func()
DEMO
Conclusion
You can not use variable in normal SQL statement in Postgres.
You have to use any Procedural Language i.e. plpgsql to use variable.
DO Block doesn't return any value so you can not use select statement like above in DO block. It is good for non-returning queries i.e. insert, update, delete or grant etc.
Only way to return a value from procedural language code block is - you have to wrap it in a suitable PostgreSQL Function.

Dynamic table name for INSERT INTO query

I am trying to figure out how to write an INSERT INTO query with table name and column name of the source as parameter.
For starters I was just trying to parametrize the source table name. I have written the following query. For now I am declaring and assigning the value of the variable tablename directly, but in actual example it would come from some other source/list. The target table has only one column.
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
tablename text;
BEGIN
tablename := 'Table_1';
EXECUTE 'INSERT INTO "Schemaname"."targettable"
SELECT "Col_A"
FROM "schemaname".'
||quote_ident(tablename);
END
$$ LANGUAGE PLPGSQL;
Although the query runs without any error no changes are reflected at the target table. On running the query I get the following output.
Query OK, 0 rows affected (execution time: 296 ms; total time: 296 ms)
I want the changes to be reflected at the target table. I don't know how to resolve the problem.
Audited code
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$func$
DECLARE
_tbl text := 'Table_1'; -- or 'table_1'?
BEGIN
EXECUTE 'INSERT INTO schemaname.targettable(column_name)
SELECT "Col_A"
FROM schemaname.' || quote_ident(_tbl); -- or "Schemaname"?
END
$func$ LANGUAGE plpgsql;
Always use an explicit target list for persisted INSERT statements.
You can assign variables at declare time.
It's a wide-spread folly to use double-quoted identifiers to preserve otherwise illegal spelling. You have to keep double-quoting the name for the rest of its existence. One or more of those errors seem to have crept into your code: "Schemaname" or "schemaname"? Table_1 or "Table_1"?
Are PostgreSQL column names case-sensitive?
When you provide an identifier like a table name as text parameter and escape it with quote_ident(), it is case sensitive!
Identifiers in SQL code are cast to lower case unless double-quoted. But quote-ident() (which you must use to defend against SQL injection) preserves the spelling you provide with double-quotes where necessary.
Function with parameter
CREATE OR REPLACE FUNCTION foo(_tbl text)
RETURNS void AS
$func$
BEGIN
EXECUTE 'INSERT INTO schemaname.targettable(column_name)
SELECT "Col_A"
FROM schemaname.' || quote_ident(_tbl);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT foo('tablename'); -- tablename is case sensitive
There are other ways:
Table name as a PostgreSQL function parameter

Print ASCII-art formatted SETOF records from inside a PL/pgSQL function

I would love to exploit the SQL output formatting of PostgreSQL inside my PL/pgSQL functions, but I'm starting to feel I have to give up the idea.
I have my PL/pgSQL function query_result:
CREATE OR REPLACE FUNCTION query_result(
this_query text
) RETURNS SETOF record AS
$$
BEGIN
RETURN QUERY EXECUTE this_query;
END;
$$ LANGUAGE plpgsql;
..merrily returning a SETOF records from an input text query, and which I can use for my SQL scripting with dynamic queries:
mydb=# SELECT * FROM query_result('SELECT ' || :MYVAR || ' FROM Alice') AS t (id int);
id
----
1
2
3
So my hope was to find a way to deliver this same nicely formatted output from inside a PL/pgSQL function instead, but RAISE does not support SETOF types, and there's no magic predefined cast from SETOF records to text (I know I could create my own CAST..)
If I create a dummy print_result function:
CREATE OR REPLACE FUNCTION print_result(
this_query text
) RETURNS void AS
$$
BEGIN
SELECT query_result(this_query);
END;
$$ LANGUAGE plpgsql;
..I cannot print the formatted output:
mydb=# SELECT print_result('SELECT ' || :MYVAR || ' FROM Alice');
ERROR: set-valued function called in context that cannot accept a set
...
Thanks for any suggestion (which preferably works with PostgreSQL 8.4).
Ok, to do anything with your result set in print_result you'll have to loop over it. That'll look something like this -
Here result_record is defined as a record variable. For the sake of explanation, we'll also assume that you have a formatted_results variable that is defined as text and defaulted to a blank string to hold the formatted results.
FOR result_record IN SELECT * FROM query_result(this_query) AS t (id int) LOOP
-- With all this, you can do something like this
formatted_results := formatted_results ||','|| result_record.id;
END LOOP;
RETURN formatted_results;
So, if you change print_results to return text, declare the variables as I've described and add this in, your function will return a comma-separated list of all your results (with an extra comma at the end, I'm sure you can make use of PostgreSQL's string functions to trim that). I'm not sure this is exactly what you want, but this should give you a good idea about how to manipulate your result set. You can get more information here about control structures, which should let you do pretty much whatever you want.
EDIT TO ANSWER THE REAL QUESTION:
The ability to format data tuples as readable text is a feature of the psql client, not the PostgreSQL server. To make this feature available in the server would require extracting relevant code or modules from the psql utility and recompiling them as a database function. This seems possible (and it is also possible that someone has already done this), but I am not familiar enough with the process to provide a good description of how to do that. Most likely, the best solution for formatting query results as text will be to make use of PostgreSQL's string formatting functions to implement the features you need for your application.

PLPGSQL: Passing an argument into a function breaks my quotation marks

Without functions, I can do:
DELETE FROM table1
WHERE something='hello'
And my rows with the value of something='hello' get deleted, but as soon I implement functions, I begin to have problems with quotation marks.
CREATE OR REPLACE FUNCTION somefunc(varchar)
RETURNS varchar AS $$
BEGIN
DELETE FROM table1
WHERE something='$1';
DELETE FROM table2
WHERE something='$1';
RETURN $1;
END;
$$ LANGUAGE plpgsql;`
Nothing seems to work. I have tried (all variations that I saw on SO or elsewhere):
something=$1 <-- says column "hello" doesn't exist (because no quotes are given)
something=''$1''
something='''$1'''
something=''''$1''''
something='''||$1||'''
something=$Q$$1$Q1$ <--- gives syntax error
something=$Q1$ $1 $Q1$
something=$$ $1 $$
something=quote_literal($1)
And many other variations. How do I get around this??
Btw, I am using a python script to run the function. Here's the line that runs it. I've also tried adding quotes into this line as well to no avail:
cur.execute("SELECT somefunc(%s);" % (sys.argv[2]))
Thank you!
This behavior is based on the implicit use of prepare statements. When prepared statements are used, query and parameters are passed to the database server separately. Do not quote values in that scenario.
PL/pgSQL uses prepared statements, psycopg2 uses prepared statements, too:
...
DECLARE myvar int;
BEGIN
DELETE FROM mytab WHERE column = myvar; -- quietly using prepared statement
versus
DECLARE myvar int;
BEGIN
-- using dynamic SQL is similar to classic languages, quoting is necessary
-- but use the quote_literal() function to protect against SQL injection
EXECUTE 'DELETE FROM mytab WHERE column = ' || quote_literal(myvar);
-- or dynamic SQL with "USING" clause
EXECUTE 'DELETE FROM mytab WHERE column = $1' USING myvar;