How to insert a text file into a field in PostgreSQL? - postgresql

How to insert a text file into a field in PostgreSQL?
I'd like to insert a row with fields from a local or remote text file.
I'd expect a function like gettext() or geturl() in order to do the following:
% INSERT INTO collection(id, path, content) VALUES(1, '/etc/motd', gettext('/etc/motd'));
-S.

The easiest method would be to use one of the embeddable scripting languages. Here's an example using plpythonu:
CREATE FUNCTION gettext(url TEXT) RETURNS TEXT
AS $$
import urllib2
try:
f = urllib2.urlopen(url)
return ''.join(f.readlines())
except Exception:
return ""
$$ LANGUAGE plpythonu;
One drawback to this example function is its reliance on urllib2 means you have to use "file:///" URLs to access local files, like this:
select gettext('file:///etc/motd');

Thanks for the tips. I've found another answer with a built in function.
You need to have super user rights in order to execute that!
-- 1. Create a function to load a doc
-- DROP FUNCTION get_text_document(CHARACTER VARYING);
CREATE OR REPLACE FUNCTION get_text_document(p_filename CHARACTER VARYING)
RETURNS TEXT AS $$
-- Set the end read to some big number because we are too lazy to grab the length
-- and it will cut of at the EOF anyway
SELECT CAST(pg_read_file(E'mydocuments/' || $1 ,0, 100000000) AS TEXT);
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;
ALTER FUNCTION get_text_document(CHARACTER VARYING) OWNER TO postgres;
-- 2. Determine the location of your cluster by running as super user:
SELECT name, setting FROM pg_settings WHERE name='data_directory';
-- 3. Copy the files you want to import into <data_directory>/mydocuments/
-- and test it:
SELECT get_text_document('file1.txt');
-- 4. Now do the import (HINT: File must be UTF-8)
INSERT INTO mytable(file, content)
VALUES ('file1.txt', get_text_document('file1.txt'));

Postgres's COPY command is exactly for this.
My advice is to upload it to a temporary table, and then transfer the data across to your main table when you're happy with the formatting. e.g.
CREATE table text_data (text varchar)
COPY text_data FROM 'C:\mytempfolder\textdata.txt';
INSERT INTO main_table (value)
SELECT string_agg(text, chr(10)) FROM text_data;
DROP TABLE text_data;
Also see this question.

You can't. You need to write a program that will read file content's (or URL's) and store it into the desired field.

Use COPY instead of INSERT
reference: http://www.commandprompt.com/ppbook/x5504#AEN5631

Related

Trigger to insert rows in remote database after deletion

I have created a trigger that works like this:
After deleting data from table flux_tresorerie_historique it insert this row in the table flux_tresorerie_historique that is located in another database archive
I use dblink to insert data in the remote database, the problem is that the creation of the query is too hard especially that the table contain more than 20 columns, and I want to create similar functions for 10 other tables.
Is there another rapid way to ensure this task?
Here an example that works fine:
CREATE OR REPLACE FUNCTION flux_tresorerie_historique_backup_row()
RETURNS trigger AS
$BODY$
DECLARE date_rapprochement_flux TEXT;
DECLARE code_commission TEXT;
DECLARE reference_flux TEXT;
BEGIN
IF OLD.date_rapprochement_flux is null
THEN
date_rapprochement_flux = 'NULL';
ELSE
date_rapprochement_flux = ''''||to_char(OLD.date_rapprochement_flux, 'YYYY-MM-DD')||'''';
END IF;
IF OLD.code_commission is null
THEN
code_commission = 'NULL';
ELSE
code_commission = ''''||replace(OLD.code_commission,'''','''''')||'''';
END IF;
IF OLD.reference_flux is null
THEN
reference_flux = 'NULL';
ELSE
reference_flux = ''''||replace(OLD.reference_flux,'''','''''')||'''';
END IF;
perform dblink_connect('dbname=gtr_bd_archive user=postgres password=postgres');
perform dblink_exec('insert into flux_tresorerie_historique values('||OLD.id_flux_historique||','''||OLD.date_operation_flux||''','''||OLD.date_valeur_flux||''','||date_rapprochement_flux||','''||replace(OLD.libelle_flux,'''','''''')||''','||OLD.montant_flux||','||OLD.contre_valeur_dzd||','''||replace(OLD.rib_compte_bancaire,'''','''''')||''','||OLD.frais_flux||','''||replace(OLD.sens_flux,'''','''''')||''','''||replace(OLD.statut_flux,'''','''''')||''','''||replace(OLD.code_devise,'''','''''')||''','''||replace(OLD.code_mode_paiement,'''','''''')||''','''||replace(OLD.code_agence,'''','''''')||''','''||replace(OLD.code_compte,'''','''''')||''','''||replace(OLD.code_banque,'''','''''')||''','''||OLD.date_maj_flux||''','''||replace(OLD.statut_frais,'''','''''')||''','||reference_flux||','||code_commission||','||OLD.id_flux||');');
perform dblink_disconnect();
RETURN NULL;
END;
This is a limited application of replication. Requirements vary a lot, so there are a number of different established solutions, addressing different situations. Consider the overview in the manual.
Your hand-knit, trigger-based solution is one viable option for relatively few deletions. Opening and closing a separate connection for every row incurs quite an overhead. There are other various options.
While working with dblink I suggest some modifications. Most importantly:
Use format() to escape strings more elegantly.
Pass the whole row instead of passing and escaping every single column.
Don't place the password in every single trigger function.
Use a FOREIGN SERVER plus USER MAPPING. Detailed instructions here:
Persistent inserts in a UDF even if the function aborts
Basically, run once on the source server:
CREATE SERVER myserver FOREIGN DATA WRAPPER dblink_fdw
OPTIONS (hostaddr '127.0.0.1', dbname 'gtr_bd_archive');
CREATE USER MAPPING FOR role_source SERVER myserver
OPTIONS (user 'postgres', password 'secret');
Preferably, don't log in as superuser at the target server. Use a dedicated role with limited privileges to avoid privilege escalation.
And use a password file on the target server to allow password-less access. This way you don't even have to store the password in the USER MAPPING. Instructions in the last chapter of this related answer:
Run batch file with psql command without password
Then:
CREATE OR REPLACE FUNCTION pg_temp.flux_tresorerie_historique_backup_row()
RETURNS trigger AS
$func$
BEGIN
PERFORM dblink_connect('myserver'); -- name of foreign server from above
PERFORM dblink_exec( format(
$$
INSERT INTO flux_tresorerie_historique -- provide target column list!
SELECT (r).id_flux_historique
, (r).date_operation_flux
, (r).date_valeur_flux
, (r).date_rapprochement_flux::date -- 'YYYY-MM-DD' is default ISO format anyway
, (r).libelle_flux
, (r).montant_flux
, (r).contre_valeur_dzd
, (r).rib_compte_bancaire
, (r).frais_flux
, (r).sens_flux
, (r).statut_flux
, (r).code_devise
, (r).code_mode_paiement
, (r).code_agence
, (r).code_compte
, (r).code_banque
, (r).date_maj_flux
, (r).statut_frais
, (r).reference_flux
, (r).code_commission
, (r).id_flux
FROM (SELECT %L::flux_tresorerie_historique) t(r)
$$, OLD::text)); -- cast whole row type
PERFORM dblink_disconnect();
RETURN NULL; -- only for AFTER trigger
END
$func$ LANGUAGE plpgsql;
You should spell out the list of columns for the target table if the row types don't match.
If you are serious about this:
insert this row in the table flux_tresorerie_historique
I.e., you insert the whole row and the target row type is identical (no extracting a date from a timestamp etc.), you can simplify much further passing the whole row.
CREATE OR REPLACE FUNCTION flux_tresorerie_historique_backup_row()
RETURNS trigger AS
$func$
BEGIN
PERFORM dblink_connect('myserver'); -- name of foreign server
PERFORM dblink_exec( format(
$$
INSERT INTO flux_tresorerie_historique
SELECT (%L::flux_tresorerie_historique).*
$$
, OLD::text));
PERFORM dblink_disconnect();
RETURN NULL; -- only for AFTER trigger
END
$func$ LANGUAGE plpgsql;
Related:
How do I do large non-blocking updates in PostgreSQL?
You can use quote_nullable for this! Also, concat_ws comes very handy:
CREATE OR REPLACE FUNCTION flux_tresorerie_historique_backup_row()
RETURNS trigger AS
$BODY$
BEGIN
perform dblink_connect('dbname=gtr_bd_archive user=postgres password=postgres');
perform dblink_exec('insert into flux_tresorerie_historique values('||
concat_ws(', ', quote_nullable(OLD.id_flux_historique),
quote_nullable(OLD.date_operation_flux),
quote_nullable(OLD.date_valeur_flux),
quote_nullable(to_char(OLD.date_rapprochement_flux, 'YYYY-MM-DD')),
quote_nullable(OLD.libelle_flux),
quote_nullable(OLD.montant_flux),
quote_nullable(OLD.contre_valeur_dzd),
quote_nullable(OLD.rib_compte_bancaire),
quote_nullable(OLD.frais_flux),
quote_nullable(OLD.sens_flux),
quote_nullable(OLD.statut_flux),
quote_nullable(OLD.code_devise),
quote_nullable(OLD.code_mode_paiement),
quote_nullable(OLD.code_agence),
quote_nullable(OLD.code_compte),
quote_nullable(OLD.code_banque),
quote_nullable(OLD.date_maj_flux),
quote_nullable(OLD.statut_frais),
quote_nullable(OLD.reference_flux),
quote_nullable(OLD.code_commission),
quote_nullable(OLD.id_flux)
)||');');
perform dblink_disconnect();
RETURN NULL;
END;
Note that it is OK to place non-sting values between single quotes, since a quoted literal is for PostgreSQL just as good a literal value as one without the quotes, so it is convenient to place all of the columns processed by quote_nullable. Also note that quote_nullable will already output dates in YYYY-MM-DD format (e.g. select quote_nullable(now()::date) would result in '2016-05-04'), so you may want to simplify OLD.date_rapprochement_flux even further by removing the to_char.

Get values from varying columns in a generic trigger

I am new to PostgreSQL and found a trigger which serves my purpose completely except for one little thing. The trigger is quite generic and runs across different tables and logs different field changes. I found here.
What I now need to do is test for a specific field which changes as the tables change on which the trigger fires. I thought of using substr as all the column will have the same name format e.g. XXX_cust_no but the XXX can change to 2 or 4 characters. I need to log the value in theXXX_cust_no field with every record that is written to the history_ / audit table. Using a bunch of IF / ELSE statements to accomplish this is not something I would like to do.
The trigger as it now works logs the table_name, column_name, old_value, new_value. I however need to log the XXX_cust_no of the record that was changed as well.
Basically you need dynamic SQL for dynamic column names. format helps to format the DML command. Pass values from NEW and OLD with the USING clause.
Given these tables:
CREATE TABLE tbl (
t_id serial PRIMARY KEY
,abc_cust_no text
);
CREATE TABLE log (
id int
,table_name text
,column_name text
,old_value text
,new_value text
);
It could work like this:
CREATE OR REPLACE FUNCTION trg_demo()
RETURNS TRIGGER AS
$func$
BEGIN
EXECUTE format('
INSERT INTO log(id, table_name, column_name, old_value, new_value)
SELECT ($2).t_id
, $3
, $4
,($1).%1$I
,($2).%1$I', TG_ARGV[0])
USING OLD, NEW, TG_RELNAME, TG_ARGV[0];
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER demo
BEFORE UPDATE ON tbl
FOR EACH ROW EXECUTE PROCEDURE trg_demo('abc_cust_no'); -- col name here.
SQL Fiddle.
Related answer on dba.SE:
How to access NEW or OLD field given only the field's name?
List of special variables visible in plpgsql trigger functions in the manual.

Print ASCII-art formatted SETOF records from inside a PL/pgSQL function

I would love to exploit the SQL output formatting of PostgreSQL inside my PL/pgSQL functions, but I'm starting to feel I have to give up the idea.
I have my PL/pgSQL function query_result:
CREATE OR REPLACE FUNCTION query_result(
this_query text
) RETURNS SETOF record AS
$$
BEGIN
RETURN QUERY EXECUTE this_query;
END;
$$ LANGUAGE plpgsql;
..merrily returning a SETOF records from an input text query, and which I can use for my SQL scripting with dynamic queries:
mydb=# SELECT * FROM query_result('SELECT ' || :MYVAR || ' FROM Alice') AS t (id int);
id
----
1
2
3
So my hope was to find a way to deliver this same nicely formatted output from inside a PL/pgSQL function instead, but RAISE does not support SETOF types, and there's no magic predefined cast from SETOF records to text (I know I could create my own CAST..)
If I create a dummy print_result function:
CREATE OR REPLACE FUNCTION print_result(
this_query text
) RETURNS void AS
$$
BEGIN
SELECT query_result(this_query);
END;
$$ LANGUAGE plpgsql;
..I cannot print the formatted output:
mydb=# SELECT print_result('SELECT ' || :MYVAR || ' FROM Alice');
ERROR: set-valued function called in context that cannot accept a set
...
Thanks for any suggestion (which preferably works with PostgreSQL 8.4).
Ok, to do anything with your result set in print_result you'll have to loop over it. That'll look something like this -
Here result_record is defined as a record variable. For the sake of explanation, we'll also assume that you have a formatted_results variable that is defined as text and defaulted to a blank string to hold the formatted results.
FOR result_record IN SELECT * FROM query_result(this_query) AS t (id int) LOOP
-- With all this, you can do something like this
formatted_results := formatted_results ||','|| result_record.id;
END LOOP;
RETURN formatted_results;
So, if you change print_results to return text, declare the variables as I've described and add this in, your function will return a comma-separated list of all your results (with an extra comma at the end, I'm sure you can make use of PostgreSQL's string functions to trim that). I'm not sure this is exactly what you want, but this should give you a good idea about how to manipulate your result set. You can get more information here about control structures, which should let you do pretty much whatever you want.
EDIT TO ANSWER THE REAL QUESTION:
The ability to format data tuples as readable text is a feature of the psql client, not the PostgreSQL server. To make this feature available in the server would require extracting relevant code or modules from the psql utility and recompiling them as a database function. This seems possible (and it is also possible that someone has already done this), but I am not familiar enough with the process to provide a good description of how to do that. Most likely, the best solution for formatting query results as text will be to make use of PostgreSQL's string formatting functions to implement the features you need for your application.

How to get postgres (8.4) query results with unknown columns

Edit: After posting I found Erwin Brandstetter's answer to a similar question. It sounds like in 9.2+ I could use the last option he listed, but none of the other alternatives sound workable for my situation. However, the comment from Jakub Kania and reiterated by Craig Ringer suggesting I use COPY, or \copy, in psql appears to solve my problem.
My goal is to get the results of executing a dynamically created query into a text file.
The names and number of columns are unknown; the query generated at run time is a 'pivot' one, and the names of columns in the SELECT list are taken from values stored in the database.
What I envision is being able, from the command line to run:
$ psql -o "myfile.txt" -c "EXECUTE mySQLGeneratingFuntion(param1, param2)"
But what I'm finding is that I can't get results from an EXECUTEd query unless I know the number of columns and their types that are in the results of the query.
create or replace function carrier_eligibility.createSQL() returns varchar AS
$$
begin
return 'SELECT * FROM carrier_eligibility.rule_result';
-- actual procedure writes a pivot query whose columns aren't known til run time
end
$$ language plpgsql
create or replace function carrier_eligibility.RunSQL() returns setof record AS
$$
begin
return query EXECUTE carrier_eligibility.createSQL();
end
$$ language plpgsql
-- this works, but I want to be able to get the results into a text file without knowing
-- the number of columns
select * from carrier_eligibility.RunSQL() AS (id int, uh varchar, duh varchar, what varchar)
Using psql isn't a requirement. I just want to get the results of the query into a text file, with the column names in the first row.
What format of a text file do you want? Something like csv?
How about something like this:
CREATE OR REPLACE FUNCTION sql_to_csv(in_sql text) returns setof text
SECURITY INVOKER -- CRITICAL DO NOT CHANGE THIS TO SECURITY DEFINER
LANGUAGE PLPGSQL AS
$$
DECLARE t_row RECORD;
t_out text;
BEGIN
FOR t_row IN EXECUTE in_sql LOOP
t_out := t_row::text;
t_out := regexp_replace(regexp_replace(t_out, E'^\\(', ''), E'\\)$', '');
return next t_out;
END LOOP;
END;
$$;
This should create properly quoted csv strings without the header. Embedded newlines may be a problem but you could write a quick Perl script to connect and write the data or something.
Note this presumes that the tuple structure (parenthesized csv) does not change with future versions, but it currently should work with 8.4 at least through 9.2.

In postgres (plpgsql), how to make a function that returns select * on a variable table_name?

Basically, at least for proof of concept, I want a function where I can run:
SELECT res('table_name'); and this will give me the results of SELECT * FROM table_name;.
The issue I am having is schema...in the declaration of the function I have:
CREATE OR REPLACE FUNCTION res(table_name TEXT) RETURNS SETOF THISISTHEPROBLEM AS
The problem is that I do not know how to declare my return, as it wants me to specify a table or a schema, and I won't have that until the function is actually run.
Any ideas?
You can do this, but as mentioned before you have to add a column definiton list in the SELECT query.
CREATE OR REPLACE FUNCTION res(table_name TEXT) RETURNS SETOF record AS $$
BEGIN
RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM res('sometable') sometable (col1 INTEGER, col2 INTEGER, col3 SMALLINT, col4 TEXT);
Why for any real practical purpose would you just want to pass in table and select * from it? For fun maybe?
You can't do it without defining some kind of known output like jack and rudi show. Or doing it like depesz does here using output parameters http://www.depesz.com/index.php/2008/05/03/waiting-for-84-return-query-execute-and-cursor_tuple_fraction/.
A few hack around the wall approachs are to issue raise notices in a loop and print out a result set one row at a time. Or you could create a function called get_rows_TABLENAME that has a definition for every table you want to return. Just use code to generate the procedures creations. But again not sure how much value doing a select * from a table, especially with no constraints is other than for fun or making the DBA's blood boil.
Now in SQL Server you can have a stored procedure return a dynamic result set. This is both a blessing and curse as you can't be certain what comes back without looking up the definition. For me I look at PostgreSQL's implementation to be the more sound way to go about it.
Even if you manage to do this (see rudi-moore's answer for a way if you have 8.4 or above), You will have to expand the type explicitly in the select - eg:
SELECT res('table_name') as foo(id int,...)