Use psql's \copy for a multi-line query - postgresql

This is a follow-up question from this answer for "Save PL/pgSQL output from PostgreSQL to a CSV file".
I need to write a client-side CSV file using psql's \copy command. A one liner works:
db=> \copy (select 1 AS foo) to 'bar.csv' csv header
COPY 1
However, I have long queries that span several lines. I don't need to show the query, as I can't seem to extend this past one line without a parse error:
db=> \copy (
\copy: parse error at end of line
db=> \copy ( \\
\copy: parse error at end of line
db=> \copy ("
\copy: parse error at end of line
db=> \copy "(
\copy: parse error at end of line
db=> \copy \\
\copy: parse error at end of line
Is it possible to use \copy with a query that spans multiple lines? I'm using psql on Windows.

The working solution I have right now is to create a temporary view, which can be declared over multiple lines, then select from it in the \copy command, which fits comfortably on one line.
db=> CREATE TEMP VIEW v1 AS
db-> SELECT i
db-> FROM generate_series(1, 2) AS i;
CREATE VIEW
db=> \cd /path/to/a/really/deep/directory/structure/on/client
db=> \copy (SELECT * FROM v1) TO 'out.csv' csv header
COPY 2
db=> DROP VIEW v1;
DROP VIEW

We may use HEREDOC to feed multiline SQL to psql and use
# Putting the SQL using a HEREDOC
tr '\n' ' ' << SQL| \psql mydatabase
\COPY (
SELECT
provider_id,
provider_name,
...
) TO './out.tsv' WITH( DELIMITER E'\t', NULL '', )
SQL
Source: https://minhajuddin.com/2017/05/18/how-to-pass-a-multi-line-copy-sql-to-psql/

You can combine the server side COPY command with the \g psql command to produce a multi-line query to local file:
db=# COPY (
SELECT department, count(*) AS employees
FROM emp
WHERE role = 'dba'
GROUP BY department
ORDER BY employees
) TO STDOUT WITH CSV HEADER \g department_dbas.csv
COPY 5
I describe this technique in detial here https://hakibenita.com/postgresql-unknown-features#use-copy-with-multi-line-sql

Related

Using pg_dump only get a portion of the data in the table ( e.g. created_on> "2020-01-01")

I want use pg_dump dump partial data.what should I do.
You can use the copy command,like this:
\copy (SELECT * FROM "Object" WHERE created_on > '2020-01-01') TO '/Users/lethean/sql/object.csv' csv header;
\copy "Object" FROM '/Users/lethean/sql/object.csv' DELIMITER ',' csv header;

PostgreSQL Syntax error, can't find it

I have a code which creates 6 templates, adds data to them, merges them and export it as data. I can make it work by using F5 on different paragraphs, but I want to make the whole code work. Can someone help me, I am pretty new.
CREATE TEMP TABLE john1
(email VARCHAR(200));
COPY john1(email) from 'E:\WORK\FXJohn1.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john2
(email VARCHAR(200));
COPY john2(email) from 'E:\WORK\FXJohn2.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john3
(email VARCHAR(200));
COPY john3(email) from 'E:\WORK\FXJohn3.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john4
(email VARCHAR(200));
COPY john4(email) from 'E:\WORK\FXJohn4.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john5
(email VARCHAR(200));
COPY john5(email) from 'E:\WORK\FXJohn5.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john6
(email VARCHAR(200));
COPY john6(email) from 'E:\WORK\FXJohn6.csv' DELIMITER ',' CSV HEADER
CREATE TABLE finished AS
(SELECT * FROM john1
UNION
SELECT * FROM john2
UNION
SELECT * FROM john3
UNION
SELECT * FROM john4
UNION
SELECT * FROM john5
UNION
SELECT * FROM john6);
DO $func$
BEGIN
EXECUTE $$
COPY public."finished" TO 'E:\$$ || to_char(CURRENT_DATE, 'YYYY_MM_DD') || $$.csv' DELIMITER ',' CSV HEADER;
$$;
END;
$func$ LANGUAGE plpgsql;
#Rupert
Sorry, but for some reason this script is not running for me, I get this error:
ERROR: syntax error at or near "for" LINE 1: for x in $(ls FXJohn1*.csv);
Do I change the variables correctly?
for x in $(ls file_name*.csv);
| I change file_name to one of my .csv in the folder|
do psql -c "copy table_name from
| I change table_name to my current table name I've created|
'/path/. todir/$x' csv" db_name; done
| I change path to E:\WORK (there are my all my csv files.
Firstly you can load multiple .csv files into the same table. So let's set that up first:
CREATE TABLE finished
(
email varchar(200)
)
Then you can load multiple files from the same folder using a simple bash script:
for x in $(ls file_name*.csv);
do psql -c "copy table_name from
'/path/. todir/$x' csv" db_name; done
This saves you doing multiple 'copies' and then the multiple UNIONs.
Then you can run your script:
DO $func$
BEGIN
EXECUTE $$
COPY public."finished" TO 'E:\$$ || to_char(CURRENT_DATE,
'YYYY_MM_DD') || $$.csv'
DELIMITER ',' CSV HEADER;
$$;
END;
$func$ LANGUAGE plpgsql;

COPY column order

I'm trying to use COPY with HEADER option but my header line in file is in different order than the column order specified in database.
Is the column name order necessary in my file ??
My code is as below:
COPY table_name (
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
)
FROM 'file.csv'
WITH DELIMITER ',' CSV HEADER;
My database table has got a different order from file.csv and i wanted to select the table order and copy data from csv to table.
You can't issue an SQL query in copy from. You can only list the columns.
If the CSV columns are in the b, a, c order then list that in the copy from command:
copy target_table (b, a, c)
from file.csv
with (delimiter ',', format csv, header)
Assuming the order of the columns we need is the one of the table from which we are copying the results, the next logical step would be to simulate a sub-query through a Bash script.
psql schema_origin -c 'COPY table_origin TO stdout' | \
psql schema_destination -c \
"$(echo 'COPY table_destination (' \
$(psql schema_origin -t -c "select string_agg(column_name, ',') \
from information_schema.columns where table_name = 'table_origin'") \
') FROM stdin')"
StackOverflow answer on COPY command
StackExchange answer on fetching column names
StackOverflow answer on fetching results as tuples
I came up with the following setup for making COPY TO/FROM successful even for quite sophisticated JSON columns:
COPY "your_schema_name.yor_table_name" (
SELECT string_agg(
quote_ident(column_name),
','
) FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'yuour_table_name'
AND TABLE_SCHEMA = 'your_schema_name'
) FROM STDIN WITH CSV DELIMITER E'\t' QUOTE '\b' ESCAPE '\';
--here rows data
\.
the most important parts:
be explicit in filtering from information_schema.columns and user also the table_schema. Otherwise, you may end up with unexpected columns when one table name occurs in multiple schemas.
use quote_ident to make sure your command does not crash if someone made poor naming of table columns using Postgres registred keywords like user or unique. Thanks to quote_ident you will get them wrapped in double-quotes what makes them safe for importing.
I also found the following setup:
QUOTE '\b' - quote with backspace
DELIMITER E'\t' - delimiter with tabs
ESCAPE '\' - and escape with a backslash
for making both COPY to and from most reliable also for dealing with sophisticated/nested JSON columns.

postgresql - how to use a cursor or select statement to generate mulitple DML statements

New to postgres and I'm using Postgresql 9.3. Is there a way with postgresql to generate a file with multiple DML statements?
For example, I want to select table name where tablename like '_foo%' and then rename all those tables to '_bar%'. Do I need to do this in a cursor or can I do this within a select statement? (like in Oracle)
ALTER TABLE RENAME tst1_foo TO tst1_bar;
ALTER TABLE RENAME tst2_foo TO tst2_bar;
ALTER TABLE RENAME tst3_foo TO tst3_bar;
I'd like to print those out to a .sql file.
Please provide a basic example if possible. Thanks.
You can use psql and the pg_tables system view. Set the output to unaligned mode:
\a
Set the output to show only rows:
\t on
Send output to your file:
\o yourfile.sql
Run the query:
SELECT 'ALTER TABLE RENAME ' || tablename || ' TO ' ||
REGEXP_REPLACE ( tablename, '_foo$', '_bar' ) || ';'
FROM pg_tables
WHERE tablename LIKE '%_foo';
Close the file:
\o
and/or close psql:
\q

How to copy from CSV file to PostgreSQL table with headers in CSV file?

I want to copy a CSV file to a Postgres table. There are about 100 columns in this table, so I do not want to rewrite them if I don't have to.
I am using the \copy table from 'table.csv' delimiter ',' csv; command but without a table created I get ERROR: relation "table" does not exist. If I add a blank table I get no error, but nothing happens. I tried this command two or three times and there was no output or messages, but the table was not updated when I checked it through PGAdmin.
Is there a way to import a table with headers included like I am trying to do?
This worked. The first row had column names in it.
COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER
With the Python library pandas, you can easily create column names and infer data types from a csv file.
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('postgresql://user:pass#localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)
The if_exists parameter can be set to replace or append to an existing table, e.g. df.to_sql('pandas_db', engine, if_exists='replace'). This works for additional input file types as well, docs here and here.
Alternative by terminal with no permission
The pg documentation at NOTES
say
The path will be interpreted relative to the working directory of the server process (normally the cluster's data directory), not the client's working directory.
So, gerally, using psql or any client, even in a local server, you have problems ... And, if you're expressing COPY command for other users, eg. at a Github README, the reader will have problems ...
The only way to express relative path with client permissions is using STDIN,
When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server.
as remembered here:
psql -h remotehost -d remote_mydb -U myuser -c \
"copy mytable (column1, column2) from STDIN with delimiter as ','" \
< ./relative_path/file.csv
I have been using this function for a while with no problems. You just need to provide the number columns there are in the csv file, and it will take the header names from the first row and create the table for you:
create or replace function data.load_csv_file
(
target_table text, -- name of the table that will be created
csv_file_path text,
col_count integer
)
returns void
as $$
declare
iter integer; -- dummy integer to iterate columns with
col text; -- to keep column names in each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet
begin
set schema 'data';
create table temp_table ();
-- add just enough number of columns
for iter in 1..col_count
loop
execute format ('alter table temp_table add column col_%s text;', iter);
end loop;
-- copy the data from csv file
execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);
iter := 1;
col_first := (select col_1
from temp_table
limit 1);
-- update the column names based on the first row which has the column names
for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
loop
execute format ('alter table temp_table rename column col_%s to %s', iter, col);
iter := iter + 1;
end loop;
-- delete the columns row // using quote_ident or %I does not work here!?
execute format ('delete from temp_table where %s = %L', col_first, col_first);
-- change the temp table name to the name given as parameter, if not blank
if length (target_table) > 0 then
execute format ('alter table temp_table rename to %I', target_table);
end if;
end;
$$ language plpgsql;
## csv with header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV, header);"
## csv without header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV);"
## csv without header, specify column
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME(COL1,COL2) FROM 'data_sample.csv' WITH (FORMAT CSV);"
all columns in csv should be same as table (or same as specified column)
about COPY
https://www.postgresql.org/docs/9.2/sql-copy.html
You can use d6tstack which creates the table for you and is faster than pd.to_sql() because it uses native DB import commands. It supports Postgres as well as MYSQL and MS SQL.
import pandas as pd
df = pd.read_csv('table.csv')
uri_psql = 'postgresql+psycopg2://usr:pwd#localhost/db'
d6tstack.utils.pd_to_psql(df, uri_psql, 'table')
It is also useful for importing multiple CSVs, solving data schema changes and/or preprocess with pandas (eg for dates) before writing to db, see further down in examples notebook
d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'),
apply_after_read=apply_fun).to_psql_combine(uri_psql, 'table')