I'm trying to write a script that copies data from a crosstab query to a .csv file in Postgres 8.4. I am able to run the command in the psql command line but when I put the command in a file and run it using the -f option, I get a syntax error.
Here's an example of what I'm looking at (from this great answer):
CREATE TEMP TABLE t (
section text
,status text
,ct integer
);
INSERT INTO t VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
,('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7);
\copy (
SELECT * FROM crosstab(
'SELECT section, status, ct
FROM t
ORDER BY 1,2'
,$$VALUES ('Active'::text), ('Inactive')$$)
AS ct ("Section" text, "Active" int, "Inactive" int)
) TO 'test.csv' HEADER CSV
I then run this and get the following syntax error:
$ psql [system specific] -f copy_test.sql
CREATE TABLE
INSERT 0 5
psql:copy_test.sql:12: \copy: parse error at end of line
psql:copy_test.sql:19: ERROR: syntax error at or near ")"
LINE 7: ) TO 'test.csv' HEADER CSV
^
A similar exercise doing just a simple query without crosstab works without incident.
What is causing the syntax error and how can I copy this table to a csv file using script file?
psql thinks your first command is just \copy ( and the lines below that are from another unrelated statement. Meta-commands aren't spread on multiple lines, because newline is is a terminator for them.
Relevant excerpts from psql manpage with some emphasis added:
Meta-Commands
Anything you enter in psql that begins with an unquoted backslash is a
psql meta-command that is processed by psql itself. These commands
make psql more useful for administration or scripting. Meta-commands
are often called slash or backslash commands.
....
....(skipped)
Parsing for arguments stops at the end of the line, or when another
unquoted backslash is found. An unquoted backslash is taken as the
beginning of a new meta-command. The special sequence \\ (two
backslashes) marks the end of arguments and continues parsing SQL
commands, if any. That way SQL and psql commands can be freely mixed
on a line. But in any case, the arguments of a meta-command cannot
continue beyond the end of the line.
So the first error is that \copy ( failing, then the lines below are interpreted as an independent SELECT which looks fine until line 7 when there is a spurious closing parenthesis.
As told in the comments, the fix would be to cram the whole meta-command into a single line.
As with this answer, create a multi-line VIEW with a single-line \copy command, e.g.:
CREATE TEMP TABLE t (
section text
,status text
,ct integer
);
INSERT INTO t VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
,('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7);
CREATE TEMP VIEW v1 AS
SELECT * FROM crosstab(
'SELECT section, status, ct
FROM t
ORDER BY 1,2'
,$$VALUES ('Active'::text), ('Inactive')$$)
AS ct ("Section" text, "Active" int, "Inactive" int);
\copy (SELECT * FROM v1) TO 'test.csv' HEADER CSV
-- optional
DROP VIEW v1;
The answers listed here explain the reasoning quite clearly. Here is a small hack that allows you to have your sql contain multiple lines and work with psql.
# Using a file
psql -f <(tr -d '\n' < ~/s/test.sql )
# or
psql < <(tr -d '\n' < ~/s/test.sql )
# Putting the SQL using a HEREDOC
cat <<SQL | tr -d '\n' | \psql mydatabase
\COPY (
SELECT
provider_id,
provider_name,
...
) TO './out.tsv' WITH( DELIMITER E'\t', NULL '', )
SQL
According to the psql documentation:
-f filename
--file filename
Use the file filename as the source of commands instead of reading commands interactively. After the file is processed, psql terminates. This is in many ways equivalent to the internal command \i.
If filename is - (hyphen), then standard input is read.
Using this option is subtly different from writing psql < filename. In general, both will do what you expect, but using -f enables some nice features such as error messages with line numbers. There is also a slight chance that using this option will reduce the start-up overhead. On the other hand, the variant using the shell's input redirection is (in theory) guaranteed to yield exactly the same output that you would have gotten had you entered everything by hand.
This would be one of those cases where the -f option treats your input differently from the command line. Removing your newlines worked, redirecting the original file to psql's stdin would likely have worked as well.
Related
I'm attempting to dynamically create a script that gets saved as a bat file that will be scheduled to execute daily via Windows Task Scheduler. The script performs full database backups for each Postgres database using pg_dump.
The current script is as follows:
COPY (SELECT 'pg_dump '|| datname || ' > e:\postgresbackups\FULL\' || datname || '_%date:~4,2%-%date:~7,2%-%date:~10,4%_%time:~0,2%_%time:~3,2%_%time:~6,2%.dump' FROM pg_database) TO 'E:\PostgresBackups\Script\FULL_Postgres_Backup_Job_TEST.bat' (format csv, delimiter ';');
An example of the output is as follows:
pg_dump postgres > e:\postgresbackups\FULL\postgres_%date:~4,2%-%date:~7,2%-%date:~10,4%%time:~0,2%%time:~3,2%_%time:~6,2%.dump
I need help with updating my code so that the output will include double quotes around the name of the dump file; however, when I add this to my COPY script it adds more than what is necessary to the output. I would like the output to look like the following which includes the double-quotes:
pg_dump postgres > "e:\postgresbackups\FULL\postgres_%date:~4,2%-%date:~7,2%-%date:~10,4%%time:~0,2%%time:~3,2%_%time:~6,2%.dump"
Any help would be greatly appreciated!
Thanks to #Mike Organek's comment, my issue has been resolved by switching the format from CSV to TEXT. Now when I enclose the dump filename in double quotes, the output is more of what is expected and works as intended. The only odd thing now is that in the output it creates a second backslash in the filename. My code has been updated as follows:
COPY (SELECT 'pg_dump '|| datname || ' > "e:\postgresbackups\FULL\' || datname || '_%date:~4,2%-%date:~7,2%-%date:~10,4%_%time:~0,2%_%time:~3,2%_%time:~6,2%.dump"' FROM pg_database) TO 'E:\PostgresBackups\Script\FULL_Postgres_Backup_Job.bat' (format text, delimiter ';');
An example of the output that gets created within the bat file is as follows:
pg_dump postgres > "e:\\postgresbackups\\FULL\\postgres_%date:~4,2%-%date:~7,2%-%date:~10,4%_%time:~0,2%_%time:~3,2%_%time:~6,2%.dump"
As you can see, it adds a double backslash; however, the pg_dump executes successfully!
I tried to export a select query result to a csv file. I used Postgres \copy metacommand and command line (psql) to do it. But I got a Syntax error and can't understand why. The Query looks fine to me. Maybe the reason for using metacommand instead of COPY?
The query
\copy
(
SELECT geo_name, state_us_abbreviation, housing_unit_count_100_percent
FROM us_counties_2010
ORDER BY housing_unit_count_100_percent DESC
LIMIT 20
)
TO '/username/Desktop/us_counties_2010_export.csv'
WITH(FORMAT CSV, HEADER, DELIMITER '|');
Error message
ERROR: syntax error at or near "TO"
LINE 7: TO '/username/Desktop/us_counties_2010_export.csv'
\copy is a metacommand given to psql, not a regular command sent to the server. So like other metacommands, the entire \copy command must all be given on one line and doesn't end in a ; but rather a newline.
If you look closely, you will see the first error you got was \copy: arguments required
I am using the \copy command for migrating my data . But the table size is 30GB and it is taking hours to migrate. Can I use a where clause so that I can migrate only data that was available a month back?
\copy hotel_room_types TO | (select hotel_room_types.* from hotel_room_types limit 1) $liocation CSV DELIMITER ',';
ERROR: syntax error at or near "."
LINE 1: ...otel_room_types TO STDOUT (select hotel_room_types.* from h...
You can specify a query with psql's \copy just like you can with the SQL command COPY:
\copy (SELECT ... WHERE ...) TO 'filename'
After all, \copy just calls COPY ... TO STDOUT under the hood.
is possible in PSQL console export file with current date on the end of the file name?
The name of the exported file should be like this table_20140710.csv is it possible to do this dynamically? - the format of the date can be different than the above it isn't so much important.
This is example what i mean:
\set curdate current_date
\copy (SELECT * FROM table) To 'C:/users/user/desktop/table_ ' || :curdate || '.csv' WITH DELIMITER AS ';' CSV HEADER
The exception of the \copy meta command not expanding variables is (meanwhile) documented
Unlike most other meta-commands, the entire remainder of the line is always taken to be the arguments of \copy, and neither variable interpolation nor backquote expansion are performed in the arguments.
To workaround you can build, store and execute the command in multiple steps (similar to the solution Clodoaldo Neto has given):
\set filename 'my fancy dynamic name'
\set command '\\copy (SELECT * FROM generate_series(1, 5)) to ' :'filename'
:command
With this, you need to double (escape) the \ in the embedded meta command. Keep in mind that \set concatenates all further arguments to the second one, so quote spaces between the arguments. You can show the command before execution (:command) with \echo :command.
As an alternative to the local \set command, you could also build the command server side with SQL (the best way depends on where the dynamic content is originating):
SELECT '\copy (SELECT * FROM generate_series(1, 5)) to ''' || :'filename' || '''' AS command \gset
Dynamically build the \copy command and store it in a file. Then execute it with \i
First set tuples only output
\t
Set the output to a file
\o 'C:/users/user/desktop/copy_command.txt'
Build the \copy command
select format(
$$\copy (select * from the_table) To 'C:/users/user/desktop/table_%s.csv' WITH DELIMITER AS ';' CSV HEADER$$
, current_date
);
Restore the output to stdout
\o
Execute the generated command from the file
\i 'C:/users/user/desktop/copy_command.txt'
I'm running PostgreSQL 9.2.6 on OS X 10.6.8. I would like to import data from a CSV file with column headers into a database. I can do this with the COPY statement, but only if I first manually create a table with a column for each column in the CSV file. Is there any way to automatically create this table based on the headers in the CSV file?
Per this question I have tried
COPY test FROM '/path/to/test.csv' CSV HEADER;
But I just get this error:
ERROR: relation "test" does not exist
And if I first create a table with no columns:
CREATE TABLE test ();
I get:
ERROR: extra data after last expected column
I can't find anything in the PostgreSQL COPY documentation about automatically creating a table. Is there some other way to automatically create a table from a CSV file with headers?
There is a very good tool that imports tables into Postgres from a csv file.
It is a command-line tool called pgfutter (with binaries for windows, linux, etc.). One of its big advantages is that it recognizes the attribute/column names as well.
The usage of the tool is simple. For example if you'd like to import myCSVfile.csv:
pgfutter --db "myDatabase" --port "5432" --user "postgres" --pw "mySecretPassword" csv myCSVfile.csv
This will create a table (called myCSVfile) with the column names taken from the csv file's header. Additionally the data types will be identified from the existing data.
A few notes: The command pgfutter varies depending on the binary you use, e.g. it could be pgfutter_windows_amd64.exe (rename it if you intend to use this command frequently). The above command has to be executed in a command line window (e.g. in Windows run cmd and ensure pgfutter is accessible). If you'd like to have a different table name add --table "myTable"; to select a particular database schema us --schema "mySchema". In case you are accessing an external database use --host "myHostDomain".
A more elaborate example of pgfutter to import myFile into myTable is this one:
pgfutter --host "localhost" --port "5432" --db "myDB" --schema "public" --table "myTable" --user "postgres" --pw "myPwd" csv myFile.csv
Most likely you will change a few data types (from text to numeric) after the import:
alter table myTable
alter column myColumn type numeric
using (trim(myColumn)::numeric)
There is a second approach, which I found here (from mmatt). Basically you call a function within Postgres (last argument specifies the number of columns).
select load_csv_file('myTable','C:/MyPath/MyFile.csv',24)
Here is mmatt's function code, which I had to modify slightly, because I am working on the public schema. (copy&paste into PgAdmin SQL Editor and run it to create the function)
CREATE OR REPLACE FUNCTION load_csv_file(
target_table text,
csv_path text,
col_count integer)
RETURNS void AS
$BODY$
declare
iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet
begin
set schema 'public';
create table temp_table ();
-- add just enough number of columns
for iter in 1..col_count
loop
execute format('alter table temp_table add column col_%s text;', iter);
end loop;
-- copy the data from csv file
execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);
iter := 1;
col_first := (select col_1 from temp_table limit 1);
-- update the column names based on the first row which has the column names
for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
loop
execute format('alter table temp_table rename column col_%s to %s', iter, col);
iter := iter + 1;
end loop;
-- delete the columns row
execute format('delete from temp_table where %s = %L', col_first, col_first);
-- change the temp table name to the name given as parameter, if not blank
if length(target_table) > 0 then
execute format('alter table temp_table rename to %I', target_table);
end if;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION load_csv_file(text, text, integer)
OWNER TO postgres;
Note: There is a common issue with importing text files related to encoding. The csv file should be in UTF-8 format. However, sometimes this is not quite achieved by the programs, which try to do the encoding. I have overcome this issue by opening the file in Notepad++ and converting it to ANSI and back to UTF8.
I am using csvsql to generate the table layout (it will automatically guess the format):
head -n 20 table.csv | csvsql --no-constraints --tables table_name
And then I use \COPY in psql. That's for me the fastest way to import CSV file.
You can also use sed with csvsql in order to get the desired datatype:
head -n 20 table.csv | csvsql --no-constraints --tables table_name | sed 's/DECIMAL/NUMERIC/' | sed 's/VARCHAR/TEXT/' | sed 's/DATETIME/TIMESTAMP'
Use sqlite as intermediate step.
Steps:
In the command prompt type: sqlite3
In the sqlite3 CLI type: .mode csv
.import my_csv.csv my_table
.output my_table_sql.sql
.dump my_table
Finally execute that sql in your Postgresql
You can't find anything in the COPY documentation, because COPY cannot create a table for you.
You need to do that before you can COPY to it.
I achieved it with this steps:
Convert the csv file to utf8
iconv -f ISO-8859-1 -t UTF-8 file.txt -o file.csv
Use this python script to create the sql to create table and copy
#!/usr/bin/env python3
import csv, os
#pip install python-slugify
from slugify import slugify
origem = 'file.csv'
destino = 'file.sql'
arquivo = os.path.abspath(origem)
d = open(destino,'w')
with open(origem,'r') as f:
header = f.readline().split(';')
head_cells = []
for cell in header:
value = slugify(cell,separator="_")
if value in head_cells:
value = value+'_2'
head_cells.append(value)
#cabecalho = "{}\n".format(';'.join(campos))
#print(cabecalho)
fields= []
for cell in head_cells:
fields.append(" {} text".format(cell))
table = origem.split('.')[0]
sql = "create table {} ( \n {} \n);".format(origem.split('.')[0],",\n".join(fields))
sql += "\n COPY {} FROM '{}' DELIMITER ';' CSV HEADER;".format(table,arquivo)
print(sql)
d.write(sql)
3.Run the script with
python3 importar.py
Optional: Edit the sql script to adjust the field types (all are text by default)
Run the sql script. Short for console
sudo -H -u postgres bash -c "psql mydatabase < file.sql"
Automatic creation seems to be pretty easy with Python+Pandas
Install sqlalchemy library in your Python environment
pip install SQLAlchemy==1.4.31
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://username:password#localhost:5432/mydatabase')
df=pd.read_csv('example.csv')
df.to_sql('table_name', engine)
I haven't used it, but pgLoader (https://pgloader.io/) is recommended by the pgfutter developers (see answer above) for more complicated problems. It looks very capable.
You can create a new table in DBeaver out of a CSV.
For a single table, I did very simply, quickly and online through one of the many good converters that can be found on the web.
Just google convert csv to sql online and choose one.