Copying data from local .CSV file to pgsql table in remote server - postgresql

I am trying to copy data from a csv file from my local machine into a remote pgsql table named states, but i am getting an ERROR: Syntax error at or near "FROM". Can someone guide me as to why i am receiving this error?
COPY FROM STDIN states FROM '/Users/Shared/data.csv' DELIMITER AS ',';

The problem is that the path to the file is in the remote server, not the local one.
you need psql and pipe the file to STDIN:
psql -h host -d remoteDB -U myuser -c "copy states from STDIN with delimiter as ',';" < /path/file.csv
alternatively you can also do:
cat /path/file.csv | psql -h host -d remoteDB -U myuser -c "copy states from STDIN with delimiter as ',';"

You can't do it directly with a filename unless the file is on the Postgres server. The docs of COPY state:
COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible to the server and the name must be specified from the viewpoint of the server. When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server.
You'll have to pipe the file in via STDIN.
Many Postgres drivers provide a method to make this easier. For example, ruby-pg provides copy_data.
conn.copy_data "COPY states FROM STDIN FORMAT CSV" do
File.foreach('/Users/Shared/data.csv') do |line|
conn.put_copy_data(line)
end
end

Related

Postgres Copy command with Log redirection

I am using Postgres Copy utility to load the data to Postgres table from CSV file. Currently using the below command
psql -h 127.0.0.1 -d target -U postgres -c "\copy TableName FROM 'E:\Dev\XXX_1_0.csv' delimiter '^'" -o E:/Dev/XXX.log
When there is an issue in the data, error information are not getting updated in the log file.
Whereas when there is no error, my log files is updated with loaded row count. fo example (COPY 25)
I tried to execute the above command from command prompt & below error is reported.
Let me know how to get the error information or redirect the errors to log files for the reference.
ERROR: value too long for type character varying(255)
CONTEXT: COPY TableName, line 2, column Name: "NickName..."
I don't know of a way to redirect the error output directly in psql. You can get your shell to do it for you.
This works to combine both stdout and stderr into one file named "log". It works both in bash and in Windows CMD:
psql -c "whatever" > log 2>&1

PostgreSQL COPY pipe output to gzip and then to STDOUT

The following command works well
$ psql -c "copy (select * from foo limit 3) to stdout csv header"
# output
column1,column2
val1,val2
val3,val4
val5,val6
However the following does not:
$ psql -c "copy (select * from foo limit 3) to program 'gzip -f --stdout' csv header"
# output
COPY 3
Why do I have COPY 3 as the output from this command? I would expect that the output would be the compressed CSV string, after passing through gzip.
The command below works, for instance:
$ psql -c "copy (select * from foo limit 3) to stdout csv header" | gzip -f -c
# output (this garbage is just the compressed string and is as expected)
߉T`M�A �0 ᆬ}6�BL�I+�^E�gv�ijAp���qH�1����� FfВ�,Д���}������+��
How to make a single SQL command that directly pipes the result into gzip and sends the compressed string to STDOUT?
When you use COPY ... TO PROGRAM, the PostgreSQL server process (backend) starts a new process and pipes the file to the process's standard input. The standard output of that process is lost. It only makes sense to use COPY ... TO PROGRAM if the called program writes the data to a file or similar.
If your goal is to compress the data that go across the network, you could use sslmode=require sslcompression=on in your connect string to use the SSL network compression feature I built into PostgreSQL 9.2. Unfortunately this has been deprecated and most OpenSSL binaries are shipped with the feature disabled.
There is currently a native network compression patch under development, but it is questionable whether that will make v14.
Other than that, you cannot get what you want at the moment.
copy is running gzip on the server and not forwarding the STDOUT from gzip on to the client.
You can use \copy instead, which would run gzip on the client:
psql -q -c "\copy (select * from foo limit 3) to program 'gzip -f --stdout' csv header"
This is fundamentally the same as piping to gzip, which you show in your question.
If the goal is to compress the output of copy so it transfers faster over the network, then...
psql "postgresql://ip:port/dbname?sslmode=require&sslcompression=1"
It should display "compression active" if it's enabled. That probably requires some server config variable to be enabled though.
Or you can simply use ssh:
ssh user#dbserver "psql -c \"copy (select * from foo limit 3) to stdout csv header\" | gzip -f -c" >localfile.csv.gz
But... of course, you need ssh access to the db server.
If you don't have ssh to the db server, maybe you have ssh to another box in the same datacenter that has a fast network link to the db server, in that case you can ssh to it instead of the db server. Data will be transferred uncompressed between that box and the database, compressed on the box, and piped via ssh to your local machine. That will even save cpu on the database server since it won't be doing the compression.
If that doesn't work, well then, why not put the ssh command into the "to program" and have the server send it via ssh to your machine? You'll have to setup your router and open a port, but you can do that. Of course you'll have to find a way to put the password in the ssh command line, that's usually a big no-no, but maybe just for once. Or just use netcat instead, that doesn't require a password.
Also, if you want speed, please, use zstd instead of gzip.
Here's an example with netcat. I just tested it and it worked.
On destination machine which is 192.168.0.1:
nc -lp 65001 | zstd -d >file.csv
In another terminal:
psql -c "copy (select * from foo) to program 'zstd -9 |nc -N 192.168.0.1 65001' csv header" test
Note -N option for netcat.
You can use copy to PROGRAM:
COPY foo_table to PROGRAM 'gzip > /tmp/foo_table.csv' delimiters',' CSV HEADER;

How to export all tables in a PostgreSQL database to csv files?

I went to the psql commandline mode and entered the correct database and I can list all the tables.
Now, I tried the following commands:
copy some_table_name1 to '/var/lib/pgsql/csv_exports/some_table_name1.csv' csv header
copy some_table_name2 to '/var/lib/pgsql/csv_exports/some_table_name2.csv' csv header
And so on...
There was no error messages or anything after the commands and I used tab-button to ensure that I was always referring to correct table names.
After doing this to all the tables I went to the directory and there were no files at all.
Am I doing something wrong?
EDIT: I should clarify that I was looking to that directory, by using putty and WINSCP, on the server machine. The same where I ran the psql commands in.
The files are written to that directory on the server machine, not the client.
Use COPY ... TO STDOUT to send the data to the client.
Using psql in the console you can use the following command to get your data in the client machine:
$ psql yourdb -c "COPY yourtable TO STDOUT DELIMITER ',' CSV HEADER" > output.csv
If you're wondering about how to do the other way around (import), take a look at this question.
Assuming that what you really want to do is to output the file to someplace on your local machine (ie your developer workstation), I suggest that you use the "\copy" command, instead of the "COPY" command.
psql -c "\copy (SELECT * FROM account) to '/tmp/account.csv' with csv;"
or
psql -c "\copy account TO '/tmp/account.csv' DELIMITER ',' CSV HEADER;
Otherwise, unless you explicitly redirect the command to stdout, as #LaurenzAlbe suggested, the "COPY"command will check and see if you are authorized to write the files out to the actual database server. Normally, this is not the behavior that you and, and it requires accesses and permissions greater than most developers have.

Output to CSV in postgres with double-quotes

Trying to dump the output of a query into a CSV file in an automated job and running into an issue with fields where the column contains my comma delimiter. With the nature of this particular network, I have to jump through a couple of hoops to get things done, and there's a good chance I'm missing something very obvious.
In a nutshell, I kick off my script from a client machine that uses PLINK to run a remote psql command on another box over an SSH connection. That psql command is hitting a Postgres server on a third machine (I can't connect directly from client to DB, hence the extra step in between).
If I manually SSH from client to server 1, connect to the Postgres box, and use \copy... with CSV header, the file that's created is perfect, and any fields that contain a comma are automatically surrounded by double quotes.
However, if I try go issue that \copy (or copy) command in a single command, the output doesn't contain those double quotes, so I end up in that situation where commas in a field are interpreted as a delimiter later one.
In other words, this has the necessary double-quotes:
SSH from client to server1.
psql -Uuser -h server2 database
\copy (select ...) to '~/myfile.csv' with CSV header;
But doesn't:
SSH from client to server1
psql -Uuser -h server2 database -c "\copy (select ...) to '~/myfile.csv' with CSV header;"
Using FORCE_QUOTE
Here is how to do:
psql -U user -h server2 database -c "\copy (select ...) to '~/myfile.csv' WITH (FORMAT CSV, HEADER TRUE, FORCE_QUOTE *);"
COPY command documentation

How can I use DBI to execute a "\copy from remote table" command in Postgres?

I need to copy from a remote PostgreSQL server to a local one. I cannot use any ETL tools, it must be done using Perl with DBI. This data will be large, so I don't want to use "select from source" and "insert into local". I was looking to use COPY to create a file, but this file will be created on the remote server. I can't do that either. I want to use \COPY instead.
How can I use DBI to execute a "\copy from remote table" command and create a local file using DBI in Perl?
You can do it in perl with DBD::Pg, details can be found here:
https://metacpan.org/pod/DBD::Pg#COPY-support
You definitely want to use the "copy from" and "copy to" commands to get the data in and out of the databases efficiently. They are orders of magnitude faster than iterating over rows of data. You many also want to turn off the indexes while you're copying data into the target table, then enable them (and let them build) when the copy is complete.
Assuming you are simply connecting to the listener ports of the two databases, simply open a connection to the source database, copy the table(s) to a file, open a connection to the destination database and copy the file back to the target table.
Hmm. \copy to ... is a psql directive, not SQL, so it won't be understood by DBI or by the PostgreSQL server at the other end.
I see that the PostgreSQL's SQL COPY command has FROM STDIN and TO STDOUT options -- but I doubt that DBI has a way to perform the "raw reads" necessary to access the result data. (I'm sure TO STDOUT is how psql internally implements \copy to ....)
So: In your case, I would mount a folder on your source box back to your target box using e.g. samba or nfs, and use plain old COPY TO '/full/path/to/mounted/folder/data.txt' ....
I got it to work using \copy (select * from remote_table) to '/local/file.txt' ... then \copy local_table from '/local/file.txt' to load the file into the local db. I executed the \copy command from a psql script.
Here's my script
export PGUSER=remoteuser
export PGPASSWORD=remotepwd
/opt/PostgreSQL/8.3/bin/psql -h xx.xx.xx -p 5432 -d remotedb -c "\COPY (select * from remote_table where date(reccreationtim
e) = date((current_date - interval '4 day'))) TO '/local/copied_from_remote.txt' D
ELIMITER '|'"
export PGUSER=localuser
export PGPASSWORD=localpwd
/opt/PostgreSQL/8.3/bin/psql -h xx.xx.xx.xx -p 5432 -d localdb -c "\COPY local_table FROM '/local/copied_from_remote.txt' DELIMITER '|'"
You could use ~/.pgpass and save yourself the export PGUSER stuff, and keep the password out of the environment... (always a good idea from a security perspective)