COPY Postgres table with Delimiter as double byte - postgresql

I want to copy a Postgres (version 11) table into a csv file with delimiter as double byte character. Please assist if this can be achieved.
I am trying this:
COPY "Tab1" TO 'C:\Folder\Tempfile.csv' with (delimiter E'অ');
Getting an error:
COPY delimiter must be a single one-byte character

You could use COPY TO PROGRAM. On Unix system that could look like
COPY "Tabl" TO PROGRAM 'sed -e ''s/|/অ/g'' > /outfile.csv' (FORMAT 'csv', delimiter '|');
Choose a delimiter that does not occur in the data. On Windows, perhaps you can write a Powershell command that translates the characters.

Related

PostgreSQL Copy To - CSV filename encoding

I have a database setup with a UTF-8 encoding. Trying to copy a table to csv, where the filename has a special character writes out the filename wrong to disk.
On a Windows 10 localhost PostgreSQL installation:
copy
(select 'tønder')
to 'C:\temp\Sønderborg.csv' (FORMAT CSV, HEADER TRUE, DELIMITER ';', ENCODING 'UTF8');
Names the csv file: Sønderborg.csv and not Sønderborg.csv.
Both
SHOW CLIENT ENCODING;
SHOW SERVER_ENCODING;
returns UTF8
How can one control the csv filename encoding? The encoding inside the csv is ok writing Tønder!
UPDATE
I have run the copy command from pgAdmin, DataGrip and a psql console. DataGrip uses JDBC and will only handle UTF8. All three applications writes the csv filename in wrong encoding. The only difference is that the psql console says the client encoding is WIN1252.
I don't think it's possible to change this behaviour. It looks like Postgres assumes that the filename encoding matches the server_encoding (as suggested on the mailing lists here and here). The only workaround I could find was to run the command while connected to a WIN1252-encoded database, which is probably not very helpful.
If you're trying to run this on the same machine as the server itself, then instead of using the server-side COPY, you can run psql's client-side \copy, which will respect your client_encoding when interpreting the file path:
psql -c "\copy (select 'tønder') to 'C:\temp\Sønderborg.csv' (FORMAT CSV, HEADER TRUE, DELIMITER ';', ENCODING 'UTF8')"
Note that cmd.exe (and even powershell.exe) still uses legacy DOS encodings by default, so you might need to run chcp 1252 to set the console codepage before launching psql.

how to pass variable to copy command in Postgresql

I tried to make a variable in SQL statement in Postgresql, but it did not work.
There are many csv files stored under the path. I want to set path in Postgresql that can tell copy command where can find csv files.
SQL statement sample:
\set outpath '/home/clients/ats-dev/'
\COPY licenses (_id, name,number_seats ) FROM :outpath + 'licenses.csv' CSV HEADER DELIMITER ',';
\COPY uploaded_files (_id, added_date ) FROM :outpath + 'files.csv' CSV HEADER DELIMITER ',';
It did not work. I got error: no such files. The two files licneses.csv and files.csv are stored under /home/cilents/ats-dev on Ubuntu. I found some sultion that use "\set file 'license.csv'". It did not work for me becacuse I have many csv files. also I tried to use "from : outpath || 'licenses.csv'". it did not work ether. Appreciate for any helps.
Using 9.3.
It looks like psql does not support :variable substitution withinpsql backslash commands.
test=> \set somevar fred
test=> \copy z from :somevar
:somevar: No such file or directory
so you will need to do this via an external tool like the unix shell. e.g.
for f in *.sql; do
psql -c "\\copy $(basename $f) FROM '$f'"
done
You can try COPY command
\set outpath '\'/home/clients/ats-dev/'
COPY licenses (_id, name,number_seats ) FROM :outpath/licenses.csv' WITH CSV HEADER DELIMITER ',';
COPY uploaded_files (_id, added_date ) FROM :outpath/files.csv' WITH CSV HEADER DELIMITER ',';
Note: Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must reside on or be accessible to the database server machine, not the client. They must be accessible to and readable or writable by the PostgreSQL user (the user ID the server runs as), not the client. Similarly, the command specified with PROGRAM is executed directly by the server, not by the client application, must be executable by the PostgreSQL user. COPY naming a file or command is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
Documentation: Postgresql 9.3 COPY
It may have been true when this was originally asked, that psql backslash commands didn't support variable interpolation, but in my PostgreSQL 14 instance that's no longer the case. However, the psql manpage is clear that \copy specifically does not support variable interpolation.

Character with byte sequence 0x9d in encoding 'WIN1252' has no equivalent in encoding 'UTF8'

I am reading a csv file in my sql script and copying its data into a postgre sql table. The line of code is below :
\copy participants_2013 from 'C:/Users/Acrotrend/Desktop/mip_sahil/mip/reelportdata/Participating_Individual_Extract_Report_MIPJunior_2013_160414135957.Csv' with CSV delimiter ',' quote '"' HEADER;
I am getting following error : character with byte sequence 0x9d in encoding 'WIN1252' has no equivalent in encoding 'UTF8'.
Can anyone help me with what the cause of this issue and how can I resolve it?
The problem is that 0x9D is not a valid byte value in WIN1252.
There's a table here: https://en.wikipedia.org/wiki/Windows-1252
The problem may be that you are importing a UTF-8 file and postgresql is defaulting to Windows-1252 (which I believe is the default on many windows systems).
You need to change the character set on your windows command line before running the script with chcp. Or in postgresql you can:
SET CLIENT_ENCODING TO 'utf8';
Before importing the file.
Simply specify encoding 'UTF-8' as the encoding in the \copy command, e.g. (I broke it into two lines for readability but keep it all on the same line):
\copy dest_table from 'C:/src-data.csv'
(format csv, header true, delimiter ',', encoding 'UTF8');
More details:
The problem is that the Client Encoding is set to WIN1252, most likely because it is running on Windows machine but the file has a UTF-8 character in it.
You can check the Client Encoding with
SHOW client_encoding;
client_encoding
-----------------
WIN1252
Any encoding has numeric ranges of valid code. Are you sure so your data are in win1252 encoding?
Postgres is very strict and doesn't import any possible encoding broken files. You can use iconv that can works in tolerant mode, and it can remove broken chars. After cleaning by iconv you can import the file.
I had this problem today and it was because inside of a TEXT column I had fancy quotes that had been copy/pasted from an external source.

How to skip first several lines when importing CSV from Postgresql?

I was trying to import a CSV file to my Postgresql with the frist 8 lines skipped and start from the 9th line. My codes below works to read from the second line and treat the first line as header:
create table report(
id integer,
name character(3),
orders integer,
shipments float
);
COPY report
FROM 'C:\Users\sample.csv' DELIMITER ',' CSV HEADER;
Now how to improve this code to read from the 9th line.
Thank you!
CSV details
With PostgreSQL 9.3 or newer, COPY can refer to a program to preprocess the data, for instance Unix tail.
To start an import at line 9:
COPY report FROM PROGRAM 'tail -n +9 /path/to/file.csv' delimiter ',' csv;
Apparently you're using Windows, so tail might not be immediately available. Personally I would install it from MSYS, otherwise there are alternatives mentioned in
Looking for a windows equivalent of the unix tail command
or Windows equivalent of the 'tail' command.

Postgresql copying data into a table

I am using the copy command in Postgresql and I have a line of data in a text file that is tab seperated and I would like to copy it into the db table.
I get an error saying:
ERROR: invalid byte sequence for encoding "UTF8": 0x00
SQL state: 22021
Context: COPY real_acct1, line 113038
So I went to the line 113038 from the text file and copied it along with 4 or 5 neighboring lines into a new text file and behold that new data went in.
Any helpful thoughts? This is parcel data attributes info.
Your problem is actually one of character encoding.
The easiest way to deal with this is running your import data through iconv (assuming you're on a unix machine).
iconv -f original_charset -t utf-8 originalfile > newfile