Incorrect characters after restore with Docker Postgres database - postgresql

I make a backup with the command
docker exec -t arcane-aio pg_dump arcane-aio -c -U admin > arcane_aio_db.sql
I restore the backup with the command
cat arcane_aio_db.sql | docker exec -i arcane-aio psql -U admin -d arcane-aio
All is good, but all Russian symbols are replaced by "?".
The string value before the restore is Привет, hi!.
The string value after the restore of the backup is ??????, hi!.
I checked the encoding of the backup, the database before the backup, the database after the restore, and they are the same (en_US.utf8). Could it be that this encoding don't support the Russian language?
We are using Windows.
After a change of the system encoding from Cyrillic to UTF-8,
the values in the data dump become correct.
But after the restore, we still see "?" instead of Russian symbols in the database.

the cat command uses your shell character encoding.
Did you try running simply the first part:
cat arcane_aio_db.sql
I bet it also shows the ???.
You need to set the charset to the same encoding on both sides. You probably have UTF-8 on one side and some russian language on the other.
The pipe, that writes to file is binary and doesn't care about the encoding, but cat does.
You can check your encoding with
echo $LANG
make sure it is UTF-8 on both sides and that should fix your issue.
** EDIT
a work-around is to do the backup and restore within the container:
#get into the container
docker exec -it arcane-aio /bin/bash
# in the container run:
pg_dump arcane-aio -c -U admin > arcane_aio_db.sql
# try restore:
cat arcane_aio_db.sql | psql -U admin -d arcane-aio
if that works, then it's an encoding issue between your docker container and local machine.
You can do the dump / restore within the container and copy the file in/out with docker cp
On another thought, the SQL you 'cat' may contain quotes or $ or # or other characters that are problematic sent directly into a TTY.
So you may want to try this instead, to make sure the whole thing is quoted:
eval "echo \"$(cat arcane_aio_db.sql)\"" | docker exec -t arcane-aio psql -U admin -d arcane-aio

Since a pg_dump includes instructions to set the client_encoding correctly, the data in your target database will be correct. That is, unless the database encoding is SQL_ASCII, in which case you are lost anyway if you need Cyrillic characters.
The problem must be with your client software or your terminal encoding.
To ascertain that the characters are correct in the database, connect with psql and cast the string to bytea so that you can see the bytes:
SELECT charcol, CAST(charcol TO bytea)
FROM tab WHERE ...;

Related

pg_dump on Windows puts extraneous characters in the dump file?

Using PostgreSQL v13.x, Windows 2019 Server
I'm using the following command either from a Powershell or command window
> pg_dump -h 127.0.0.1 -U postgres -W --format custom --file my_db_dump.sql my_db
(p/w prompted and entered)
The dump is created successfully. However it's full of extraneous non-printable control-sequence characters, e.g., ^A^N^#^X^#^#^#...., you name it. Post processing, e.g., dos2unix, :set ff=unix, :%!col -xb doesn't eliminate the characters. Is there a switch in pg_dump to control this? I didn't see it in the pg_dump documentation.
You have specified the custom format which is not plain SQL. It has structure and some compression. If you want plain SQL that is also available as a format.

Postgresql - PSQL encoding issue

I am trying to set client encoding for a session. I am running the following command on linux terminal (postgresql installed on a remote server.
psql -h localhost -p 5432 -U user -d dbase -c "SET client_encoding to 'LATIN1';"
Output: SET
psql -h localhost -p 5432 -U user -d dbase -c "show client_encoding"
Output: UTF8
Why is this happening? Do I need to run this command as a superuser? (I don't think so)
If I run the following commands on Pgadmin4 then it correctly show LATIN1 as the output.
SET client_encoding to 'LATIN1'
show client_encoding
The server encoding is set to UTF8.
Setting the client encoding is only for the current session unless you set the environment variable PGCLIENTENCODING.
From postgresql documentation: If the environment variable PGCLIENTENCODING is defined in the client's environment, that client encoding is automatically selected when a connection to the server is made.
So, if you want the client encoding persist try setting this environment var.
Or if you want to execute multiple queries in just one session have a look at the -f parameter to set a file to parse.
Ex: psql -d myDataBase -f myFile

Loading PostgreSQL Database Backup Into Docker/Initial Docker Data

I am migrating an application into Docker. One of the issues that I am bumping into is what is the correct way to load the initial data into PostgreSQL running in Docker? My typical method of restoring a database backup file are not working. I have tried the following ways:
gunzip -c mydbbackup.sql.gz | psql -h <docker_host> -p <docker_port> -U <dbuser> -d <db> -W
That does not work, because PostgreSQL is prompting for a password, and I cannot enter a password because it is reading data from STDOUT. I cannot use the $PGPASSWORD environment variable, because the any environment variable I set in my host is not set in my container.
I also tried a similar command above, except using the -f flag, and specify the path to a sql backup file. This does not work because my file is not on my container. I could copy the file to my container with the ADD statement in my Dockerfile, but this does not seem right.
So, I ask the community. What is the preferred method on loading PostgreSQL database backups into Docker containers?
I cannot use the $PGPASSWORD environment variable, because the any
environment variable I set in my host is not set in my container.
I don't use docker, but your container looks like a remote host in the command shown, with psql running locally. So PGPASSWORD never has to to be set on the remote host, only locally.
If the problems boils down to adding a password to this command:
gunzip -c mydbbackup.sql.gz |
psql -h <docker_host> -p <docker_port> -U <dbuser> -d <db> -W
you may submit it using several methods (in all cases, don't use the -W option to psql)
hardcoded in the invocation:
gunzip -c mydbbackup.sql.gz |
PGPASSWORD=something psql -h <docker_host> -p <docker_port> -U <dbuser> -d <db>
typed on the keyboard
echo -n "Enter password:"
read -s PGPASSWORD
export PGPASSWORD
gunzip -c mydbbackup.sql.gz |
psql -h <docker_host> -p <docker_port> -U <dbuser> -d <db>
Note about the -W or --password option to psql.
The point of this option is to ask for a password to be typed first thing, even if the context makes it unnecessary.
It's frequently misunderstood as the equivalent of the -poption of mysql. This is a mistake: while -p is required on password-protected connections, -W is never required and actually goes in the way when scripting.
-W, --password
Force psql to prompt for a password before connecting to a
database.
This option is never essential, since psql will automatically
prompt for a password if the server demands password
authentication. However, psql will waste a connection attempt
finding out that the server wants a password. In some cases it is
worth typing -W to avoid the extra connection attempt.

How to batch load data with psql with different encodings?

I have a UTF8 database and a UTF8 script to fill tables with data. However I want to run this script with psql -d instance -U user -f fillTables.sql. As my system has a Windows CP1252 encoding it looks like psql uses this to parse the file. I found this documentation and saw these backslash commands, but don't get it working
psql \encoding UTF8 -d instance -U user -f fillTables.sql
It looks like these are meant for starting psql and entering commands inside the psql console, right? How can I set different encoding for a batch processing of different files?
I got it working with export PGCLIENTENCODING=UTF8 (in cygwin, there is another syntax for windows), but would accept other answers if they can achieve the same with an option of psql.

pg_dump: too many command line arguments

what is wrong with this command:
pg_dump -U postgres -W admin --disable-triggers -a -t employees -f D:\ddd.txt postgres
This is giving error of too many command-line arguments
Looks like its the -W option. There is no value to go with that option.
-W, --password force password prompt (should happen automatically)
If you want to run the command without typing is a password, use a .pgpass file.
http://www.postgresql.org/docs/9.1/static/libpq-pgpass.html
For posterity, note that pg_dump and pg_restore (and many other commands) cannot process long hyphens that word processors create. If you are cut-pasting command lines from a word processor, be sure it hasn't converted your hyphens to something else in editing. Else you will get command lines that look correct but hopelessly confuse the argument parsers in these tools.
pg_dump and pg_restore need to ask password on commandline, if you put it command, they always give "too many command-line arguments" error. You can use below for setting related environment variable in commandline or batch file:
"SET PGPASSWORD=<password>"
so that you are not asked to enter password manually in your batch file. They use given environment variable.
Instead of passing password with -W flag start with setting temporary variable for postgres:
PGPASSWORD="mypass" pg_dump -U postgres--disable-triggers -a -t employees -f D:\ddd.txt postgres
-W -> will prompt for a password
to take full DB dump
use some thing like
pg_dump -h 192.168.44.200 -p 5432 -U postgres -W -c -C -Fc -f C:\MMM\backup10_3.backup DATABASE_NAME
I got this from copy-pasting, where 1 of the dashes were different.
Was: –-host= (first dash i a "long" dash)
Corrected to --host= solved it
Another option is to add ~/.pgpass file with content like this:
hostname:port:database:username:password
read more here
Additionally, if you don't want password prompt, use connection string directly.
pg_dump 'postgresql://<username>:<password>#localhost:5432/<dbname>'
So, combination with options in original question,
pg_dump 'postgresql://postgres:<password>#localhost:5432/postgres' --table='"employees"' --format='t' --file='D:\ddd.txt' --data-only --disable-triggers
(Don't forget to use quotes when you have letter-casing issues)
reference:
https://www.postgresql.org/docs/current/app-pgdump.html
Postgres dump specific table with a capital letter
2021-11-30, pg v12, windows 10
pg_dump -U postgres -W -F t postgres > C:\myfolder\pg.tar
-U "postgres" as username,
-W to prompt for psd,
-F t means format is .tar,
> C:\myfolder\pg.tar is the destination path and filename