pg_upgrade tool failed: invalid "unknown" user columns - postgresql

Postgresql update from 9.6 to 10.4 (on Fedora 28) has me stuck: one table in one database has a column of data type "unknown". I would gladly remove the column, but since I cannot get postgresql service to start (because "An old version of the database format was found"), I have no access to the database. In more detail:
postgresql-setup --upgrade fails.
/var/lib/pgsql/upgrade_postgresql.log attributes this failure to column with data type "unknown": "...Checking for invalid 'unknown' user columns: fatal .... check tables_using_unknown.txt". And "tables_using_unknown.txt" specifies one column in one table that I wish I could drop, but can't, because I can't get the server to start:
systemctl start postgresql.service fails, and
systemctl status postgresql.service complains about the "old version of the database"
I have found no obvious way to install postgresql 9.6 on Fedora 28.
Is there a way to drop the column without a running server? Or at least produce a dump of the database? Or can I force the upgrade tool to drop columns with data type "unknown"? Or is there any other obvious solution that I'm missing?

Here's what finally worked for me:
I used a docker container (on the same machine) with postgres 9.6 to access the "old" database directory,
converted the problematic column from "unknown" to "text" in the container,
dumped the relevant database to a file on the container's host, and then
loaded the dumped db into the postgres 10.4 environment.
Not pretty, but worked. In more detail:
I copied postgresql's data directory (/var/lib/pgsql/data/ in Fedora) -- containing the database that could not be converted -- to a new, empty directory /home/hj/pg-problem/.
I created a Dockerfile (text file) called "Docker-pg-problem" reading
FROM postgres:9.6
# my databases need German locale;
# if you just need en_US, comment the next two lines out.
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
and saved it as the only file in the new, empty folder /home/hj/pg-problem/docker/.
I started the docker daemon and ran a container that uses the data from my copy of the problematic data (in /home/hj/pg-problem/data/) as data directory for the postgres 9.6 server in the container. (NB: the "docker build" command in line three needs a working internet connection, takes a while, and should finish saying "Successfully built").
root#host: cd /home/hj/pg-problem/docker
root#host: service docker start
root#host: docker build -t hj/failed-update -f Dockerfile .
root#host: docker run -it --rm -p 5431:5432 -v /home/hj/pg-problem/data:/var/lib/postgresql/data:z --name failed-update -e POSTGRES_PASSWORD=secret hj/failed-update
Then, I opened a terminal in the container to fix the database:
hj#host: docker exec -it failed-update bash
Inside the container, I fixed and dumped the database:
root#container: su postgres
postgres#container: psql <DB-name>
postgres#container: alter table <Table-name> alter column <Col-Name> type text;
postgres#container: \q
postgres#container: dump_db <DB-name> /var/lib/postgresql/data/dbREPAIRED.sql
I dumped the db right into the data directory so I could easily access the dumped file from the docker host.
On the docker host, the dumped database was, obviously, in /home/hj/pg-problem/data/dbREPAIRED.sql, and from there I could load it into postgresql 10:
postgres#host: createdb <DB-name>
postgres#host: psql <DB-name> < /home/hj/pg-problem/data/dbREPAIRED.sql
Since I was on a laptop with limited disk space, I deleted the docker stuff:
root#host: docker rm $(docker ps -a -q)
root#host: docker rmi $(docker images -q)

Related

Backup PostgreSQL database running in a container on Digital Ocean

I'm a newbie to Docker, I'm working on a project that written by another developer, the project is running on Digital Ocean Ubuntu 18.04. It consists of 2 containers (1 container for Django App, 1 container for PostgreSQL database).
It is required from me now to get a backup of database, I found a bash file written by previous programmer:
### Create a database backup.
###
### Usage:
### $ docker-compose -f <environment>.yml (exec |run --rm) postgres backup
set -o errexit
set -o pipefail
set -o nounset
working_dir="$(dirname ${0})"
source "${working_dir}/_sourced/constants.sh"
source "${working_dir}/_sourced/messages.sh"
message_welcome "Backing up the '${POSTGRES_DB}' database..."
if [[ "${POSTGRES_USER}" == "postgres" ]]; then
message_error "Backing up as 'postgres' user is not supported. Assign 'POSTGRES_USER' env with another one and try again."
exit 1
fi
export PGHOST="${POSTGRES_HOST}"
export PGPORT="${POSTGRES_PORT}"
export PGUSER="${POSTGRES_USER}"
export PGPASSWORD="${POSTGRES_PASSWORD}"
export PGDATABASE="${POSTGRES_DB}"
backup_filename="${BACKUP_FILE_PREFIX}_$(date +'%Y_%m_%dT%H_%M_%S').sql.gz"
pg_dump | gzip > "${BACKUP_DIR_PATH}/${backup_filename}"
message_success "'${POSTGRES_DB}' database backup '${backup_filename}' has been created and placed in '${BACKUP_DIR_PATH}'."
my first question is:
is that command right? i mean if i ran:
docker-compose -f production.yml (exec |run --rm) postgres backup
Would that create a backup for my database at the written location?
Second question: can I run this command while the database container is running? or should I run docker-compose down then run the command for backup then run docker-compose up again.
Sure you can run that script to backup, one way to do it is executing a shell in container with docker-compose exec db /bin/bash and then run that script.
Other way is running a new postgres container attached to the postgres container network:
docker run -it --name pgback -v /path/backup/host:/var/lib/postgresql/data --network composeNetwork postgres /bin/bash
this will create a new postgres container attached to the network created with compose, with a binding volume attached, then you can create this script in the container and back up the database to the volume to save it to other place out of container.
Then when you want to backup simple start docker container and backup:
docker start -a -i pgback
You dont need to create other compose file, just copy the script to he container and run it, also you could create a new postgres image with the script and run it from CMD, just to run the container, there are plenty ways to do it.

A container is a database server. How to ask it's Dockerfile to complete its construction after that container has started?

I am using a postgis/postgis Docker image to set a database server for my application.
The database server must have a tablespace created, then a database.
Then each time another application will start from another container, it will run a Liquibase script that will update the database schema (create tables, index...) when needed.
On a terminal, to prepare the database container, I'm running these commands :
# Run a naked Postgis container
sudo docker run --name ecoemploi-postgis
-e POSTGRES_PASSWORD=postgres
-d -v /data/comptes-france:/data/comptes-france postgis/postgis
# Send 'bash level' commands to create the directory for the tablespace
sudo docker exec -it ecoemploi-postgis
bin/sh -c 'mkdir /tablespace && chown postgres:postgres /tablespace'
Then to complete my step 1, I have to run SQL statements to create the tablespace in a PostGIS point of view, and create the database by a CREATE DATABASE.
I connect myself, manually, under the psql of my container :
sudo docker exec -it ecoemploi-postgis bin/sh
-c 'exec psql -h "$POSTGRES_PORT_5432_TCP_ADDR"
-p "$POSTGRES_PORT_5432_TCP_PORT" -U postgres'
And I run manally these commands :
CREATE TABLESPACE data LOCATION '/tablespace';
CREATE DATABASE comptesfrance TABLESPACE data;
exit
But I would like to have a container created from a single Dockerfile having done all the needed work. The difficulty is that it has to be done in two parts :
One before the container is started. (creating directories, granting them user:group).
One after it is started for the first time : declaring the tablespace and creating the base. If I understand well the base image I took, it should be done after an entrypoint docker-entrypoint.sh has been run ?
What is the good way to write a Dockerfile creating a container having done all these steps ?
The PostGIS image "is based on the official postgres image", so it should be able to use the /docker-entrypoint-initdb.d mechanism. Any files you put in that directory will be run the first time the database container is started. The postgis Dockerfile already uses this directory to install the PostGIS extensions into the default database.
That means you can put your build-time setup directly into the Dockerfile, and copy the startup-time script into that directory.
FROM postgis/postgis:12-3.0
RUN mkdir /tablespace && chown postgres:postgres /tablespace
COPY createdb.sql /docker-entrypoint-initdb.d/20-createdb.sql
# Use default ENTRYPOINT/CMD from base image
For the particular setup you describe, this may not be necessary. Each database runs in an isolated filesystem space and starts with an empty data directory, so there's not a specific need to create an alternate data directory; Docker style is to just run multiple databases if you need isolated storage. Similarly, the base postgres image will create a database for you at first start (named by the POSTGRES_DB environment variable).
In order to run a container, your Dockerfile must be functional and completed.
you must enter the queries in a bash file and in the last line you have to enter an ENTRYPOINT with this bash script

How to properly copy data into postgres database through docker via COPY?

I am looking to run a postgresql docker container to hold some data. I've used docker before for clickhouse but not for postgresql and I'm having a bit of a basic issue here which is loading data with COPY. Here are the details:
os: UBUNTU 16.04 running on a NUC
using docker postgres server container from here:
https://docs.docker.com/engine/examples/postgresql_service/
docker ps shows the server running no problem:
29fb9b39e293 eg_postgresql "/usr/lib/postgresql…" 43 hours ago Up 3 hours 0.0.0.0:5432->5432/tcp pg_test
I'd like to copy a file that is currently located in the same NUC in the following folder:
Desktop/ems/ems_raw.csv
I've given user rights to postgres user just in case:
-rw-rw-r-- 1 postgres me 4049497429 Mar 22 12:17 Desktop/ems/ems_raw.csv
me#docker1:~$
I've tried running the following in psql. METHOD 1:
me#docker1:~$ docker exec -ti pg_test psql -U postgres
psql (9.3.17)
Type "help" for help.
postgres=# COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format csv,header);
ERROR: syntax error at or near ""Desktop/ems/ems_raw.csv""
LINE 1: COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format c...
^
postgres=#
I've also tried running this via terminal direct just in case, METHOD 2:
me#docker1:~$ docker exec -ti pg_test psql -U postgres -c "COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format csv,header);"
ERROR: syntax error at or near "Desktop"
LINE 1: COPY ems_stage FROM Desktop/ems/ems_raw.csv WITH (format csv...
^
me#docker1:~$
I know there is something basic here I may not be wrapping my head around. How would I go about running this properly? I'm assuming i am making a mistake with the path? Appreciate the help guys.
So after a combination of suggestions this is what worked for me:
psql -h localhost -p 5432 -d docker -U docker --password --c "\COPY ems_stage FROM 'Desktop/ems/ems_raw.csv' WITH (format csv,header);"
Referencing the docker database and docker username from the dockerfile in docker's documentation.
You should install the postgresql-client package on your host, if you haven't already, and use psql from there to connect to the database
host$ sql -h localhost -U postgres
Once you've connected, run the COPY FROM ... command as above.
In this scenario it helps to think of the Docker container like a remote system, and the docker exec command as equivalent to ssh root#.... The container has an isolated filesystem and can't see files on the host; since in this case you're launching psql from inside the container, the COPY can't see the file it's trying to copy.
In principle you can use the docker run -v option to mount a directory from the host into the container. Since you're on native Linux, it might work to start the database with the external data file mounted, run the COPY FROM ... as you've shown, and then restart the database without it. Restarting the database for this doesn't seem like a desirable path, though. (On other host configurations bind mounts can be pretty slow, and this could be a substantial problem for a 4 GB data file.)
I know there is something basic here I may not be wrapping my head around.
Indeed. The name of the file needs to be in single quotes (or dollar quotes), not in double quotes. There may be later errors as well, but you have to fix this in order to get to those.

Artifactory upgrade fail, postgres 9.5 -> 9.6 upgrade instructions needed

I had planned an upgrade of artifactory from 6.7.5 to 6.8.1. As part of the upgrade I checked jfrog's repo on github and it looks like they have a new recommended nginx and postgres version.
The current docker-compose is using postgres 9.5 and the new default version if 9.6. Simply pulling down postgres 9.6 however does not do an inplace upgrade.
FATAL: database files are incompatible with server DETAIL: The data
directory was initialized by PostgreSQL version 9.5, which is not
compatible with this version 9.6.11.
The upgrade instructions do not mention anything about how to do the upgrade.
The examples provided in github (https://github.com/jfrog/artifactory-docker-examples) are just examples.
Using them in production could cause issues and backwards compatibility is not guaranteed.
To get over the PostgreSQL matter when upgrading, I would suggest:
$ docker-compose -f yml-file-name.yml stop
edit the yml-file-name.yml and change the docker.bintray.io/postgres:9.6.11 to docker.bintray.io/postgres:9.5.2
$ docker-compose -f yml-file-name.yml up -d
Artifactory should be upgraded after following this, however it will keep using the previous version of the PostgreSQL DB
I have been able to upgrade database using following approach:
Dump all database to an SQL script using old database image; store it in a volume for future import:
# Override PostgreSQL image used to export using old binaries
printf "version: '2.1'\nservices:\n postgresql:\n image: docker.bintray.io/postgres:9.5.2\n" > image_override.yml
started_container=$(docker-compose -f artifactory-pro.yml -f image_override.yml run -d -v sql_dump_volume:/tmp/dump --no-deps postgresql)
# Dump database to a text file in a volume (to make it available for import)
docker exec "${started_container}" bash -c "until pg_isready -q; do sleep 1; done"
docker exec "${started_container}" bash -c "pg_dumpall --clean --if-exists --username=\${POSTGRES_USER} > /tmp/dump/dump.sql"
docker stop "${started_container}"
docker rm --force "${started_container}"
Back up old database directory and prepare a new one:
mv -fv /data/postgresql /data/postgresql.old
mkdir -p /data/postgresql
chown --reference=/data/postgresql.old /data/postgresql
chmod --reference=/data/postgresql.old /data/postgresql
Run a new database image with mounting dump script from step 1. It processes SQL scripts upon startup when setting up a new database, provided it's started as postgres something. We just don't need to leave the server running afterwards, so I provided --version to make entrypoint execute, import the data and quit:
docker-compose -f artifactory-pro.yml run --rm --no-deps -e POSTGRES_DB=postgres -e POSTGRES_USER=root -v sql_dump_volume:/docker-entrypoint-initdb.d postgresql postgres --version
After all this is done, I was able to start Artifactory normally with docker-compose -f artifactory-pro.yml up -d and it started up normally, applying rest of schema and file upgrade procedure as usual.
I have also prepared a script that basically does the above steps along with some additional checks and cleanup. Feel free to use if you find it useful.

How to upgrade Cloudera Manager Postgres database

I have Cloudera Manager 5.9 installed on Ubuntu 12.04 with embedded postgres database. I upgraded Ubuntu to 14.04 using do-release-upgrade. In the process, Postgres also got upgraded from 8.4 to 9.3. Now when I try to start the CM database via:
# sudo service cloudera-scm-server-db start
I get the following error in CM db.log:
FATAL: database files are incompatible with server
DETAIL: The data directory was initialized by PostgreSQL version 8.4, which is not compatible with this version 9.3.15.
How do I get past this? I have looked at a lot documentation which talks about dumping the postgres database via pg_dump and restoring via psql, but I don't know how this applies in the context of cloudera manager, especially when the database is not coming up.
On Ubuntu 12.04 when everything is working, I believe the dump can be taken like this:
#pg_dump -h localhost -p 7432 -U scm > /tmp/scm_server_db_backup.$(date +%Y%m%d)
I can try to create an empty database and restore the dump to this one using psql. But how do I configure cdh to point to this database?
As you suggest, you need to find a way to "convert" 8.4 data files to 9.3 data files.
Using pg_dump will need a working PostgreSQL 8.4 instance. So, basically you need a working Postgresql 8.4 (think VM or Docker), then copy your existing 8.4 fiels to that VM/Docker, have that VM/Docker provide a plain-text dump [plain SQL, so compatible with any version), restore that plain-text dump to your 9.3 instance).
You can try :
Create a VirtualMachine or a Docker instance with Postgresql 8.4
deployed.
Locate the main data directory (usually /var/lib/postgresql/8.4/main on Ubuntu, but this might differ) of
your upgraded Cloudera machine. Backup this directory and keep in
safe.
Stop PostgreSQL on your VM/Docker if necessary. Locate the main data directory (usually /var/lib/postgresql/8.4/main on Ubuntu, but
this might differ).
Replace the previous found directory by a copy of your existing 8.4/main content (the one on your upgraded machine having now PG 9.3) to your VM/Docker.
Restart PostgreSQL 8.4 on the VM/Docker
Use pg_dumpall to create a full backup :
pg_dumpall > dump.sql
Transfer dump.sql to your Cloudera machine, and restore it. You might need to drop previous schemas/databases :
psql -f dump.sql postgres
I am able to resolve this problem using the following process:
Step 1: Take a dump of the running postgres database on Ubuntu 14.02
# sudo su
# su - postgres
# pg_dump -h localhost -p 7432 -U scm scm > scm.sql
Step 2: Upgrade Ubuntu to 16.04
# sudo do-release-upgrade
...
Step 3: Rename the old data directory
# mv /var/lib/cloudera-scm-server-db/data/ /var/lib/cloudera-scm-server-db/data9-3
Step 4: Restart cloudera-scm-server-db service. This will create an empty database which we will populate using the backup taken in step 1
# sudo service cloudera-scm-server-db restart
Step 5: Now restore the database
# sudo su
# su - postgres
# psql -h localhost -p 7432 -U scm
(password can be obtained like this: grep password /etc/cloudera-scm-server/db.properties)
scm> \i scm.sql
Step 6: Now restart cloudera-scm-server service:
# sudo service cloudera-scm-service restart