How to properly copy data into postgres database through docker via COPY? - postgresql

I am looking to run a postgresql docker container to hold some data. I've used docker before for clickhouse but not for postgresql and I'm having a bit of a basic issue here which is loading data with COPY. Here are the details:
os: UBUNTU 16.04 running on a NUC
using docker postgres server container from here:
https://docs.docker.com/engine/examples/postgresql_service/
docker ps shows the server running no problem:
29fb9b39e293 eg_postgresql "/usr/lib/postgresql…" 43 hours ago Up 3 hours 0.0.0.0:5432->5432/tcp pg_test
I'd like to copy a file that is currently located in the same NUC in the following folder:
Desktop/ems/ems_raw.csv
I've given user rights to postgres user just in case:
-rw-rw-r-- 1 postgres me 4049497429 Mar 22 12:17 Desktop/ems/ems_raw.csv
me#docker1:~$
I've tried running the following in psql. METHOD 1:
me#docker1:~$ docker exec -ti pg_test psql -U postgres
psql (9.3.17)
Type "help" for help.
postgres=# COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format csv,header);
ERROR: syntax error at or near ""Desktop/ems/ems_raw.csv""
LINE 1: COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format c...
^
postgres=#
I've also tried running this via terminal direct just in case, METHOD 2:
me#docker1:~$ docker exec -ti pg_test psql -U postgres -c "COPY ems_stage FROM "Desktop/ems/ems_raw.csv" WITH (format csv,header);"
ERROR: syntax error at or near "Desktop"
LINE 1: COPY ems_stage FROM Desktop/ems/ems_raw.csv WITH (format csv...
^
me#docker1:~$
I know there is something basic here I may not be wrapping my head around. How would I go about running this properly? I'm assuming i am making a mistake with the path? Appreciate the help guys.

So after a combination of suggestions this is what worked for me:
psql -h localhost -p 5432 -d docker -U docker --password --c "\COPY ems_stage FROM 'Desktop/ems/ems_raw.csv' WITH (format csv,header);"
Referencing the docker database and docker username from the dockerfile in docker's documentation.

You should install the postgresql-client package on your host, if you haven't already, and use psql from there to connect to the database
host$ sql -h localhost -U postgres
Once you've connected, run the COPY FROM ... command as above.
In this scenario it helps to think of the Docker container like a remote system, and the docker exec command as equivalent to ssh root#.... The container has an isolated filesystem and can't see files on the host; since in this case you're launching psql from inside the container, the COPY can't see the file it's trying to copy.
In principle you can use the docker run -v option to mount a directory from the host into the container. Since you're on native Linux, it might work to start the database with the external data file mounted, run the COPY FROM ... as you've shown, and then restart the database without it. Restarting the database for this doesn't seem like a desirable path, though. (On other host configurations bind mounts can be pretty slow, and this could be a substantial problem for a 4 GB data file.)

I know there is something basic here I may not be wrapping my head around.
Indeed. The name of the file needs to be in single quotes (or dollar quotes), not in double quotes. There may be later errors as well, but you have to fix this in order to get to those.

Related

Running a Chainlink Node - Remote DATABASE_URL Config PostgreSQL problem

I have been trying since yesterday to connect to a ChainLink node and I was not able to.
I followed the steps at this website
I am having a problem with "Set the Remote DATABASE_URL Config" (I think that this is my only error because of the [ERROR] listed below, I do not know if I am doing something else wrong since every command was executed without error)
I am using the Docker option to create the database listed here.
I am always having this error:
"[ERROR] unable to lock ORM: failed to connect to host=localhost user=some-postgres database=postgres: dial error (dial tcp [::1]:5432: connect: cannot assign requested address) logger/default.go:155 stacktrace=github.com/smartcontractkit/chainlink/core/logger.Errorf
/chainlink/core/logger/default.go:155"
After writing in my Ubuntu Terminal (ON WINDOWS 10):
"cd ~/.chainlink-kovan && docker run -p 6688:6688 -v ~/.chainlink-kovan:/chainlink -it --env-file=.env smartcontract/chainlink:0.10.1 local n"
I do not know how to connect to the database and what to write as attributes. All of the other steps and installs I have accomplished successfully.
I just want to know how to create a database on PostgreSQL and connect it to Docker as explained on the ChainLink website and write the appropriate command in the Ubunto terminal (for the "Remote DATABASE_URL Config PostgreSQL" step) so that I can run my node.
Thanks! (PS: I am a beginner and your help is much appreciated, and if I forgot to mention any important information please let me know so that I add it)
A comprehensive 101 for docker-postgres can be found here: https://hackernoon.com/dont-install-postgres-docker-pull-postgres-bee20e200198
Basically, you need to deploy a postgres db with docker
Pre-Reqs:
Create a dir for you docker/postgres:
mkdir -p $HOME/docker/volumes/postgres
Example:
docker run --rm --name pg-docker -e POSTGRES_USER=<any_desired_name> -e POSTGRES_PASSWORD=docker -e POSTGRES_DB=<any_db_name> -d -p 5432:5432 -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data postgres
For postgres username, it can be anything like "super_chain" or etc.
For postgres db, it can be "chainlink"
After, docker is up and running. Just follow up the docs tut, where you need to write the DB URL to the .env file
Cheers

Custom PostgreSQL Dockerfile

I am trying to create multiple PostgreSQL databases using Dockerfile and create a container from this image.
My sample setup looks like this:
Dockerfile:
FROM postgres:11.8
COPY init.sql /docker-entrypoint-initdb.d
init.sql
CREATE DATABASE firstdb
CREATE DATABASE seconddb
CREATE DATABASE thirddb
In order to build the docker image and SSH into a running container I run the following commands:
docker build -t postgres:v11.8 .
docker run -it postgres:v11.8 bash
One of the problems that I'm facing right now is the error below as soon as I try to connect using psql -U postgres command:
psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket
"/var/run/postgresql/.s.PGSQL.5432"?
The second issue I have is how to make the separate lines within init.sql (CREATE DATABASE ) into a single line or loop?
Thanks, guys!
I'm not sure, but when you run your docker image only with -it flags that way it's not really running the PostgreSQL process. First, run your container as its should without the flags with:
docker container run --name db <your-custom-image>:<tag>
After that, if you want to enter the container's bash then run with -it flags with the correct container name (db).

Postgresql as docker container not starting with data from mapped volume

On my macbook I have postgresql running in a docker container and I use a mapped volume to persist the data. This works perfectly locally. However, when I try to do the same on the Ubuntu server the 'initial' data from the mapped volume is not working. Postgres starts up in an 'empty' initial state.
However, when I add a table and data in that table in the default postgres database it IS persistent. So the volume mapping seems to work.
Furthermore it is interesting to note that I'm getting an error when I try to create a table in a new database. The new database is persistent as well, but the table cant be saved as there is an error thrown:
could not open file "base/16384/2611": No such file or directory
This is expected as the folder base/16384 doesn't exist.
To me this seems this is a user/rights issue perhaps, but no clue how to fix this.
I tried running the container as root, which didn't help.
Any suggestions?
I'm starting the container with either docker-compose or from the command line using;
docker run --rm --name pg -e POSTGRES_PASSWORD=[password] -d -p 5432:5432 -v /root/docker/volumes/postgres:/var/lib/postgresql/data postgres -c listen_addresses='*'
Instead of moving the actual data folder around I used pg_dump and pg_restore within the docker containers per suggestion on the docker forums. This did the trick

Configuration issue Postgres on Ubuntu?

I have installed Postgres 12 on Ubuntu by building it from source and I am facing two issues:
Although I followed the installation manual from Postgrez, every time I restart my computer, my Postgres server stopz and is no longer seen as a running process.
To start it the first time after install, I do this from the terminal:
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
After a restart, to start DB again when I run: /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data, it throws this error:
initdb: error: directory "/usr/local/pgsql/data" exists but is not empty
If you want to create a new database system, either remove or empty
the directory "/usr/local/pgsql/data" or run initdb
with an argument other than "/usr/local/pgsql/data".
Does that mean that every time I start Postgres after a restart, I have to create a new /data directory?
Upon installing Postgres sing pip or pip3, one can just switch user to postgres and run psql to enter postgres, however now I have to run "/usr/local/bin/psql". Please note I have exported all the paths per https://www.postgresql.org/docs/12/installation.html. How can I fix this? Can an alias be set for this?
After a restart, to start DB again when I run:
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data, it throws this
error:
Does that mean that every
time I start Postgres after a restart, I have to create a new /data
directory?
No, quite the opposite. You don't need to initdb after the first time, you just need to start. It is your attempt to initdb when you don't need to which is causing the error message. Note that attempting to initdb isn't doing any harm, because it refused to run. It just generates log/console noise.
Upon installing Postgres sing pip or pip3, one can just switch user to
postgres and run psql to enter postgres, however now I have to run
"/usr/local/bin/psql". Please note I have exported all the paths per
https://www.postgresql.org/docs/12/installation.html. How can I fix
this?
I don't know what your first sentence means, as you don't use pip or pip3 to install PostgreSQL (or at least, the docs don't describe doing so) although you might use them to install psycopg2 to enable python to talk to PostgreSQL.
You could use an alias, but it would probably make more sense to edit ~/.bash_profile to set the PATH, as described from the page you linked to under Environment Variables.
You have to register postgreSQL as a service.
run this:
pg_ctl register [-N servicename] [-U username] [-P password] [-D datadir] [-S a[uto] | d[emand] ] [-w] [-t seconds] [-s] [-o options]
Example:
pg_ctl register -N postgresql -U OS_username -P OS_password -D '/etc/postgresql/12/data' -w
More info in the manual: pg_ctl
Notes:
Username and Password is related to the OS, not postgresql
If you have doubts read the manual.
/usr/local/pgsql/bin/pg_ctl start -D '/usr/local/pgsql/data'
Export following in postgres user account's ~/.bashrc:
LD_LIBRARY_PATH=/usr/local/pgsql/lib
export LD_LIBRARY_PATH
PATH=/usr/local/pgsql/bin:$PATH
export PATH

pg_upgrade tool failed: invalid "unknown" user columns

Postgresql update from 9.6 to 10.4 (on Fedora 28) has me stuck: one table in one database has a column of data type "unknown". I would gladly remove the column, but since I cannot get postgresql service to start (because "An old version of the database format was found"), I have no access to the database. In more detail:
postgresql-setup --upgrade fails.
/var/lib/pgsql/upgrade_postgresql.log attributes this failure to column with data type "unknown": "...Checking for invalid 'unknown' user columns: fatal .... check tables_using_unknown.txt". And "tables_using_unknown.txt" specifies one column in one table that I wish I could drop, but can't, because I can't get the server to start:
systemctl start postgresql.service fails, and
systemctl status postgresql.service complains about the "old version of the database"
I have found no obvious way to install postgresql 9.6 on Fedora 28.
Is there a way to drop the column without a running server? Or at least produce a dump of the database? Or can I force the upgrade tool to drop columns with data type "unknown"? Or is there any other obvious solution that I'm missing?
Here's what finally worked for me:
I used a docker container (on the same machine) with postgres 9.6 to access the "old" database directory,
converted the problematic column from "unknown" to "text" in the container,
dumped the relevant database to a file on the container's host, and then
loaded the dumped db into the postgres 10.4 environment.
Not pretty, but worked. In more detail:
I copied postgresql's data directory (/var/lib/pgsql/data/ in Fedora) -- containing the database that could not be converted -- to a new, empty directory /home/hj/pg-problem/.
I created a Dockerfile (text file) called "Docker-pg-problem" reading
FROM postgres:9.6
# my databases need German locale;
# if you just need en_US, comment the next two lines out.
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
and saved it as the only file in the new, empty folder /home/hj/pg-problem/docker/.
I started the docker daemon and ran a container that uses the data from my copy of the problematic data (in /home/hj/pg-problem/data/) as data directory for the postgres 9.6 server in the container. (NB: the "docker build" command in line three needs a working internet connection, takes a while, and should finish saying "Successfully built").
root#host: cd /home/hj/pg-problem/docker
root#host: service docker start
root#host: docker build -t hj/failed-update -f Dockerfile .
root#host: docker run -it --rm -p 5431:5432 -v /home/hj/pg-problem/data:/var/lib/postgresql/data:z --name failed-update -e POSTGRES_PASSWORD=secret hj/failed-update
Then, I opened a terminal in the container to fix the database:
hj#host: docker exec -it failed-update bash
Inside the container, I fixed and dumped the database:
root#container: su postgres
postgres#container: psql <DB-name>
postgres#container: alter table <Table-name> alter column <Col-Name> type text;
postgres#container: \q
postgres#container: dump_db <DB-name> /var/lib/postgresql/data/dbREPAIRED.sql
I dumped the db right into the data directory so I could easily access the dumped file from the docker host.
On the docker host, the dumped database was, obviously, in /home/hj/pg-problem/data/dbREPAIRED.sql, and from there I could load it into postgresql 10:
postgres#host: createdb <DB-name>
postgres#host: psql <DB-name> < /home/hj/pg-problem/data/dbREPAIRED.sql
Since I was on a laptop with limited disk space, I deleted the docker stuff:
root#host: docker rm $(docker ps -a -q)
root#host: docker rmi $(docker images -q)