Docker mongodb - add database on disk to container - mongodb

I am running Docker on windows and I have a database with some entries on disk at C:\data\db.
I want to add this database to my container. I have tried numerous ways to do this but failed.
I tried: docker run -p 27017:27017 -v //c/data/db:/data/db --name mongodb devops-mongodb
In my dockerfile I have:
RUN mkdir -p /data/db
VOLUME /data/db
But this doesn't add my current database on disk to the container. It creates a fresh /data/db directory and persists the data I add to it.
The docs here https://docs.docker.com/userguide/dockervolumes/ under 'Mount a host directory as a data volume' specifically told me to execute the -v //c/data/db:/data/db but this isn't working.
Any ideas?

You're using Boot2Docker (which runs inside a Virtual Machine). Boot2Docker uses VirtualBox guest additions to make directories on your Windows machine available to Docker running inside the Virtual Machine.
By default, only the C:\Users directory (on Windows), or /Users/ directory (on OS X) is shared with the virtual machine. Anything outside those directories is not shared with the Virtual Machine, which results in Docker creating an empty directory at the specified location for the volume.
To share directories outside C:\Users\ with the Virtual Machine, you have to manually configure Boot2Docker to share those. You can find the steps needed in the VirtualBox guest addition section of the README;
If some other path or share is desired, it can be mounted at run time by doing something like:
$ mount -t vboxsf -o uid=1000,gid=50 your-other-share-name /some/mount/location
It is also important to note that in the future, the plan is to have any share which is created in VirtualBox with the "automount" flag turned on be mounted during boot at the directory of the share name (ie, a share named home/jsmith would be automounted at /home/jsmith).
Please be aware that using VirtualBox guest additions have a really bad impact on performance (reading/writing to the volume will be really slow). Which could be fine for development, but should be used with caution.

Related

Postgres volume mounting on WSL2 and Docker desktop: Permission Denied on PGDATA folder

There are some similar posts but this is specifically related to running Postgres with WSL2 backend on Docker desktop. WSL2 brings full Linux experience on Windows. Volumes can be mounted to both Windows and Linux file systems. But the best practice is to use Linux file system for performance reasons see docker documentation.
Performance is much higher when files are bind-mounted from the Linux filesystem, rather than remoted from the Windows host. Therefore avoid docker run -v /mnt/c/users:/users (where /mnt/c is mounted from Windows).
Instead, from a Linux shell use a command like docker run -v ~/my-project:/sources where ~ is expanded by the Linux shell to $HOME.
My WSL distro is Ubuntu 20.04 LTS. I'm bind mounting Postgres data directory to a directory on Linux filesystem and I'm also configuring the Postgres PGDATA to use a sub-directory because this is instructed on the official Docker image docs:
PGDATA
This optional variable can be used to define another location - like a subdirectory - for the database files. The default is /var/lib/postgresql/data. If the data volume you're using is a filesystem mountpoint (like with GCE persistent disks) or remote folder that cannot be chowned to the postgres user (like some NFS mounts), Postgres initdb recommends a subdirectory be created to contain the data.
So this is how I start Postgres with the volume mounting to WSL2 Ubuntu file system:
docker run -d \
--name some-postgres -e POSTGRES_PASSWORD=root \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v ~/custom/mount:/var/lib/postgresql/data \
postgres
I can exec into the running container and verify that the data folder exists and it's configured correctly:
Now from the host machine (WSL2 Linux) if I try to access that folder I get the permission denied:
I would appreciate if anyone can provide a solution. None of the existing posts worked to resolve the issue.
This has got nothing to do with PostgreSQL. Docker containers run as root and so any directory created by Docker will also belong to root.
When you attach to the container and list the directory under /var/lib/postgresql/data it shows postgres as the owner.
Check "Arbitrary --user Notes" section in the official documentation here
The second option "bind-mount /etc/passwd read-only from the host" worked for me.
Two things that were blocking us working with WSL2 on Windows were:
Folder c:\Program files\WindowsApps didn't have admin account listed as owner
McAfee was blocking the WSL. In order to disable blocking we had to remove following rule: Open McAfee -> Threat Prevention -> Show Advanced (button in Right upper corner) -> scroll down to Rules -> name of the rule is "Executing Subsystem for Linux"

docker postgres, fail to map volume in windows

I wish to store my persists data in my local D:\dockerData\postgres9.6. Below is my docker command
docker pull postgres
docker run -d -v /d/dockerData/postgres9.6:/var/lib/postgresql/data -p 5432:5432 postgres
It successful create a container and I can use pgAdmin to access and create database.
But I found out that there is no file in my D:\dockerData\postgres9.6. I exec bash into the container, there is at least 20+ files inside /var/lib/postgresql/data.
Anyone can point out which part goes wrong?
It depends what kind of Docker you are using on Windows:
Docker Toolbox with VirtualBox: only C:\Users\mylogin is shared by default. D:\ is not mounted.
Docker for Windows with HyperV: only C:\ is mounted by default. Make sure D:\ is a shared drive: see image

Docker cannot start MongoDb with attached volume through data-only container

I'm trying to run a docker-compose on my Windows machine spinning up a MongoDB instance and a data-only container which proxies an attached volume containing the database files.
mongodata:
image: mongo:2.6.8
volumes:
- ./data/db:/data/db
command: --break-mongo
mongo:
image: mongo:2.6.8
volumes_from:
- mongodata
ports:
- "27017:27017"
command: --smallfiles --rest
*p.s. the --break-mongo command is there on purpose as it just needs to create the volume
To my understanding, using a data-only volume pattern would handle permission issues but I can see the following error during the Mongo container startup:
[0m2016-01-26T00:23:52.340+0000 [initandlisten] info preallocateIsFaster couldn't run due to: couldn't open file /data/db/journal/tempLatencyTest for writing errno:1 Operation not permitted; returning false
[0m2016-01-26T00:23:52.341+0000 [initandlisten] Unable to remove temporary file due to: boost::filesystem::remove: Text file busy: "/data/db/journal/tempLatencyTest"
[0m2016-01-26T00:23:52.344+0000 [initandlisten] exception in initAndListen: 13516 couldn't open file /data/db/journal/j._0 for writing errno:1 Operation not permitted, terminating
Therefore I'm unable to use MongoDb with an attached volume from my local machine. Is there any way around this issue?
The documentation states
If you are using Docker Machine on Mac or Windows, your Docker daemon
has only limited access to your OS X or Windows filesystem. Docker
Machine tries to auto-share your /Users (OS X) or C:\Users (Windows)
directory. So, you can mount files or directories on OS X using.
docker run -v /Users/<path>:/<container path> ...
On Windows, mount directories using:
docker run -v /c/Users/<path>:/<container path> ...
All other paths come from your virtual machine’s filesystem. For example, if you are
using VirtualBox some other folder available for sharing, you need to
do additional work. In the case of VirtualBox you need to make the
host folder available as a shared folder in VirtualBox. Then, you can
mount it using the Docker -v flag.
Basically, either try to give a full path beginning from your C:\Users folder as shown above, or if you can't have that, make the host folder a shared folder in Virtualbox.
Update
No need to give a full path. docker-compose will handle that. You have to make sure that your docker-compose.yml is inside (somewhere down the line) of your Users folder. It can't be in some root folder. If you are already doing that, then you will have to adjust your permissions. Just give full permissions to that folder.
Update: Check out the latest Docker for Windows and MacOS X.
Faster and more reliable: no more VirtualBox! The Docker engine is
running in an Alpine Linux distribution on top of an xhyve Virtual
Machine on Mac OS X or on a Hyper-V VM on Windows, and that VM is
managed by the Docker application. You don’t need docker-machine to
run Docker for Mac and Windows.
Note: if Windows, you need Windows 10 Pro to make it work as Hyper-V is not included in other releases.
For Docker Toolbox previously, it seems there is no solution at all on Windows and OS X due to VirtualBox. The image documentation indeed states:
WARNING (Windows & OS X): The default Docker setup on Windows and OS X
uses a VirtualBox VM to host the Docker daemon. Unfortunately, the
mechanism VirtualBox uses to share folders between the host system and
the Docker container is not compatible with the memory mapped files
used by MongoDB (see vbox bug, docs.mongodb.org and related
jira.mongodb.org bug). This means that it is not possible to run a
MongoDB container with the data directory mapped to the host
As an workaround I just copy from a folder before mongo deamon starts. Also, in my case I don't care of journal files, so i only copy database files.
I've used this command on my docker-compose.yml
command: bash -c "(rm /data/db/*.lock && cd /prev && cp *.* /data/db) && mongod"
And everytime before stoping the container I use:
docker exec <container_name> bash -c 'cd /data/db && cp $(ls *.* | grep -v *.lock) /prev'
Note: /prev is set as a volume. path/to/your/prev:/prev
Another workaround is to use mongodump and mongorestore.
in docker-compose.yml: command: bash -c "(sleep 30; mongorestore
--quiet) & mongod"
in terminal: docker exec <container_name> mongodump
Note: I use sleep because I want to make sure that mongo started, and it takes a while.
I know this involves manual work etc, but I am happy that at least I got mongo with existing data running on my Windows 10 machine, and still can work on my Macbook when I want.
(croscopy https://stackoverflow.com/a/42044756/1894856)

Why doesn't postgres official docker repo start db service at build time?

Under the background of https://github.com/docker-library/postgres (github repo) and https://registry.hub.docker.com/_/postgres/ (docker hub)
It can be seen database is started by Entrypoint and CMD with bash script
/docker-entrypoint.sh
with
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
another script hook provided to change database is
/docker-entrypoint-initdb.d
which means the database starts (can be pqsl) only at runtime, when docker run command is typed in.
This causes a problem, we could not customize the database before it runs in build time, for example add extensions and populate db with data.
Of course, it could be done in run time. But it has the advantage to repeat the operation every time when the image is run.
So, what is the logic behind this design from docker or postgres perspective? How could I add extension and populate data in build time ?
If you were to customize (create, populate data) a database at build time, that would imply that the database data is written into the docker image filesystem itself (as one cannot mount a volume at build time).
The issue with that is that the docker image filesystem is a special one (AUFS or btrfs, etc) which isn't delivering good I/O performances for data intensive applications such as a database server.
As a consequence, you want to have your data written on a volume instead of on the docker container filesystem. As you don't know at build time what would be the volume used at run time, and as there is no mean anyway to mount volumes at build time, no one should create database at build time.
Furthermore, if you take a close look at the Dockerfile of the official PostgreSQL image, you will see that there is a VOLUME instruction that makes the path at which the data is written a volume. That means that the image is designed so that the data will never hit the docker container filesystem.
If you take a look at other Dockerfiles for other databases or data intensive applications, you will notice that they all operate in this manner. An other reason for that is that it is accepted as a good practice to make your docker containers immutable.
If you want to install additional modules to your image, it is fine as long as those do not depend on data that would be written on a volume, and as long as you make sure to declare a volume for any path they would write data on.
tl;dr
Application code/binary → docker image filesystem
Application data → docker volume
This is right from the docker page for the postgres image (library/postgres):
If you would like to do additional initialization in an image derived from this one, add a *.sql or *.sh script under /docker-entrypoint-initdb.d (creating the directory if necessary). After the entrypoint calls initdb to create the default postgres user and database, it will run any *.sql files and source any *.sh script found in that directory to do further initialization before starting the service.
You can also extend the image with a simple Dockerfile to set the locale. The following example will set the default locale to de_DE.utf8:
FROM postgres:9.4
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
Since database initialization only happens on container startup, this allows us to set the language before it is created.
You have the ability to extend an image just as the example shows from the docs that I pasted above. You can also use the exec command and execute virtually anything within the container right from your host machine. It took me a little while to get used to it, I continue to discover things as I play with it more and more.
UPDATE:
sudo docker run --name some-postgres -v ~/PATH/TO/some-postgres/data:/var/lib/postgres/data -p 127.0.0.1:5432:5432 -e POSTGRES_PASSWORD=test -d postgres

How can I move postgresql data to another directory on Ubuntu over Amazon EC2?

We've been running postgresql 8.4 for quite some time. As with any database, we are slowly reaching our threshold for space. I added another 8 GB EBS drive and mounted it to our instance and configured it to work properly on a directory called /files
Within /files, I manually created
Correct me if I'm wrong, but I believe all postgresql data is stored in /var/lib/postgresql/8.4/main
I backed up the database and I ran sudo /etc/init.d/postgresql stop. This stops the postgresql server. I tried to copy and paste the contents of /var/lib/postgresql/8.4/main into the /files directory but that turned out be a HUGE MESS! due to file permissions. I had to go in and chmod the contents of that folder just so that I could copy and paste them. Some files did not copy fully because of root permissions. I modified the data_directory parameter in postgresql.conf to point to the files directory
data_directory = '/files/postgresql/main'
and I ran sudo /etc/init.d/postgresql restart and the server failed to start. Again probably due to permission issues. Amazon EC2 only allows you to access the service as ubuntu by default. You can only access root from within the terminal which makes everything a lot more complicated.
Is there a much cleaner and more efficient step by step way of doing this?
Stop the server.
Copy the datadir while retaining permissions - use cp -aRv.
Then (easiest, as it avoids the need to modify initscripts) just move the old datadir aside and symlink the old path to the new location.
Thanks for the accepted answer. Instead of the symlink you can also use a bind mount. That way it is independent from the file system. If you want to use a dedicated hard drive for the database you can also mount it normally. to the data directory.
I did the latter. Here are my steps if someone needs a reference. I ran this as a script on many AWS instances.
# stop postgres server
sudo service postgresql stop
# create new filesystem in empty hard drive
sudo mkfs.ext4 /dev/xvdb
# mount it
mkdir /tmp/pg
sudo mount /dev/xvdb /tmp/pg/
# copy the entire postgres home dir content
sudo cp -a /var/lib/postgresql/. /tmp/pg
# mount it to the correct directory
sudo umount /tmp/pg
sudo mount /dev/xvdb /var/lib/postgresql/
# see if it is mounted
mount | grep postgres
# add the mount point to fstab
echo "/dev/xvdb /var/lib/postgresql ext4 rw 0 0" | sudo tee -a /etc/fstab
# when database is in use, observe that the correct disk is being used
watch -d grep xvd /proc/diskstats
A clarification. It is the particular AMI that you used that sets ubuntu as the default user, this may not apply to other AMIs.
In essence if you are trying move data manually, you will probably need to do so as the root user, and then make sure its available to whatever user postgres is running with.
You also do have the option of snapshotting the volume and increasing the size of the a volume created from the snapshot. Then you could replace the volume on your instance with the new volume (You probably will have to resize the partition to take advantage of all the space).