docker-compose mongo needs read permissions for all - mongodb

The mongo image has this nice mount option to initialize a database. You can mount ./initdb.d/ and *.js will be executed.
However, my project files and directories are 600 and 700 respectively, owned by redsandro:redsandro. The mongo image cannot read them.
It doesn't matter if I add group read + execute (dir) (i.e. 640 and 750). Only when I add read permissions for all on the file, and read + execute permissions for all on the directory (i.e. 644 and 755, let's call that "plan C"), will the mongo image execute the script.
Is there a way I can keep my files private on my machine (e.g. no permissions for all AKA umask 007) and still have the mongo image read them?
Update: I'm looking for a way to do this with docker-compose options, variables, environment etc. Not with building a custom image. E.g. some images that have similar problems allow me to simply set user: 1000 (uid for local user). This doesn't work with the mongo image.

You can copy files instead of mounting the directory. E.g. your Dockerfile can be something like that:
FROM mongo
COPY --chown=mongodb:mongodb /host/path/to/scripts/* /docker-entrypoint-initdb.d/
It means each time you change the scripts you will need to rebuild the image, not just re-create the container. It's a tiny layer on top of the base image so it shouldn't take much time to build the image.

Related

dockerfile for backend and a seperate one for dbms because compose wont let me copy sql file into dbms container?

I have a dockerfile for frontend, one for backend, and one for the database.
In the backend portion of the project, I have a dockerfile and a docker-compose.yml file.
the dockerfile is great for the backend because it configures the backend, copies and sets up the information etc. I like it alot.
The issue i have come to though is that if i can easily create a dockerfile for the dbms, but it requires me to put it in a different directory, where i was hoping to just define it in the same directory as the backend, and because of the fact the backend and the dbms is so tightly coupled, i figured this is where docker-compose would go.
My issue I ran into is that in a compose file, I cant do a COPY into the dbms container. I would just have to create another dockerfile to set that up. I was thinking that would work.
When looking on github, there was a big enhancement thread about it, but the closest people would get is just creating volume relationship, which fails to do what I want.
Ideally, All i want to be able to do is to stand up a postgres dbms in a fashion such that i could conduct load balancing on it later down the line with 1 write, 5 read or something, and have its initial db defined in my one sql file.
Am I missing something? I thought i was going about it correctly, but maybe I need to create a whole new directory with a dockerfile for the dbms.
Thoughts on how I should accomplish this?
Right now i was doing something like:
version: '2.0'
services:
backend:
build: .
ports:
- "8080:8080"
database:
image: "postgres:10"
environment:
POSTGRES_USER: "test"
POSTGRES_PASSWORD: "password"
POSTGRES_DB: "foo"
# I shouldnt have volumes as it would copy the entire folder and its contents to db.
volumes:
- ./:/var/lib/postgresql/data
To copy things with docker there an infinite set of possibilities.
At image build time:
use COPY or ADD instructions
use shell commands including cp,ssh,wget and many others.
From the docker command line:
use docker cp to copy from/to hosts and containers
use docker exec to run arbitrary shell commands including cp, ssh and many others...
In docker-compose / kubernetes (or through command line):
use volume to share data between containers
volume can be local or distant file systems (network disk for example)
potentially combine that with shell commands for example to perform backups
Still how you should do it dependy heavily of the use case.
If the data you copy is linked to the code and versionned (in the git repo...) then treat as it was code and build the image with it thanks to the Dockerfile. This is for me a best practice.
If the data is a configuration dependrnt of the environement (like test vs prod, farm 1 vs farm 2), then go for docker config/secret + ENV variables.
If the data is dynamic and generated at production time (like a DB that is filled with user data as the app is used), use persistant volumes and be sure you understand well the impact of container failure for your data.
For a database in a test system it can make sense to relauch the DB from a backup dump, a read only persistant volume or much simpler backup the whole container at a known state (with docker commit).

File ownership and permissions in Singularity containers

When I run singularity exec foo.simg whoami I get my own username from the host, unlike in Docker where I would get root or the user specified by the container.
If I look at /etc/passwd inside this Singularity container, an entry has been added to /etc/passwd for my host user ID.
How can I make a portable Singularity container if I don't know the user ID that programs will be run as?
I have converted a Docker container to a Singularity image, but it expects to run as a particular user ID it defines, and several directories have been chown'd to that user. When I run it under Singularity, my host user does not have access to those directories.
It would be a hack but I could modify the image to chmod 777 all of those directories. Is there a better way to make this image work on Singularity as any user?
(I'm running Singularity 2.5.2.)
There is actually a better approach than just chmod 777, which is to create a "vanilla" folder with your application data/conf in the image, and then copy it over to a target directory within the container, at runtime.
Since the copy will be carried out by the user actually running the container, you will not have any permission issues when working within the target directory.
You can have a look at what I done here to create a portable remote desktop service, for example: https://github.com/sarusso/Containers/blob/c30bd32/MinimalMetaDesktop/files/entrypoint.sh
This approach is compatible with both Docker and Singularity, but it depends on your use-case if it is a viable solution or not. Most notably, it requires you to run the Singularity container with --writable-tmpfs.
As a general comment, keep in mind that even if Singularity is very powerful, it behaves more as an environment than a container engine. You can make it work more container-like using some specific options (in particular --writable-tmpfs --containall --cleanenv --pid), but it will still have limitations (variable usernames and user ids will not go away).
First, upgrade to the v3 of Singularity if at all possible (and/or bug your cluster admins to do it). The v2 is no longer supported and several versions <2.6.1 have security issues.
Singularity is actually mounting the host system's /etc/passwd into the container so that it can be run by any arbitrary user. Unfortunately, this also effectively clobbers any users that may have been created by a Dockerfile. The solution is as you thought, to chmod any files and directories to be readable by all. chmod -R o+rX /path/to/base/dir in a %post step is simplest.
Since the final image is read-only, allowing write permission doesn't do anything and it's useful to get into the mindset about only writing to files/directories that have been mounted to the image.

Why does my postgres docker image not contain any data?

I'd like to create a docker image with data.
My attempt is as follows:
FROM postgres
COPY dump_testdb /image
RUN pg_restore /image
RUN rm -rf /image
Then I run docker build -t testdb . and docker run -d testdb.
When I connect to the container I don't see the restored db. How do I get an image with the restored data?
COPY the dump file, with a .sql extension, into /docker-entrypoint-initdb.d/. Do not try to RUN anything. The postgres image will run everything in that directory the first time a container is started on a particular data directory.
You generally can’t RUN commands that interact with the database in a Dockerfile because the database won’t be running at that point. (There is a script in the base image that goes through some complicated gymnastics to do the first-time setup.) In any case, because of the mechanics of Docker’s volume system, you can’t create an image that contains prepopulated database data; you have to use a mechanism like this to cause the image to restore a dump or otherwise set itself up at first start.

Moving postgres data folder on Ubuntu

I have a web application querying a Postgresql database (successfully) and I'm looking to move the data folder from location /var/lib/postgres/9.3/main to a customisable location.
Right now I'm prevented from even copying the folder due to permission errors, but I can't assign myself the permissions because that breaks the postgres server.
(I broke the server by running sudo chown <username> -R /var/lib/postgres/9.3/main - which worked as a command but stopped the postgres server from working)
I would simply create a new folder and change the location there, but I'll lose the current instance of my database if that was done.
How can I move the current folder to a new location, so that I can point to it in the .conf file? I need to explicitly move the folder, I can't create a new DB.
You can just copy or move the directory, including all subdirs and files
cp -rp or mv should be enough for this.
Postgres must not be running while you are messing with the files
The base of the data-drectory (PG_DATA) must be owned by postgres and have file mode 0700 . (when not: pg will refuse to start)
[the rest of the files must at least be readable/writeble by postgres]
the new location must also be known to the startup process (in /etc/init.d/ and (possibly) in the postgres.conf file within the data directory. (for the log file location)

Where are the db files? at /var/lib/mongodb I cant find any increase in size. I ran very big loop to create lakhs of objects

I am using UBUNTU, from /etc/mongod.conf, I found that /var/lib/mongdb is the path for data.
I have found some files like collectionname.0, .1, .ns in that directory. but when I run a very big loop(lakhs), I am able to get them back using mongo shell, but that mongodb directory size is not increasing, so there must be someother place where this data is being stored
What is that place?
There is no another place. As indicated by #itsbruce, in Ubuntu it's /var/lib/mongodb.
On a non-packaged installation (on Linux), i.e. without a /etc/mongodb.conf file, the default is /data/db/ (unless otherwise specified).
You can modify the switch "--dbpath" on the command line to designate a different directory for the mongod instance to store its data. Typical locations include: /srv/mongodb, /var/lib/mongodb or /opt/mongodb.
(Windows systems use the \data\db directory.) If you installed using a package management system.
I recommend using the db.collection.stats command as outlined here to monitor the size of your collection as you insert data. The link also explains what each field (in the output) means.
That is the correct data location for MongoDB on Ubuntu. MongoDB pre-allocates filespace. Are you sure you have generated more data than would fit into the initial pre-allocated files? Try blowing away any existing data files and restarting Mongo with the --noprealloc flag. Then add data.