How do you pass an environment variable to Solr running inside Docker when the environment variable only exists inside the container? - postgresql

I need to do a dataimport from a PostgreSQL container running inside docker to a Solr server also running inside of Docker.
In my docker run command I specify the --link option which creates the environment variable $POSTGRESQL_PORT_5432_TCP_ADDR inside the solr docker container, and I need to pass this into Solr to use in my solrconfig.xml file.
I've heard that this is possible by passing JVM environment variables to the Solr startup command, but docker run starts Solr automatically. The only workaround I've found is doing something like:
docker run --name solr -d -p 8983:8983 --link postgresql --volumes-from solr_cores makuk66/docker-solr /bin/true
Starting the container with bin/true so it does nothing, and then
docker exec -it solr /bin/bash
to get into the container, finally running the solr startup command myself with the flag
-Dsolr.database.ip=$POSTGRESQL_PORT_5432_TCP_ADDR
However this is an involved manual process, and I'm wondering if there's a better way.

Looking on the page Taking Solr to Production you see
The bin/solr script simply passes options starting with -D on to the JVM during startup. For running in production, we recommend setting these properties in the SOLR_OPTS variable defined in the include file. Keeping with our soft-commit example, in /var/solr/solr.in.sh, you would do:
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=10000"
So all you need to do is edit the SOLR_OPTS environment variable in solr.bin.sh.
It's a bit different for Docker because you don't directly have access to solr.bin.sh, but it after some trial and error, it was as easy as adding this to my Dockerfile.
RUN echo 'SOLR_OPTS="$SOLR_OPTS -Dsolr.database.ip=$POSTGRESQL_PORT_5432_TCP_ADDR"' >> /opt/solr/bin/solr.in.sh
Then you can use it in the solrconfig.xml file as
${solr.database.ip}
An important thing to note is that you can call the JVM environment variable whatever you want as long as you make sure not to overwrite anything important. I could have called it
-Dsolr.potato
if I wanted to.

For some reason the solr.in.cmd file looks exactly the same as solr.in.sh which confused me on how to set variables there. In windows containers, the command to accomplish the same - from a dockerfile, would be:
RUN Add-Content C:\solr\bin\solr.in.cmd 'set SOLR_OPTS=%SOLR_OPTS% -Dsolr.database.ip=%POSTGRESQL_PORT_5432_TCP_ADDR%'

Related

I can't enter into the mongo db cli in my docker project

I am learning docker and during my project, i can't enter the mongo db with this command:
mongo -u "username" -p "mypassword"
It throws me this error:
bash: mongo: command not found
I am not sure what the issue is. I have installed the community edition of mongo db and i also tried different terminals but i can't enter the db.
Any suggestions?
Thanks in advance!
I assume, you did the following: Create docker-compose.yml as you wrote before. Start docker compose up. This will start a container on your system, having mongodb installed in it. It will not affect your "normal" system outside this container. (You can imagine it as kind of a virtual machine, though it is not really the same.) So, if you did not install mongodb on your local host system as well, the error you encounter is quite explicable.
If you want to access the mongodb running within the container, you have two possibilities:
1. From outside the container (which is the more common use case)
You will have to install mongo on your regular PC (or anywhere you want to access your db from) as well. Then you would issue mongo 127.0.0.1:3000. The 3000 is important as your docker-compose.yml says, mongo is listening on port 3000. Note that you might have to get your network configuration adapted before this works, especially from other PCs, where 127.0.0.1 won't be correct.
2. From within the container
Once your container is started, you can also execute a command inside it, like this: docker exec -it ${container_id} /bin/bash. You'll have to find out the container's ID beforehand, using something like docker-compose ps -q. This will start a bash shell inside the container and "connect" you to it. (If there's no /bin/bash installed in the container, this will not work. Try e. g. /bin/sh instead.) Now your terminal will be inside the container and just be able to use the commands present there. So, to get back to your local PC, don't forget to issue exit.
Conclusion
IMHO, the crucial point is, that the physical PC you are working in front of and the container running inside it are almost completely different systems, connected only by the docker daemon and some virtual network access. You'll have to keep that in mind and decide what you want to do/run inside the container and what to do outside, on the host.
Here is a little further reference that might help you. And this answer is about how to find out your container ID in an automated way. (Assuming that you are running just that one container!)

Docker Postgres data host volume mapping

I'm trying to docker-containerize PostgreSQL server and this container will have many other applications as well. The need is that, PostgreSQL server data should be mapped to the host volume so that when container is stopped, we won't lose the data. Also that, the next time when we start the container, the same directory can be mapped again and postgres can use the old data. Below is the DOCKERFILE. Note that I'm using ubuntu 22.04 on the host.
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt install -y postgresql
ENTRYPOINT ["tail", "-f", "/dev/null"]
Docker image is built using the command
docker build -t pg_test .
and the container is run using the command
docker run --name test -v /home/me/data:/var/lib/postgresql/14/main pg_test
'/home/me/data' is the host directory which is empty where I want to map the postgres server data. '/var/lib/postgresql/14/main' is the directory inside the docker container where the postgres is supposed to store the data.
Once the docker container starts, I enter the docker container using the command
docker exec -it test bash
and once I'm inside, I'm trying to start the PostgreSQL service. But PostgreSQL fails to start as there is no data in '/var/lib/postgresql/14/main' directory. I understand that since I have mapped an empty host directory to '/var/lib/postgresql/14/main' directory, postgres doesn't have the files required to start.
I understand that I'm doing it the wrong way, but I couldn't find a way around it. Can anyone please help me to do this the right way, if there is one?
Any help would be appreciable.
You should use the postgres docker image, it will set up the db for you when you start the container, you can find instructions on https://hub.docker.com/_/postgres
If you must use a custom image, you will need to initialize the db yourself, usually by running initdb or whatever your system provides.
But really you should use the appropriate docker image, and if you need more services you start them in their own container and connect them to the postgres one

In ansible can I run docker_compose with parameters?

Is there a way to run docker_compose with parameters?
Something like the following:
docker-compose run --rm app_service python init_script
Now I use shell module for this.
Can I use the docker_compose module instead?
The documentation for the docker_compose module suggests that it can only do the equivalents of docker-compose up, down, and build. None of the other Ansible Docker modules connect to Compose at all.
You could use docker_container as an equivalent to a separate docker run command, but this has the same drawbacks as trying to docker run a separate container in a mostly-Compose environment (you don't get networks or volumes or dependencies declared in the docker-compose.yml file).
Falling back to shell is probably your best option here.

How to Start Cron / Crond inside the Official Postgres Container

The crond is not running by default in the official postgres alpine image. How could I define my Dockerfile to make sure that the daemon runs in the background? I want that it is running by default, if possible even when the container gets restarted.
I tried to add CMD ["/usr/sbin/crond"] to my Dockerfile but I didn't succeed. Any thoughts how to run this in combination with postgres?
Update
I have added the answer of tianon:
[...]
If you must run crond inside a container, I'd recommend instead using
a separate container which runs nothing but crond (and thus Docker can
both track its lifecycle, and restart it when/if it fails, the machine
restarts, etc). You should be able to connect to the PostgreSQL
instance from a second container, but if absolutely necessary, one
could use things like --network container:some-postgres in order to
join the network namespace of the database container directly.
pg_cron must be added to shared_preload_libraries. Per the docs:
# add to postgresql.conf:
shared_preload_libraries = 'pg_cron'
and you must then restart PostgreSQL.

How am I supposed to use a Postgresql docker image/container?

I'm new to docker. I'm still trying to wrap my head around all this.
I'm building a node application (REST api), using Postgresql to store my data.
I've spent a few days learning about docker, but I'm not sure whether I'm doing things the way I'm supposed to.
So here are my questions:
I'm using the official docker postgres 9.5 image as base to build my own (my Dockerfile only adds plpython on top of it, and installs a custom python module for use within plpython stored procedures). I created my container as suggedsted by the postgres image docs:
docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres
After I stop the container I cannot run it again using the above command, because the container already exists. So I start it using docker start instead of docker run. Is this the normal way to do things? I will generally use docker run the first time and docker start every other time?
Persistance: I created a database and populated it on the running container. I did this using pgadmin3 to connect. I can stop and start the container and the data is persisted, although I'm not sure why or how is this happening. I can see in the Dockerfile of the official postgres image that a volume is created (VOLUME /var/lib/postgresql/data), but I'm not sure that's the reason persistance is working. Could you please briefly explain (or point to an explanation) about how this all works?
Architecture: from what I read, it seems that the most appropriate architecture for this kind of app would be to run 3 separate containers. One for the database, one for persisting the database data, and one for the node app. Is this a good way to do it? How does using a data container improve things? AFAIK my current setup is working ok without one.
Is there anything else I should pay atention to?
Thanks
EDIT: adding to my confusion, I just ran a new container from the debian official image (no Dockerfile, just docker run -i -t -d --name debtest debian /bin/bash). With the container running in the background, I attached to it using docker attach debtest and the proceeded to apt-get install postgresql. Once installed I ran (still from within the container) psql and created a table in the default postgres database, and populated it with 1 record. Then I exited the shell and the container stopped automatically since the shell wasn't running anymore. I started the container againg using docker start debtest, then attached to it and finally run psql again. I found everything is persisted since the first run. Postgresql is installed, my table is there, and offcourse the record I inserted is there too. I'm really confused as to why do I need a VOLUME to persist data, since this quick test didn't use one and everything apears to work just fine. Am I missing something here?
Thanks again
1.
docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword
-d postgres
After I stop the container I cannot run it again using the above
command, because the container already exists.
Correct. You named it (--name some-postgres) hence before starting a new one, the old one has to be deleted, e.g. docker rm -f some-postgres
So I start it using
docker start instead of docker run. Is this the normal way to do
things? I will generally use docker run the first time and docker
start every other time?
No, it is by no means normal for docker. Docker process containers are supposed normally to be ephemeral, that is easily thrown away and started anew.
Persistance: ... I can stop and start
the container and the data is persisted, although I'm not sure why or
how is this happening. ...
That's because you are reusing the same container. Remove the container and the data is gone.
Architecture: from what I read, it seems that the most appropriate
architecture for this kind of app would be to run 3 separate
containers. One for the database, one for persisting the database
data, and one for the node app. Is this a good way to do it? How does
using a data container improve things? AFAIK my current setup is
working ok without one.
Yes, this is the good way to go by having separate containers for separate concerns. This comes in handy in many cases, say when for example you need to upgrade the postgres base image without losing your data (that's in particular where the data container starts to play its role).
Is there anything else I should pay atention to?
When acquainted with the docker basics, you may take a look at Docker compose or similar tools that will help you to run multicontainer applications easier.
Short and simple:
What you get from the official postgres image is a ready-to-go postgres installation along with some gimmicks which can be configured through environment variables. With docker run you create a container. The container lifecycle commands are docker start/stop/restart/rm Yes, this is the Docker way of things.
Everything inside a volume is persisted. Every container can have an arbitrary number of volumes. Volumes are directories either defined inside the Dockerfile, the parent Dockerfile or via the command docker run ... -v /yourdirectoryA -v /yourdirectoryB .... Everything outside volumes is lost with docker rm. Everything including volumes is lost with docker rm -v
It's easier to show than to explain. See this readme with Docker commands on Github, read how I use the official PostgreSQL image for Jira and also add NGINX to the mix: Jira with Docker PostgreSQL. Also a data container is a cheap trick to being able to remove, rebuild and renew the container without having to move the persisted data.
Congratulations, you have managed to grasp the basics! Keep it on! Try docker-compose to better manage those nasty docker run ...-commands and being able to manage multi-containers and data-containers.
Note: You need a blocking thread in order to keep a container running! Either this command must be explicitly set inside the Dockerfile, see CMD, or given at the end of the docker run -d ... /usr/bin/myexamplecommand command. If your command is NON blocking, e.g. /bin/bash, then the container will always stop immediately after executing the command.