Init container in openshift - kubernetes

I am new to openshift, I have gone through Openshift website for more details but wanted to know if anyone has deployed init container.
I want to use that to take dump from database and restore it to new version of it with the help of init container
We are using postgres database
Any help would be appreciated.
Thanks!

I want to use that to take dump from database and restore it to new version of it with the help of init container
I would say you should rather use the operator instead of initContainer. Take a look at below Init Containers Design Considerations
There are some considerations that you should take into account when you create initcontainers:
They always get executed before other containers in the Pod. So, they
shouldn’t contain complex logic that takes a long time to complete.
Startup scripts are typically small and concise. If you find that
you’re adding too much logic to init containers, you should consider
moving part of it to the application container itself.
Init containers are started and executed in sequence. An init
container is not invoked unless its predecessor is completed
successfully. Hence, if the startup task is very long, you may
consider breaking it into a number of steps, each handled by an init
container so that you know which steps fail.
If any of the init containers fail, the whole Pod is restarted
(unless you set restartPolicy to Never). Restarting the Pod means
re-executing all the containers again including any init containers.
So, you may need to ensure that the startup logic tolerates being
executed multiple times without causing duplication. For example, if
a DB migration is already done, executing the migration command again
should just be ignored.
An init container is a good candidate for delaying the application
initialization until one or more dependencies are available. For
example, if your application depends on an API that imposes an API
request-rate limit, you may need to wait for a certain time period to
be able to receive responses from that API. Implementing this logic
in the application container may be complex; as it needs to be
combined with health and readiness probes. A much simpler way would
be creating an init container that waits until the API is ready
before it exits successfully. The application container would start
only after the init container has done its job successfully.
Init containers cannot use health and readiness probes as application
containers do. The reason is that they are meant to start and exit
successfully, much like how Jobs and CronJobs behave.
All containers on the same Pod share the same Volumes and network.
You can make use of this feature to share data between the
application and its init containers.
The only thing I found about using it for dumping data is this example about doing that with mysql, maybe it can guide you how to do it with postgresql.
In this scenario, we are serving a MySQL database. This database is used for testing an application. It doesn’t have to contain real data, but it must be seeded with enough data so that we can test the application's query speed. We use an init container to handle downloading the SQL dump file and restore it to the database, which is hosted in another container. This scenario can be illustrated as below:
The definition file may look like this:
apiVersion: v1
kind: Pod
metadata:
name: mydb
labels:
app: db
spec:
initContainers:
- name: fetch
image: mwendler/wget
command: ["wget","--no-check-certificate","https://sample-videos.com/sql/Sample-SQL-File-1000rows.sql","-O","/docker-entrypoint-initdb.d/dump.sql"]
volumeMounts:
- mountPath: /docker-entrypoint-initdb.d
name: dump
containers:
- name: mysql
image: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "example"
volumeMounts:
- mountPath: /docker-entrypoint-initdb.d
name: dump
volumes:
- emptyDir: {}
name: dump
The above definition creates a Pod that hosts two containers: the init container and the application one. Let’s have a look at the interesting aspects of this definition:
The init container is responsible for downloading the SQL file that contains the database dump. We use the mwendler/wget image because we only need the wget command.
The destination directory for the downloaded SQL is the directory used by the MySQL image to execute SQL files (/docker-entrypoint-initdb.d). This behavior is built into the MySQL image that we use in the application container.
The init container mounts /docker-entrypoint-initdb.d to an emptyDir volume. Because both containers are hosted on the same Pod, they share the same volume. So, the database container has access to the SQL file placed on the emptyDir volume.
Additionally for best practices I would suggest to take a look at kubernetes operators, as far as I know that's the best practice way to menage databases in kubernetes.
If you're not familiar with operators I would suggest to start with kubernetes documentation and this short video on youtube.
Operators are methods of packaging Kubernetes that enable you to more easily manage and monitor stateful applications. There are many operators already available, such as the
Crunchy Data PostgreSQL Operator
Postgres Operator
which automates and simplifies deploying and managing open source PostgreSQL clusters on Kubernetes by providing the essential features you need to keep your PostgreSQL clusters up and running.

Related

How to mount a single file from the local Kubernetes cluster into the pods

I set up a local Kubernetes cluster using Kind, and then I run Apache-Airflow on it using Helm.
To actually create the pods and run Airflow, I use the command:
helm upgrade -f k8s/values.yaml airflow bitnami/airflow
which uses the chart airflow from the bitnami/airflow repo, and "feeds" it with the configuration of values.yaml.
The file values.yaml looks something like:
web:
extraVolumeMounts:
- name: functions
mountPath: /dir/functions/
extraVolumes:
- name: functions
hostPath:
path: /dir/functions/
type: Directory
where web is one component of Airflow (and one of the pods on my setup), and the directory /dir/functions/ is successfully mapped from the cluster inside the pod. However, I fail to do the same for a single, specific file, instead of a whole directory.
Does anyone knows the syntax for that? Or have an idea for an alternative way to map the file into the pod (its whole directory is successfully mapped into the cluster)?
There is a File type for hostPath which should behave like you desire, as it states in the docs:
File: A file must exist at the given path
which you can then use with the precise file path in mountPath. Example:
web:
extraVolumeMounts:
- name: singlefile
mountPath: /path/to/mount/the/file.txt
extraVolumes:
- name: singlefile
hostPath:
path: /path/on/the/host/to/the/file.txt
type: File
Or if it's not a problem, you could mount the whole directory containing it at the expected path.
With this said, I want to point out that using hostPath is (almost always) never a good idea.
If you have a cluster with more than one node, saying that your Pod is mounting an hostPath doesn't restrict it to run on a specific host (even tho you can enforce it with nodeSelectors and so on) which means that if the Pod starts on a different node, it may behave differently, not finding the directory and / or file it was expecting.
But even if you restrict the application to run on a specific node, you need to be ok with the idea that, if such node becomes unavailable, the Pod will not be scheduled on its own somewhere else.. meaning you'll need manual intervention to recover from a single node failure (unless the application is multi-instance and can resist one instance going down)
To conclude:
if you want to mount a path on a particular host, for whatever reason, I would go for local volumes.. or at least use hostPath and restrict the Pod to run on the specific node it needs to run on.
if you want to mount small, textual files, you could consider mounting them from ConfigMaps
if you want to configure an application, providing a set of files at a certain path when the app starts, you could go for an init container which prepares files for the main container in an emptyDir volume

What is the root password of postgresql-ha/helm?

Installed PostgreSQL in AWS Eks through Helm https://bitnami.com/stack/postgresql-ha/helm
I need to fulfill some tasks in deployments with root rights, but when
su -
requires a password that I don't know and where to take it, and to access the desired folders, such as /opt/bitnami/postgresql/
Error: Permission denied
How to get the necessary rights or what password?
Image attached: bitnami root error
I need [...] to place the .so libraries I need for postgresql in [...] /opt/bitnami/postgresql/lib
I'd consider this "extending" rather than "configuring" PostgreSQL; it's not a task you can do with a Helm chart alone. On a standalone server it's not something you could configure with only a text editor, for example, and while the Bitnami PostgreSQL-HA chart has a pretty wide swath of configuration options, none of them allow providing extra binary libraries.
The first step to doing this is to create a custom Docker image that includes the shared library. That can start FROM the Bitnami PostgreSQL image this chart uses:
ARG postgresql_tag=11.12.0-debian-10-r44
FROM bitnami/postgresql:${postgresql_tag}
# assumes the shared library is in the same directory as
# the Dockerfile
COPY whatever.so /opt/bitnami/postgresql/lib
# or RUN curl ..., or RUN apt-get, or ...
#
# You do not need EXPOSE, ENTRYPOINT, CMD, etc.
# These come from the base image
Build this image and push it to a Docker registry, the same way you do for your application code. (In a purely local context you might be able to docker build the image in minikube's context.)
When you deploy the chart, it has options to override the image it runs, so you can point it at your own custom image. Your Helm values could look like:
postgresqlImage:
registry: registry.example.com:5000
repository: infra/postgresql
tag: 11.12.0-debian-10-r44
# `docker run registry.example.com:5000/infra/postgresql:11.12.0-debian-10-r44`
and then you can provide this file via the helm install -f option when you deploy the chart.
You should almost never try to manually configure a Kubernetes pod by logging into it with kubectl exec. It is extremely routine to delete pods, and in many cases Kubernetes does this automatically (if the image tag in a Deployment or StatefulSet changes; if a HorizontalPodAutoscaler scales down; if a Node is taken offline); in these cases your manual changes will be lost. If there are multiple replicas of a pod (with an HA database setup there almost certainly will be) you also need to make identical changes in every replica.
Like they told you in the comments, you are using the wrong approach to the problem. Executing inside a container to make manual operations is (most of the times) useless, since Pods (and the containers which are part of such Pods) are ephimeral entities, which will be lost whenever the Pod restart.
Unless the path you are trying to interact with is supported by a persisted volume, as soon as the container will be restared, all your changes will be lost.
HELM Charts, like the bitnami-ha chart, exposes several way to refine / modify the default installation:
You could build a custom docker image starting from the one used by default, adding there the libraries and whatever you need. This way the container will be already "ready" in the way you want, as soon as it starts
You could add an additional Init Container to perfom operations such as preparing files for the main container on emptydir volumes, which can then be mounted at the expected path
You could inject an entrypoint script which does what you want at start, before calling the main entrypoint
Check the Readme as it lists all the possibilities offered by the Chart (such as how to override the image with your custom one and more)

from docker-compose to kubernetes as development environment using minikube

Right now i use docker-compose for development. This is a great tool that comes handy if i use it on simple projects where i got maximum of 3-6 active services but when it comes to 6-8 and more it is become hard to manage.
So i've started to learn k8s on minikube and now i got few questions about some questions:
How to make "two-way" binding for volumes? for example if i have folder named "my-frontend" and i want to sync specific folder in deployment, how to "link" them using PV and PVC ?
Very often it comes handy to make some service with specific environment like node:12.0.0 and then use it as command executor like this: docker-compose run workspace npm install
how to achieve something like this using k8s?
How to make "two-way" binding for volumes? for example if i have folder named "my-frontend" and i want to sync specific folder in deployment, how to "link" them using PV and PVC ?
You need to create a PersistentVolume which in your case will use a specific directory in the host, in Kubernetes official documentation there's an example with this same use case.
Then a PersistentVolumeClaim to request some space from this volume (also an example in the previous documentation) and then mount the PVC on the pod/deployment where you need it.
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: my-pvc
containers:
....
volumeMounts:
- mountPath: "/mount/path/in/pod"
name: my-volume
Very often it comes handy to make some service with specific environment like node:12.0.0 and then use it as command executor like this: docker-compose run workspace npm install
how to achieve something like this using k8s?
You need to use kubectl, it has very similar functionalities as docker CLI, it also supports run command with very similar parameters and functionality. Alternatively, you can start your pod once and then run commands multiple times in the same instance by using kubectl exec

Where to store files in GKE container?

I'm having trouble understanding where to store files in a GKE container? I've seen the following documentation of the filesystem layout:
https://cloud.google.com/kubernetes-engine/docs/concepts/node-images#file_system_layout
But then there are also Dockerfile examples on the web that copy executable files to other paths not listed in the layout, such as /usr or /go. One of these examples is here:
https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/master/hello-app/Dockerfile
Another question is: If I have runtime code that needs to download certain configuration information after the container starts, can I write the configuration file to the same directory as my executable? Or do I have to choose /etc or /tmp.
And finally, the layout documentation states that /home and /var store data for the the lifetime of the boot disk? What does that mean? How does that compare to the lifetime of the pod or the node?
When you want to store something in a container you can either store something ephemeral or permanent
To store ephemeral way just choose a path /tmp, /var, /opt etc (this depends on the container set up as well), once the container is restarted the information you would have is the same at the moment the container was created, for instance your binary files and initial config files.
To store permanent you must have to mount a volume, this is a support for your container where a volume (container path) is linked with a external storage. with this if your container is restarted the volume will be mounted once the container is ready again and you are no gonna lose anything.
In kubernetes this is called Persistent Volumes and you can leverage this even if you are in another cloud provider,
steps to used
Define a path where you would mount the volume in your source code example /myfiles/private
Create a storage class in your GKE https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/ssd-pd
Create a Persistent Volume Claim in your GKE https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/ssd-pd
Relate this storage class with your Kubernetes deployment
Example
link the volume with your container
volumeMounts:
- mountPath: /myfiles/private
name: any-name-you-want
relate the persistent volume with your deployment
volumes:
- name: any-name-you-want
persistentVolumeClaim:
claimName: my-claim-name
This is really up to you. By default most base images will leave /tmp writeable as per normal. But anything written inside the container will be gone if/when the container restarts for any reason. For something like config data, that might be fine, for a database probably less so. To get more stable storage you need to use a Volume. The exact type to use depends on your environment and how long the data should live. An emptyDir volume lives only as long as the pod but can be shared between containers in the same pod. Beyond that you would probably use a PersistentVolumeClaim to dynamically provision a new Google Cloud disk which will last unless the claim is deleted (or forever depending on your Reclaim setting).

Does Kubernetes provide a colocated Job container?

I wonder how would one implement a colocated auxiliary container in a Pod within a Deployment which does not provide a service but rather a job/batch workload?
Background of my questions is, that I want to deploy a scalable service at which each instance needs configuration after its start. This configuration is done via a HTTP POST to its local colocated service instance. I've implemented a auxiliary container for this in order to benefit from the feature of colocation. So the auxiliary container always knows which instance needs to be configured.
Problem is, that the restartPolicy needs to be defined at the Pod level. I am looking for something like restart policy always for the service and a different restart policy onFailurefor the configuration job.
I know that k8s provides the Job resource for such workloads. But is there an option to colocate those jobs to Pods?
Furthermore I've stumbled across the so called init containers which might be defined via annotations. But these suffer the drawback, that k8s ensures that the actual Pod is only started after the init container did run. So for my very scenario it seems unsuitable.
As I understand you need your service running to configure it.
Your solution is workable and you can set restartPolicy: always you just need a way to tell your one off configuration container that it already ran. You could create and attach an emptyDir volume to your configuration container, create a file on it to mark your configuration successful and check for this file from your process. After your initialization you enter sleep in a loop. The downside is that some resources will be taken up by that container too.
Or you can just add an extra process in the same container and do the configuration (maybe with the file mentioned above as a guard to avoid configuring twice). So write a simple shell script like this and run it instead of your main process:
#!/bin/sh
(
[ -f /mnt/guard-vol/stamp ] && exit 0
/opt/my-config-process parameters && touch /mnt/guard-vol/stamp
) &
exec /opt/my-main-process "$#"
Alternatively you could implement a separate pod that queries the kubernetes API for pods of your service with label configured=false. Configure it and remove the label with the API. You should also modify your Service to select configured=true pods.