Best practice for docker webserver, muliple layers or single layer? - webserver

I want to build a docker webserver, serving Nginx, PHP and MySQL for multiple websites.
Would it be better to run a single docker container for each component, ie one for MySQL, one for Nginx, One for PHP and another for my site data.
Or Run multipul containers that each include all the services (MySQL, PHP, Nginx and site Data) together, one for each web app?
Or just one with all the services, and another with site data?
My main concerns here are data backup, and using the hardware efficiently.

Normally we think Container as process. One process does one job.
So recommend to simplify the container if you can. Set containers for each service

As BMW mentioned, simplifying containers is what Docker recommends. Containers should be lightweight, micro-services that are portable. It is easy enough to use the Docker cli --link function to connect each service.
Tutum has an example of a separated LAMP stack. The web server is located on the same container as PHP code but that should be sufficient. I don't see why movine Nginx on to a different container is necessary unless you intend on load balancing.
Here is the example from Tutum:
https://github.com/tutumcloud/lamp

Related

Why kubernetes does not work directly with containers

Somebody, please, explain me (or direct to a detailed resource) why kubernetes uses this wrapper (pod) to work with containers. Every resource I go across just quotes same words - "it is the smallest unit in k8s". What I am looking for is the reason for it from engineering perspective. I do understand that it provides namespace for storage and networking for containers inside, but best practice is keeping a single container in a pod anyways.
I've used docker-compose a lot before I familiarized myself with k8s, and have hard times to understand the need for this additional layer (wrapper) around pretty straightforward entity, container.
The reason for this decision is simply because a Pod may contain more than one container, doing different things.
First of all, A pod may have an init-container which is responsible to do some starting operations to ensure that the main container / containers work properly. I could have an init-container load some configuration and preparing it for the main application, or do some basic operations such as restoring a backup or similar things.
I can basically inject a series of operations to exec before starting the main application without building again the main application container image.
Second, even if the majority of applications are perfectly fine having only one container for Pod, there are several situations where more than one container in the same Pod may be useful.
An example could be having the main application running, and then a side-car container doing a proxy in front of the main application, maybe being the responsible for checking JWT tokens.. or another example could be a secondary application extracting metrics from the main application or similar things.
Last, let me quote Kubernetes documentation (https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/)
The primary reason that Pods can have multiple containers is to support helper applications that assist a primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. Helper and primary applications often need to communicate with each other. Typically this is done through a shared filesystem, as shown in this exercise, or through the loopback network interface, localhost. An example of this pattern is a web server along with a helper program that polls a Git repository for new updates.
Update
Like you said, init containers.. or multiple containers in the same Pod are not a must, all the functionalities that I listed can also be obtained in other ways, such as en entrypoints or two separate Pods communicating with each other instead of two containers in the same Pod.
There are several benefits in using those functionalities tho, let me quote the Kubernetes documentation once more (https://kubernetes.io/docs/concepts/workloads/pods/init-containers/)
Because init containers have separate images from app containers, they have some advantages for start-up related code:
Init containers can contain utilities or custom code for setup that
are not present in an app image. For example, there is no need to make
an image FROM another image just to use a tool like sed, awk, python,
or dig during setup.
The application image builder and deployer roles
can work independently without the need to jointly build a single app
image.
Init containers can run with a different view of the filesystem
than app containers in the same Pod. Consequently, they can be given
access to Secrets that app containers cannot access.
Because init
containers run to completion before any app containers start, init
containers offer a mechanism to block or delay app container startup
until a set of preconditions are met. Once preconditions are met, all
of the app containers in a Pod can start in parallel.
Init containers
can securely run utilities or custom code that would otherwise make an
app container image less secure. By keeping unnecessary tools separate
you can limit the attack surface of your app container image
The same applies to multiple containers running in the same Pod, they can communicate safely with each other, without exposing that communication to other on the cluster, because they keep it local.

What are the best practices for accessing a kubernetes pod with sftp or ssh?

I have deployed a wordpress pod on Kubernetes and I want to be able to use sftp or ssh to access it.
Containers is a bit different from whole Virtual Machines. With containers you typically only run a single process - your app. Unless your app is a ssh-daemon or an FTP server, it does not support sftp or ssh protocol. It is common for apps in Kubernetes only to use HTTP.
That said, it is possible to run one-off commands in containers using kubectl exec, see Get a Shell to a Running Container
so what are the best practices for managing the files of a webserver type pod? You have to publish the files and their updates
There are two common way to do this:
Copy the files to a Dockerfile and build a new container image (this also contains the web server).
Upload the files to a Bucket, e.g. AWS S3 or Google Cloud Storage and let the server serve those files.

Does CloudFoundry support multiple containers per app?

A Kubernetes Pod and an AWS ECS Task Definition both support multiple different container images: each instance of the pod / task will run all images as containers together.
Does CloudFoundry support a similar concept to allow apps that consist of multiple, separate processes?
Actually, CloudFoundry has a community project for container orchestration tools based on Kubernetes, so that will accept pods the same way Kubernetes does.
You can read more about it here
CloudFoundry also has a CF Application Runtime which is pretty much their PaaS that allows you to deploy applications Heroku style which under the hood run as 'containers'. It's not clear from the docs what type of containers, but I presume you could find out more reading the code, but that's not exposed to the users, neither it's exposed as Pods.
tl;dr
No. You can only run a single container per application instance.
Longer Answer
Most of the answers are quickly pointing you to PKS, however Cloud Foundry itself is outside of that.
Cloud Foundry runs each application via Diego. Each application runs as a standalone container on a diego-cell. This is different from Kubernetes which you think of Pods or groups of colocated containers.
Cloud Foundry allows you to run multiple instances of each container, but I believe this is different from what you are asking.
Workaround
You may not be able to run multiple containers, but you can run multiple processes. For an example of this, check out how CF-FaaS runs. It uses the CF-Space-Security processes in a collocated scheme.
Pivotal now provides PAS - Pivotal Application Service, which is the traditional PaaS.
As a developer, I cf push my archive, the platform creates the container, and the Diego Orchestrator run my application. And yes, I can run multiple instances of my app.
PKS - Pivotal Container Service (cool kids spell with 'K'), is Pivotal's implementation of Kubernetes. It is CaaS - Container as a Service. As a developer, I create my own container - a docker container, or a vendor provides me a container, and PKS runs the container in a POD, inside a PKS cluster.
Next one coming out, some time in next 3 - 6 months, from Pivotal is PFS - Pivotal Functional Service. It is Pivotal's implementation of Function as a Service. As a developer, I can create and deploy a function to PFS. I have to identify the triggers for this function, based on which PFS will spin up new instances of the function, and when done, destroy it.
How you use what, depends on your use case.
This deck is for the presentation at Dallas Cloud Native Meetup's last session. Parth did a great job simplifying and explaining the differences and how you choose. Hope you can access it. Take a look.

Running multiple applications on Kubernetes. How to create the structure?

This is more of a theoretical question. How do you guys create the structure of a Kubernetes deployments/services/pods that runs multiple applications?
Let's say I want to run 3 Wordpress websites on my servers. For this I need: Nginx, MySQL, PHP-FPM and the Wordpress code base.
Is it better to spin off separate pods/services for Nginx, MySQL, PHP-FPM that will serve all 3 Wordpress websites and create 3 Wordpress pods/services for the 3 websites?
OR is it better to create a separate pods/service for each one of the websites, therefore the grouping would be:
Pod1: Nginx, MySQL, PHP-FPM, Wordpress
Pod2: Nginx, MySQL, PHP-FPM, Wordpress
Pod3: Nginx, MySQL, PHP-FPM, Wordpress
With option 2 I would need somehow to route the specific website traffic to the specific service/pod
Kubernetes is extremely flexible as you are discovering and allows you to architect you application in numerous ways. As a general rule of thumb, only run one process per container per pod. However, there definitely valid use cases for running multiple containers in a pod. I think for your use case, you can use both approaches.
Let me attempt to break down each of your components:
MySQL
I would definitely run this in it's own pod. I would wrap it in a StatefulSet and front it with its own Service
Nginx + Wordpress
In my opinion, whether you run these two processes in one pod or two depends on how you are using tls, if at all. As we know, Wordpress is very vulnerable to attacks. Hence, perhaps you have rules in your Nginx config to limit access to certain paths, methods, etc. If you run Nginx and Wordpress in the same pod, then you can expose only the Nginx port and the only way traffic will get to the Wordpress container is if it goes through Nginx. If you run these containers as separate pods, then from a security standpoint, you'll need some other way to make sure that inbound traffic to your Wordpress pod only comes from your Nginx pod. You can accomplish this with the NetworkPolicy resource or you can just use mutual TLS between these two pods.
In summary, in a microservice architecture, you want your process to be as decoupled as possible so that they can be managed and deployed separately. Hence, a single process per container per Pod is attractive. However, there are instances that require you to run more than one container per Pod. In my example I used security as such motivation.

How to get files into pod?

I have a fully functioning Kubernetes cluster with one master and one worker, running on CoreOS.
Everything is working and my pods and services are running fine. Now I have no clue how to proceed in a webserver idea.
Before I go further: I have no configs at the moment about my idea I am going to explain. I just did a lot of research.
When setting up a pod (nginx) with a service. You get the default nginx page. After that you can setup a mount volume with a hostvolume (volume mapping from host to container).
But lets say I want to seperate every site (multiple sites separated with different pods), how can I let my users add files to their pod/nginx document root? Having FTP in the CoreOS node removes the Kubernetes way and adds security vulnerabilities.
If someone can help me shed some light on this issue, that would be great.
Thanks for your time.
I'm assuming that you want to have multiple nginx servers running. The content of each nginx server is managed by a different admin (you called them users).
TL;DR:
Option 1: Each admin needs to build their own nginx docker image every time the static files change and deploy that new image. This is if you consider these static files as a part of the source-code of the nginx application
Option 2: Use a persistent volume for nginx, the init-script for the nginx image should use something like s3 to sync all its files with s3 and then start nginx
Before you proceed with building an application with kubernetes. The most important thing is to separate your services into 2 conceptual categories, and give up your desire to touch the underlying nodes directly:
1) Stateless: These are services that are built by the developers and can be released. They can be stopped, started, moved from one node to another, their filesystem can be reset during restart and they will work perfectly fine. Majority of your web-services will fit this category.
2) Stateful: These services cannot be stopped and restarted willy nilly like the ones above. Primarily, their underlying filesystem must be persistent and remain the same across runs of the service. Databases, file-servers and similar services are in this category. These need special care and should use k8s persistent-volumes and now stateful-sets.
Typical application:
nginx: build the nginx.conf into the docker image, and deploy it as a stateless service
rails/nodejs/python service: build the source code into the docker image, configure with env-vars, deploy as a stateless service
database: mount a persistent volume, configure with env-vars, deploy as a stateful service.
Separate sites:
Typically, I think at a k8s deployment and a k8s service level. Each site can be one k8s deployment and k8s service set. You can then have separate ways to expose them (different external DNS/IPs)
Application users storing files:
This is firmly in the category of a stateful service. Use a persistent volume to mount to a /media kind of directory
Developers changing files:
Say developers or admins want to use FTP to change the files that nginx serves. The correct pattern is to build a docker image with the new files and then use that docker image. If there are too many files, and you don't consider those files to be a part of the 'source' of the nginx, then use something like s3 and a persistent volume. In your docker image init script, don't directly start nginx. Contact s3, sync all your files onto your persistent volume, then start nginx.
While the options and reasoning listed by iamnat are right, there's at least one more option to add to the list. You could consider using ConfigMap objects, maintain your file within the configmap and mount them to your containers.
A good example can be found in the official documentation - check the Real World Example configuring Redis section to get some actionable input.