What are the options for syncing static content to Nginx within Kubernetes? - kubernetes

I'm currently building a Kubernetes cluster. I plan on using Nginx containers as a server for static content, and to act as a web socket proxy. If you restart Nginx, you lose your web socket connection, so I do not want to restart the containers. But I will want to update the content within the container.

I do that same exact thing in my Kubernetes cluster. Our solution is for application to handle the web socket disconnect with consistent state kept intact.
However, other options you have are mount a volume to serve from the host; however, you cannot guarantee all nginx pods will have that volume on multi hosts, unless you use a kubernetes' persistent volume http://kubernetes.io/v1.1/docs/user-guide/persistent-volumes.html.
Another option you have is to have your static content on an object store like S3, Google Cloud Storage or Ceph, and then proxy the object store through nginx along with the websocket.

Related

Access an application running on a random port in Kubernetes

My application is running within a pod container in a kubernetes cluster. Every time it is started in the container it allocates a random port. I would like to access my application from outside (from another pod or node for example) but since it allocates a random port I cannot create a serivce (NodePort or LoadBalancer) to map the application port to a specific port to be able to access it.
What are the options to handle this case in a kubernetes cluster?
Not supported, checkout the issue here. Even with docker, if your range is overly broad, you can hit issue as well.
One option to solve this would be to use a dynamically configured sidecar proxy that takes care of routing the pods traffic from a fixed port to the dynamic port of the application.
This has the upside that the service can be scaled even if all instances have different ports.
But this also has the potentially big downside that every request has an added overhead due to the extra network step and even if it's just some milliseconds this can have quite an impact depending on how the application works.
Regarding the dynamic configuration of the proxy you could e.g. use shared pod volumes.
Edit:
Containers within a pod are a little bit special. They are able to share predefined volumes and their network namespace.
For shared volumes this means. you are able to define a volume e.g. of type emptyDir and mount it in both containers. The result is that you can share information between both containers by writing them into that specific volume in the first pod and reading the information in the second.
For networking this makes communication between containers of one pod easier because you can use the loopback to communicate with between your containers. In your case this means the sidecar proxy container can call your service by calling localhost:<dynamic-port>.
For further information take a look at the official docs.
So how is this going to help you?
You can use a proxy like envoy and configure it to listen to dynamic configuration changes. The source for the dynamic configuration should be a shared volume between both your application container and the sidecar proxy container.
After your application started and allocated the dynamic port you can now automatically create the configuration for the envoy proxy and save it in the shared volume. The same source envoy is listening to due to the aforementioned configuration.
Envoy itself acts as a reverse proxy and can listen statically on port 8080 (or any other port you like) and routes the incoming network traffic to your application with dynamic port by calling localhost:<dynamic-port>.
Unfortunately I don't have any predefined configuration for this use case but if you need some inspiration you can take a look at istio - they use something quite similar.
As gohm'c mentioned there is no built in way to do this.
As a workaround you can run a script to adjust the service:
Start the application and retrieve the choosen random port from the application
Modify the Kubernetes service, loadbalancer, etc. with the new port
The downside of this is that if the port changes there will be short delay to update the service afterwards.
How about this:
Deploy and start your application in a Kubernetes cluster.
run kubectl exec <pod name here> -- netstat -tulpn | grep "application name" to identify the port number associated with your application.
run kubectl port-forward <pod name here> <port you like>:<random application port> to access the pod of your application.
Would this work for you?
Edit. This is very similar to #Lukas Eichler's response.

Deploying a mobile app backend with kubernetes

I need to some advice regarding how to deploy a high traffic mobile app back-end using kubernetes. This deployment should support HA at-least. We have plans to run a DR site as well, but scope of this question does not include a DR.
We currently use hardware load-balancers to route incoming traffic to different IP addresses attached to different boxes. Each such box runs a nginx instance as a reverse proxy which also act as the https terminator. After https termination, traffic is directed to an apache web-server. Each box has one apacher server receiving all traffic from nginx running in the same box.
We want to introduce kubernetes to this setup so that we can utilize boxes better. Our traffic patterns are highly fluctuating and we believe kubernetes can help us utilize boxes in a more efficient manner.
My current plan is as follows:
-- Keep the hardware load balancer to route incoming traffic to different boxes. (this may not be needed but getting rid of HLB could become very political).
-- Run a kubenetes cluster utilizing all available boxes
-- pack apache + our app as docker image and deploy this image on docker container which in tern is run inside pods in the kubenetes cluster
-- setup ingress to accept external traffic, do https termination and load balance to above pods. A simple round robin or random load balancing algo is fine as our back ends are stateless
Does this sound right? Are there any alternatives? In the above case, where does the ingress controller run?
Your plan seems right. You can either pack apache with the code but it shall be better to keep it separate so that they can contact each other and any one of the version upgrades won't be dependent upon this one.
Also, the hardware load balancer will tickle the traffic on to the ingress which shall further bring it down to the k8s cluster and eventually on the pods.
The ingress controller runs inside the cluster. I guess you're looking to run kuberentes on-premise with your existing hardware. To use the existing hardware loadbalancer outside of kubernetes you could run the nginx ingress controller as a daemonset so that there'd be one instance on each node and expose it via HostPort so that each is exposed on the same port. Or if there are lots of nodes then you'd want to just use a Deployment. Then you'd would want to use NodePort so that Kuberentes would send the traffic to a node where an ingress controller pod runs.
Another alternative would be to expose the nginx ingress controller through LoadBalancer - to do that you'd need to integrate your loadbalancer with kubernetes using something like https://hackernoon.com/metallb-a-load-balancer-for-bare-metal-kubernetes-clusters-f7320fde52f2
Alternatively, you wouldn't necessarily have to use ingress. You could just run nginx in the cluster and expose it via NodePort.
It's not clear to me that you'd need apache http server in your container. I guess it depends how you are using it currently.

GCE Kubernetes Session Persistence

I'm running a wordpress / woocommerce site on GCE Kubernetes and having trouble scaling because of session persistence.
The LoadBalancer (GCE Ingress) sends all traffic to a reverse proxy that then sends the traffic to different services I have set up, one of which is wordpress.
If I use SessionAffinity: ClientIP on the WordPress service all of the traffic goes to one pod and the others are ignored. This seems to because the service is seeing the LoadBalancer's ip address rather than the Client's. This is in spite of externalTrafficPolicy: Local set on both nginx reverse proxy and the wordpress NodePort services.
I've also tried using the wordpress service as the default backend and I managed to get traffic to go to all pods but lost session affinity.
The Ingress also performs TLS termination, which I've seen can effect ClientIP visibility, but I think that issue is resolved by the external traffic policy.
We are also using Cloudflare, I'm wondering if that could have an effect. But we are using the ngx_http_realip_module to try to get the correct Client IP address.
I had a similar issue in one of the PHP services deployed in my cluster. Sessions are evil :) but sometimes you do need to use them. You can cluster your session data in PHP in couple ways, so that you do not need to use sticky sessions on loadbalancer(s).
shared RWX volume in your pod(s) that will keep the session files available to all instances in your deployment. Unless you use something like S3 for wordpress uploads, you probably already do something similar for binaries, as I suggest for session files.
session handler with Memcached or Redis as the session storage (this is what I have now)
you can even keep them in your MySQL, same as WP database, although I've seen that it can be of significant performance impact.
You can find simple mamcache example here. If you'd need a clustered storage, you could look into Redis clustering, or, as I would, into Couchbase

How to get files into pod?

I have a fully functioning Kubernetes cluster with one master and one worker, running on CoreOS.
Everything is working and my pods and services are running fine. Now I have no clue how to proceed in a webserver idea.
Before I go further: I have no configs at the moment about my idea I am going to explain. I just did a lot of research.
When setting up a pod (nginx) with a service. You get the default nginx page. After that you can setup a mount volume with a hostvolume (volume mapping from host to container).
But lets say I want to seperate every site (multiple sites separated with different pods), how can I let my users add files to their pod/nginx document root? Having FTP in the CoreOS node removes the Kubernetes way and adds security vulnerabilities.
If someone can help me shed some light on this issue, that would be great.
Thanks for your time.
I'm assuming that you want to have multiple nginx servers running. The content of each nginx server is managed by a different admin (you called them users).
TL;DR:
Option 1: Each admin needs to build their own nginx docker image every time the static files change and deploy that new image. This is if you consider these static files as a part of the source-code of the nginx application
Option 2: Use a persistent volume for nginx, the init-script for the nginx image should use something like s3 to sync all its files with s3 and then start nginx
Before you proceed with building an application with kubernetes. The most important thing is to separate your services into 2 conceptual categories, and give up your desire to touch the underlying nodes directly:
1) Stateless: These are services that are built by the developers and can be released. They can be stopped, started, moved from one node to another, their filesystem can be reset during restart and they will work perfectly fine. Majority of your web-services will fit this category.
2) Stateful: These services cannot be stopped and restarted willy nilly like the ones above. Primarily, their underlying filesystem must be persistent and remain the same across runs of the service. Databases, file-servers and similar services are in this category. These need special care and should use k8s persistent-volumes and now stateful-sets.
Typical application:
nginx: build the nginx.conf into the docker image, and deploy it as a stateless service
rails/nodejs/python service: build the source code into the docker image, configure with env-vars, deploy as a stateless service
database: mount a persistent volume, configure with env-vars, deploy as a stateful service.
Separate sites:
Typically, I think at a k8s deployment and a k8s service level. Each site can be one k8s deployment and k8s service set. You can then have separate ways to expose them (different external DNS/IPs)
Application users storing files:
This is firmly in the category of a stateful service. Use a persistent volume to mount to a /media kind of directory
Developers changing files:
Say developers or admins want to use FTP to change the files that nginx serves. The correct pattern is to build a docker image with the new files and then use that docker image. If there are too many files, and you don't consider those files to be a part of the 'source' of the nginx, then use something like s3 and a persistent volume. In your docker image init script, don't directly start nginx. Contact s3, sync all your files onto your persistent volume, then start nginx.
While the options and reasoning listed by iamnat are right, there's at least one more option to add to the list. You could consider using ConfigMap objects, maintain your file within the configmap and mount them to your containers.
A good example can be found in the official documentation - check the Real World Example configuring Redis section to get some actionable input.

Can I make calls directly to pods from outside Kubernetes?

I'm attempting to transition existing applications to Kubernetes that work as follows:
An outside service calls our application through a load balancer with a new session.
Our application returns the ip of the server that processed the request.
All subsequent calls from the outside service for that session are made directly to the same server (bypassing the load balancer)
Is there any way to do this in kubernetes? I understand that pod ip's are not exposed externally, is there some way to expose them directly?
Also, I don't think I can use sessionAffinity="ClientIP" because the requests will all come in from the same place. Is there a way to write custom sessionAffinity type?
It kind of depends on how your network is set up and what you mean by an "outside service", but the answer is most likely "no".
If you're running using one of the default cluster creation scripts in a cloud environment, pod IP addresses are not routable from the Internet, so any service not in the same private network as your cluster won't be able to talk directly to pods.
However, depending on what cloud provider you're on, you'll likely get the behavior that you want anyways by just continuing to make all calls through to the external IP of a service of type LoadBalancer. For instance, on the Google Cloud Platform, the cloud load balancer that gets created for such services by default maintains connection affinity by 5-tuple (src ip and port, dst ip and port, L4 protocol), which sounds like it's what you want, since you want balancing per session rather than per IP.
As for creating a new sessionAffinity type, that's not an easy thing to extend, since it requires changing Kubernetes source code. If that's really a path you want to take, it's likely that you'd want to run your own load balancer within your cluster rather than relying on the built-in load balancing.