Reverse proxy for thrift hiveservers running in a kubernetes cluster - kubernetes

I have a requirement to run multiple hiveservers as pods on a kubernetes cluster, each serving users belonging to different AD groups. These hiveservers need to be exposed outside of kubernetes cluster, but each hiveserver cannot be exposed as a different service. Ideally I would like to have a reverse proxy implemented using ingress controller with ingress defined for each hiveserver, as the servers could be dynamically created and destroyed.
I see that nginx ingress controller can be used for http, I don't see a way to make this work as a reverse proxy for thrift based hiveservers. I also had a look at knox, but that seems to support http transport only.
Is there a known way to have ingress controller setup as reverse proxy to front end non-http end points like thrift hiveservers?

You may try to use service mesh, if this is an option for you.
In Istio such a use case (managing TCP traffic) can be achieved with Istio ingress gateway, that will act as entry point for the bunch of services inside your cluster (similar to K8S ingress but not limited to http traffic). There is even a built-in support for custom protocols like Apache Thrift protocol, which allows you to use features like rate limiting.

Related

Installing knative on existing kind cluster

I have an existing kind k8s cluster with a bunch of running services and nginx-ingress setup and I would like to add knative to it.
Is there a way of doing this with nginx-ingress, seems like networking for knative is a bit more complex than a normal service installation.
Knative needs more capabilities out of the HTTP routing layer than are exposed via the Kubernetes Ingress resource. (Percentage splits and header rewrite are two of the big ones.)
Unfortunately, so one has written an adapter from Knative's "KIngress" implementation to the nginx ingress. In the future, the gateway API (aka "Ingress V2") may provide these capabilities; in the meantime, you'll need to install one of the other network adapters and ingress implementations. Kourier provides the smallest implementation, while Contour also provides an Ingress implementation if you want to switch from nginx entirely.

Connecting to many kubernetes services from local machine

From my local machine I would like to be able to port forward to many services in a cluster.
For example I have services of name serviceA-type1, serviceA-type2, serviceA-type3... etc. None of these services are accessible externally but can be accessed using the kubectl port-forward command. However there are so many services, that port forwarding to each is unfeasible.
Is it possible to create some kind of proxy service in kubernetes that would allow me to connect to any of the serviceA-typeN services by specifying the them in a URL? I would like to be able to port-forward to the proxy service from my local machine and it would then forward the requests to the serviceA-typeN services.
So for example, if I have set up a port forward on 8080 to this proxy, then the URL to access the serviceA-type1 service might look like:
http://localhost:8080/serviceA-type1/path/to/endpoint?a=1
I could maybe create a small application that would do this but does kubernetes provide this functionality already?
kubectl proxy command provides this functionality.
Read more here: https://kubernetes.io/docs/tasks/administer-cluster/access-cluster-services/#manually-constructing-apiserver-proxy-urls
Good option is to use Ingrees to achieve it.
Read more about what Ingress is.
Main concepts are:
Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name-based virtual hosting.
An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer.
In Kubernetes we have 4 types of Services and the default service type is Cluster IP which means the service is only reachable within the cluster.Ingress exposes your service outside the cluster so ingress acts as the entry point into your cluster.
If you plan to move to cloud (I assume you will, because all applications are going to work in cloud in future) with Ingress, it will be compatible with cloud services and eventually will save time and will be easier to migrate from local environment.
To start with ingress you need to install an Ingress controller first.
There are different ingress controllers which you can use.
You can start with most common ingress-nginx which is supported by kubernetes community.
If you're using a minikube than it can be enabled as an addon - see here
Once you have installed ingress in your cluster, you need to create a rule to have it work. Simple fanout is an example with two services and path based routing to it.

How expose multiple services on the same port in kubernetes using OpenStack

I have a Kubernetes cluster on a private cloud based on the OpenStack. My service is required to be exposed on a specific port. I am able to do this using NodePort. However, if I try to create another service similar to the first one, I am not able to expose it since I have to use the same port and it is already occupied by the first one.
I've noticed that I can use LoadBalancer in public clouds for this, but I assume this is not possible in OpenStack?
I also tried to use Ingress Controller of Kubernetes but it did not worked. However, I am not sure if I went through a correct way to do it.
Is there any other way else than LoadBalancer or Ingress to do this? (My first assumption was that if I dedicate my pods to specific nodes, then I should be able to expose each of services on the same port on different nodes, but this approach also did not worked.)
Please let me know if you have any thoughts on this.
You have to setup the OpenStack Cloud Provider: basically, this Deployment will watch for LoadBalancer Service and will provide an {internal,external} IP address you can use to interact with your application, even at L4 and not only (sic) L7 like many Ingress Controller resources.
If you want to only expose one port then the only answer to the best of my knowledge is an ingress-controller. The two most famous ones are Nginx and Traefik. I agree that setting up ingress-controller can be difficult and I had problems with them before but you have to solve them one by one.
Another thing you can do is you can build your own ingress controller. What I mean is to use a reverse proxy such as Nginx, configure it to reroute the traffic based on your topology then just expose this reverse proxy so all the traffic goes through this custom reverse proxy but this should be done just if you need something very customized.

Synchronize HTTP requests between several service instances in Kubernetes

We have a service with multiple replicas which works with storage without transactions and blocking approaches. So we need somehow to synchronize concurrent requests between multiple instances by some "sharding" key. Right now we host this service in Kubernetes environment as a ReplicaSet.
Don't you know any simple out-of-the-box approaches on how to do this to not implement it from scratch?
Here are several of our ideas on how to do this:
Deploy the service as a StatefulSet and implement some proxy API which will route traffic to the specific pod in this StatefulSet by sharding key from the HTTP request. In this scenario, all requests which should be synchronized will be handled by one instance and it wouldn't be a problem to handle this case.
Deploy the service as a StatefulSet and implement some custom logic in the same service to re-route traffic to the specific instance (or process on this exact instance). As I understand it's not possible to have abstract implementation and it would work only in Kubernetes environment.
Somehow expose each pod IP outside the cluster and implement routing logic on the client-side.
Just implement synchronization between instances through some third-party service like Redis.
I would like to try to route traffic to the specific pods. If you know standard approaches how to handle this case I'll be much appreciated.
Thank you a lot in advance!
Another approach would be to put a messaging queue (like Kafka and RabbitMq) in front of your service.
Then your pods will subscribe to the MQ topic/stream. The pod will decide if it should process the message or not.
Also, try looking into service meshes like Istio or Linkerd.
They might have an OOTB solution for your use-case, although I wasn't able to find one.
Remember that Network Policy is not traffic routing !
Pods are intended to be stateless and indistinguishable from one another, pod-networking.
I recommend to Istio. It has special component which is responsible or routing- Envoy. It is a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh.
Useful article: istio-envoy-proxy.
Istio documentation: istio-documentation.
Useful Istio explaination https://www.youtube.com/watch?v=e2kowI0fAz0.
But you should be able to create a Deployment per customer group, and a Service per Deployment. The Ingress nginx should be able to be told to map incoming requests by whatever attributes are relevant to specific customer group Services.
Other solution is to use kube-router.
Kube-router can be run as an agent or a Pod (via DaemonSet) on each node and leverages standard Linux technologies iptables, ipvs/lvs, ipset, iproute2.
Kube-router uses IPVS/LVS technology built in Linux to provide L4 load balancing. Each ClusterIP, NodePort, and LoadBalancer Kubernetes Service type is configured as an IPVS virtual service. Each Service Endpoint is configured as real server to the virtual service. The standard ipvsadm tool can be used to verify the configuration and monitor the active connections.
How it works: service-proxy.

Deploying a mobile app backend with kubernetes

I need to some advice regarding how to deploy a high traffic mobile app back-end using kubernetes. This deployment should support HA at-least. We have plans to run a DR site as well, but scope of this question does not include a DR.
We currently use hardware load-balancers to route incoming traffic to different IP addresses attached to different boxes. Each such box runs a nginx instance as a reverse proxy which also act as the https terminator. After https termination, traffic is directed to an apache web-server. Each box has one apacher server receiving all traffic from nginx running in the same box.
We want to introduce kubernetes to this setup so that we can utilize boxes better. Our traffic patterns are highly fluctuating and we believe kubernetes can help us utilize boxes in a more efficient manner.
My current plan is as follows:
-- Keep the hardware load balancer to route incoming traffic to different boxes. (this may not be needed but getting rid of HLB could become very political).
-- Run a kubenetes cluster utilizing all available boxes
-- pack apache + our app as docker image and deploy this image on docker container which in tern is run inside pods in the kubenetes cluster
-- setup ingress to accept external traffic, do https termination and load balance to above pods. A simple round robin or random load balancing algo is fine as our back ends are stateless
Does this sound right? Are there any alternatives? In the above case, where does the ingress controller run?
Your plan seems right. You can either pack apache with the code but it shall be better to keep it separate so that they can contact each other and any one of the version upgrades won't be dependent upon this one.
Also, the hardware load balancer will tickle the traffic on to the ingress which shall further bring it down to the k8s cluster and eventually on the pods.
The ingress controller runs inside the cluster. I guess you're looking to run kuberentes on-premise with your existing hardware. To use the existing hardware loadbalancer outside of kubernetes you could run the nginx ingress controller as a daemonset so that there'd be one instance on each node and expose it via HostPort so that each is exposed on the same port. Or if there are lots of nodes then you'd want to just use a Deployment. Then you'd would want to use NodePort so that Kuberentes would send the traffic to a node where an ingress controller pod runs.
Another alternative would be to expose the nginx ingress controller through LoadBalancer - to do that you'd need to integrate your loadbalancer with kubernetes using something like https://hackernoon.com/metallb-a-load-balancer-for-bare-metal-kubernetes-clusters-f7320fde52f2
Alternatively, you wouldn't necessarily have to use ingress. You could just run nginx in the cluster and expose it via NodePort.
It's not clear to me that you'd need apache http server in your container. I guess it depends how you are using it currently.