Kubernetes Service and Ingress - kubernetes

I'm learning Kubernetes at the moment and just had a question I'd like to have clarified regarding exposing Services and making Pods accessible to the public internet.
Lets say I have an Java Spring boot application which has an embedded Tomcat server using JSP, MySQL Pod and Memcached (All on separate Pods), and I'd like to expose them as a Service making them publicly available.
I'm confused as to which type of Service each of these Pods would need , and also why. I'm aware that of Ingress and using a single Load Balancer to route requests from Services instead of multiple Load Balancers, but the actual Service type is what I'm finding hard to understand based on the what work the Pod needs to do.

Answering which Service type do you need: it's always ClusterIP.
LoadBalancers and NodePort are reserved for very specific use cases. One requiring to be integrated with a cloud (provisioning loadbalancers), the other requiring your kubernetes nodes being exposed to external clients, allowing connections to non-default ports.
When you don't know or you're not sure: just assume you can't use NodePort or LoadBalancers. As a cluster end-user, developer or Kubernetes beginner: ClusterIP is the only Service type you need.
Exposing your application to clients outside of your SDN, you want to use Ingresses. As again, while LoadBalancers or NodePorts might be suitable technical solutions on paper, they usually aren't in practice -- and when they are, there are security aspect to consider: better dealt with by your cluster administrator.

Related

My understanding of headless service in k8s and two questions to verify

I am learning the headless service of kubernetes.
I understand the following without question (please correct me if I am wrong):
A headless service doesn't have a cluster IP,
It is used for communicating with stateful app
When client app container/pod communicates with a database pod via headless service the pod IP address is returned instead of the service's.
What I don't quite sure:
Many articles on internet explaining headless service is vague in my opinion. Because all I found only directly state something like :
If you don't need load balancing but want to directly connect to the
pod (e.g. database) you can use headless service
But what does it mean exactly?
So, following are my thoughts of headless service in k8s & two questions with an example
Let's say I have 3 replicas of PostgreSQL database instance behind a service, if it is a regular service I know by default request to database would be routed in a round-robin fasion to one of the three database pod. That's indeed a load balancing.
Question 1:
If using headless service instead, does the above quoted statement mean the headless service will stick with one of the three database pod, never change until the pod dies? I ask this because otherwise it would still be doing load balancing if not stick with one of the three pod. Could some one please clarify it?
Question 2:
I feel no matter it is regular service or headless service, client application just need to know the DNS name of the service to communicate with database in k8s cluster. Isn't it so? I mean what's the point of using the headless service then? To me the headless service only makes sense if client application code really needs to know the IP address of the pod it connects to. So, as long as client application doesn't need to know the IP address it can always communicate with database either with regular service or with headless service via the service DNS name in cluster, Am I right here?
A normal Service comes with a load balancer (even if it's a ClusterIP-type Service). That load balancer has an IP address. The in-cluster DNS name of the Service resolves to the load balancer's IP address, which then forwards to the selected Pods.
A headless Service doesn't have a load balancer. The DNS name of the Service resolves to the IP addresses of the Pods themselves.
This means that, with a headless Service, basically everything is up to the caller. If the caller does a DNS lookup, picks the first address it's given, and uses that address for the lifetime of the process, then it won't round-robin requests between backing Pods, and it will not notice if that Pod disappears. With a normal Service, so long as the caller gets the Service's (cluster-internal load balancer's) IP address, these concerns are handled automatically.
A headless Service isn't specifically tied to stateful workloads, except that StatefulSets require a headless Service as part of their configuration. An individual StatefulSet Pod will actually be given a unique hostname connected to that headless Service. You can have both normal and headless Services pointing at the same Pods, though, and it might make sense to use a normal Service for cases where you don't care which replica is (initially) contacted.
A headless service will return all Pod IPs that are associated through the selector. The order is not stable, so if a client is making repeated DNS queries and uses only the first returned IP, this will result in some kind of load balancing as well.
Regarding your second question: That is correct. In general, if a client does not need to know all instances - and handle the unstable IPs - a regular service provides more benefits.

Can somebody explain whay I have to use external (MetalLB, HAProxy etc) Load Balancer with Bare-metal kubernetes cluster?

For instance, I have a bare-metal cluster with 3 nodes ich with some instance exposing the port 105. In order to expose it on external Ip address I can define a service of type NodePort with "externalIPs" and it seems to work well. In the documentation it says to use a load balancer but I didn't get well why I have to use it and I worried to do some mistake.
Can somebody explain whay I have to use external (MetalLB, HAProxy etc) Load Balancer with Bare-metal kubernetes cluster?
You don't have to use it, it's up to you to choose if you would like to use NodePort or LoadBalancer.
Let's start with the difference between NodePort and LoadBalancer.
NodePort is the most primitive way to get external traffic directly to your service. As the name implies, it opens a specific port on all the Nodes (the VMs), and any traffic that is sent to this port is forwarded to the service.
LoadBalancer service is the standard way to expose a service to the internet. It gives you a single IP address that will forward all traffic to your service.
You can find more about that in kubernetes documentation.
As for the question you've asked in the comment, But NodePort with "externalIPs" option is doing exactly the same. I see only one tiny difference is that the IP should be owned by one of the cluster machin. So where is the profit of using a loadBalancer? let me answer that more precisely.
There are the advantages & disadvantages of ExternalIP:
The advantages of using ExternalIP is:
You have full control of the IP that you use. You can use IP that belongs to your ASN >instead of a cloud provider’s ASN.
The disadvantages of using ExternalIP is:
The simple setup that we will go thru right now is NOT highly available. That means if the node dies, the service is no longer reachable and you’ll need to manually remediate the issue.
There is some manual work that needs to be done to manage the IPs. The IPs are not dynamically provisioned for you thus it requires manual human intervention
Summarizing the pros and cons of both, we can conclude that ExternalIP is not made for a production environment, it's not highly available, if node dies the service will be no longer reachable and you will have to manually fix that.
With a LoadBalancer if node dies the service will be recreated automatically on another node. So it's dynamically provisioned and there is no need to configure it manually like with the ExternalIP.

Best way to go between private on-premises network and kubernetes

I have setup an on-premises Kubernetes cluster, and I want to be ensure that my services that are not in Kubernetes, but exist on a separate class B are able to consume those services that have migrated to Kubernetes. There's a number of ways of doing this by all accounts and I'm looking for the simplest one.
Ingress + controller seems to be the one favoured - and it's interesting because of the virtual hosts and HAProxy implementation. But where I'm getting confused is how to set up the Kubernetes service:
We've not a great deal of choice - ClusterIP won't be sufficient to expose it to the outside, or NodePort. LoadBalancer seems to be a simpler, cut down way of switching between network zones - and although there are OnPrem solutions (metalLB), seems to be far geared towards cloud solutions.
But if I stick with NodePort, then my entry into the network is going to be on a non-standard port number, and I would prefer it to be over standard port; particuarly if running a percentage of traffic for that service over non-kube, and the rest over kubernetes (for testing purposes, I'd like to monitor the traffic over a period of time before I bite the bullet and move 100% of traffic for the given microservice to kubernetes). In that case it would be better those services would be available across the same port (almost always 80 because they're standard REST micro-services). More than that, if I have to re-create the service for whatever reason, I'm pretty sure the port will change, and then all traffic will not be able to enter the Kubernetes cluster and that's a frightening proposition.
What are the suggested ways of handling communication between existing on-prem and Kubernetes cluster (also on prem, different IP/subnet)?
Is there anyway to get traffic coming in without changing the network parameters (class B's the respective networks are on), and not being forced to use NodePort?
NodePort service type may be good at stage or dev environments. But i recommend you to go with LoadBalancer type service (Nginx ingress controller is one). The advantage for this over other service types are
You can use standard port (Rather random Nodeport generated by your kubernetes).
Your service is load balanced. (Load balancing will be taken care by ingress controller).
Fixed port (it will not change unless you modify something in ingress object).

How to install Kubernetes dashboard on external IP address?

How to install Kubernetes dashboard on external IP address?
Is there any tutorial for this?
You can expose services and pods in several ways:
expose the internal ClusterIP service through Ingress, if you have that set up.
change the service type to use 'type: LoadBalancer', which will try to create an external load balancer.
If you have external IP addresses on your kubernetes nodes, you can also expose the ports directly on the node hosts; however, I would avoid these unless it's a small, test cluster.
change the service type to 'type: NodePort', which will utilize a port above 30000 on all cluster machines.
expose the pod directly using 'type: HostPort' in the pod spec.
Depending on your cluster type (Kops-created, GKE, EKS, AKS and so on), different variants may not be setup. Hosted clusters typically support and recommend LoadBalancers, which they charge for, but may or may not have support for NodePort/HostPort.
Another, more important note is that you must ensure you protect the dashboard. Running an unprotected dashboard is a sure way of getting your cluster compromised; this recently happened to Tesla. A decent writeup on various way to protect yourself was written by Jo Beda of Heptio

How to access Kubernetes pod in local cluster?

I have set up an experimental local Kubernetes cluster with one master and three slave nodes. I have created a deployment for a custom service that listens on port 10001. The goal is to access an exemplary endpoint /hello with a stable IP/hostname, e.g. http://<master>:10001/hello.
After deploying the deployment, the pods are created fine and are accessible through their cluster IPs.
I understand the solution for cloud providers is to create a load balancer service for the deployment, so that you can just expose a service. However, this is apparently not supported for a local cluster. Setting up Ingress seems overkill for this purpose. Is it not?
It seems more like kube proxy is the way to go. However, when I run kube proxy --port <port> on the master node, I can access http://<master>:<port>/api/..., but not the actual pod.
There are many related questions (e.g. How to access services through kubernetes cluster ip?), but no (accepted) answers. The Kubernetes documentation on the topic is rather sparse as well, so I am not even sure about what is the right approach conceptually.
I am hence looking for a straight-forward solution and/or a good tutorial. It seems to be a very typical use case that lacks a clear path though.
If an Ingress Controller is overkill for your scenario, you may want to try using a service of type NodePort. You can specify the port, or let the system auto-assign one for you.
A NodePort service exposes your service at the same port on all Nodes in your cluster. If you have network access to your Nodes, you can access your service at the node IP and port specified in the configuration.
Obviously, this does not load balance between nodes. You can add an external service to help you do this if you want to emulate what a real load balancer would do. One simple option is to run something like rocky-cli.
An Ingress is probably your simplest bet.
You can schedule the creation of an Nginx IngressController quite simply; here's a guide for that. Note that this setup uses a DaemonSet, so there is an IngressController on each node. It also uses the hostPort config option, so the IngressController will listen on the node's IP, instead of a virtual service IP that will not be stable.
Now you just need to get your HTTP traffic to any one of your nodes. You'll probably want to define an external DNS entry for each Service, each pointing to the IPs of your nodes (i.e. multiple A/AAAA records). The ingress will disambiguate and route inside the cluster based on the HTTP hostname, using name-based virtual hosting.
If you need to expose non-HTTP services, this gets a bit more involved, but you can look in the nginx ingress docs for more examples (e.g. UDP).