How to add external GCP loadbalancer to kubespray cluster? - kubernetes

I deployed a kubernetes cluster on Google Cloud using VMs and Kubespray.
Right now, I am looking to expose a simple node app to external IP using loadbalancer but showing my external IP from gcloud to service does not work. It stays on pending state when I query kubectl get services.
According to this, kubespray does not have any loadbalancer mechanicsm included/integrated by default. How should I progress?

Let me start of by summarizing the problem we are trying to solve here.
The problem is that you have self-hosted kubernetes cluster and you want to be able to create a service of type=LoadBalancer and you want k8s to create a LB for you with externlIP and in fully automated way, just like it would if you used a GKE (kubernetes as a service solution).
Additionally I have to mention that I don't know much of a kubespray, so I will only describe all the steps that need to bo done to make it work, and leave the rest to you. So if you want to make changes in kubespray code, it's on you.
All the tests I did with kubeadm cluster but it should not be very difficult to apply it to kubespray.
I will start of by summarizing all that has to be done into 4 steps:
tagging the instances
enabling cloud-provider functionality
IAM and service accounts
additional info
Tagging the instances
All worker node instances on GCP have to be labeled with unique tag that is the name of an instance; these tags are later used to create a firewall rules and target lists for LB. So lets say that you have an instance called worker-0; you need to tag that instance with a tag worker-0
Otherwise it will result in an error (that can be found in controller-manager logs):
Error syncing load balancer: failed to ensure load balancer: no node tags supplied and also failed to parse the given lists of hosts for tags. Abort creating firewall rule
Enabling cloud-provider functionality
K8s has to be informed that it is running in cloud and what cloud provider that is so that it knows how to talk with the api.
controller manager logs informing you that it wont create an LB.
WARNING: no cloud provider provided, services of type LoadBalancer will fail
Controller Manager is responsible for creation of a LoadBalancer. It can be passed a flag --cloud-provider. You can manually add this flag to controller manager pod manifest file; or like in your case since you are running kubespray, you can add this flag somewhere in kubespray code (maybe its already automated and just requires you to set some env or sth, but you need to find it out yourself).
Here is how this file looks like with the flag:
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
...
- --cloud-provider=gce # <----- HERE
As you can see the value in our case is gce, which stangs for Google Compute Engine. It informs k8s that its running on GCE/GCP.
IAM and service accounts
Now that you have your provider enabled, and tags covered, I will talk about IAM and permissions.
For k8s to be able to create a LB in GCE, it needs to be allowed to do so. Every GCE instance has a deafult service account assigned. Controller Manager uses instance service account, stored within instance metadata to access GCP API.
For this to happen you need to set Access Scopes for GCE instance (master node; the one where controller manager is running) so it can use Cloud Engine API.
Access scopes -> Set access for each API -> compute engine=Read Write
To do this the instance has to be stopped, so now stop the instance. It's better to set these scopes during instance creation so that you don't need to make any unnecessary steps.
You also need to go to IAM & Admin page in GCP Console and add permissions so that master instance's service account has Kubernetes Engine Service Agent role assigned. This is a predefined role that has much more permissions than you probably need but I have found that everything works with this role so I decided to use is for demonstration purposes, but you probably want to use least privilege rule.
additional info
There is one more thing I need to mention. It does not impact you but while testing I have found out an interesting thing.
Firstly I created only one node cluster (single master node). Even though this is allowed from k8s point of view, controller manager would not allow me to create a LB and point it to a master node where my application was running. This draws a conclusion that one cannot use LB with only master node and has to create at least one worker node.
PS
I had to figure it out the hard way; by looking at logs, changing things and looking at logs again to see if the issue got solved. I didn't find a single article/documentation page where it is documented in one place. If you manage to solve it for yourself, write the answer for others. Thank you.

Related

Dynamic deployment of stateful applications in GKE

I'm trying to figure out which tools from GKE stack I should apply to my use case which is a dynamic deployment of stateful application with dynamic HTTP endpoints.
Stateful in my case means that I don't want any replicas and load-balancing (because the app doesn't scale horizontally at all). I understand though that in k8s/gke nomenclature I'm still going to be using a 'load-balancer' even though it'll act as a reverse proxy and not actually balance any load.
The use case is as follows. I have some web app where I can request for a 'new instance' and in return I get a dynamically generated url (e.g. http://random-uuid-1.acme.io). This domain should point to a newly spawned, single instance of a container (Pod) hosting some web application. Again, if I request another 'new instance', I'll get a http://random-uuid-2.acme.io which will point to another (separate), newly spawned instance of the same app.
So far I figured out following setup. Every time I request a 'new instance' I do the following:
create a new Pod with dynamic name app-${uuid} that exposes HTTP port
create a new Service with NodePort that "exposes" the Pod's HTTP port to the Cluster
create or update (if exists) Ingress by adding a new http rule where I specify that domain X should point at NodePort X
The Ingress mentioned above uses a LoadBalancer as its controller, which is automated process in GKE.
A few issues that I've already encountered which you might be able to help me out with:
While Pod and NodePort are separate resources per each app, Ingress is shared. I am thus not able to just create/delete a resource but I'm also forced to keep track of what has been added to the Ingress to be then able to append/delete from the yaml which is definitely not the way to do that (i.e. editing yamls). Instead I'd probably want to have something like an Ingress to monitor a specific namespace and create rules automatically based on Pod labels. Say I have 3 pods with labels, app-1, app-2 and app-3 and I want Ingress to automatically monitor all Pods in my namespace and create rules based on the labels of these pods (i.e. app-1.acme.io -> reverse proxy to Pod app-1).
Updating Ingress with a new HTTP rule takes around a minute to allow traffic into the Pod, until then I keep getting 404 even though both Ingress and LoadBalancer look as 'ready'. I can't figure out what I should watch/wait for to get a clear message that the Ingress Controller is ready for accepting traffic for newly spawned app.
What would be the good practice of managing such cluster where you can't strictly define Pods/Services manifests because you are creating them dynamically (with different names, endpoints or rules). You surely don't want to create bunch of yaml-s for every application you spawn to maintain. I would imagine something similar to consul templates in case of Consul but for k8s?
I participated in a similar project and our decision was to use Kubernetes Client Library to spawn instances. The instances were managed by a simple web application, which took some customisation parameters, saved them into its database, then created an instance. Because of the database, there was no problem with keeping track of what have been created so far. By querying the database we were able to tell if such deployment was already created or update/delete any associated resources.
Each instance consisted of:
a deployment (single or multi-replica, depending on the instance);
a ClusterIp service (no reason to reserve machine port with NodePort);
an ingress object for shared ingress controller;
and some shared configMaps.
And we also used external DNS and cert manager, one to manage DNS records and another to issue SSL certificates for the ingress. With this setup it took about 10 minutes to deploy a new instance. The pod and ingress controller were ready in seconds but we had to wait for the certificate and it's readiness depended on whether issuer's DNS got our new record. This problem might be avoided by using a wildcard domain but we had to use many different domains so it wasn't an option in our case.
Other than that you might consider writing a Helm chart and make use of helm list command to find existing instances and manage them. Though, this is a rather 'manual' solution. If you want this functionality to be a part of your application - better use a client library for Kubernetes.

Having question about publishing service in Kubernetes

My cluster has one master and two slaves(not on any cloud platform), and I create a deployment with 2 replicas so each slave has one pod, the image I’m running is tensorflow-jupyter. Then I create a NodePort type service for this deployment and I thought I can separately run these two pods at the same time, but I was wrong.
Tensorflow-jupyter have to use token it gives to login, everything is fine if there has only 1 pod, but if the replicas is 2 or more, it will have server error after login and logout by itself after I press F5, then I can’t use the token to login anymore. Similar situation happens to Wordpress, too.
I think I shouldn’t use NodePort type to doing this, but I don’t know if other service type can solve this problem. I don’t have load balancer to try and I don’t know how to use ExternalName.
Is there has any way to expose a service for a deployment with 2 or more replicas(one pod per slave)? Or I only can create a lot of deployments all with 1 pod and then expose same amount of services for each deployment?
It seems the application you're trying to deploy requires sticky session support: this is not supported out-of-the-box with the NodePort Service, you have to go for exposing your application using an Ingress resource controlled by an Ingress Controller in order to take advantage of the reverse-proxy capabilities (in this case, the sticky-session).
I'm not suggesting you use the sessionAffinity=ClientIP Service option since it's allowed only for ClusterIP Service resources and according to your question it seems the application has to be accessed outside of the cluster.

How to achieve hazelcast syncing in kubernetes with different pod (App and Hazel insatnce)?

They should able to communicate and update should visible to each other i mean mainly syncing.
DiscoveryStrategyConfig strategyConfig = new DiscoveryStrategyConfig(factory);
Blockquote
// strategyConfig.addProperty("service-dns",
"my-serice-name.my-namespace.svc.cluster.local");
// strategyConfig.addProperty("service-dns-timeout", "300");
strategyConfig.addProperty("service-name", "my-service-name");
strategyConfig.addProperty("service-label-name",
"my-service-label");
strategyConfig.addProperty("service-label-value", true);
strategyConfig.addProperty("namespace", "my-namespace");
I have followed the https://github.com/hazelcast/hazelcast-kubernetes.I have used the first approach was able to see the instance(per pod not in one members list) but they were not communicating (if I am doing crud in one hazel instance it's not reflecting in other). I want to use DNS strategy but was not able to create the instance only.
Please check the followings:
1. Discovery Strategy
For Kubernetes you need to use the HazelcastKubernetesDiscoveryStrategy class. It can be defined in the XML configuration or in the code (as in your case).
2. Labels
Check that the service for your Hazelcast cluster has the labels you specified. The same when it comes to the service name and namespace.
3. Configuration
There are two ways to configure the discovery: DNS Lookup and REST API. Each has special requirements. You mentioned DNS Lookup, but the configuration you've sent actually uses REST API.
DNS Lookup
Your Hazelcast cluster service must be headless ClusterIP.
spec:
type: ClusterIP
clusterIP: None
REST API
You need to grant access for you app to access Kubernetes API. Please check: https://github.com/hazelcast/hazelcast-code-samples/blob/master/hazelcast-integration/kubernetes/rbac.yaml
Other helpful resources
Hazelcast Kubernetes Code Sample
Hazelcast OpenShift Client app (should also work in Kubernetes)

Intercluster RBAC with service-account

Our infrastructure currently has 2 Kubernetes Cluster, with one Cluster (cluster-1) creating pods in another cluster (cluster-2). Since we are on kubernetes1.7.x, we are able to make this work.
However, with 1.8 Kubernetes added support for RBAC as a result of which we cannot create pods in the new cluster anymore.
We already added support for Service Accounts and made sure that RoleBindings are properly set-up. But the main issue is that the service-account is not propagated outside of the cluster (and rightly so). The user that cluster-2 receives the request is called 'client', so when we added a RoleBinding with 'client' as a User, everything worked.
This is most certainly not the correct solution, as now any cluster that talks to Kubernetes API server can create a pod.
Is there support for RBAC that works cross cluster? Or, is there a way to propagate the service info through to the cluster we want to create the pods in?
P.S.: Our Kubernetes cluster are currently on GKE. But, we would like this to work on all Kubernetes-engine.
Your cluster-1 SA uses a kubecfg (for cluster-2) which resolves to the user "client". The only way to solve this is to generate a kubecfg (for cluster-2) with an identity associated (cert/token) for your cluster-1 SA. Lot of ways to do that: https://kubernetes.io/docs/admin/authentication/
Simplest way is to create an identical SA in cluster-2 and use its token in the kubecfg in cluster-1. Give RBAC only to that SA.

Should the Kubernetes api server be accesible as https://kubernetes:443 from any pod in the cluster?

According to the Kubernetes docs,
The kubernetes service (in all namespaces) is configured with a virtual IP address that is redirected (via kube-proxy) to the HTTPS endpoint on the apiserver.
For some reason I can't access kubernetes from a non-default namespace, unless I manually create the service there (or use kubernetes.default). Looking at the code I see the kubernetes service is created in namespace default, is it also available in other namespaces? If so, how is that accomplished? How might I debug it?
I've been finding it difficult to Google this, since "kubernetes service" is not really a great search keyword.
For the record, I'm using GKE.
Service kubernetes is only available in Namespace default.
If you want to access API server using this service, you need to use kubernetes.default
Services are assigned a DNS A record for a name of the form
my-svc.my-namespace.svc.cluster.local
This resolves to the cluster IP of the Service.
That means, you need to use kubernetes.default.svc.cluster.local
You can skip svc.cluster.local.
So to access a kubernetes Service, you need to provide kubernetes.default.
If you want to access from default namespace, you can skip namespace part.
See details in here.
Also,
When you create a pod, if you do not specify a service account, it is automatically assigned the default service account in the same namespace.
You can access the API from inside a pod using automatically mounted service account credentials, as described in Accessing the Cluster.