When I try to install metrics server on EKS Fargate cluster, it is throwing error:
0/4 nodes are available: 4 Insufficient pods.
Following instructions from here to install metrics server: http://arun-gupta.github.io/hpa-app-metrics/
Can someone tell me why this error is throwing up?
update:
when i make additional deployment, it is allocating new pods and working fine. but it is showing this error when metrics are installed by following instructions in the above link
From the AWS Fargate docs here
Fargate profiles support specifying subnets from VPC secondary CIDR blocks. You may want to specify a secondary CIDR block because there are a limited number of IP addresses available in a subnet. As a result, there are a limited number of pods that can be created in the cluster. Using different subnets for pods allows you to increase the number of available IP addresses. For more information, see Adding IPv4 CIDR blocks to a VPC.
Related
I am having some issue with GKE(autopilot).
I am deploying statefulsets and for each statefulset I deploy a service with public IP.
But after deploying like 10-12 statefulsets, if I try deploying any new it remains red(Unschedulable) with message "Insufficient cpu".
When I go to cluster section is show a different message saying:
Can’t scale up because instances in managed instance groups hosting node pools ran out of IPs
Image of error: https://i.imgur.com/t8I4Yij.png
I am new to GKE and tried doing what suggested in links of those image but it seems most of steps give error saying its not supported in autopilot mode.
Any help/suggestion is appreciated.
Thanks.
If you are on GKE autopilot ideally it will create the new nodes in cluster if out of CPU or no space left to schedule the PODs.
However if it's issue of IP you can read more : https://cloud.google.com/kubernetes-engine/docs/how-to/alias-ips#not_enough_space
Cluster autoscaler might not have enough unallocated IP address space
to use to add new nodes or Pods, resulting in scale-up failures, which
are indicated by eventResult events with the reason
scale.up.error.ip.space.exhausted. You can add more IP addresses for
nodes by expanding the primary subnet, or add new IP addresses for
Pods using discontiguous multi-Pod CIDR. For more information, see Not
enough free IP space for Pods.
but you are on autopilot so wont be able to access underlaying subnet and node pools of cluster maybe.
Unfortunately, the only option at this point is to create a new cluster and make sure that the CIDR ranges you assign to the cluster have enough available IPs for the number of nodes you believe you'll need. The default setting for Autopilot should be enough.
I'm wondering if anyone can help with my issue, here's the setup:
We have 2 separate kubernetes clusters in GKE, running on v1.17, and they each sit in a separate project
We have set up VPC peering between the two projects
On cluster 1, we have 'service1' which is exposed by an internal HTTPS load balancer, we don't want this to be public
On cluster 2, we intend on being able to access 'service1' via the internal load balancer, and it should do this over the VPC peering connection between the two projects
Here's the issue:
When I'm connected via SSH on a GKE node on cluster 2, I can successfully run a curl request to access https://service1.domain.com running on cluster 1, and get the expected response, so traffic is definitely routing from cluster 2 > cluster 1. However, when I'm running the same curl command from a POD, running on a GKE node, the same curl request times out.
I have run as much troubleshooting as I can including telnet, traceroute etc and I'm really stuck why this might be. If anyone can shed light on the difference here that would be great.
I did wonder whether pod networking is somehow forwarding traffic over the clusters public IP rather than over the VPC peering connection.
So it seems you're not using a "VPC-native" cluster and what you need is "IP masquerading".
From this document:
"A GKE cluster uses IP masquerading so that destinations outside of the cluster only receive packets from node IP addresses instead of Pod IP addresses. This is useful in environments that expect to only receive packets from node IP addresses."
You can use ip-masq-agent or k8s-custom-iptables. After this, it will work since it will be like you're making a call from node, not inside of pod.
As mentioned in one of the answers IP aliases (VPC-native) should work out of the box. If using a route based GKE cluster rather than VPC-native you would need to use custom routes.
As per this document
By default, VPC Network Peering with GKE is supported when used with
IP aliases. If you don't use IP aliases, you can export custom routes
so that GKE containers are reachable from peered networks.
This is also explained in this document
If you have GKE clusters without VPC native addressing, you might have
multiple static routes to direct traffic to VM instances that are
hosting your containers. You can export these static routes so that
the containers are reachable from peered networks.
The problem your facing seems similar to the one mentioned in this SO question, perhaps your pods are using IPs outside of the VPC range and for that reason cannot access the peered VPC?
UPDATE: In Google cloud, I tried to access the service from another cluster which had VPC native networking enabled, which I believe allows pods to use the VPC routing and possibly the internal IPs.
Problem solved :-)
We upgraded our existing development cluster from 1.13.6-gke.13 to 1.14.6-gke.13 and our pods can no longer reach our in-house network over our Google Cloud VPN. Our production cluster (still on 1.13) shares the same VPC network and VPN tunnels and is still working fine. The only thing that changed was the upgrade of the admin node and node pool to 1.14 on the development cluster.
I have opened a shell into a pod on the development cluster and attempted to ping the IP address of an in-house server to which we need access. No response received. Doing the same on a pod in our production cluster works as expected.
I ssh'd into a node in the cluster and was able to ping the in-house network. so it's just pods that have networking issues.
Access to the publicly exposed services in the cluster is still working as expected. Health checks are OK.
UPDATE:
I created a new node pool using the latest 1.13 version, drained the pods from the 1.14 pool and all is well with the pods running on the 1.13 pool again. Something is definitely up with 1.14. It remains to be seen if this is an issue cause by some new configuration option or just a bug.
RESOLUTION:
IP masquerading is discussed here https://cloud.google.com/kubernetes-engine/docs/how-to/ip-masquerade-agent. My resolution was to add the pod subnets for each of my clusters to the list of advertised networks in my VPN Cloud Routers on GCP. So now the pod networks can traverse the VPN.
Until GKE 1.13.x, even if not necessary, GKE will masquerade pods trying to reach external IPs, even on the same VPC of the cluster, unless the destination is on the 10.0.0.0/8 range.
Since 1.14.x versions, this rule is no longer added by default on clusters. This means that pods trying to reach any endpoint will be seen with their Pod IP instead of the node IP as the masquerade rule was removed.
You could try recreating your Cloud VPN in order to include the POD IP range.
I have deployed a Kubernetes cluster to GCP. For this cluster, I added some deployments. Those deployments are using external resources that protected with security policy to reject connection from unallow IP address.
So, in order to pod to connect the external resource, I need manually allow the node (who hosting the pod) IP address.
It's also possible to me to allow range of IP address, where one of my nodes are expected to be running.
Untill now, I just find their internal IP addresses range. It looks like this:
Pod address range 10.16.0.0/14
The question is how to find the range of external IP addresses for my nodes?
Let's begin with the IPs that are assigned to Nodes:
When we create a Kubernetes cluster, GCP in the backend creates compute engines machines with a specific internal and external IP address.
In your case, just go to the compute engine section of the Google Cloud Console and capture all the external IPs of the VM whose initials starts with gke-(*) and whitelist it.
Talking about the range, as such in GCP only the internal IP ranges are known and external IP address are randomly assigned from a pool of IPs hence you need to whitelist it one at a time.
To get the pod description and IPs run kubectl describe pods.
If you go to the compute engine instance page it shows the instances which make the cluster. it shows the external ips on the right side. For the the ip of the actual pods use the Kubectl command.
I am trying to install Kubernetes in my on-premise server Ubuntu 16.04. And referring following documentation ,
https://medium.com/#Grigorkh/install-kubernetes-on-ubuntu-1ac2ef522a36
After installing kubelete kubeadm and kubernetes-cni I found that to initiate kubeadm with following command,
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.133.15.28 --kubernetes-version stable-1.8
Here I am totally confused about why we are setting cidr and api server advertise address. I am adding few confusion from Kubernetes here,
Why we are specifying CIDR and --apiserver-advertise-address here?
How I can find these two address for my server?
And why flannel is using in Kubernetes installation?
I am new to this containerization and Kubernetes world.
Why we are specifying CIDR and --apiserver-advertise-address here?
And why flannel is using in kubernetes installation?
Kubernetes using Container Network Interface for creating a special virtual network inside your cluster for communication between pods.
Here is some explanation "why" from documentation:
Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies):
all containers can communicate with all other containers without NAT
all nodes can communicate with all containers (and vice-versa) without NAT
the IP that a container sees itself as is the same IP that others see it as
Kubernetes applies IP addresses at the Pod scope - containers within a Pod share their network namespaces - including their IP address. This means that containers within a Pod can all reach each other’s ports on localhost. This does imply that containers within a Pod must coordinate port usage, but this is no different than processes in a VM. This is called the “IP-per-pod” model.
So, Flannel is one of the CNI which can be used for create network which will connect all your pods and CIDR option define a subnet for that network. There are many alternative CNI with similar functions.
If you want to get more details about how network working in Kubernetes you can read by link above or, as example, here.
How I can find these two address for my server?
API server advertise address has to be only one and static. That address using by all components to communicate with API server. Unfortunately, Kubernetes has no support of multiple API server addresses per master.
But, you can still use as many addresses on your server as you want, but only one of them you can define as --apiserver-advertise-address. The only one request for it - it has to be accessible from all your nodes in cluster.