" error: metrics not available yet " even metrics-server is running - kubernetes

I am new to kubernetes and was trying to create horizontal auto scaler. For this I deployed metrics server. I used the official gitHub repository for metrics-server. I can see the process running as below
NAME READY STATUS RESTARTS AGE
pod/metrics-server-766c9b8df-dltgd 1/1 Running 0 13m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/metrics-server ClusterIP 10.106.14.34 <none> 443/TCP 37m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/metrics-server 1/1 1 1 37m
I deployed a pod which is running on the worker nodes and hence I can see below metrics
ubuntu#master:~/metrics-server$ kubectl top pods
NAME CPU(cores) MEMORY(bytes)
demo-deploy-d86b8cfcc-2jg9w 2m 289Mi
demo-deploy-d86b8cfcc-5xtww 1m 284Mi
demo-deploy-d86b8cfcc-hk2bq 1m 278Mi
demo-deploy-d86b8cfcc-jkdmc 1m 286Mi
But issue is
ubuntu#master:~/metrics-server$ kubectl top nodes
error: metrics not available yet
I searched a lot but my bad I couldn't get answer for this. Can someone help me why it is so?
ubuntu#master:~/metrics-server$ kubectl top nodes
this should show the metrics for the worker nodes. But to my bad I am not getting it. Not even blank status.

Related

kube-apiserver: constantly 5 to 10% CPU: Although there is no single request

I installed kind to play around with Kubernetes.
If I use top and sort by CPU usage (key C), then I see that kube-apiserver is constantly consuming 5 to 10% CPU.
Why?
I don't have installed something up to now:
guettli#p15:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-ntg7c 1/1 Running 0 40h
kube-system coredns-558bd4d5db-sx8w9 1/1 Running 0 40h
kube-system etcd-kind-control-plane 1/1 Running 0 40h
kube-system kindnet-9zkkg 1/1 Running 0 40h
kube-system kube-apiserver-kind-control-plane 1/1 Running 0 40h
kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 40h
kube-system kube-proxy-dthwl 1/1 Running 0 40h
kube-system kube-scheduler-kind-control-plane 1/1 Running 0 40h
local-path-storage local-path-provisioner-547f784dff-xntql 1/1 Running 0 40h
guettli#p15:~$ kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 40h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 40h
guettli#p15:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane,master 40h v1.21.1
guettli#p15:~$ kubectl get nodes --all-namespaces
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane,master 40h v1.21.1
I am curious. Where does the CPU usage come from? How can I investigate this?
Even in an empty cluster with just one master node, there are at least 5 components that reach out to the API server on a regular basis:
kubelet for the master node
Controller manager
Scheduler
CoreDNS
Kube proxy
This is because API Server acts as the only entry point for all components in Kubernetes to know what the cluster state should be and take action if needed.
If you are interested in the details, you could enable audit logs in the API server and get a very verbose file with all the requests being made.
How to do so is not the goal of this answer, but you can start from the apiserver documentation.

Openshift: How to Delete or Manage Specific Pod

I spun up an Openshift 4.6 cluster on AWS, 3 masters, 2 Workers for learning/play. Since it was just for learning, I shut down all nodes at once using the AWS Web Console. When I brought them back up a few days later, the console will not spin up.
So I followed this doc on restarting the cluster. Didn't work, so I started looking at how the pods were doing for the console itself. I ran oc get pods -n openshift-console -o wide -w and got:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
console-59f557f67d-ks5kq 0/1 Pending 0 13m <none> <none> <none> <none>
console-59f557f67d-q6zrc 0/1 UnexpectedAdmissionError 0 4d6h <none> ip-10-0-139-41.us-west-2.compute.internal <none> <none>
console-59f557f67d-w4q7l 0/1 UnexpectedAdmissionError 0 56m <none> ip-10-0-131-234.us-west-2.compute.internal <none> <none>
console-59f557f67d-zvxzn 0/1 UnexpectedAdmissionError 0 59m <none> ip-10-0-131-234.us-west-2.compute.internal <none> <none>
downloads-55f4ff79-lqdj7 1/1 Running 0 59m 10.131.0.4 ip-10-0-208-19.us-west-2.compute.internal <none> <none>
downloads-55f4ff79-mrfzn 1/1 Running 0 59m 10.131.0.13 ip-10-0-208-19.us-west-2.compute.internal <none> <none>
So seeing that there were a few messed up pods, I wanted to look at their logs, but I don't know how to target just a specific pod by the NAME column above. I tried each of these:
oc get pods/console-59f557f67d-q6zrc
oc get podtemplates/console-59f557f67d-q6zrc
I consistently get Error from server (NotFound): pods "console-59f557f67d-q6zrc" not found.
I then found the command oc get pods -n openshift-console -o name which reveals:
pod/console-59f557f67d-ks5kq
pod/console-59f557f67d-q6zrc
pod/console-59f557f67d-w4q7l
pod/console-59f557f67d-zvxzn
pod/downloads-55f4ff79-lqdj7
pod/downloads-55f4ff79-mrfzn
So I was right, it's a "pod", but then if I try to run anything like oc logs <name> it returns the same error that it can't be found. Is this a bug? Does Openshift think there are Pods around that no longer exist and is routing to those Pods despite not existing?
If not, what resource type is the thing under the NAME column? How do I target it with say an oc logs or oc delete command?
I discovered the correct syntax in this Redhat Bugzilla Issue.
The correct syntax is to place the name as another argument after the namespace declaration.
Examples:
oc describe pod -n openshift-console console-59f557f67d-zvxzn
oc logs pod -n openshift-console console-59f557f67d-zvxzn
oc delete pod -n openshift-console console-59f557f67d-zvxzn
I'll follow up and update when I find this reference in the official docs or command line help.

How to scale a kubernetes cluster while limiting costs on GCP

We have a GKE cluster set up on google cloud platform.
We have an activity that requires 'bursts' of computing power.
Imagine that we usually do 100 computations an hour on average, then suddently we need to be able to process 100000 in less then two minutes. However most of the time, everything is close to idle.
We do not want to pay for idle servers 99% of the time, and want to scale clusters depending on actual use (no data persistance needed, servers can be deleted afterwards). I looked up the documentation available on kubernetes regarding auto scaling, for adding more pods with HPA and adding more nodes with cluster autoscaler
However it doesn't seem like any of these solutions would actually reduce our costs or improve performances, because they do not seem to scale past the GCP plan:
Imagine that we have a google plan with 8 CPUs. My understanding is if we add more nodes with cluster autoscaler we will just instead of having e.g. 2 nodes using 4 CPUs each we will have 4 nodes using 2 cpus each. But the total available computing power will still be 8 CPU.
Same reasoning go for HPA with more pods instead of more nodes.
If we have the 8 CPU payment plan but only use 4 of them, my understanding is we still get billed for 8 so scaling down is not really useful.
What we want is autoscaling to change our payment plan temporarly (imagine from n1-standard-8 to n1-standard-16) and get actual new computing power.
I can't believe we are the only ones with this use case but I cannot find any documentation on this anywhere! Did I misunderstand something ?
TL;DR:
Create a small persistant node-pool
Create a powerfull node-pool that can be scaled to zero (and cease billing) while not in use.
Tools used:
GKE’s Cluster Autoscaling, Node selector, Anti-affinity rules and Taints and tolerations.
GKE Pricing:
From GKE Pricing:
Starting June 6, 2020, GKE will charge a cluster management fee of $0.10 per cluster per hour. The following conditions apply to the cluster management fee:
One zonal cluster per billing account is free.
The fee is flat, irrespective of cluster size and topology.
Billing is computed on a per-second basis for each cluster. The total amount is rounded to the nearest cent, at the end of each month.
From Pricing for Worker Nodes:
GKE uses Compute Engine instances for worker nodes in the cluster. You are billed for each of those instances according to Compute Engine's pricing, until the nodes are deleted. Compute Engine resources are billed on a per-second basis with a one-minute minimum usage cost.
Enters, Cluster Autoscaler:
automatically resize your GKE cluster’s node pools based on the demands of your workloads. When demand is high, cluster autoscaler adds nodes to the node pool. When demand is low, cluster autoscaler scales back down to a minimum size that you designate. This can increase the availability of your workloads when you need it, while controlling costs.
Cluster Autoscaler cannot scale the entire cluster to zero, at least one node must always be available in the cluster to run system pods.
Since you already have a persistent workload, this wont be a problem, what we will do is create a new node pool:
A node pool is a group of nodes within a cluster that all have the same configuration. Every cluster has at least one default node pool, but you can add other node pools as needed.
For this example I'll create two node pools:
A default node pool with a fixed size of one node with a small instance size (emulating the cluster you already have).
A second node pool with more compute power to run the jobs (I'll call it power-pool).
Choose the machine type with the power you need to run your AI Jobs, for this example I'll create a n1-standard-8.
This power-pool will have autoscaling set to allow max 4 nodes, minimum 0 nodes.
If you like to add GPUs you can check this great: Guide Scale to almost zero + GPUs.
Taints and Tolerations:
Only the jobs related to the AI workload will run on the power-pool, for that use a node selector in the job pods to make sure they run in the power-pool nodes.
Set a anti-affinity rule to ensure that two of your training pods cannot be scheduled on the same node (optimizing the price-performance ratio, this is optional depending on your workload).
Add a taint to the power-pool to avoid other workloads (and system resources) to be scheduled on the autoscalable pool.
Add the tolerations to the AI Jobs to let them run on those nodes.
Reproduction:
Create the Cluster with the persistent default-pool:
PROJECT_ID="YOUR_PROJECT_ID"
GCP_ZONE="CLUSTER_ZONE"
GKE_CLUSTER_NAME="CLUSTER_NAME"
AUTOSCALE_POOL="power-pool"
gcloud container clusters create ${GKE_CLUSTER_NAME} \
--machine-type="n1-standard-1" \
--num-nodes=1 \
--zone=${GCP_ZONE} \
--project=${PROJECT_ID}
Create the auto-scale pool:
gcloud container node-pools create ${GKE_BURST_POOL} \
--cluster=${GKE_CLUSTER_NAME} \
--machine-type=n1-standard-8 \
--node-labels=load=on-demand \
--node-taints=reserved-pool=true:NoSchedule \
--enable-autoscaling \
--min-nodes=0 \
--max-nodes=4 \
--zone=${GCP_ZONE} \
--project=${PROJECT_ID}
Note about parameters:
--node-labels=load=on-demand: Add a label to the nodes in the power pool to allow selecting them in our AI job using a node selector.
--node-taints=reserved-pool=true:NoSchedule: Add a taint to the nodes to prevent any other workload from accidentally being scheduled in this node pool.
Here you can see the two pools we created, the static pool with 1 node and the autoscalable pool with 0-4 nodes.
Since we don't have workload running on the autoscalable node-pool, it shows 0 nodes running (and with no charge while there is no node in execution).
Now we'll create a job that create 4 parallel pods that run for 5 minutes.
This job will have the following parameters to differentiate from normal pods:
parallelism: 4: to use all 4 nodes to enhance performance
nodeSelector.load: on-demand: to assign to the nodes with that label.
podAntiAffinity: to declare that we do not want two pods with the same label app: greedy-job running in the same node (optional).
tolerations: to match the toleration to the taint that we attached to the nodes, so these pods are allowed to be scheduled in these nodes.
apiVersion: batch/v1
kind: Job
metadata:
name: greedy-job
spec:
parallelism: 4
template:
metadata:
name: greedy-job
labels:
app: greedy-app
spec:
containers:
- name: busybox
image: busybox
args:
- sleep
- "300"
nodeSelector:
load: on-demand
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- greedy-app
topologyKey: "kubernetes.io/hostname"
tolerations:
- key: reserved-pool
operator: Equal
value: "true"
effect: NoSchedule
restartPolicy: OnFailure
Now that our cluster is in standby we will use the job yaml we just created (I'll call it greedyjob.yaml). This job will run four processes that will run in parallel and that will complete after about 5 minutes.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 42m v1.14.10-gke.27
$ kubectl get pods
No resources found in default namespace.
$ kubectl apply -f greedyjob.yaml
job.batch/greedy-job created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 0/1 Pending 0 11s
greedy-job-72j8r 0/1 Pending 0 11s
greedy-job-9dfdt 0/1 Pending 0 11s
greedy-job-wqct9 0/1 Pending 0 11s
Our job was applied, but is in pending, let's see what's going on in those pods:
$ kubectl describe pod greedy-job-2xbvx
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 28s (x2 over 28s) default-scheduler 0/1 nodes are available: 1 node(s) didn't match node selector.
Normal TriggeredScaleUp 23s cluster-autoscaler pod triggered scale-up: [{https://content.googleapis.com/compute/v1/projects/owilliam/zones/us-central1-b/instanceGroups/gke-autoscale-to-zero-clus-power-pool-564148fd-grp 0->1 (max: 4)}]
The pod can't be scheduled on the current node due to the rules we defined, this triggers a Scale Up routine on our power-pool. This is a very dynamic process, after 90 seconds the first node is up and running:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 0/1 Pending 0 93s
greedy-job-72j8r 0/1 ContainerCreating 0 93s
greedy-job-9dfdt 0/1 Pending 0 93s
greedy-job-wqct9 0/1 Pending 0 93s
$ kubectl nodes
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 44m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 11s v1.14.10-gke.27
Since we set pod anti-affinity rules, the second pod can't be scheduled on the node that was brought up and triggers the next scale up, take a look at the events on the second pod:
$ k describe pod greedy-job-2xbvx
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal TriggeredScaleUp 2m45s cluster-autoscaler pod triggered scale-up: [{https://content.googleapis.com/compute/v1/projects/owilliam/zones/us-central1-b/instanceGroups/gke-autoscale-to-zero-clus-power-pool-564148fd-grp 0->1 (max: 4)}]
Warning FailedScheduling 93s (x3 over 2m50s) default-scheduler 0/1 nodes are available: 1 node(s) didn't match node selector.
Warning FailedScheduling 79s (x3 over 83s) default-scheduler 0/2 nodes are available: 1 node(s) didn't match node selector, 1 node(s) had taints that the pod didn't tolerate.
Normal TriggeredScaleUp 62s cluster-autoscaler pod triggered scale-up: [{https://content.googleapis.com/compute/v1/projects/owilliam/zones/us-central1-b/instanceGroups/gke-autoscale-to-zero-clus-power-pool-564148fd-grp 1->2 (max: 4)}]
Warning FailedScheduling 3s (x3 over 68s) default-scheduler 0/2 nodes are available: 1 node(s) didn't match node selector, 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules.
The same process repeats until all requirements are satisfied:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 0/1 Pending 0 3m39s
greedy-job-72j8r 1/1 Running 0 3m39s
greedy-job-9dfdt 0/1 Pending 0 3m39s
greedy-job-wqct9 1/1 Running 0 3m39s
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 46m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 2m16s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-sf6q Ready <none> 28s v1.14.10-gke.27
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 0/1 Pending 0 5m19s
greedy-job-72j8r 1/1 Running 0 5m19s
greedy-job-9dfdt 1/1 Running 0 5m19s
greedy-job-wqct9 1/1 Running 0 5m19s
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 48m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 Ready <none> 63s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 4m8s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-sf6q Ready <none> 2m20s v1.14.10-gke.27
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 1/1 Running 0 6m12s
greedy-job-72j8r 1/1 Running 0 6m12s
greedy-job-9dfdt 1/1 Running 0 6m12s
greedy-job-wqct9 1/1 Running 0 6m12s
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 48m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 Ready <none> 113s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv Ready <none> 26s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 4m58s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-sf6q Ready <none> 3m10s v1.14.10-gke.27
Here we can see that all nodes are now up and running (thus, being billed by second)
Now all jobs are running, after a few minutes the jobs complete their tasks:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 1/1 Running 0 7m22s
greedy-job-72j8r 0/1 Completed 0 7m22s
greedy-job-9dfdt 1/1 Running 0 7m22s
greedy-job-wqct9 1/1 Running 0 7m22s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
greedy-job-2xbvx 0/1 Completed 0 11m
greedy-job-72j8r 0/1 Completed 0 11m
greedy-job-9dfdt 0/1 Completed 0 11m
greedy-job-wqct9 0/1 Completed 0 11m
Once the task is completed, the autoscaler starts downsizing the cluster.
You can learn more about the rules for this process here: GKE Cluster AutoScaler
$ while true; do kubectl get nodes ; sleep 60; done
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 54m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 Ready <none> 7m26s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv Ready <none> 5m59s v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 10m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-sf6q Ready <none> 8m43s v1.14.10-gke.27
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 62m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 Ready <none> 15m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv Ready <none> 14m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 18m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-sf6q NotReady <none> 16m v1.14.10-gke.27
Once conditions are met, autoscaler flags the node as NotReady and starts removing them:
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 64m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 NotReady <none> 17m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv NotReady <none> 16m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw Ready <none> 20m v1.14.10-gke.27
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 65m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-39m2 NotReady <none> 18m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv NotReady <none> 17m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-qxkw NotReady <none> 21m v1.14.10-gke.27
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 66m v1.14.10-gke.27
gke-autoscale-to-zero-clus-power-pool-564148fd-ggxv NotReady <none> 18m v1.14.10-gke.27
NAME STATUS ROLES AGE VERSION
gke-autoscale-to-zero-cl-default-pool-9f6d80d3-x9lb Ready <none> 67m v1.14.10-gke.27
Here is the confirmation that the nodes were removed from GKE and from VMs(remember that every node is a Virtual Machine billed as Compute Engine):
Compute Engine: (note that gke-cluster-1-default-pool is from another cluster, I added it to the screenshot to show you that there is no other node from cluster gke-autoscale-to-zero other than the default persistent one.)
GKE:
Final Thoughts:
When scaling down, cluster autoscaler respects scheduling and eviction rules set on Pods. These restrictions can prevent a node from being deleted by the autoscaler. A node's deletion could be prevented if it contains a Pod with any of these conditions:
An application's PodDisruptionBudget can also prevent autoscaling; if deleting nodes would cause the budget to be exceeded, the cluster does not scale down.
You can note that the process is really fast, in our example it took around 90 seconds to upscale a node and 5 minutes to finish downscaling a standby node, providing a HUGE improvement in your billing.
Preemptible VMs can reduce even further your billing, but you will have to consider the kind of workload you are running:
Preemptible VMs are Compute Engine VM instances that last a maximum of 24 hours and provide no availability guarantees. Preemptible VMs are priced lower than standard Compute Engine VMs and offer the same machine types and options.
I know you are still considering the best architecture for your app.
Using APP Engine and IA Platform are optimal solutions as well, but since you are currently running your workload on GKE I wanted to show you an example as requested.
If you have any further questions let me know in the comments.

Autoscaler not scaling up leaving nodes in NotReady state and pods in Unknown state

I am running a cluster on GKE with a single node pool. It has 3 nodes and can scale from 1 to 99 nodes. The cluster uses the nginx-ingress controller
On this cluster, I want to deploy apps. An app is scoped by a namespace and consists of 3 deployments and one ingress (defining paths to access the application from the internet). Each deployment runs a single replica of a container.
Deploying a couple of apps works fine, but deploying many apps (requiring the node pool to scale up) breaks everything:
All pods start having warnings (including those successfully deployed earlier)
kubectl get pods --namespace bcd
NAME READY STATUS RESTARTS AGE
actions-664b7d79f5-7qdkw 1/1 Unknown 1 35m
actions-664b7d79f5-v8s2m 1/1 Running 1 18m
core-85cb74f89b-ns49z 1/1 Unknown 1 35m
core-85cb74f89b-qqzfp 1/1 Running 1 18m
nlu-77899ddbf-8pd7k 1/1 Running 1 27m
All nodes becomes unready:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-clients-projects-default-pool-f9af73d4-gzwr NotReady <none> 42m v1.9.7-gke.6
gke-clients-projects-default-pool-f9af73d4-p5l2 NotReady <none> 21m v1.9.7-gke.6
gke-clients-projects-default-pool-f9af73d4-wnxc NotReady <none> 37m v1.9.7-gke.6
Deleting the namespace to remove all resources from the cluster also seems to fail as after a long while the pods remain active but still in an unknown state.
How can I safely add more apps and let the cluster autoscale?
The reason seems to be that not knowing the resources needed for each pod, the scheduler schedules them on any available node, potentially exhausting available resources and putting the Docker daemon in an inconsistent state.
The solution is to specify resources requests and limits: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container

How do I expose the Kubernetes UI Dashboard?

Per the documentation at: https://kubernetes.io/docs/tasks/web-ui-dashboard/
I ran :
kubectl create -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
Then I tried running this to expose the service
cluster/kubectl.sh expose svc/kubernetes
but I keep getting an error:
error: couldn't retrieve selectors via --selector flag or introspection: the service has no pod selector set
See 'kubectl expose -h' for help and examples.
I have looked at the examples but can't understand what I am doing wrong.
kubernetes# cluster/kubectl.sh get all
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes 10.0.0.1 <none> 443/TCP 7h
kubernetes# cluster/kubectl.sh get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-dns-806549836-r6wtk 0/3 Pending 0 7h
kube-system kubernetes-dashboard-2396447444-9675d 0/1 Pending 0 6h
To get access to the dashboard, usually you would just type:
kubectl cluster-info
Which then gives you all the required urls for accessing your cluster.