Can I run a small project with Kubernetes on GCP with one node (g1-small)? - kubernetes

I’m voluntarily operating (developing and hosting) a community project. Meaning time and money are tight. Currently it runs on a bare-metal machine at AWS (t2.micro, (1 vCPU, 1 GB memory)).
For learning purposes I would like to containerize my application. Now I'm looking for hosting. The Google Cloud Plattform seems to be the cheapest to me.
I setup a Kubernetes cluster with 1 node (1.10.9-gke.5, g1-small (1 vCPU shared, 1.7 GB memory)).
After I set up the one node Kubernetes cluster I checked how much memory and CPU is already used by the Kubernetes system. (Please see kubectl describe node).
I was wondering if I can run the following application with 30% CPU and 30% memory left on the node. Unfortunately I don't have experience with how much the container in my example will need in terms of resources. But having only 30% CPU and 30% memory left doesn't seem like much for my kind of application.
kubectl describe node
Non-terminated Pods: (9 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system event-exporter-v0.2.3-54f94754f4-bznpk 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-scaler-6d7bbc67c5-pbrq4 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-v3.1.0-fjbz6 100m (10%) 0 (0%) 200Mi (17%) 300Mi (25%)
kube-system heapster-v1.5.3-66b7745959-4zbcl 138m (14%) 138m (14%) 301456Ki (25%) 301456Ki (25%)
kube-system kube-dns-788979dc8f-krrtt 260m (27%) 0 (0%) 110Mi (9%) 170Mi (14%)
kube-system kube-dns-autoscaler-79b4b844b9-vl4mw 20m (2%) 0 (0%) 10Mi (0%) 0 (0%)
kube-system kube-proxy-gke-spokesman-cluster-default-pool-d70d068f-wjtk 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system l7-default-backend-5d5b9874d5-cgczj 10m (1%) 10m (1%) 20Mi (1%) 20Mi (1%)
kube-system metrics-server-v0.2.1-7486f5bd67-ctbr2 53m (5%) 148m (15%) 154Mi (13%) 404Mi (34%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
681m (72%) 296m (31%) 807312Ki (67%) 1216912Ki (102%)
Here my app
PROD:
API: ASP.NET core 1.1 (microsoft/dotnet:1.1-runtime-stretch)
Frontend: Angular app (nginx:1.15-alpine)
Admin: Angular app (nginx:1.15-alpine)
TEST:
API: ASP.NET core 1.1 (microsoft/dotnet:1.1-runtime-stretch)
Frontend: Angular app (nginx:1.15-alpine)
Admin: Angular app (nginx:1.15-alpine)
SHARDED
Database: Postgres (postgres:11-alpine)
Any suggestions are more than welcome.
Thanks in advance!

If you intend to run a containerized app on a single node, a GCE instance could be better to begin with.
When moving into GKE, check out this GCP's guide explaining resource allocation per machine type before any workload and kube-system pods. You'd still need to have estimated resources usage per app component or container, maybe from monitoring your Dev or GCE environment.
If you want to explore other alternatives on GCP for your app (e.g. App Engine supports .NET), here's a post with a decision tree that might help you. I also found this article/tutorial about running containers on App Engine and GKE, comparing both with load tests.

Related

Request CPU to a pod with Azure AKS is failing

I'm using an AKS cluster running with K8s v1.16.15.
I'm following this simple example to assign some cpu to a pod and it does not work.
https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/
After applying this yaml file for the request,
apiVersion: v1
kind: Pod
metadata:
name: cpu-demo
namespace: cpu-example
spec:
containers:
- name: cpu-demo-ctr
image: vish/stress
resources:
limits:
cpu: "1"
requests:
cpu: "0.5"
args:
- -cpus
- "2"
If I try Kubectl describe pod... I get the following:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient cpu.
But CPUs seems available, if I run kubectl top nodes, I get:
CPU(cores) CPU% MEMORY(bytes) MEMORY%
702m 36% 4587Mi 100%
Maybe it is related to some AKS configuration but I can figure it out.
Do you have an idea of what is happening?
Thanks a lot in advance!!
Kubernetes will decide where the pod can schedule on using node allocatable resources, not real resource usages. You can see your node allocatable resource using kubectl describe node <your node name>. Refer Capacity and Allocatable for more details. As I see the events logs, 0/1 nodes are available: 1 Insufficient cpu., you have just one worker node and the node has not cpu resource enough to run your pod with requests.cpu: "0.5". Pod scheduling is based on requests resource size, not limits one.
The previous answer well explains the reasons why this could happen. What can be added is that while scheduling pods that has request you have to be aware of the resources that your other cluster objects consumes. System objects also use your resources. Even with small cluster you may have enabled some addon that will consume node resources.
So your node has a certain amount of CPU and memory it can allocate to pods. While scheduling the Scheduler will only take into consideration nodes with enough unallocated resources to meet your desired requests.
If the amount of unallocated CPU or memory is less than what the pod requests, Kubernetes will not schedule the pod to that node, because the node can’t provide the minimum amount required by the pod.
If you describe your node you will see the pods that are already running and consuming your resources and all allocated resources:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default elasticsearch-master-0 1 (25%) 1 (25%) 2Gi (13%) 4Gi (27%) 8d
default test-5487d9b57b-4pz8v 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system coredns-66bff467f8-rhbnj 100m (2%) 0 (0%) 70Mi (0%) 170Mi (1%) 35d
kube-system etcd-minikube 0 (0%) 0 (0%) 0 (0%) 0 (0%) 16d
kube-system httpecho 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34d
kube-system ingress-nginx-controller-69ccf5d9d8-rbdf8 100m (2%) 0 (0%) 90Mi (0%) 0 (0%) 34d
kube-system kube-apiserver-minikube 250m (6%) 0 (0%) 0 (0%) 0 (0%) 16d
kube-system kube-controller-manager-minikube 200m (5%) 0 (0%) 0 (0%) 0 (0%) 35d
kube-system kube-scheduler-minikube 100m (2%) 0 (0%) 0 (0%) 0 (0%) 35d
kube-system traefik-ingress-controller-78b4959fdf-8kp5k 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1750m (43%) 1 (25%)
memory 2208Mi (14%) 4266Mi (28%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Now the most important part is what you can do about that:
You can enable autoscaling so that system automatically provision node and extra needed resources. This of course assumes that you ran out of resources and you need more
You can provision appropriate node by yourself (depending on how did you bootstrap your cluster)
Turn off any addon services that might taking desired resources that you don`t need

Kubernetes: What pod uses most CPU on a node?

Is there any way to list all PODs that are using the most CPU on the node using kubectl command. I could not see this in the official documentation.
You can get by using
kubectl top pods # This will give you which pod is using how much CPU and Memory
kubectl top nodes # This will give you which node is using how much CPU and Memory
Make sure metric server has deployed on the cluster.
To know which pod scheduled on a specific node has most CPU requests you can describe that node and check the Non-terminated Pods section.
kubectl describe node masternode
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system calico-kube-controllers-76d4774d89-vmsnf 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system calico-node-t4qzr 250m (12%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system coredns-66bff467f8-v9mn5 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 30d
kube-system etcd-ip-10-0-0-38 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system kube-apiserver-ip-10-0-0-38 250m (12%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system kube-controller-manager-ip-10-0-0-38 200m (10%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system kube-proxy-nf7jp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30d
kube-system kube-scheduler-ip-10-0-0-38 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30d
If the cluster have metrics server deployed then below commands are useful to know pod and node CPU utilization
kubectl top podname
kubectl top nodename
For nodes that have many pods across multiple namespaces I use an alias in .bash_profile. Outputs the cpu and memory for all pods on given node.
kntp () {
for p in `kubectl get pods --all-namespaces --field-selector spec.nodeName=$1` | grep -v "Completed" | tail -n +2 | awk '{print $2}'`; do
kubectl top pod --all-namespaces --field-selector metadata.name=$p | tail -n +2
done
}
Run it like
source ~/.bash_profile
kntp my-node-name-here
You can use:
kubectl top pods --all-namespaces --sort-by=cpu
To find the CPU and memory usage of all the pods among all available namespaces.
The CPU (cores) is the CPU usage:
338m means 338 millicpu. 1000m is equal to 1 CPU, hence 338m means 33.8% of 1 CPU.

How to get allocated gpus for node in kubernetes[with kubernetes-client java api]

I want to get available gpus[nvidia.com/gpu] in kubernetes cluster, and i now get total gpu from NodeAPI, but how to get allocated gpus from client api??
like kubectl describe node result:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 350m (8%) 0 (0%)
memory 70Mi (0%) 170Mi (1%)
ephemeral-storage 0 (0%) 0 (0%)
nvidia.com/gpu 1 1

How many Pods to run a single Kubernetes Node in Google Kubernetes Engine?

I have multiple Node.js apps / Services running on Google Kubernetes Engine (GKE), Actually 8 pods are running. I didnot set up resources limit when I created the pods so now I'm getting CPU Unscheduled error.
I understand I have to set up resource limits. From what I know, 1 CPU / Node = 1000Mi ? My question is,
1) what's the ideal resource limit I should set up? Like the minimum? for a Pod that's rarely used, can I set up 20Mi? or 50Mi?
2) How many Pods are ideal to run on a single Kubernetes Node? Right now I have 2 Nodes set up which I want to reduce to 1.
3) what do people use in Production? and for development Cluster?
Here are my Nodes
Node 1:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default express-gateway-58dff8647-f2kft 100m (10%) 0 (0%) 0 (0%) 0 (0%)
default openidconnect-57c48dc448-9jmbn 100m (10%) 0 (0%) 0 (0%) 0 (0%)
default web-78d87bdb6b-4ldsv 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system event-exporter-v0.1.9-5c8fb98cdb-tcd68 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-v2.0.17-mhpgb 100m (10%) 0 (0%) 200Mi (7%) 300Mi (11%)
kube-system kube-dns-5df78f75cd-6hdfv 260m (27%) 0 (0%) 110Mi (4%) 170Mi (6%)
kube-system kube-dns-autoscaler-69c5cbdcdd-2v2dj 20m (2%) 0 (0%) 10Mi (0%) 0 (0%)
kube-system kube-proxy-gke-qp-cluster-default-pool-7b00cb40-6z79 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system kubernetes-dashboard-7b89cff8-9xnsm 50m (5%) 100m (10%) 100Mi (3%) 300Mi (11%)
kube-system l7-default-backend-57856c5f55-k9wgh 10m (1%) 10m (1%) 20Mi (0%) 20Mi (0%)
kube-system metrics-server-v0.2.1-7f8dd98c8f-5z5zd 53m (5%) 148m (15%) 154Mi (5%) 404Mi (15%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
893m (95%) 258m (27%) 594Mi (22%) 1194Mi (45%)
Node 2:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default kube-healthcheck-55bf58578d-p2tn6 100m (10%) 0 (0%) 0 (0%) 0 (0%)
default pubsub-function-675585cfbf-2qgmh 100m (10%) 0 (0%) 0 (0%) 0 (0%)
default servicing-84787cfc75-kdbzf 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-v2.0.17-ptnlg 100m (10%) 0 (0%) 200Mi (7%) 300Mi (11%)
kube-system heapster-v1.5.2-7dbb64c4f9-bpc48 138m (14%) 138m (14%) 301656Ki (11%) 301656Ki (11%)
kube-system kube-dns-5df78f75cd-89c5b 260m (27%) 0 (0%) 110Mi (4%) 170Mi (6%)
kube-system kube-proxy-gke-qp-cluster-default-pool-7b00cb40-9n92 100m (10%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
898m (95%) 138m (14%) 619096Ki (22%) 782936Ki (28%)
My plan is to move all this into 1 Node.
According to kubernetes official documentation
1) You can go low in terms of memory and CPU, but you need to give enough CPU and memory to pods to function properly. I have gone as low as to CPU 100 and Memory 200 (It is highly dependent on the application you're running also the number of replicas)
2) There should not be 100 pods per node (This is the extreme case)
3) Production cluster are not of single node in any case. This is a very good read around kubernetes in production
But keep in mind, if you increase the number of pod on single node, you might need to increase the size (in terms of resources) of node.
Memory and CPU usage tends to grow proportionally with size/load on cluster
Here is the official documentation stating the requirements
https://kubernetes.io/docs/setup/cluster-large/

Cluster autoscaler not downscaling

I have a regional cluster set up in google kubernetes engine (GKE). The node group is a single vm in each region (3 total). I have a deployment with 3 replicas minimum controlled by a HPA.
The nodegroup is configured to be autoscaling (cluster autoscaling aka CA).
The problem scenario:
Update deployment image. Kubernetes automatically creates new pods and the CA identifies that a new node is needed. I now have 4.
The old pods get removed when all new pods have started, which means I have the exact same CPU request as the minute before. But the after the 10 min maximum downscale time I still have 4 nodes.
The CPU requests for the nodes is now:
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
358m (38%) 138m (14%) 516896Ki (19%) 609056Ki (22%)
--
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
800m (85%) 0 (0%) 200Mi (7%) 300Mi (11%)
--
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
510m (54%) 100m (10%) 410Mi (15%) 770Mi (29%)
--
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
823m (87%) 158m (16%) 484Mi (18%) 894Mi (33%)
The 38% node is running:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system event-exporter-v0.1.9-5c8fb98cdb-8v48h 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-v2.0.17-q29t2 100m (10%) 0 (0%) 200Mi (7%) 300Mi (11%)
kube-system heapster-v1.5.2-585f569d7f-886xx 138m (14%) 138m (14%) 301856Ki (11%) 301856Ki (11%)
kube-system kube-dns-autoscaler-69c5cbdcdd-rk7sd 20m (2%) 0 (0%) 10Mi (0%) 0 (0%)
kube-system kube-proxy-gke-production-cluster-default-pool-0fd62aac-7kls 100m (10%) 0 (0%) 0 (0%) 0 (0%)
I suspect it wont downscale because heapster or kube-dns-autoscaler.
But the 85% pod contains:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system fluentd-gcp-v2.0.17-s25bk 100m (10%) 0 (0%) 200Mi (7%) 300Mi (11%)
kube-system kube-proxy-gke-production-cluster-default-pool-7ffeacff-mh6p 100m (10%) 0 (0%) 0 (0%) 0 (0%)
my-deploy my-deploy-54fc6b67cf-7nklb 300m (31%) 0 (0%) 0 (0%) 0 (0%)
my-deploy my-deploy-54fc6b67cf-zl7mr 300m (31%) 0 (0%) 0 (0%) 0 (0%)
The fluentd and kube-proxy pods are present on every node, so I assume they are not needed without the node. Which means that my deployment could be relocated to the other nodes since it only has a request of 300m (31% since only 94% of node CPU is allocatable).
So I figured that Ill check the logs. But if I run kubectl get pods --all-namespaces there are no pod visible on GKE for the CA. And if I use the command kubectl get configmap cluster-autoscaler-status -n kube-system -o yaml it only tells me if it is about to scale, not why or why not.
Another option is to look at /var/log/cluster-autoscaler.log in the master node. I SSH:ed in the all 4 nodes and only found a gcp-cluster-autoscaler.log.pos file that says: /var/log/cluster-autoscaler.log 0000000000000000 0000000000000000 meaning the file should be right there but is empty.
Last option according to the FAQ, is to check the events for the pods, but as far as i can tell they are empty.
Anyone know why it wont downscale or atleast where to find the logs?
Answering myself for visibility.
The problem is that the CA never considers moving anything unless all the requirements mentioned in the FAQ are met at the same time.
So lets say I have 100 nodes with 51% CPU requests. It still wont consider downscaling.
One solution is to increase the value at which CA checks, now 50%. But unfortunately that is not supported by GKE, see answer from google support #GalloCedrone:
Moreover I know that this value might sound too low and someone could be interested to keep as well a 85% or 90% to avoid your scenario.
Currently there is a feature request open to give the user the possibility to modify the flag "--scale-down-utilization-threshold", but it is not implemented yet.
The workaround I found is to decrease the CPU request (100m instead of 300m) of the pods and have the Horizontal Pod Autoscaler (HPA) create more on demand. This is fine for me but if your application is not suitable for many small instances you are out of luck. Perhaps a cron job that cordons a node if the total utilization is low?
I agree that according to [Documentation][1] it seems that "gke-name-cluster-default-pool" could be safely deleted, conditions:
The sum of cpu and memory requests of all pods running on this node is smaller than 50% of the node's allocatable.
All pods running on the node (except these that run on all nodes by default, like manifest-run pods or pods created by DaemonSets) can be moved to other nodes.
It doesn't have scale-down disabled annotation
Therefore there should remove it after 10 minutes it is considered not needed.
However checking the [Documentation][2] I found:
What types of pods can prevent CA from removing a node?
[...]
Kube-system pods that are not run on the node by default, *
[..]
heapster-v1.5.2--- is running on the node and it is a Kube-system pod that is not run on the node by default.
I will update the answer if I discover more interesting information.
UPDATE
The fact that the node it is the last one in the zone is not an issue.
To prove it I tested on a brand new cluster with 3 nodes each one in a different zone, one of them was without any workload apart from "kube-proxy" and "fluentd" and was correctly deleted even if it was bringing the size of the zone to zero.
[1]: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md
[2]: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node