How Kubernetes knows resource requests and limits? - kubernetes

Here is a yaml file that has been created to be deployed in kubernetes. I would like to know since there is no resource request and limits in the file, how kubernetes knows the resource requests and limits to run it? How can I fetch that information?
apiVersion: v1
kind: Pod
metadata:
name: rss-site
labels:
app: web
spec:
containers:
- name: front-end
image: nginx
ports:
- containerPort: 80
- name: rss-reader
image: nickchase/rss-php-nginx:v1
ports:
- containerPort: 88

You can "kubectl describe" your pod and see what actual resources got assigned. With LimitRange Kubernetes can assign default requests and limits to pod if not part of its spec.
If there are no requests/limits assigned - your pod will become of Best Effort quality of service and can be Evicted in case of resource pressure on node.

you can use the below steps to fetch the resource limits assigned to the pod.
Create the pod
-------------------
kubectl run test-resource-limits --image=busybox --limits "memory=100Mi" \
--command -- /bin/sh -c "while true; do sleep 2; done"
Test the resource limits that are specified
-------------------------------------------
kubectl get pods test-resource-limits-7b8b46c8c7-jdjgs \
-o=jsonpath='{.spec.containers[0].resources}'

If you don't specify resource requests and limits. Kubernetes will run your workload without them. meaning your pod could potentially use all the CPU and RAM on the node.
Caveat to that; if your namespace has defaults set with a limitRange the defaults will be applied to workloads that don't specify resource spec.

Related

how to restrict a pod to connect only to 2 pods using networkpolicy and test connection in k8s in simple way?

Do I still need to expose pod via clusterip service?
There are 3 pods - main, front, api. I need to allow ingress+egress connection to main pod only from the pods- api and frontend. I also created service-main - service that exposes main pod on port:80.
I don't know how to test it, tried:
k exec main -it -- sh
netcan -z -v -w 5 service-main 80
and
k exec main -it -- sh
curl front:80
The main.yaml pod:
apiVersion: v1
kind: Pod
metadata:
labels:
app: main
item: c18
name: main
spec:
containers:
- image: busybox
name: main
command:
- /bin/sh
- -c
- sleep 1d
The front.yaml:
apiVersion: v1
kind: Pod
metadata:
labels:
app: front
name: front
spec:
containers:
- image: busybox
name: front
command:
- /bin/sh
- -c
- sleep 1d
The api.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: api
name: api
spec:
containers:
- image: busybox
name: api
command:
- /bin/sh
- -c
- sleep 1d
The main-to-front-networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: front-end-policy
spec:
podSelector:
matchLabels:
app: main
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
What am I doing wrong? Do I still need to expose main pod via service? But should not network policy take care of this already?
Also, do I need to write containerPort:80 in main pod? How to test connectivity and ensure ingress-egress works only for main pod to api, front pods?
I tried the lab from ckad prep course, it had 2 pods: secure-pod and web-pod. There was issue with connectivity, the solution was to create network policy and test using netcat from inside the web-pod's container:
k exec web-pod -it -- sh
nc -z -v -w 1 secure-service 80
connection open
UPDATE: ideally I want answers to these:
a clear explanation of the diff btw service and networkpolicy.
If both service and netpol exist - what is the order of evaluation that the traffic/request goes thru? It first goes thru netpol then service? Or vice versa?
if I want front and api pods to send/receive traffic to main - do I need separate services exposing front and api pods?
Network policies and services are two different and independent Kubernetes resources.
Service is:
An abstract way to expose an application running on a set of Pods as a network service.
Good explanation from the Kubernetes docs:
Kubernetes Pods are created and destroyed to match the state of your cluster. Pods are nonpermanent resources. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them "backends") provides functionality to other Pods (call them "frontends") inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
Enter Services.
Also another good explanation in this answer.
For production you should use a workload resources instead of creating pods directly:
Pods are generally not created directly and are created using workload resources. See Working with Pods for more information on how Pods are used with workload resources.
Here are some examples of workload resources that manage one or more Pods:
Deployment
StatefulSet
DaemonSet
And use services to make requests to your application.
Network policies are used to control traffic flow:
If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
Network policies target pods, not services (an abstraction). Check this answer and this one.
Regarding your examples - your network policy is correct (as I tested it below). The problem is that your cluster may not be compatible:
For Network Policies to take effect, your cluster needs to run a network plugin which also enforces them. Project Calico or Cilium are plugins that do so. This is not the default when creating a cluster!
Test on kubeadm cluster with Calico plugin -> I created similar pods as you did, but I changed container part:
spec:
containers:
- name: main
image: nginx
command: ["/bin/sh","-c"]
args: ["sed -i 's/listen .*/listen 8080;/g' /etc/nginx/conf.d/default.conf && exec nginx -g 'daemon off;'"]
ports:
- containerPort: 8080
So NGINX app is available at the 8080 port.
Let's check pods IP:
user#shell:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
api 1/1 Running 0 48m 192.168.156.61 example-ubuntu-kubeadm-template-2 <none> <none>
front 1/1 Running 0 48m 192.168.156.56 example-ubuntu-kubeadm-template-2 <none> <none>
main 1/1 Running 0 48m 192.168.156.52 example-ubuntu-kubeadm-template-2 <none> <none>
Let's exec into running main pod and try to make request to the front pod:
root#main:/# curl 192.168.156.61:8080
<!DOCTYPE html>
...
<title>Welcome to nginx!</title>
It is working.
After applying your network policy:
user#shell:~$ kubectl apply -f main-to-front.yaml
networkpolicy.networking.k8s.io/front-end-policy created
user#shell:~$ kubectl exec -it main -- bash
root#main:/# curl 192.168.156.61:8080
...
Not working anymore, so it means that network policy is applied successfully.
Nice option to get more information about applied network policy is to run kubectl describe command:
user#shell:~$ kubectl describe networkpolicy front-end-policy
Name: front-end-policy
Namespace: default
Created on: 2022-01-26 15:17:58 +0000 UTC
Labels: <none>
Annotations: <none>
Spec:
PodSelector: app=main
Allowing ingress traffic:
To Port: 8080/TCP
From:
PodSelector: app=front
Allowing egress traffic:
To Port: 8080/TCP
To:
PodSelector: app=front
Policy Types: Ingress, Egress

what is the default allocation when resources are not specified in kubernetes?

Below is kubernetes POD definition
apiVersion: v1
kind: Pod
metadata:
name: static-web
labels:
role: myrole
spec:
containers:
- name: web
image: nginx
ports:
- name: web
containerPort: 80
protocol: TCP
as I have not specified the resources, how much Memory & CPU will be allocated? Is there a kubectl to find what is allocated for the POD?
If resources are not specified for the Pod, the Pod will be scheduled to any node and resources are not considered when choosing a node.
The Pod might be "terminated" if it uses more memory than available or get little CPU time as Pods with specified resources will be prioritized. It is a good practice to set resources for your Pods.
See Configure Quality of Service for Pods - your Pod will be classified as "Best Effort":
For a Pod to be given a QoS class of BestEffort, the Containers in the Pod must not have any memory or CPU limits or requests.
In your case, kubernetes will assign QoS called BestEffort to your pod.
That's means kube-scheduler has no idea how to schedule your pod and just do its best.
That's also means your pod can consume any amount of resource(cpu/mem)it want (but the kubelet will evict it if anything goes wrong).
To see the resource cost of your pod, you can use kubectl top pod xxxx

Get the current and the most latest CPU and Memory usage of all the pods

I would like to know how to get current or the last read metric value of CPU and memory usage of all pods.
I tried to call the hawkler endpoint. I went to the browser developer mode by hitting f12 and took this endpoint from list of calls that are made when metrics page of a pod is loaded.
https://metrics.mydev.abccomp.com/hakular/metrics/gauges/myservice%dfjlajdflk-lkejre-12112kljdfkl%2Fcpu%2Fusage_rate/data?bucketDuration=1mn&start=-1mn
However this will give me the cpu usage metrics for the last minute, for that particular pod. I am trying to see if there is a command or way exisits that will give me only the current snapshot of cpu usage and memory stats of all the pods collectively like below:
pod memory usage memory max cpu usage cpu max
pod1 0.4 mb 2.0 mb 20 m cores 25 m cores
pod2 1.5 mb 2.0 mb 25 m cores 25 m cores
To see the pods that use the most cpu and memory you can use the kubectl top command but it doesn't sort yet and is also missing the quota limits and requests per pod. You only see the current usage.
Execute command below:
$ kubectl top pods --all-namespaces --containers=true
Because of these limitations, but also because you want to gather and store this resource usage information on an ongoing basis, a monitoring tool comes in handy. This allows you to analyze resource usage both in real time and historically, and also lets you alert on capacity bottlenecks.
To workaround problem " Error from server (Forbidden): unknown (get services http:heapster"
Make sure that heapster deployment don't forgot to install the Service for heapster, otherwise you will have to do it manually.
E.g.:
$ kubectl create -f /dev/stdin <<SVC
apiVersion: v1
kind: Service
metadata:
name: heapster
namespace: kube-system
spec:
selector:
whatever-label: is-on-heapster-pods
ports:
- name: http
port: 80
targetPort: whatever-is-heapster-is-listening-on
SVC
List resource CPU & Memory utilization for all containers (pods)
kubectl top pods --all-namespaces --containers=true
The default admin, edit, view, and cluster-reader cluster roles support cluster role aggregation, where the cluster rules for each role are dynamically updated as new rules are created. This feature is relevant only if you extend the Kubernetes API by creating custom resources. Cluster-role-aggregator
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: top-pods-admin
labels:
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
verbs:
- get
- list
- watch

kubernetes - exposing container info as environment variables

I'm trying to expose some of the container info as env variables reading the values from the pod's spec.template.spec.containers[0].name which seems to be not working. What would be the apiSpec for referencing the container fields inside the deployment template.The deployment template is as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
run: nginx
name: nginx
spec:
replicas: 2
selector:
matchLabels:
run: nginx
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
run: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 8000
resources: {}
env:
- name: MY_CONTAINER_NAME
valueFrom:
fieldRef:
fieldPath: spec.template.spec.containers[0].name
The Downward API enables you to expose the pod’s own metadata to the processes
running inside that pod.
Currently, it allows you to pass the following information to your containers:
The pod’s name
The pod’s IP address
The namespace the pod belongs to
The name of the node the pod is running on
The name of the service account the pod is running under
The CPU and memory requests for each container
The CPU and memory limits for each container
The pod’s labels
The pod’s annotations
And that's it. As you can see the container port is not part of this list.
In general, the metadata available through the Downward API is fairly limited. If you need more, you’ll need to obtain it from the Kubernetes API server directly which you can do either by using client libraries or by using an ambassador container.
Two things: first, the container name is fixed -- it's defined by the PodSpec template -- are you perhaps thinking of the docker container's name (which will be a long generated name composed of the namespace, container name, pod UID, and restart count)? Because the docker container's name will for sure not be present in .spec.containers[0].name
Second, while I agree with David that I doubt kubernetes will let you run arbitrary fieldPath: selectors, if you're open to being flexible with your command: you can actually use the Pod's own ServiceAccount to query the kubernetes API at launch time to retrieve all of the Pod's info, including its status: structure which likely has a ton of the information you're after.

Running kubectl proxy from same pod vs different pod on same node - what's the difference?

I'm experimenting with this, and I'm noticing a difference in behavior that I'm having trouble understanding, namely between running kubectl proxy from within a pod vs running it in a different pod.
The sample configuration run kubectl proxy and the container that needs it* in the same pod on a daemonset, i.e.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
# ...
spec:
template:
metadata:
# ...
spec:
containers:
# this container needs kubectl proxy to be running:
- name: l5d
# ...
# so, let's run it:
- name: kube-proxy
image: buoyantio/kubectl:v1.8.5
args:
- "proxy"
- "-p"
- "8001"
When doing this on my cluster, I get the expected behavior. However, I will run other services that also need kubectl proxy, so I figured I'd rationalize that into its own daemon set to ensure it's running on all nodes. I thus removed the kube-proxy container and deployed the following daemon set:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-proxy
labels:
app: kube-proxy
spec:
template:
metadata:
labels:
app: kube-proxy
spec:
containers:
- name: kube-proxy
image: buoyantio/kubectl:v1.8.5
args:
- "proxy"
- "-p"
- "8001"
In other words, the same container configuration as previously, but now running in independent pods on each node instead of within the same pod. With this configuration "stuff doesn't work anymore"**.
I realize the solution (at least for now) is to just run the kube-proxy container in any pod that needs it, but I'd like to know why I need to. Why isn't just running it in a daemonset enough?
I've tried to find more information about running kubectl proxy like this, but my search results drown in results about running it to access a remote cluster from a local environment, i.e. not at all what I'm after.
I include these details not because I think they're relevant, but because they might be even though I'm convinced they're not:
*) a Linkerd ingress controller, but I think that's irrelevant
**) in this case, the "working" state is that the ingress controller complains that the destination is unknown because there's no matching ingress rule, while the "not working" state is a network timeout.
namely between running kubectl proxy from within a pod vs running it in a different pod.
Assuming your cluster has an software defined network, such as flannel or calico, a Pod has its own IP and all containers within a Pod share the same networking space. Thus:
containers:
- name: c0
command: ["curl", "127.0.0.1:8001"]
- name: c1
command: ["kubectl", "proxy", "-p", "8001"]
will work, whereas in a DaemonSet, they are by definition not in the same Pod and thus the hypothetical c0 above would need to use the DaemonSet's Pod's IP to contact 8001. That story is made more complicated by the fact that kubectl proxy by default only listens on 127.0.0.1, so you would need to alter the DaemonSet's Pod's kubectl proxy to include --address='0.0.0.0' --accept-hosts='.*' to even permit such cross-Pod communication. I believe you also need to declare the ports: array in the DaemonSet configuration, since you are now exposing that port into the cluster, but I'd have to double-check whether ports: is merely polite, or is actually required.