what is the meaning of 'window' in results from k8s metric server api - kubernetes-metrics

When I type this command on cli:
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/<NAMESPACE>/pods/<POD_NAME> | jq
I can get these results as below:
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "busybox",
"namespace": "default",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/busybox",
"creationTimestamp": "2019-12-10T18:23:20Z"
},
"timestamp": "2019-12-10T18:23:12Z",
"window": "30s",
"containers": [
{
"name": "busybox",
"usage": {
"cpu": "0",
"memory": "364Ki"
}
}
]
}
What is the meaning of that "window" item?
I am really want to know what it is exactly.

According to k8s source code:
// PodMetrics sets resource usage metrics of a pod.
type PodMetrics struct {
metav1.TypeMeta
metav1.ObjectMeta
// The following fields define time interval from which metrics were
// collected from the interval [Timestamp-Window, Timestamp].
Timestamp metav1.Time
Window metav1.Duration
// Metrics for all containers are collected within the same time window.
Containers []ContainerMetrics
}
You are most likely interested in this comment:
The following fields define time interval from which metrics were collected from the interval [Timestamp-Window, Timestamp].
So the usage result is an averaged data gathered over this window/interval.

Related

Filter kubernetes file

I have the following file:
# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-64897985d-qzvj8"} 1075.30302335 1641411355244
container_cpu_usage_seconds_total{container="etcd",namespace="kube-system",pod="etcd-minikube"} 7948.244422673 1641411341787
container_cpu_usage_seconds_total{container="kindnet-cni",namespace="kube-system",pod="kindnet-v9rn4"} 253.401092815 1641411342227
container_cpu_usage_seconds_total{container="kube-apiserver",namespace="kube-system",pod="kube-apiserver-minikube"} 21314.526032702 1641411341706
container_cpu_usage_seconds_total{container="kube-controller-manager",namespace="kube-system",pod="kube-controller-manager-minikube"} 9960.616171401 1641411346752
container_cpu_usage_seconds_total{container="kube-proxy",namespace="kube-system",pod="kube-proxy-ktclh"} 220.17024815 1641411352327
container_cpu_usage_seconds_total{container="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube"} 1216.162832124 1641411355059
container_cpu_usage_seconds_total{container="metrics-server",namespace="kube-system",pod="metrics-server-6b76bd68b6-lpx4q"} 715.97119974 1641411344274
container_cpu_usage_seconds_total{container="storage-provisioner",namespace="kube-system",pod="storage-provisioner"} 47.685435216 1641411354429
# HELP container_memory_working_set_bytes [ALPHA] Current working set of the container in bytes
# TYPE container_memory_working_set_bytes gauge
container_memory_working_set_bytes{container="coredns",namespace="kube-system",pod="coredns-64897985d-qzvj8"} 1.5364096e+07 1641411355244
container_memory_working_set_bytes{container="etcd",namespace="kube-system",pod="etcd-minikube"} 5.9752448e+07 1641411341787
container_memory_working_set_bytes{container="kindnet-cni",namespace="kube-system",pod="kindnet-v9rn4"} 1.0326016e+07 1641411342227
container_memory_working_set_bytes{container="kube-apiserver",namespace="kube-system",pod="kube-apiserver-minikube"} 2.66002432e+08 1641411341706
container_memory_working_set_bytes{container="kube-controller-manager",namespace="kube-system",pod="kube-controller-manager-minikube"} 5.9129856e+07 1641411346752
container_memory_working_set_bytes{container="kube-proxy",namespace="kube-system",pod="kube-proxy-ktclh"} 2.00704e+07 1641411352327
container_memory_working_set_bytes{container="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube"} 2.3130112e+07 1641411355059
container_memory_working_set_bytes{container="metrics-server",namespace="kube-system",pod="metrics-server-6b76bd68b6-lpx4q"} 2.6923008e+07 1641411344274
container_memory_working_set_bytes{container="storage-provisioner",namespace="kube-system",pod="storage-provisioner"} 1.4209024e+07 1641411354429
A few questions:
What format is this? I know it isn't JSON.
Can I use jq to parse/filter this data? I would like to get all metrics on the coredns container:
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-64897985d-qzvj8"} 1075.30302335 1641411355244
container_memory_working_set_bytes{container="coredns",namespace="kube-system",pod="coredns-64897985d-qzvj8"} 1.5364096e+07 1641411355244
You could convert your file to JSON using https://github.com/prometheus/prom2json
Then it's jq all the way down, if you wish. E.g. with your input:
prom2json sample.prom | jq '
.[] | .metrics |= map(select(.labels.container=="coredns") )'
yields
{
"name": "container_memory_working_set_bytes",
"help": "[ALPHA] Current working set of the container in bytes",
"type": "GAUGE",
"metrics": [
{
"labels": {
"container": "coredns",
"namespace": "kube-system",
"pod": "coredns-64897985d-qzvj8"
},
"timestamp_ms": "1641411355244",
"value": "1.5364096e+07"
}
]
}
{
"name": "container_cpu_usage_seconds_total",
"help": "[ALPHA] Cumulative cpu time consumed by the container in core-seconds",
"type": "COUNTER",
"metrics": [
{
"labels": {
"container": "coredns",
"namespace": "kube-system",
"pod": "coredns-64897985d-qzvj8"
},
"timestamp_ms": "1641411355244",
"value": "1075.30302335"
}
]
}

POD is being terminated and created again due to scale up and it's running twice

I have an application that runs a code and at the end it sends an email with a report of the data. When I deploy pods on GKE , certain pods get terminated and a new pod is created due to Auto Scale, but the problem is that the termination is done after my code is finished and the email is sent twice for the same data.
Here is the JSON file of the deploy API:
{
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {
"name": "$name",
"namespace": "$namespace"
},
"spec": {
"template": {
"metadata": {
"name": "********"
},
"spec": {
"priorityClassName": "high-priority",
"containers": [
{
"name": "******",
"image": "$dockerScancatalogueImageRepo",
"imagePullPolicy": "IfNotPresent",
"env": $env,
"resources": {
"requests": {
"memory": "2000Mi",
"cpu": "2000m"
},
"limits":{
"memory":"2650Mi",
"cpu":"2650m"
}
}
}
],
"imagePullSecrets": [
{
"name": "docker-secret"
}
],
"restartPolicy": "Never"
}
}
}
}
and here is a screen-shot of the pod events:
Any idea how to fix that?
Thank you in advance.
"Perhaps you are affected by this "Note that even if you specify .spec.parallelism = 1 and .spec.completions = 1 and .spec.template.spec.restartPolicy = "Never", the same program may sometimes be started twice." from doc. What happens if you increase terminationgraceperiodseconds in your yaml file? – "
#danyL
my problem was that I had another jobs that deploy pods on my nodes with more priority , so it was trying to terminate my running pods but the job was already done and the email was already sent , so i fixed the problem by fixing the request and the limit resources on all my json files , i don't know if it's the perfect solution but for now it solved my problem.
Thank you all for you help

What thresholds should be set in Service Fabric Placement / Load balancing config for Cluster with large number of guest executable applications?

What thresholds should be set in Service Fabric Placement / Load balancing config for Cluster with large number of guest executable applications?
I am having trouble with Service Fabric trying to place too many services onto a single node too fast.
To give an example of cluster size, there are 2-4 worker node types, there are 3-6 worker nodes per node type, each node type may run 200 guest executable applications, and each application will have at least 2 replicas. The nodes are more than capable of running the services while running, it is just startup time where CPU is too high.
The problem seems to be the thresholds or defaults for placement and load balancing rules set in the cluster config. As examples of what I have tried: I have turned on InBuildThrottlingEnabled and set InBuildThrottlingGlobalMaxValue to 100, I have set the Global Movement Throttle settings to be various percentages of the total application count.
At this point there are two distinct scenarios I am trying to solve for. In both cases, the nodes go to 100% for an amount of time such that service fabric declares the node as down.
1st: Starting an entire cluster from all nodes being off without overwhelming nodes.
2nd: A single node being overwhelmed by too many services starting after a host comes back online
Here are my current parameters on the cluster:
"Name": "PlacementAndLoadBalancing",
"Parameters": [
{
"Name": "UseMoveCostReports",
"Value": "true"
},
{
"Name": "PLBRefreshGap",
"Value": "1"
},
{
"Name": "MinPlacementInterval",
"Value": "30.0"
},
{
"Name": "MinLoadBalancingInterval",
"Value": "30.0"
},
{
"Name": "MinConstraintCheckInterval",
"Value": "30.0"
},
{
"Name": "GlobalMovementThrottleThresholdForPlacement",
"Value": "25"
},
{
"Name": "GlobalMovementThrottleThresholdForBalancing",
"Value": "25"
},
{
"Name": "GlobalMovementThrottleThreshold",
"Value": "25"
},
{
"Name": "GlobalMovementThrottleCountingInterval",
"Value": "450"
},
{
"Name": "InBuildThrottlingEnabled",
"Value": "false"
},
{
"Name": "InBuildThrottlingGlobalMaxValue",
"Value": "100"
}
]
},
Based on discussion in answer below, wanted to leave a graph-image: if a node goes down, the act of shuffling services on to the remaining nodes will cause a second node to go down, as noted here. Green node goes down, then purple goes down due to too many resources being shuffled onto it.
From SF's perspective, 1 & 2 are the same problem. Also as a note, SF doesn't evict a node just because CPU consumption is high. So: "The nodes go to 100% for an amount of time such that service fabric declares the node as down." needs some more explanation. The machines might be failing for other reasons, or I guess could be so loaded that the kernel level failure detectors can't ping other machines, but that isn't very common.
For config changes: I would remove all of these to go with the defaults
{
"Name": "PLBRefreshGap",
"Value": "1"
},
{
"Name": "MinPlacementInterval",
"Value": "30.0"
},
{
"Name": "MinLoadBalancingInterval",
"Value": "30.0"
},
{
"Name": "MinConstraintCheckInterval",
"Value": "30.0"
},
For the inbuild throttle to work, this needs to flip to true:
{
"Name": "InBuildThrottlingEnabled",
"Value": "false"
},
Also, since these are likely constraint violations and placement (not proactive rebalancing) we need to explicitly instruct SF to throttle those operations as well. There is config for this in SF, although it is not documented or publicly supported at this time, you can see it in the settings. By default only balancing is throttled, but you should be able to turn on throttling for all phases and set appropriate limits via something like the below.
These first two settings are also within PlacementAndLoadBalancing, like the ones above.
{
"Name": "ThrottlePlacementPhase",
"Value": "true"
},
{
"Name": "ThrottleConstraintCheckPhase",
"Value": "true"
},
These next settings to set the limits are in their own sections, and are a map of the different node type names to the limit you want to throttle for that node type.
{
"name": "MaximumInBuildReplicasPerNodeConstraintCheckThrottle",
"parameters": [
{
"name": "YourNodeTypeNameHere",
"value": "100"
},
{
"name": "YourOtherNodeTypeNameHere",
"value": "100"
}
]
},
{
"name": "MaximumInBuildReplicasPerNodePlacementThrottle",
"parameters": [
{
"name": "YourNodeTypeNameHere",
"value": "100"
},
{
"name": "YourOtherNodeTypeNameHere",
"value": "100"
}
]
},
{
"name": "MaximumInBuildReplicasPerNodeBalancingThrottle",
"parameters": [
{
"name": "YourNodeTypeNameHere",
"value": "100"
},
{
"name": "YourOtherNodeTypeNameHere",
"value": "100"
}
]
},
{
"name": "MaximumInBuildReplicasPerNode",
"parameters": [
{
"name": "YourNodeTypeNameHere",
"value": "100"
},
{
"name": "YourOtherNodeTypeNameHere",
"value": "100"
}
]
}
I would make these changes and then try again. Additional information like what is actually causing the nodes to be down (confirmed via events and SF health info) would help identify the source of the problem. It would probably also be good to verify that starting 100 instances of the apps on the node actually works and whether that's an appropriate threshold.

How can I filter events for the cluster autoscaler in kubernetes?

I see the following event from kubectl get events:
{
"apiVersion": "v1",
"count": 1,
"eventTime": null,
"firstTimestamp": "2019-12-04T19:52:51Z",
"involvedObject": {
"apiVersion": "v1",
"kind": "Pod",
"name": "example-deployment-55f789d54c-tlwnz",
"namespace": "default",
"resourceVersion": "82663",
"uid": "2fdbd034-16cf-11ea-bc4a-42010a800186"
},
"kind": "Event",
"lastTimestamp": "2019-12-04T19:52:51Z",
"message": "Unable to mount volumes for pod \"example-deployment-55f789d54c-tlwnz_default(2fdbd034-16cf-11ea-bc4a-42010a800186)\": timeout expired waiting for volumes to attach or mount for pod \"default\"/\"example-deployment-55f789d54c-tlwnz\". list of unmounted volumes=[nfs-volume]. list of unattached volumes=[nfs-volume default-token-kc7ks]",
"metadata": {
"creationTimestamp": "2019-12-04T19:52:51Z",
"name": "example-deployment-55f789d54c-tlwnz.15dd430deb31e8fd",
"namespace": "default",
"resourceVersion": "1529",
"selfLink": "/api/v1/namespaces/default/events/example-deployment-55f789d54c-tlwnz.15dd430deb31e8fd",
"uid": "a7c80266-16cf-11ea-bc4a-42010a800186"
},
"reason": "FailedMount",
"reportingComponent": "",
"reportingInstance": "",
"source": {
"component": "kubelet",
"host": "gke-test-a2e50ea5b9f1dd9-my-node-pool-5a20b1ac-vk9q"
},
"type": "Warning"
}
....
I've tried filtering by: kubectl get events --all-namespaces -o json --field-selector source.component=cluster-autoscaler but that errors with:
{
"apiVersion": "v1",
"items": [],
"kind": "List",
"metadata": {
"resourceVersion": "",
"selfLink": ""
}
}
Error from server (BadRequest): Unable to find "/v1, Resource=events" that match label selector "", field selector "source.component=cluster-autoscaler": field label not supported: source.component
How can I filter this?
Can be done using jq (though it does not return a JSON array - but individual JSON objects seperated by newlines):
kubectl get events --all-namespaces -o json | jq '.items[]|select(.source.component=="cluster-autoscaler")'

Extract LoadBalancer name from kubectl output with go-template

I'm trying to write a go template that extracts the value of the load balancer. Using --go-template={{status.loadBalancer.ingress}} returns [map[hostname:GUID.us-west-2.elb.amazonaws.com]]% When I add .hostname to the template I get an error saying, "can't evaluate field hostname in type interface {}". I've tried using the range keyword, but I can't seem to get the syntax right.
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"creationTimestamp": "2018-07-30T17:22:12Z",
"labels": {
"run": "nginx"
},
"name": "nginx-http",
"namespace": "jx",
"resourceVersion": "495789",
"selfLink": "/api/v1/namespaces/jx/services/nginx-http",
"uid": "18aea6e2-941d-11e8-9c8a-0aae2cf24842"
},
"spec": {
"clusterIP": "10.100.92.49",
"externalTrafficPolicy": "Cluster",
"ports": [
{
"nodePort": 31032,
"port": 80,
"protocol": "TCP",
"targetPort": 8080
}
],
"selector": {
"run": "nginx"
},
"sessionAffinity": "None",
"type": "LoadBalancer"
},
"status": {
"loadBalancer": {
"ingress": [
{
"hostname": "GUID.us-west-2.elb.amazonaws.com"
}
]
}
}
}
As you can see from the JSON, the ingress element is an array. You can use the template function index to grab this array element.
Try:
kubectl get svc <name> -o=go-template --template='{{(index .status.loadBalancer.ingress 0 ).hostname}}'
This is assuming of course that you're only provisioning a single loadbalancer, if you have multiple, you'll have to use range
try this:
kubectl get svc <name> -o go-template='{{range .items}}{{range .status.loadBalancer.ingress}}{{.hostname}}{{printf "\n"}}{{end}}{{end}}'