How can I get more Replicas of Istio Running? - kubernetes

I am trying to upgrade the nodes in my Kubernetes cluster. When I go to do that, I get a notification saying:
PDB istio-ingressgateway in namespace istio-system allows 0 pod disruptions
PDB is Pod Disruption Budget. Basically, istio is saying that it can't loose that pod and keep things working right.
There is a really long discussion about this over on the Istio GitHub issues. This issue has been on going for over 2 years. Most of the discussions center around saying that the defaults are wrong. There are few workaround suggestions. But most of them are pre 1.4 (and the introduction of Istiod). The closest workaround I could find that might be compatible with current version is to add some additional replicas to the IstioOperator.
I tried that with a patch operation (run in PowerShell):
kubectl patch IstioOperator installed-state --patch $(Get-Content istio-ha-patch.yaml -Raw) --type=merge -n istio-system
Where istio-ha-patch.yaml is:
spec:
components:
egressGateways:
- enabled: true
k8s:
hpaSpec:
minReplicas: 2
name: istio-egressgateway
ingressGateways:
- enabled: true
k8s:
hpaSpec:
minReplicas: 2
name: istio-ingressgateway
pilot:
enabled: true
k8s:
hpaSpec:
minReplicas: 2
I applied that, and checked the yaml of the IstioOperator, and it did apply to the resource's yaml. But the replica count for the ingress pod did not go up. (It stayed at 1 of 1.)
At this point, my only option is to uninstall Istio, apply my update then re-install Istio. (Yuck)
Is there anyway to get the replica count of Istio's ingress gateway up such that I can keep it running as I do a rolling node upgrade?

Turns out that if you did not install Istio using the Istio Kubernetes Operator, you cannot use the option I tried.
Once I uninstalled Istio and reinstalled it using the Operator, then I was able to get it to work.
Though I did not use the Patch operation, I just did a kubectl apply -f istio-operator-spec.yaml where istio-operator-spec.yaml is:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: istio-controlplane
namespace: istio-system
spec:
components:
ingressGateways:
- enabled: true
k8s:
hpaSpec:
minReplicas: 2
name: istio-ingressgateway
pilot:
enabled: true
k8s:
hpaSpec:
minReplicas: 2
profile: default

Related

Kubernetes Cronjob labeling

As I have seen few related posts but none answered my question, I thought I would ask a new question based on suggestions from other users as well here.
I have the need to make a selector label for a network policy for a running cronjob that is responsible to connect to some other services within the cluster, as far as I know there is no easy straight forward way to make a selector label for the jobs pod as that would be problematic with duplicate job labels if they ever existed. Not sure why the cronjob can't have a selector itself, and then can be applied to the job and the pod.
also there might be a possibility to just set this cronjob in its own namespace and then allow all from that one namespace to whatever needed in the network policy but does not feel like the right way to overcome that problem.
Using k8s v1.20
First of all, to select pods (spawned by your CronJob) that should be allowed by the NetworkPolicy as ingress sources or egress destinations, you may set specific label for those pods.
You can easily set a label for Jobs spawned by CronJob using labels field (another example with an explanation can be found in the OpenShift CronJobs documentation):
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mysql-test
spec:
...
jobTemplate:
spec:
template:
metadata:
labels:
workload: cronjob # Sets a label for jobs spawned by this CronJob.
type: mysql # Sets another label for jobs spawned by this CronJob.
...
Pods spawned by this CronJob will have the labels type=mysql and workload=cronjob, using this labels you can create/customize your NetworkPolicy:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
mysql-test-1615216560-tkdvk 0/1 Completed 0 2m2s ...,type=mysql,workload=cronjob
mysql-test-1615216620-pqzbk 0/1 Completed 0 62s ...,type=mysql,workload=cronjob
mysql-test-1615216680-8775h 0/1 Completed 0 2s ...,type=mysql,workload=cronjob
$ kubectl describe pod mysql-test-1615216560-tkdvk
Name: mysql-test-1615216560-tkdvk
Namespace: default
...
Labels: controller-uid=af99e9a3-be6b-403d-ab57-38de31ac7a9d
job-name=mysql-test-1615216560
type=mysql
workload=cronjob
...
For example this mysql-workload NetworkPolicy allows connections to all pods in the mysql namespace from any pod with the labels type=mysql and workload=cronjob (logical conjunction) in a namespace with the label namespace-name=default :
NOTE: Be careful to use correct YAML (take a look at this namespaceSelector and podSelector example).
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mysql-workload
namespace: mysql
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
namespace-name: default
podSelector:
matchLabels:
type: mysql
workload: cronjob
To use network policies, you must be using a networking solution which supports NetworkPolicy:
Network policies are implemented by the network plugin. To use network policies, you must be using a networking solution which supports NetworkPolicy. Creating a NetworkPolicy resource without a controller that implements it will have no effect.
You can learn more about creating Kubernetes NetworkPolicies in the Network Policies documentation.

Azure Kubernetes - prometheus is deployed as a part of ISTIO not showing the deployments?

I have used the following configuration to setup the Istio
cat << EOF | kubectl apply -f -
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istio-control-plane
spec:
# Use the default profile as the base
# More details at: https://istio.io/docs/setup/additional-setup/config-profiles/
profile: default
# Enable the addons that we will want to use
addonComponents:
grafana:
enabled: true
prometheus:
enabled: true
tracing:
enabled: true
kiali:
enabled: true
values:
global:
# Ensure that the Istio pods are only scheduled to run on Linux nodes
defaultNodeSelector:
beta.kubernetes.io/os: linux
kiali:
dashboard:
auth:
strategy: anonymous
components:
egressGateways:
- name: istio-egressgateway
enabled: true
EOF
and exposed the prometheus service as mentioned below
kubectl expose service prometheus --type=LoadBalancer --name=prometheus-svc --namespace istio-system
kubectl get svc prometheus-svc -n istio-system -o json
export PROMETHEUS_URL=$(kubectl get svc prometheus-svc -n istio-system -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}"):$(kubectl get svc prometheus-svc -n istio-system -o 'jsonpath={.spec.ports[0].port}')
echo http://${PROMETHEUS_URL}
curl http://${PROMETHEUS_URL}
I have deployed an application however couldn't see the below deployments in prometheus
The standard prometheus installation by istio does not configure your pods to send metrics to prometheus. It just collects data from the istio resouces.
To add your pods to being scraped add the following annotations in the deployment.yml of your application:
apiVersion: apps/v1
kind: Deployment
[...]
spec:
template:
metadata:
annotations:
prometheus.io/scrape: true # determines if a pod should be scraped. Set to true to enable scraping.
prometheus.io/path: /metrics # determines the path to scrape metrics at. Defaults to /metrics.
prometheus.io/port: 80 # determines the port to scrape metrics at. Defaults to 80.
[...]
By the way: The prometheus instance installed with istioctl should not be used for production. From docs:
[...] pass --set values.prometheus.enabled=true during installation. This built-in deployment of Prometheus is intended for new users to help them quickly getting started. However, it does not offer advanced customization, like persistence or authentication and as such should not be considered production ready.
You should setup your own prometheus and configure istio to report to it. See:
Reference: https://istio.io/latest/docs/ops/integrations/prometheus/#option-1-metrics-merging
The following yaml provided by istio can be used as reference for setup of prometheus:
https://raw.githubusercontent.com/istio/istio/release-1.7/samples/addons/prometheus.yaml
Furthermore, if I remember correctly, installation of addons like kiali, prometheus, ... with istioctl will be removed with istio 1.8 (release date december 2020). So you might want to setup your own instances with helm anyway.

prometheus operator - enable monitoring for everything in all namespaces

I want to monitor a couple applications running on a Kubernetes cluster in namespaces named development and production through prometheus-operator.
Installation command used (as per Github) is:
helm install prometheus-operator stable/prometheus-operator -n production --set prometheusOperator.enabled=true,prometheus.service.type=NodePort,prometheusOperator.service.type=NodePort,alertmanager.service.type=NodePort,grafana.service.type=NodePort,grafana.service.nodePort=30906
What parameters do I need to add to above command to have prometheus-operator discover and monitor all apps/services/pods running in all namespaces?
With this, Service Discovery only shows some prometheus-operator related services, but not the app that I am running within 'production' namespace even though prometheus-operator is installed in the same namespace.
Anything I am missing?
Note - Am running performing all actions using the same user (which uses the $HOME/.kube/config file), so I assume permissions are not an issue.
kubectl version - v1.17.3
helm version - 3.1.2
P.S. There are numerous articles on this on different forums, but am still not finding simple and direct answers for this.
I had the same problem. After some investigation answering with more details.
I've installed Prometheus stack via Helm charts which include Prometheus operator chart directly as a sub-project. Prometheus operator monitors namespaces specified by the following helm values:
prometheusOperator:
namespaces: ''
denyNamespaces: ''
prometheusInstanceNamespaces: ''
alertmanagerInstanceNamespaces: ''
thanosRulerInstanceNamespaces: ''
The namespaces value specifies monitored namespaces for ServiceMonitor and PodMonitor CRDs. Other CRDs have their own settings, which if not set, default to namespaces. Helm values are passed as command-line arguments to the operator. See here and here.
Prometheus CRDs are picked up by the operator from the mentioned namespaces, by default - everywhere. However, as the operator is designed with multiple simultaneous Prometheus releases in mind, what to pick up by a particular Prometheus app instance is controlled by the corresponding Prometheus CRD. CRDs selectors and corresponding namespaces selectors are controlled via the following Helm values:
prometheus:
prometheusSpec:
serviceMonitorSelectorNilUsesHelmValues: true
serviceMonitorSelector: {}
serviceMonitorNamespaceSelector: {}
Similar values are present for other CRDs: alertmanagerConfigXXX, ruleNamespaceXXX, podMonitorXXX, probeXXX. XXXSelectorNilUsesHelmValues set to true, means to look for CRD with particular release label, e.g. release=myrelease. See here.
Empty selector (for a namespace, CRD, or any other object) means no filtering. So for Prometheus object to pick up a ServiceMonitor from the other namespaces there are few options:
Set serviceMonitorSelectorNilUsesHelmValues: false. This leaves serviceMonitorSelector empty.
Apply the release label, e.g. release=myrelease, to your ServiceMonitor CRD.
Set a non-empty serviceMonitorSelector that matches your ServiceMonitor.
For the curious ones here are links to the operator sources:
Enqueue of Prometheus CRD processing
Processing of Prometheus CRD
I used values.yaml from https://github.com/helm/charts/blob/master/stable/prometheus-operator/values.yaml, modified parameters *NilUsesHelmValues to False and it seems to work fine with that.
helm install prometheus-operator stable/prometheus-operator -n monitoring -f values.yaml
Also, like https://stackoverflow.com/users/7889479/anish-kumar-mourya stated, the services do show in Grafana dashboard even though they dont appear in Prometheus UI under Service Discovery or Targets.
Hope this helps other newbies like me.
no its fine but you can create new namespace for monitoring and install prometheus over there would be good to manage things related to monitoring.
helm install prometheus-operator stable/prometheus-operator -n monitoring
You need to create a service for the pod and a serviceMonitor custom resource to configure which services in which namespace need to be discovered by prometheus.
kube-state-metrics Service example
apiVersion: v1
kind: Service
metadata:
labels:
app: kube-state-metrics
k8s-app: kube-state-metrics
annotations:
alpha.monitoring.coreos.com/non-namespaced: "true"
name: kube-state-metrics
spec:
ports:
- name: http-metrics
port: 8080
targetPort: metrics
protocol: TCP
selector:
app: kube-state-metrics
This Service targets all Pods with the label k8s-app: kube-state-metrics.
Generic ServiceMonitor example
This ServiceMonitor targets all Services with the label k8s-app (spec.selector) any value, in the namespaces kube-system and monitoring (spec.namespaceSelector).
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: k8s-apps-http
labels:
k8s-apps: http
spec:
jobLabel: k8s-app
selector:
matchExpressions:
- {key: k8s-app, operator: Exists}
namespaceSelector:
matchNames:
- kube-system
- monitoring
endpoints:
- port: http-metrics
interval: 15s
https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/running-exporters.md

How to specify a GKE node pool configuration in a YAML file instead of using gcloud container node-pools create?

It seems that the only way to create node pools on Google Kubernetes Engine is with the command gcloud container node-pools create. I would like to have all the configuration in a YAML file instead. What I tried is the following:
apiVersion: v1
kind: NodeConfig
metadata:
annotations:
cloud.google.com/gke-nodepool: ares-pool
spec:
diskSizeGb: 30
diskType: pd-standard
imageType: COS
machineType: n1-standard-1
metadata:
disable-legacy-endpoints: 'true'
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
But kubectl apply fails with:
error: unable to recognize "ares-pool.yaml": no matches for kind "NodeConfig" in version "v1"
I am surprised that Google yields almost no relevant results for all my searches. The only documentation that I found was the one on Google Cloud, which is quite incomplete in my opinion.
Node pools are not Kubernetes objects, they are part of the Google Cloud API. Therefore Kubernetes does not know about them, and kubectl apply will not work.
What you actually need is a solution called "infrastructure as code" - a code that will tell GCP what kind of node pool it wants.
If you don't strictly need YAML, you can check out Terraform that handles this use case. See: https://terraform.io/docs/providers/google/r/container_node_pool.html
You can also look into Google Deployment Manager or Ansible (it has GCP module, and uses YAML syntax), they also address your need.
I don' know if it answers accurately your needs but if you want to do IAC in general with Kubernetes, you can use Crossplane CRDs. If you already have a running cluster, you just have to install their helm chart and you can provision a cluster this way:
apiVersion: container.gcp.crossplane.io/v1beta1
kind: GKECluster
metadata:
name: gke-crossplane-cluster
spec:
forProvider:
initialClusterVersion: "1.19"
network: "projects/development-labs/global/networks/opsnet"
subnetwork: "projects/development-labs/regions/us-central1/subnetworks/opsnet"
ipAllocationPolicy:
useIpAliases: true
defaultMaxPodsConstraint:
maxPodsPerNode: 110
And then you can define an associated node pool as follows:
apiVersion: container.gcp.crossplane.io/v1alpha1
kind: NodePool
metadata:
name: gke-crossplane-np
spec:
forProvider:
autoscaling:
autoprovisioned: false
enabled: true
maxNodeCount: 2
minNodeCount: 1
clusterRef:
name: gke-crossplane-cluster
config:
diskSizeGb: 100
# diskType: pd-ssd
imageType: cos_containerd
labels:
test-label: crossplane-created
machineType: n1-standard-4
oauthScopes:
- "https://www.googleapis.com/auth/devstorage.read_only"
- "https://www.googleapis.com/auth/logging.write"
- "https://www.googleapis.com/auth/monitoring"
- "https://www.googleapis.com/auth/servicecontrol"
- "https://www.googleapis.com/auth/service.management.readonly"
- "https://www.googleapis.com/auth/trace.append"
initialNodeCount: 2
locations:
- us-central1-a
management:
autoRepair: true
autoUpgrade: true
If you want you can find a full example of a GKE provisionning with Crossplane here.

Kubernetes HPA fails to detect a successfully published custom metric from Stackdriver

I'm trying to scale a Kubernetes Deployment using a HorizontalPodAutoscaler, which listens to a custom metrics through Stackdriver.
I'm having a GKE cluster, with a Stackdriver adapter enabled.
I'm able to publish the custom metric type to Stackdriver, and following is the way it's being displayed in Stackdriver's Metric Explorer.
This is how I have defined my HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
targetValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
After successfully creating example-hpa, executing kubectl get hpa example-hpa, always shows TARGETS as <unknown>, and never detects the value from custom metrics.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
example-hpa Deployment/test-app-group-1-1 <unknown>/400 1 10 1 18m
I'm using a Java client which runs locally to publish my custom metrics.
I have given the appropriate resource labels as mentioned here (hard coded - so that it can run without a problem in local environment). I have followed this document to create the Java client.
private static MonitoredResource prepareMonitoredResourceDescriptor() {
Map<String, String> resourceLabels = new HashMap<>();
resourceLabels.put("project_id", "<<<my-project-id>>>);
resourceLabels.put("pod_id", "<my pod UID>");
resourceLabels.put("container_name", "");
resourceLabels.put("zone", "asia-southeast1-b");
resourceLabels.put("cluster_name", "my-cluster");
resourceLabels.put("namespace_id", "mynamespace");
resourceLabels.put("instance_id", "");
return MonitoredResource.newBuilder()
.setType("gke_container")
.putAllLabels(resourceLabels)
.build();
}
What am I doing wrong in the above-mentioned steps please? Thank you in advance for any answers provided!
EDIT [RESOLVED]:
I think I have had some misconfigurations, since kubectl describe hpa [NAME] --v=9 showed me some 403 status code, as well as I was using type: External instead of type: Pods (Thanks MWZ for your answer, pointing out this mistake).
I managed to fix it by creating a new project, a new service account, and a new GKE cluster (basically everything from the beginning again). Then I changed my yaml file as follows, exactly as this document explains.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: test-app-group-1-1
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: test-app-group-1-1
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods # Earlier this was type: External
pods: # Earlier this was external:
metricName: baz # metricName: custom.googleapis.com|worker_pod_metrics|baz
targetAverageValue: 20
I'm now exporting as custom.googleapis.com/baz, and NOT as custom.googleapis.com/worker_pod_metrics/baz. Also, now I'm explicitly specifying the namespace for my HPA in the yaml.
Since you can see your custom metric in Stackdriver GUI I'm guessing metrics are correctly exported. Based on Autoscaling Deployments with Custom Metrics I believe you wrongly defined metric to be used by HPA to scale the deployment.
Please try using this YAML:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: baz
targetAverageValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Please have in mind that:
The HPA uses the metrics to compute an average and compare it to the
target average value. In the application-to-Stackdriver export
example, a Deployment contains Pods that export metric. The following
manifest file describes a HorizontalPodAutoscaler object that scales a
Deployment based on the target average value for the metric.
Troubleshooting steps described on the page above can also be useful.
Side-note
Since above HPA is using beta API autoscaling/v2beta1 I got error when running kubectl describe hpa [DEPLOYMENT_NAME]. I ran kubectl describe hpa [DEPLOYMENT_NAME] --v=9 and got response in JSON.
It is a good practice to put some unique labels to target your metrics. Right now, based on metrics labelled in your java client, only pod_id looks unique which can't be used due to its stateless nature.
So, I would suggest you try introducing a deployment/metrics wide unqiue identifier.
resourceLabels.put("<identifier>", "<could-be-deployment-name>");
After this, you can try modifying your HPA with something similar to following:
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
metricSelector:
matchLabels:
# define labels to target
metric.labels.identifier: <deployment-name>
# scale +1 whenever it crosses multiples of mentioned value
targetAverageValue: "400"
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Apart from this, this setup has no issues and should work smooth.
Helper command to see what metrics are exposed to HPA :
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|worker_pod_metrics|baz" | jq