Dynamic targets for Prometheus in Kubernetes? - kubernetes

In my docker setup, I maintain targets.json file which is dynamically updated with targets to probe. The file starts empty but is appended with targets during some use case.
sample targets.json
[
{
"targets": [
"x.x.x.x"
],
"labels": {
"app": "testApp1"
}
},
{
"targets": [
"x.x.x.x"
],
"labels": {
"app": "testApp2"
}
}
]
This file is then provided to prometheus configuration as file_sd_configs. Everything works fine, targets get added to targets.json file due to some event in application and prometheus starts monitoring along with blackbox for health checks.
scrape_configs:
- job_name: 'test-run'
metrics_path: /probe
params:
module: [icmp]
file_sd_configs:
- files:
- targets.json
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox:9115
Inside my node.js application I am able to append data to targets.json file, but now I trying to replicate this in Kubernetes on minikube. I tried adding in ConfigMap as following and it works, but I dont want to populate targets in configuration, but rather maintain a json file.
Can this be done using Persistent Volumes? The pod running Prometheus will always read the targets file and pod running application will write to targets file.
kind: ConfigMap
apiVersion: v1
metadata:
name: prometheus-cm
data:
targets.json: |-
[
{
"targets": [
"x.x.x.x"
],
"labels": {
"app": "testApp1"
}
}
]
Simply, what strategy in Kubernetes is recommended to so that one pod can read a json file and another pod can write to that file.

In order to achieve your goal you need to use PVC:
A PersistentVolume (PV) is a piece of storage in the cluster that has
been provisioned by an administrator. It is a resource in the cluster
just like a node is a cluster resource. PVs are volume plugins like
Volumes, but have a lifecycle independent of any individual pod that
uses the PV. This API object captures the details of the
implementation of the storage, be that NFS, iSCSI, or a
cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It
is similar to a pod. Pods consume node resources and PVCs consume PV
resources. Pods can request specific levels of resources (CPU and
Memory). Claims can request specific size and access modes (e.g., can
be mounted once read/write or many times read-only).
The json file needs to be persisted if one pod has to write to it and another one to read it. There is an official guide describing that concept in steps:
Create a PersistentVolume
Create a PersistentVolumeClaim
Create a Pod that uses your PersistentVolumeClaim as a volume
I also recommend reading this: Create ReadWriteMany PersistentVolumeClaims on your Kubernetes Cluster as a supplement.

Related

Prometheus file based service discovery

I tried file based service discovery ,But everytime when I change the configmap(which contains static target), I am deleting prometheus pod manually to get config changes. Is there any way that prometheus can get config changes automatically without deleting the prometheus pod? any help on this issue?
I am installing prometheus-operator using helm chart
target.json file
[
{
"labels": {
"app": "web",
"env": "dev"
},
"targets": [
"web.dev.svc.cluster.local"
]
}
]```
command I used to create configmap
kubectl create cm static-config --from-file=target.json -n monitoring
prometheus-operator.yaml
```volumes:
- name: config-volume
configMap:
name: static-config
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus/config
additionalScrapeConfigs:
- job_name: 'file-based-targets'
file_sd_configs:
- files:
- '/etc/prometheus/config/target.json'```
Prometheus reloads file_sd_configs automatically using file watches according to the documentation:
It reads a set of files containing a list of zero or more <static_config>s. Changes to all defined files are detected via disk watches and applied immediately. Files may be provided in YAML or JSON format. Only changes resulting in well-formed target groups are applied.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config
If you need to add more target files you can use a wildcard for the files, for example:
scrape_configs:
- job_name: 'file-based-targets'
file_sd_configs:
- files:
- '/etc/prometheus/targets/*.json'
If you still need reloading from configmaps you can add another container to the Prometheus CRD and use either the built-in prometheus-operator/prometheus-config-reloader (this is how the Prometheus configuration and rules are reloaded) or one of the following:
https://github.com/kiwigrid/k8s-sidecar
https://github.com/jimmidyson/configmap-reload
https://github.com/stakater/Reloader

K8s Service as DaemonSet

Is there a possibility to have a service in all namespaces of k8s dynamically deployed?
Right now, glusterFS endpoint(ns dependent) is being deleted by k8s if the port is not in use anymore.
Ex:
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"subsets": [
{
"addresses": [
{
"ip": "172.0.0.1"
}
],
"ports": [
{
"port": 1
}
]
}
]
}
So I made a svc for port 1 to be used all the time, so I dont end up with a missing/deleted endpoint in any ns.
apiVersion: v1
kind: Service
metadata:
name: glusterfs
spec:
ports:
- port: 1
It would be interesting to have the above service deployed dynamically every time someone creates a new namespace.
DaemonSet is used to deploy Exactly one replica per node.
coming to your question, why do you need to create same service across namespaces?
It is not supported out of box though. However, you can create a custom script to achieve it.
K8s doesn't have any replication of services, pods, deployments, secrets, etc across namespaces... out of the box.
Introducing...The Kubernetes Controller/Operator Pattern.
Deploy a controller pod that has a read/list permissions on the namespaces resource. This controller will "watch" the namespaces and deploy whatever resources you want when they show up or change.
To get started building your own operator or controller please look at kubebuilder. https://book.kubebuilder.io/

How can I mount a single distinct Secret into each Pod managed by a StatefulSet?

I have 3 different Kubernetes Secrets and I want to mount each one into its own Pod managed by a StatefulSet with 3 replicas.
Is it possible to configure the StatefulSet such that each Secret is mounted into its own Pod?
Not really. A StatefulSet (and any workload controller for that matter) allows only a single pod definition template (it could have multiple containers). The issue with this is that a StatefulSet is designed to have N replicas so can you have an N number of secrets. It would have to be a SecretStatefulSet: a different controller.
Some solutions:
You could define a single Kubernetes secret that contains all your required secrets for all of your pods. The downside is that you will have to share the secret between the pods. For example:
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
pod1: xxx
pod2: xxx
pod3: xxx
...
podN: xxx
Use something like Hashicorp's Vault and store your secret remotely with keys such as pod1, pod2, pod3,...podN. You can also use an HSM. This seems to be the more solid solution IMO but it might take longer to implement.
In all cases, you will have to make sure that the number of secrets matches your number of pods in your StatefulSet.
This is exactly what you're looking for I guess. https://github.com/spoditor/spoditor
Essentially, it uses a custom annotation on the PodSpec template, like:
annotations:
spoditor.io/mount-volume: |
{
"volumes": [
{
"name": "my-volume",
"secret": {
"secretName": "my-secret"
}
}
],
"containers": [
{
"name": "nginx",
"volumeMounts": [
{
"name": "my-volume",
"mountPath": "/etc/secrets/my-volume"
}
]
}
]
}
Now, nginx container in each Pod of the StatefulSet will try to mount its own dedicated secret in the pattern of my-secret-{pod ordinal}.
You will just need to make sure my-secret-0, my-secret-1, so on and so forth exists in the same namespace of the StatefulSet.
There're more advanced usage of the annotation in the documentation of the project.

Provisioning persistent disks for horizontally scaled pods

In our cluster we have a horizontally-scaling deployment of an application that uses a lot of local disk space, which has been causing major cluster stability problems (docker crashes, nodes recreate, etc).
We are trying to have each pod provision a gcePersistentDisk of its own so its disk usage is isolated from the cluster. We created a storage class and a persistent volume claim that uses that class, and have specified a volume mount for that claim in our deployment's pod template spec.
However, when we set the autoscaler to use multiple replicas, they apparently try to use the same volume, and we get this error:
Multi-Attach error for volume
Volume is already exclusively attached to one node and can't be attached to another
Here are the relevant parts of our manifests. Storage Class:
{
"apiVersion": "storage.k8s.io/v1",
"kind": "StorageClass",
"metadata": {
"annotations": {},
"name": "some-storage",
"namespace": ""
},
"parameters": {
"type": "pd-standard"
},
"provisioner": "kubernetes.io/gce-pd"
}
PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: some-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: some-class
Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: some-deployment
spec:
volumes:
- name: some-storage
persistentVolumeClaim:
claimName: some-pvc
containers:
[omitted]
volumeMounts:
- name: some-storage
mountPath: /var/path
With those applied, we update the deployment's autoscaler to a minimum of 2 replicas and get the above error.
Is this not how persistent volume claims should work?
We definitely don't care about volume sharing, and we don't really care about persistence, we just want storage that is isolated from the cluster -- is this the right tool for the job?
A Deployment is meant to be stateless. There is no way for the deployment controller to determine which disk belongs to which pod once a pod gets rescheduled, which would lead to corrupted state. That is the reason why a Deployment can only have one disk shared across all its pods.
Concerning the error you are seeing:
Multi-Attach error for volume
Volume is already exclusively attached to one node and can't be attached to another
You are getting this because you have pods across multiple nodes, but only one volume (because a Deployment can only have one) and multiple nodes are trying to mount this volume to attach it to your deployments pods. The volume doesn't seem to be NFS which could be mounted into multiple nodes at the same time. If you do not care about state at all and still want to use a Deployment, then you must use a disk that supports mounts from multiple nodes at the same time, like NFS. Further, you would need to change your PVCs accessModes policy to ReadWriteMany, as multiple pods would write to the same physical volume.
If you need a dedicated disk for each pod, then you might want to use a StatefulSet instead. As the name suggests, its pods are meant to keep state, thus you can also define a volumeClaimTemplates section in it, which will create a dedicated disk for each pod as described in the documentation.

Is there a way to add arbitrary records to kube-dns?

I will use a very specific way to explain my problem, but I think this is better to be specific than explain in an abstract way...
Say, there is a MongoDB replica set outside of a Kubernetes cluster but in a network. The ip addresses of all members of the replica set were resolved by /etc/hosts in app servers and db servers.
In an experiment/transition phase, I need to access those mongo db servers from kubernetes pods.
However, kubernetes doesn't seem to allow adding custom entries to /etc/hosts in pods/containers.
The MongoDB replica sets are already working with large data set, creating a new replica set in the cluster is not an option.
Because I use GKE, changing any of resources in kube-dns namespace should be avoided I suppose. Configuring or replace kube-dns to be suitable for my need are last thing to try.
Is there a way to resolve ip address of custom hostnames in a Kubernetes cluster?
It is just an idea, but if kube2sky can read some entries of configmap and use them as dns records, it colud be great.
e.g. repl1.mongo.local: 192.168.10.100.
EDIT: I referenced this question from https://github.com/kubernetes/kubernetes/issues/12337
There are 2 possible solutions for this problem now:
Pod-wise (Adding the changes to every pod needed to resolve these domains)
cluster-wise (Adding the changes to a central place which all pods have access to, Which is in our case is the DNS)
Let's begin with the pod-wise solution:
As of Kunbernetes 1.7, It's possible now to add entries to a Pod's /etc/hosts directly using .spec.hostAliases
For example: to resolve foo.local, bar.local to 127.0.0.1 and foo.remote,
bar.remote to 10.1.2.3, you can configure HostAliases for a Pod under
.spec.hostAliases:
apiVersion: v1
kind: Pod
metadata:
name: hostaliases-pod
spec:
restartPolicy: Never
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "foo.local"
- "bar.local"
- ip: "10.1.2.3"
hostnames:
- "foo.remote"
- "bar.remote"
containers:
- name: cat-hosts
image: busybox
command:
- cat
args:
- "/etc/hosts"
The Cluster-wise solution:
As of Kubernetes v1.12, CoreDNS is the recommended DNS Server, replacing kube-dns. If your cluster originally used kube-dns, you may still have kube-dns deployed rather than CoreDNS. I'm going to assume that you're using CoreDNS as your K8S DNS.
In CoreDNS it's possible to Add an arbitrary entries inside the cluster domain and that way all pods will resolve this entries directly from the DNS without the need to change each and every /etc/hosts file in every pod.
First:
Let's change the coreos ConfigMap and add required changes:
kubectl edit cm coredns -n kube-system
apiVersion: v1
kind: ConfigMap
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
hosts /etc/coredns/customdomains.db example.org {
fallthrough
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
customdomains.db: |
10.10.1.1 mongo-en-1.example.org
10.10.1.2 mongo-en-2.example.org
10.10.1.3 mongo-en-3.example.org
10.10.1.4 mongo-en-4.example.org
Basically we added two things:
The hosts plugin before the kubernetes plugin and used the fallthrough option of the hosts plugin to satisfy our case.
To shed some more lights on the fallthrough option. Any given backend is usually the final word for its zone - it either returns a result, or it returns NXDOMAIN for the
query. However, occasionally this is not the desired behavior, so some of the plugin support a fallthrough option.
When fallthrough is enabled, instead of returning NXDOMAIN when a record is not found, the plugin will pass the
request down the chain. A backend further down the chain then has the opportunity to handle the request and that backend in our case is kubernetes.
We added a new file to the ConfigMap (customdomains.db) and added our custom domains (mongo-en-*.example.org) in there.
Last thing is to Remember to add the customdomains.db file to the config-volume for the CoreDNS pod template:
kubectl edit -n kube-system deployment coredns
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
- key: customdomains.db
path: customdomains.db
and finally to make kubernetes reload CoreDNS (each pod running):
$ kubectl rollout restart -n kube-system deployment/coredns
#OxMH answer is fantastic, and can be simplified for brevity. CoreDNS allows you to specify hosts directly in the hosts plugin (https://coredns.io/plugins/hosts/#examples).
The ConfigMap can therefore be edited like so:
$ kubectl edit cm coredns -n kube-system
apiVersion: v1
kind: ConfigMap
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
hosts {
10.10.1.1 mongo-en-1.example.org
10.10.1.2 mongo-en-2.example.org
10.10.1.3 mongo-en-3.example.org
10.10.1.4 mongo-en-4.example.org
fallthrough
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
You will still need to restart coredns so it rereads the config:
$ kubectl rollout restart -n kube-system deployment/coredns
Inlining the contents of the hostsfile removes the need to map the hostsfile from the configmap. Both approaches achieve the same outcome, it is up to personal preference as to where you want to define the hosts.
A type of External Name is required to access hosts or ips outside of the kubernetes.
The following worked for me.
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "tiny-server-5",
"namespace": "default"
},
"spec": {
"type": "ExternalName",
"externalName": "192.168.1.15",
"ports": [{ "port": 80 }]
}
}
For the record, an alternate solution for those not checking the referenced github issue.
You can define an "external" Service in Kubernetes, by not specifying any selector or ClusterIP. You have to also define a corresponding Endpoint pointing to your external IP.
From the Kubernetes documentation:
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "my-service"
},
"spec": {
"ports": [
{
"protocol": "TCP",
"port": 80,
"targetPort": 9376
}
]
}
}
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "my-service"
},
"subsets": [
{
"addresses": [
{ "ip": "1.2.3.4" }
],
"ports": [
{ "port": 9376 }
]
}
]
}
With this, you can point your app inside the containers to my-service:9376 and the traffic should be forwarded to 1.2.3.4:9376
Limitations:
The DNS name used needs to be only letters, numbers or dashes. You can't use multi-level names (something.like.this). This means you probably have to modify your app to point just to your-service, and not yourservice.domain.tld.
You can only point to a specific IP, not a DNS name. For that, you can define a kind of a DNS alias with an ExternalName type Service.
UPDATE: 2017-07-03 Kunbernetes 1.7 now support Adding entries to Pod /etc/hosts with HostAliases.
The solution is not about kube-dns, but /etc/hosts.
Anyway, following trick seems to work so far...
EDIT: Changing /etc/hosts may has race condition with kubernetes system. Let it retry.
1) create a configMap
apiVersion: v1
kind: ConfigMap
metadata:
name: db-hosts
data:
hosts: |
10.0.0.1 db1
10.0.0.2 db2
2) Add a script named ensure_hosts.sh.
#!/bin/sh
while true
do
grep db1 /etc/hosts > /dev/null || cat /mnt/hosts.append/hosts >> /etc/hosts
sleep 5
done
Don't forget chmod a+x ensure_hosts.sh.
3) Add a wrapper script start.sh your image
#!/bin/sh
$(dirname "$(realpath "$0")")/ensure_hosts.sh &
exec your-app args...
Don't forget chmod a+x start.sh
4) Use the configmap as a volume and run start.sh
apiVersion: extensions/v1beta1
kind: Deployment
...
spec:
template:
...
spec:
volumes:
- name: hosts-volume
configMap:
name: db-hosts
...
containers:
command:
- ./start.sh
...
volumeMounts:
- name: hosts-volume
mountPath: /mnt/hosts.append
...
Use configMap seems better way to set DNS, but it's a little bit heavy when just add a few record (in my opinion). So I add records to /etc/hosts by shell script executed by docker CMD.
for example:
Dockerfile
...(ignore)
COPY run.sh /tmp/run.sh
CMD bash /tmp/run.sh
run.sh
#!/bin/bash
echo repl1.mongo.local 192.168.10.100 >> /etc/hosts
# some else command...
Notice, if your run MORE THAN ONE container in a pod, you have to add script in each container, because kubernetes start container randomly, /etc/hosts may be override by another container (which start later).